All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
@ 2010-06-13 19:20 Pedro Garcia
  2010-06-13 21:56 ` Ben Hutchings
  0 siblings, 1 reply; 35+ messages in thread
From: Pedro Garcia @ 2010-06-13 19:20 UTC (permalink / raw)
  To: netdev

Hi,

I am using kernel 2.6.26 in a linux box, and I have another box in the
network using 802.1p (priority tagging, but no VLAN).

Without the 8021q module loaded in the kernel, all 802.1p packets are
silently discarded (probably as expected, as the protocol is not loaded in
the kernel).

When I load 8021q module, these packets are forwarded to the module, but
they are discarded also as VLAN 0 is not configured.

I think this should not be the default behaviour, as VLAN 0 is not really
a VLAN, so it should be treated differently.

I could define the VLAN 0 (ip link add link eth0 name eth0.dot1p type vlan
id 0), but then I have a lot of issues with the ARP table entries, as to
ping the other box, outgoing traffic goes through eth0, but incoming arp
reply ends up in eth0.dot1p. In the end this means I can not communicate
with the box using 802.1p unless I use 802.1p tagging for all traffic in
the network (the linux box and all other), which is not a must of the spec.

I have developed a patch for vlan_dev.c which makes VLAN 0 to be just
reintroduced to netif_rx but with no VLAN tagging if VLAN 0 has not been
defined, so the default behaviour is to ignore the VLAN tagging and accept
the packet as if it was not tagged, and one can still define something
different for VLAN 0 if desired (so it is backwards compatible).

=======================================================================
*** linux-source-2.6.26/net/8021q/vlan_dev.c    2008-07-13
23:51:29.000000000 +0200
--- vlan_patch/net/8021q/vlan_dev.c     2010-06-13 20:24:46.000000000
+0200
*************** int vlan_skb_recv(struct sk_buff *skb, s
*** 151,156 ****
--- 151,157 ----
        struct vlan_hdr *vhdr;
        unsigned short vid;
        struct net_device_stats *stats;
+       struct net_device *vlan_dev;
        unsigned short vlan_TCI;
  
        skb = skb_share_check(skb, GFP_ATOMIC);
*************** int vlan_skb_recv(struct sk_buff *skb, s
*** 165,176 ****
        vid = (vlan_TCI & VLAN_VID_MASK);
  
        rcu_read_lock();
!       skb->dev = __find_vlan_dev(dev, vid);
!       if (!skb->dev) {
                pr_debug("%s: ERROR: No net_device for VID: %u on dev:
%s\n",
                         __func__, (unsigned int)vid, dev->name);
                goto err_unlock;
        }
  
        skb->dev->last_rx = jiffies;
  
--- 166,191 ----
        vid = (vlan_TCI & VLAN_VID_MASK);
  
        rcu_read_lock();
!       vlan_dev = __find_vlan_dev(dev, vid);
!       if(vlan_dev) {
!               skb->dev = vlan_dev;
!       }
!       else if(vid) {
                pr_debug("%s: ERROR: No net_device for VID: %u on dev:
%s\n",
                         __func__, (unsigned int)vid, dev->name);
                goto err_unlock;
        }
+       else {
+               /* 2010-06-13: Pedro Garcia
+                  The packet is VLAN tagged, but VID is 0 and the user
has
+                  not defined anything for VLAN 0, so it is a 802.1p
packet.
+                  We will just netif_rx it later to the original
interface,
+                  but with the skb->proto set to the wrapped proto, so we
do 
+                  nothing here. */
+ 
+               pr_debug("%s: INFO: VLAN 0 used as default VLAN on dev:
%s\n",
+                        __func__, dev->name);
+       }
  
        skb->dev->last_rx = jiffies;
=======================================================================

I do not really have much experience in touching the kernel so maybe I
have done it totally wrong..., but there are no major changes applied, and
this way the 8021q module is more transparently similar to the expected
behaviour of 802.1p (VLAN 0 means no VLAN).

Regards,
Pedro


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-13 19:20 [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet) Pedro Garcia
@ 2010-06-13 21:56 ` Ben Hutchings
  2010-06-14 16:49   ` Pedro Garcia
  0 siblings, 1 reply; 35+ messages in thread
From: Ben Hutchings @ 2010-06-13 21:56 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev

I have no particular opinion on this change, but you need to read and
follow Documentation/SubmittingPatches.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-13 21:56 ` Ben Hutchings
@ 2010-06-14 16:49   ` Pedro Garcia
  2010-06-14 17:02     ` Ben Hutchings
  0 siblings, 1 reply; 35+ messages in thread
From: Pedro Garcia @ 2010-06-14 16:49 UTC (permalink / raw)
  To: netdev; +Cc: Ben Hutchings

On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> I have no particular opinion on this change, but you need to read and
> follow Documentation/SubmittingPatches.
> 
> Ben.

Sorry, first kernel patch, and I did not know about it. I resubmit with
the correct style / format:

I am using kernel 2.6.26 in a linux box, and I have another box in the
network using 802.1p (priority tagging, but no VLAN).

Without the 8021q module loaded in the kernel, all 802.1p packets are
silently discarded (probably as expected, as the protocol is not loaded in
the kernel).

When I load 8021q module, these packets are forwarded to the module, but
they are discarded also as VLAN 0 is not configured.

I think this should not be the default behaviour, as VLAN 0 is not really
a VLAN, so it should be treated differently.

I could define the VLAN 0 (ip link add link eth0 name eth0.dot1p type vlan
id 0), but then I have a lot of issues with the ARP table entries, as to
ping the other box, outgoing traffic goes through eth0, but incoming arp
reply ends up in eth0.dot1p. In the end this means I can not communicate
with the box using 802.1p unless I use 802.1p tagging for all traffic in
the network (the linux box and all other), which is not a must of the
spec.

I have developed a patch for vlan_dev.c which makes VLAN 0 to be just
reintroduced to netif_rx but with no VLAN tagging if VLAN 0 has not been
defined, so the default behaviour is to ignore the VLAN tagging and accept
the packet as if it was not tagged, and one can still define something
different for VLAN 0 if desired (so it is backwards compatible).

Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
--- net/8021q/vlan_dev.c.orig   2008-07-13 23:51:29.000000000 +0200
+++ net/8021q/vlan_dev.c        2010-06-14 18:07:35.000000000 +0200
@@ -151,6 +151,7 @@ int vlan_skb_recv(struct sk_buff *skb, s
        struct vlan_hdr *vhdr;
        unsigned short vid;
        struct net_device_stats *stats;
+       struct net_device *vlan_dev;
        unsigned short vlan_TCI;
 
        skb = skb_share_check(skb, GFP_ATOMIC);
@@ -165,11 +166,23 @@ int vlan_skb_recv(struct sk_buff *skb, s
        vid = (vlan_TCI & VLAN_VID_MASK);
 
        rcu_read_lock();
-       skb->dev = __find_vlan_dev(dev, vid);
-       if (!skb->dev) {
+       vlan_dev = __find_vlan_dev(dev, vid);
+       if (vlan_dev) {
+               skb->dev = vlan_dev;
+       } else if (vid) {
                pr_debug("%s: ERROR: No net_device for VID: %u on dev:
%s\n",
                         __func__, (unsigned int)vid, dev->name);
                goto err_unlock;
+       } else {
+               /* 2010-06-13: Pedro Garcia
+                  The packet is VLAN tagged, but VID is 0 and the user
has
+                  not defined anything for VLAN 0, so it is a 802.1p
packet.
+                  We will just netif_rx it later to the original
interface,
+                  but with the skb->proto set to the wrapped proto, so we
do
+                  nothing here. */
+
+               pr_debug("%s: INFO: VLAN 0 used as default VLAN on dev:
%s\n",
+                        __func__, dev->name);
        }
 
        skb->dev->last_rx = jiffies;


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-14 16:49   ` Pedro Garcia
@ 2010-06-14 17:02     ` Ben Hutchings
  2010-06-14 17:11       ` Patrick McHardy
  0 siblings, 1 reply; 35+ messages in thread
From: Ben Hutchings @ 2010-06-14 17:02 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev

On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
> <bhutchings@solarflare.com> wrote:
> > I have no particular opinion on this change, but you need to read and
> > follow Documentation/SubmittingPatches.
> > 
> > Ben.
> 
> Sorry, first kernel patch, and I did not know about it. I resubmit with
> the correct style / format:
[...]

Sorry, no you haven't.

- Networking changes go through David Miller's net-next-2.6 tree so you
need to use that as the baseline, not 2.6.26
- Patches should be applicable with -p1, not -p0 (so if you use diff,
you should run it from one directory level up)
- The patch was word-wrapped

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-14 17:02     ` Ben Hutchings
@ 2010-06-14 17:11       ` Patrick McHardy
  2010-06-14 19:12         ` Eric Dumazet
  2010-06-14 19:42         ` Joe Perches
  0 siblings, 2 replies; 35+ messages in thread
From: Patrick McHardy @ 2010-06-14 17:11 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Pedro Garcia, netdev

Ben Hutchings wrote:
> On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
>   
>> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
>> <bhutchings@solarflare.com> wrote:
>>     
>>> I have no particular opinion on this change, but you need to read and
>>> follow Documentation/SubmittingPatches.
>>>
>>> Ben.
>>>       
>> Sorry, first kernel patch, and I did not know about it. I resubmit with
>> the correct style / format:
>>     
> [...]
>
> Sorry, no you haven't.
>
> - Networking changes go through David Miller's net-next-2.6 tree so you
> need to use that as the baseline, not 2.6.26
> - Patches should be applicable with -p1, not -p0 (so if you use diff,
> you should run it from one directory level up)
> - The patch was word-wrapped

Additionally:

- please use the proper comment style, meaning each line begins
  with a '*'

- the pr_debug statements may be useful for debugging, but are
  a bit excessive for the final version

- + /* 2010-06-13: Pedro Garcia

   We have changelogs for this, simply explaining what the code
   does is enough.

- Please CC the maintainer (which is me)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-14 17:11       ` Patrick McHardy
@ 2010-06-14 19:12         ` Eric Dumazet
  2010-06-16  8:49           ` Pedro Garcia
  2010-06-14 19:42         ` Joe Perches
  1 sibling, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2010-06-14 19:12 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Ben Hutchings, Pedro Garcia, netdev

Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
> Ben Hutchings wrote:
> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
> >   
> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
> >> <bhutchings@solarflare.com> wrote:
> >>     
> >>> I have no particular opinion on this change, but you need to read and
> >>> follow Documentation/SubmittingPatches.
> >>>
> >>> Ben.
> >>>       
> >> Sorry, first kernel patch, and I did not know about it. I resubmit with
> >> the correct style / format:
> >>     
> > [...]
> >
> > Sorry, no you haven't.
> >
> > - Networking changes go through David Miller's net-next-2.6 tree so you
> > need to use that as the baseline, not 2.6.26
> > - Patches should be applicable with -p1, not -p0 (so if you use diff,
> > you should run it from one directory level up)
> > - The patch was word-wrapped
> 
> Additionally:
> 
> - please use the proper comment style, meaning each line begins
>   with a '*'
> 
> - the pr_debug statements may be useful for debugging, but are
>   a bit excessive for the final version
> 
> - + /* 2010-06-13: Pedro Garcia
> 
>    We have changelogs for this, simply explaining what the code
>    does is enough.
> 
> - Please CC the maintainer (which is me)
> --

Pedro, we have two kind of vlan setups :

accelerated and non accelerated ones.

Your patch address non accelated ones only, since you only touch
vlan_skb_recv()

Accelerated vlan can follow these paths :

1) NAPI devices

vlan_gro_receive() -> vlan_gro_common()

2) non NAPI devices

__vlan_hwaccel_rx() 

So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()

Please merge following bits to your patch submission :

http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868


Good luck for your first patch !



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-14 17:11       ` Patrick McHardy
  2010-06-14 19:12         ` Eric Dumazet
@ 2010-06-14 19:42         ` Joe Perches
  2010-06-14 20:03           ` Eric Dumazet
  1 sibling, 1 reply; 35+ messages in thread
From: Joe Perches @ 2010-06-14 19:42 UTC (permalink / raw)
  To: Pedro Garcia, Patrick McHardy; +Cc: Ben Hutchings, netdev

On Mon, 2010-06-14 at 19:11 +0200, Patrick McHardy wrote:
> Ben Hutchings wrote:
> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
> >> <bhutchings@solarflare.com> wrote:
> >>> I have no particular opinion on this change, but you need to read and
> >>> follow Documentation/SubmittingPatches.
> >> Sorry, first kernel patch, and I did not know about it. I resubmit with
> >> the correct style / format:
> > Sorry, no you haven't.
> > - Networking changes go through David Miller's net-next-2.6 tree so you
> > need to use that as the baseline, not 2.6.26
> > - Patches should be applicable with -p1, not -p0 (so if you use diff,
> > you should run it from one directory level up)
> > - The patch was word-wrapped

Pedro, you could use git format-patch and git send-email
http://linux.yyz.us/git-howto.html
http://www.kernel.org/pub/software/scm/git/docs/git-format-patch.html

> Additionally:
> - the pr_debug statements may be useful for debugging, but are
>   a bit excessive for the final version

Patrick, what's wrong with pr_debug?
Do you prefer pr_devel?

> - Please CC the maintainer (which is me)

scripts/get_maintainer.pl 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-14 19:42         ` Joe Perches
@ 2010-06-14 20:03           ` Eric Dumazet
  0 siblings, 0 replies; 35+ messages in thread
From: Eric Dumazet @ 2010-06-14 20:03 UTC (permalink / raw)
  To: Joe Perches; +Cc: Pedro Garcia, Patrick McHardy, Ben Hutchings, netdev

Le lundi 14 juin 2010 à 12:42 -0700, Joe Perches a écrit :
> On Mon, 2010-06-14 at 19:11 +0200, Patrick McHardy wrote:

> > Additionally:
> > - the pr_debug statements may be useful for debugging, but are
> >   a bit excessive for the final version
> 
> Patrick, what's wrong with pr_debug?
> Do you prefer pr_devel?

In the patch context, this pr_debug() is not necessary.

Just remove it, since its a normal situation, no need to log anything,
ever.




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-14 19:12         ` Eric Dumazet
@ 2010-06-16  8:49           ` Pedro Garcia
  2010-06-16  9:08             ` Eric Dumazet
                               ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Pedro Garcia @ 2010-06-16  8:49 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy, Ben Hutchings, Eric Dumazet

On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet <eric.dumazet@gmail.com>
wrote:
> Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
>> Ben Hutchings wrote:
>> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
>> >   
>> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
>> >> <bhutchings@solarflare.com> wrote:
>> >>     
>> >>> I have no particular opinion on this change, but you need to read and
>> >>> follow Documentation/SubmittingPatches.
>> >>>
>> >>> Ben.
>> >>>       
>> >> Sorry, first kernel patch, and I did not know about it. I resubmit
>> >> with
>> >> the correct style / format:
>> >>     
>> > [...]
>> >
>> > Sorry, no you haven't.
>> >
>> > - Networking changes go through David Miller's net-next-2.6 tree so you
>> > need to use that as the baseline, not 2.6.26
>> > - Patches should be applicable with -p1, not -p0 (so if you use diff,
>> > you should run it from one directory level up)
>> > - The patch was word-wrapped
>> 
>> Additionally:
>> 
>> - please use the proper comment style, meaning each line begins
>>   with a '*'
>> 
>> - the pr_debug statements may be useful for debugging, but are
>>   a bit excessive for the final version
>> 
>> - + /* 2010-06-13: Pedro Garcia
>> 
>>    We have changelogs for this, simply explaining what the code
>>    does is enough.
>> 
>> - Please CC the maintainer (which is me)
>> --
> 
> Pedro, we have two kind of vlan setups :
> 
> accelerated and non accelerated ones.
> 
> Your patch address non accelated ones only, since you only touch
> vlan_skb_recv()
> 
> Accelerated vlan can follow these paths :
> 
> 1) NAPI devices
> 
> vlan_gro_receive() -> vlan_gro_common()
> 
> 2) non NAPI devices
> 
> __vlan_hwaccel_rx() 
> 
> So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()
> 
> Please merge following bits to your patch submission :
> 
> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868
> 
> 
> Good luck for your first patch !

Here it is again. I added the modifications in http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW accelerated incoming packets (it did not apply cleanly on the last version of
the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly created by the user, VLAN 0 packets will be treated as no VLAN (802.1p packets), instead of dropping them.

The patch is now for two files: vlan_core (accel) and vlan_dev (non accel)

I can not test on HW accelerated devices, so if someone can check it I will appreciate (even though in the thread above it looked like yes). For non accel I tessted in 2.6.26. Now the patch is for
net-next-2.6, and it compiles OK, but I a have to setup a test environment to check it is still OK (should, but better to test).

Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
--
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 50f58f5..daaca31 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -8,6 +8,9 @@
 int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
                      u16 vlan_tci, int polling)
 {
+       struct net_device *vlan_dev;
+       u16 vlan_id;
+
        if (netpoll_rx(skb))
                return NET_RX_DROP;
 
@@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 
        skb->skb_iif = skb->dev->ifindex;
        __vlan_hwaccel_put_tag(skb, vlan_tci);
-       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
+       vlan_id = vlan_tci & VLAN_VID_MASK;
+       vlan_dev = vlan_group_get_device(grp, vlan_id);
 
-       if (!skb->dev)
-               goto drop;
+       if (vlan_dev)
+               skb->dev = vlan_dev;
+       else
+               if (vlan_id)
+                       goto drop;
 
        return (polling ? netif_receive_skb(skb) : netif_rx(skb));
 
@@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
                unsigned int vlan_tci, struct sk_buff *skb)
 {
        struct sk_buff *p;
+       struct net_device *vlan_dev;
+       u16 vlan_id;
 
        if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
                skb->deliver_no_wcard = 1;
 
        skb->skb_iif = skb->dev->ifindex;
        __vlan_hwaccel_put_tag(skb, vlan_tci);
-       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
-
-       if (!skb->dev)
-               goto drop;
+       vlan_id = vlan_tci & VLAN_VID_MASK;
+       vlan_dev = vlan_group_get_device(grp, vlan_id);
+
+       if (vlan_dev)
+               skb->dev = vlan_dev;
+       else
+               if (vlan_id)
+                       goto drop;
 
        for (p = napi->gro_list; p; p = p->next) {
                NAPI_GRO_CB(p)->same_flow =
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 5298426..65512c3 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
 {
        struct vlan_hdr *vhdr;
        struct vlan_rx_stats *rx_stats;
+       struct net_device *vlan_dev;
        u16 vlan_id;
        u16 vlan_tci;
 
@@ -157,8 +158,18 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
        vlan_id = vlan_tci & VLAN_VID_MASK;
 
        rcu_read_lock();
-       skb->dev = __find_vlan_dev(dev, vlan_id);
-       if (!skb->dev) {
+       vlan_dev = __find_vlan_dev(dev, vlan_id);
+
+       /* If the VLAN device is defined, we use it.
+        * If not, and the VID is 0, it is a 802.1p packet (not
+        * really a VLAN), so we will just netif_rx it later to the
+        * original interface, but with the skb->proto set to the
+        * wrapped proto: we do nothing here.
+        */
+
+       if (vlan_dev) {
+               skb->dev = vlan_dev;
+       } else if (vlan_id) {
                pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
                         __func__, vlan_id, dev->name);
                goto err_unlock;



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16  8:49           ` Pedro Garcia
@ 2010-06-16  9:08             ` Eric Dumazet
  2010-06-16 11:42               ` Patrick McHardy
  2010-06-27 23:21               ` Pedro Garcia
  2010-06-24 18:28             ` [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet) Pedro Garcia Pelaez
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 35+ messages in thread
From: Eric Dumazet @ 2010-06-16  9:08 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Patrick McHardy, Ben Hutchings

Le mercredi 16 juin 2010 à 10:49 +0200, Pedro Garcia a écrit :
> On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet <eric.dumazet@gmail.com>

> > Good luck for your first patch !
> 
> Here it is again. I added the modifications in http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW accelerated incoming packets (it did not apply cleanly on the last version of
> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly created by the user, VLAN 0 packets will be treated as no VLAN (802.1p packets), instead of dropping them.
> 
> The patch is now for two files: vlan_core (accel) and vlan_dev (non accel)
> 
> I can not test on HW accelerated devices, so if someone can check it I will appreciate (even though in the thread above it looked like yes). For non accel I tessted in 2.6.26. Now the patch is for
> net-next-2.6, and it compiles OK, but I a have to setup a test environment to check it is still OK (should, but better to test).
> 
> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>

OK, the patch itself is correct.

Now, could you please send it again with a proper changelog ?

In this changelog, please explain why patch is needed, and
keep lines short (< 72 chars), like the one you did in your first mail.

I'll then add my Signed-off-by, since I wrote the accelerated part ;)

Note : I wonder if another patch is needed, in case 8021q module is
_not_ loaded. We probably should accept vlan 0 frames in this case ?



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16  9:08             ` Eric Dumazet
@ 2010-06-16 11:42               ` Patrick McHardy
  2010-06-16 13:28                 ` Pedro Garcia
  2010-06-27 23:21               ` Pedro Garcia
  1 sibling, 1 reply; 35+ messages in thread
From: Patrick McHardy @ 2010-06-16 11:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Pedro Garcia, netdev, Ben Hutchings

Eric Dumazet wrote:
> Le mercredi 16 juin 2010 à 10:49 +0200, Pedro Garcia a écrit :
>> Here it is again. I added the modifications in http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW accelerated incoming packets (it did not apply cleanly on the last version of
>> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly created by the user, VLAN 0 packets will be treated as no VLAN (802.1p packets), instead of dropping them.
>>
>> The patch is now for two files: vlan_core (accel) and vlan_dev (non accel)
>>
>> I can not test on HW accelerated devices, so if someone can check it I will appreciate (even though in the thread above it looked like yes). For non accel I tessted in 2.6.26. Now the patch is for
>> net-next-2.6, and it compiles OK, but I a have to setup a test environment to check it is still OK (should, but better to test).
>>
>> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
>>     
>
> OK, the patch itself is correct.
>   

Yes, looks fine to me as well.

> Now, could you please send it again with a proper changelog ?
>
> In this changelog, please explain why patch is needed, and
> keep lines short (< 72 chars), like the one you did in your first mail.
>
> I'll then add my Signed-off-by, since I wrote the accelerated part ;)
>
> Note : I wonder if another patch is needed, in case 8021q module is
> _not_ loaded. We probably should accept vlan 0 frames in this case ?
>   

I agree that this would be best for consistency, but that would mean
adding more special cases to __netif_receive_skb().

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 11:42               ` Patrick McHardy
@ 2010-06-16 13:28                 ` Pedro Garcia
  2010-06-16 14:24                   ` Arnd Bergmann
  2010-06-16 14:24                   ` Eric Dumazet
  0 siblings, 2 replies; 35+ messages in thread
From: Pedro Garcia @ 2010-06-16 13:28 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet, Ben Hutchings, Patrick McHardy

On Wed, 16 Jun 2010 13:42:16 +0200, Patrick McHardy <kaber@trash.net>
wrote:
> Eric Dumazet wrote:
>> Le mercredi 16 juin 2010 à 10:49 +0200, Pedro Garcia a écrit :
>>> Here it is again. I added the modifications in
>>> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
>>> accelerated incoming packets (it did not apply cleanly on the last
>>> version of
>>> the kernel, so I applied manually). Now, if the VLAN 0 is not
>>> explicitly created by the user, VLAN 0 packets will be treated as no
>>> VLAN (802.1p packets), instead of dropping them.
>>>
>>> The patch is now for two files: vlan_core (accel) and vlan_dev (non
>>> accel)
>>>
>>> I can not test on HW accelerated devices, so if someone can check it I
>>> will appreciate (even though in the thread above it looked like yes).
>>> For non accel I tessted in 2.6.26. Now the patch is for
>>> net-next-2.6, and it compiles OK, but I a have to setup a test
>>> environment to check it is still OK (should, but better to test).
>>>
>>> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
>>>     
>>
>> OK, the patch itself is correct.
>>   
> 
> Yes, looks fine to me as well.
> 
>> Now, could you please send it again with a proper changelog ?
>>
>> In this changelog, please explain why patch is needed, and
>> keep lines short (< 72 chars), like the one you did in your first mail.
>>
>> I'll then add my Signed-off-by, since I wrote the accelerated part ;)
>>
>> Note : I wonder if another patch is needed, in case 8021q module is
>> _not_ loaded. We probably should accept vlan 0 frames in this case ?
>>   
> 
> I agree that this would be best for consistency, but that would mean
> adding more special cases to __netif_receive_skb().

In my understanding, 802.1p is a "subset" of 802.1q, and they share the 
protocol number. We can do a 802.1p module, but in the end it will end
up reusing most of the code in 802.1q.

In any case defining a VLAN 0 ends up usually in problems with which table 
the ARP entries get stored in. This patch solves the problem to whom 
is not using VLAN 0 explicitly, but if somebody is using VLAN 0 tagging
it will work (whatever "work" means) as before.

Probably a definitive fix would be not to allow the definition of VLAN 0 
in 802.1q module and provide some other way to tag priority packets without
using a subinterface (maybe in the same module or a new 8021p one). I am 
having a look at the kernel to see what happens if we load two modules for 
the same protocol. 

By the way, the changelog I have to write is just the text before the 
patch?


Pedro

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 13:28                 ` Pedro Garcia
@ 2010-06-16 14:24                   ` Arnd Bergmann
  2010-06-16 15:28                     ` Patrick McHardy
  2010-06-16 14:24                   ` Eric Dumazet
  1 sibling, 1 reply; 35+ messages in thread
From: Arnd Bergmann @ 2010-06-16 14:24 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Eric Dumazet, Ben Hutchings, Patrick McHardy

On Wednesday 16 June 2010, Pedro Garcia wrote:
> Probably a definitive fix would be not to allow the definition of VLAN 0 
> in 802.1q module and provide some other way to tag priority packets without
> using a subinterface (maybe in the same module or a new 8021p one). I am 
> having a look at the kernel to see what happens if we load two modules for 
> the same protocol. 

On a related note, we will also need to support 802.1Qad provider bridges
at some point, which use yet another variation of the VLAN header (actually
two nested VLAN tags) with a different ethertype.
I need this for 802.1Qbg multi-channel VEPA (possibly also 802.1Qbh
port extenders), but I have not yet investigated how to implement this
in the VLAN module.

> By the way, the changelog I have to write is just the text before the 
> patch?

Yes.

	Arnd

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 13:28                 ` Pedro Garcia
  2010-06-16 14:24                   ` Arnd Bergmann
@ 2010-06-16 14:24                   ` Eric Dumazet
  1 sibling, 0 replies; 35+ messages in thread
From: Eric Dumazet @ 2010-06-16 14:24 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Ben Hutchings, Patrick McHardy

Le mercredi 16 juin 2010 à 15:28 +0200, Pedro Garcia a écrit :

> 
> In my understanding, 802.1p is a "subset" of 802.1q, and they share the 
> protocol number. We can do a 802.1p module, but in the end it will end
> up reusing most of the code in 802.1q.
> 

I was more thinking of a default ETH_P_8021Q rx handler (aka
vlan_skb_recv_minimal) with minimum handling (only accept vid=0 frames),
being overridden by real 8021q handler if module loaded/present.

> In any case defining a VLAN 0 ends up usually in problems with which table 
> the ARP entries get stored in. This patch solves the problem to whom 
> is not using VLAN 0 explicitly, but if somebody is using VLAN 0 tagging
> it will work (whatever "work" means) as before.
> 
> Probably a definitive fix would be not to allow the definition of VLAN 0 
> in 802.1q module and provide some other way to tag priority packets without
> using a subinterface (maybe in the same module or a new 8021p one). I am 
> having a look at the kernel to see what happens if we load two modules for 
> the same protocol. 
> 
> By the way, the changelog I have to write is just the text before the 
> patch?

Yes, you can take a look on any patch around for examples, like...

git show 6e327c11a91d190650df9aabe7d3694d4838bfa1

Check Documentation/SubmittingPatches   section 2)




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 14:24                   ` Arnd Bergmann
@ 2010-06-16 15:28                     ` Patrick McHardy
  2010-06-16 18:26                       ` Arnd Bergmann
  0 siblings, 1 reply; 35+ messages in thread
From: Patrick McHardy @ 2010-06-16 15:28 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Pedro Garcia, netdev, Eric Dumazet, Ben Hutchings

Arnd Bergmann wrote:
> On Wednesday 16 June 2010, Pedro Garcia wrote:
>   
>> Probably a definitive fix would be not to allow the definition of VLAN 0 
>> in 802.1q module and provide some other way to tag priority packets without
>> using a subinterface (maybe in the same module or a new 8021p one). I am 
>> having a look at the kernel to see what happens if we load two modules for 
>> the same protocol. 
>>     
>
> On a related note, we will also need to support 802.1Qad provider bridges
> at some point, which use yet another variation of the VLAN header (actually
> two nested VLAN tags) with a different ethertype.
> I need this for 802.1Qbg multi-channel VEPA (possibly also 802.1Qbh
> port extenders), but I have not yet investigated how to implement this
> in the VLAN module.
>   

Since we don't have any special VLAN handling in the bridging code, I
guess it comes down to optionally using a different ethertype value
(0x88a8) in the VLAN code. We probably also need some indication from
device drivers whether they are able to add these headers to avoid
trying to offload tagging in case they're not.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 15:28                     ` Patrick McHardy
@ 2010-06-16 18:26                       ` Arnd Bergmann
  2010-06-16 18:58                         ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Arnd Bergmann @ 2010-06-16 18:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Pedro Garcia, netdev, Eric Dumazet, Ben Hutchings

On Wednesday 16 June 2010 17:28:23 Patrick McHardy wrote:

> Since we don't have any special VLAN handling in the bridging code, I
> guess it comes down to optionally using a different ethertype value
> (0x88a8) in the VLAN code. We probably also need some indication from
> device drivers whether they are able to add these headers to avoid
> trying to offload tagging in case they're not.

It's probably a little more than just supporting the new ethertype, but not
much. The outer tag can be handled like our current VLAN module does,
but the standard does not allow a regular frame to be encapsulated directly,
but rather requires one of 

1. In 802.1ad: an 802.1Q VLAN tag (ethertype 0x8100) followed by the frame
2. In 802.1ah: A service tag (ethertype 0x88e7) followed by the 802.1Q VLAN tag
   and then the frame.

Maybe what we can do is extend the vlan code to understand all three frame
formats (q, ad and ah) or at least the first two so we configure both the
provider VID and the Customer VID for the interface in case of 802.1ad but
only the regular VID in 802.1Q.

Device drivers can then flag whether they support both formats or just
the regular Q tag.

	Arnd

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 18:26                       ` Arnd Bergmann
@ 2010-06-16 18:58                         ` Eric Dumazet
  2010-06-17  8:56                           ` Vladislav Zolotarov
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2010-06-16 18:58 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Patrick McHardy, Pedro Garcia, netdev, Ben Hutchings

Le mercredi 16 juin 2010 à 20:26 +0200, Arnd Bergmann a écrit :
> On Wednesday 16 June 2010 17:28:23 Patrick McHardy wrote:
> 
> > Since we don't have any special VLAN handling in the bridging code, I
> > guess it comes down to optionally using a different ethertype value
> > (0x88a8) in the VLAN code. We probably also need some indication from
> > device drivers whether they are able to add these headers to avoid
> > trying to offload tagging in case they're not.
> 
> It's probably a little more than just supporting the new ethertype, but not
> much. The outer tag can be handled like our current VLAN module does,
> but the standard does not allow a regular frame to be encapsulated directly,
> but rather requires one of 
> 
> 1. In 802.1ad: an 802.1Q VLAN tag (ethertype 0x8100) followed by the frame
> 2. In 802.1ah: A service tag (ethertype 0x88e7) followed by the 802.1Q VLAN tag
>    and then the frame.
> 
> Maybe what we can do is extend the vlan code to understand all three frame
> formats (q, ad and ah) or at least the first two so we configure both the
> provider VID and the Customer VID for the interface in case of 802.1ad but
> only the regular VID in 802.1Q.
> 
> Device drivers can then flag whether they support both formats or just
> the regular Q tag.
> 
> 	Arnd

Speaking of device drivers, I see bnx2 (hardware accelerated) is able to
insert a 8021q tag in case no vlgrp is defined (the 8201q tag that was
removed by NIC)... interesting ping pong games, since our 8021q stack
will remove it again, eventually.

So VLAN 0 'problem' on bnx2 could be solved with following patch
(avoiding this insert if vtag==0)



diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 522de9f..b5d4d05 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -3192,7 +3192,7 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 				hw_vlan = 1;
 			else
 #endif
-			{
+			if (vtag) {
 				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
 					__skb_push(skb, 4);
 



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16 18:58                         ` Eric Dumazet
@ 2010-06-17  8:56                           ` Vladislav Zolotarov
  2010-06-17 10:28                             ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Vladislav Zolotarov @ 2010-06-17  8:56 UTC (permalink / raw)
  To: Eric Dumazet, Arnd Bergmann
  Cc: Patrick McHardy, Pedro Garcia, netdev, Ben Hutchings



> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Eric Dumazet
> Sent: Wednesday, June 16, 2010 9:58 PM
> To: Arnd Bergmann
> Cc: Patrick McHardy; Pedro Garcia; netdev@vger.kernel.org; Ben Hutchings
> Subject: Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> (802.1p packet)
> 
> Le mercredi 16 juin 2010 à 20:26 +0200, Arnd Bergmann a écrit :
> > On Wednesday 16 June 2010 17:28:23 Patrick McHardy wrote:
> >
> > > Since we don't have any special VLAN handling in the bridging code, I
> > > guess it comes down to optionally using a different ethertype value
> > > (0x88a8) in the VLAN code. We probably also need some indication from
> > > device drivers whether they are able to add these headers to avoid
> > > trying to offload tagging in case they're not.
> >
> > It's probably a little more than just supporting the new ethertype, but not
> > much. The outer tag can be handled like our current VLAN module does,
> > but the standard does not allow a regular frame to be encapsulated
> directly,
> > but rather requires one of
> >
> > 1. In 802.1ad: an 802.1Q VLAN tag (ethertype 0x8100) followed by the frame
> > 2. In 802.1ah: A service tag (ethertype 0x88e7) followed by the 802.1Q VLAN
> tag
> >    and then the frame.
> >
> > Maybe what we can do is extend the vlan code to understand all three frame
> > formats (q, ad and ah) or at least the first two so we configure both the
> > provider VID and the Customer VID for the interface in case of 802.1ad but
> > only the regular VID in 802.1Q.
> >
> > Device drivers can then flag whether they support both formats or just
> > the regular Q tag.
> >
> > 	Arnd
> 
> Speaking of device drivers, I see bnx2 (hardware accelerated) is able to
> insert a 8021q tag in case no vlgrp is defined (the 8201q tag that was
> removed by NIC)... interesting ping pong games, since our 8021q stack
> will remove it again, eventually.
> 
> So VLAN 0 'problem' on bnx2 could be solved with following patch
> (avoiding this insert if vtag==0)
> 
> 
> 
> diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
> index 522de9f..b5d4d05 100644
> --- a/drivers/net/bnx2.c
> +++ b/drivers/net/bnx2.c
> @@ -3192,7 +3192,7 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi,
> int budget)
>  				hw_vlan = 1;
>  			else
>  #endif
> -			{
> +			if (vtag) {
>  				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
>  					__skb_push(skb, 4);
> 
> 
> 
> --

This way u will loose all the priority information that was on the VLAN header.

> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-17  8:56                           ` Vladislav Zolotarov
@ 2010-06-17 10:28                             ` Eric Dumazet
  2010-06-17 14:08                               ` Vladislav Zolotarov
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2010-06-17 10:28 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Arnd Bergmann, Patrick McHardy, Pedro Garcia, netdev, Ben Hutchings

Le jeudi 17 juin 2010 à 01:56 -0700, Vladislav Zolotarov a écrit :
> 
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> > Behalf Of Eric Dumazet
> > Sent: Wednesday, June 16, 2010 9:58 PM
> > To: Arnd Bergmann
> > Cc: Patrick McHardy; Pedro Garcia; netdev@vger.kernel.org; Ben Hutchings
> > Subject: Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> > (802.1p packet)
> > 
> > Le mercredi 16 juin 2010 à 20:26 +0200, Arnd Bergmann a écrit :
> > > On Wednesday 16 June 2010 17:28:23 Patrick McHardy wrote:
> > >
> > > > Since we don't have any special VLAN handling in the bridging code, I
> > > > guess it comes down to optionally using a different ethertype value
> > > > (0x88a8) in the VLAN code. We probably also need some indication from
> > > > device drivers whether they are able to add these headers to avoid
> > > > trying to offload tagging in case they're not.
> > >
> > > It's probably a little more than just supporting the new ethertype, but not
> > > much. The outer tag can be handled like our current VLAN module does,
> > > but the standard does not allow a regular frame to be encapsulated
> > directly,
> > > but rather requires one of
> > >
> > > 1. In 802.1ad: an 802.1Q VLAN tag (ethertype 0x8100) followed by the frame
> > > 2. In 802.1ah: A service tag (ethertype 0x88e7) followed by the 802.1Q VLAN
> > tag
> > >    and then the frame.
> > >
> > > Maybe what we can do is extend the vlan code to understand all three frame
> > > formats (q, ad and ah) or at least the first two so we configure both the
> > > provider VID and the Customer VID for the interface in case of 802.1ad but
> > > only the regular VID in 802.1Q.
> > >
> > > Device drivers can then flag whether they support both formats or just
> > > the regular Q tag.
> > >
> > > 	Arnd
> > 
> > Speaking of device drivers, I see bnx2 (hardware accelerated) is able to
> > insert a 8021q tag in case no vlgrp is defined (the 8201q tag that was
> > removed by NIC)... interesting ping pong games, since our 8021q stack
> > will remove it again, eventually.
> > 
> > So VLAN 0 'problem' on bnx2 could be solved with following patch
> > (avoiding this insert if vtag==0)
> > 
> > 
> > 
> > diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
> > index 522de9f..b5d4d05 100644
> > --- a/drivers/net/bnx2.c
> > +++ b/drivers/net/bnx2.c
> > @@ -3192,7 +3192,7 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi,
> > int budget)
> >  				hw_vlan = 1;
> >  			else
> >  #endif
> > -			{
> > +			if (vtag) {
> >  				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
> >  					__skb_push(skb, 4);
> > 
> > 
> > 
> > --
> 
> This way u will loose all the priority information that was on the VLAN header.

16bits vtag = 0 : there is no priority information.




^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-17 10:28                             ` Eric Dumazet
@ 2010-06-17 14:08                               ` Vladislav Zolotarov
  0 siblings, 0 replies; 35+ messages in thread
From: Vladislav Zolotarov @ 2010-06-17 14:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Arnd Bergmann, Patrick McHardy, Pedro Garcia, netdev, Ben Hutchings



> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Eric Dumazet
> Sent: Thursday, June 17, 2010 1:29 PM
> To: Vladislav Zolotarov
> Cc: Arnd Bergmann; Patrick McHardy; Pedro Garcia; netdev@vger.kernel.org; Ben
> Hutchings
> Subject: RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> (802.1p packet)
> 
> Le jeudi 17 juin 2010 à 01:56 -0700, Vladislav Zolotarov a écrit :
> >
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On
> > > Behalf Of Eric Dumazet
> > > Sent: Wednesday, June 16, 2010 9:58 PM
> > > To: Arnd Bergmann
> > > Cc: Patrick McHardy; Pedro Garcia; netdev@vger.kernel.org; Ben Hutchings
> > > Subject: Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> > > (802.1p packet)
> > >
> > > Le mercredi 16 juin 2010 à 20:26 +0200, Arnd Bergmann a écrit :
> > > > On Wednesday 16 June 2010 17:28:23 Patrick McHardy wrote:
> > > >
> > > > > Since we don't have any special VLAN handling in the bridging code, I
> > > > > guess it comes down to optionally using a different ethertype value
> > > > > (0x88a8) in the VLAN code. We probably also need some indication from
> > > > > device drivers whether they are able to add these headers to avoid
> > > > > trying to offload tagging in case they're not.
> > > >
> > > > It's probably a little more than just supporting the new ethertype, but
> not
> > > > much. The outer tag can be handled like our current VLAN module does,
> > > > but the standard does not allow a regular frame to be encapsulated
> > > directly,
> > > > but rather requires one of
> > > >
> > > > 1. In 802.1ad: an 802.1Q VLAN tag (ethertype 0x8100) followed by the
> frame
> > > > 2. In 802.1ah: A service tag (ethertype 0x88e7) followed by the 802.1Q
> VLAN
> > > tag
> > > >    and then the frame.
> > > >
> > > > Maybe what we can do is extend the vlan code to understand all three
> frame
> > > > formats (q, ad and ah) or at least the first two so we configure both
> the
> > > > provider VID and the Customer VID for the interface in case of 802.1ad
> but
> > > > only the regular VID in 802.1Q.
> > > >
> > > > Device drivers can then flag whether they support both formats or just
> > > > the regular Q tag.
> > > >
> > > > 	Arnd
> > >
> > > Speaking of device drivers, I see bnx2 (hardware accelerated) is able to
> > > insert a 8021q tag in case no vlgrp is defined (the 8201q tag that was
> > > removed by NIC)... interesting ping pong games, since our 8021q stack
> > > will remove it again, eventually.
> > >
> > > So VLAN 0 'problem' on bnx2 could be solved with following patch
> > > (avoiding this insert if vtag==0)
> > >
> > >
> > >
> > > diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
> > > index 522de9f..b5d4d05 100644
> > > --- a/drivers/net/bnx2.c
> > > +++ b/drivers/net/bnx2.c
> > > @@ -3192,7 +3192,7 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi
> *bnapi,
> > > int budget)
> > >  				hw_vlan = 1;
> > >  			else
> > >  #endif
> > > -			{
> > > +			if (vtag) {
> > >  				struct vlan_ethhdr *ve = (struct vlan_ethhdr *)
> > >  					__skb_push(skb, 4);
> > >
> > >
> > >
> > > --
> >
> > This way u will loose all the priority information that was on the VLAN
> header.
> 
> 16bits vtag = 0 : there is no priority information.

According to IEEE 802.1p PCP=0 is legal priority, CFI bit is usually zero. So, VTAG=0 would mark a priority tagged frame with a priority 0 and it should be handled differently than a frame with no priority at all and your patch will prevent it.

> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16  8:49           ` Pedro Garcia
  2010-06-16  9:08             ` Eric Dumazet
@ 2010-06-24 18:28             ` Pedro Garcia Pelaez
  2010-07-08 12:54             ` Vladislav Zolotarov
  2010-07-08 13:51             ` Vladislav Zolotarov
  3 siblings, 0 replies; 35+ messages in thread
From: Pedro Garcia Pelaez @ 2010-06-24 18:28 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Patrick McHardy, Ben Hutchings, Eric Dumazet

On Wed, 16 Jun 2010 10:49:27 +0200, Pedro Garcia
<pedro.netdev@dondevamos.com> wrote:
> On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet
<eric.dumazet@gmail.com>
> wrote:
>> Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
>>> Ben Hutchings wrote:
>>> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
>>> >   
>>> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
>>> >> <bhutchings@solarflare.com> wrote:
>>> >>     
>>> >>> I have no particular opinion on this change, but you need to read
>>> >>> and
>>> >>> follow Documentation/SubmittingPatches.
>>> >>>
>>> >>> Ben.
>>> >>>       
>>> >> Sorry, first kernel patch, and I did not know about it. I resubmit
>>> >> with
>>> >> the correct style / format:
>>> >>     
>>> > [...]
>>> >
>>> > Sorry, no you haven't.
>>> >
>>> > - Networking changes go through David Miller's net-next-2.6 tree so
>>> > you
>>> > need to use that as the baseline, not 2.6.26
>>> > - Patches should be applicable with -p1, not -p0 (so if you use
diff,
>>> > you should run it from one directory level up)
>>> > - The patch was word-wrapped
>>> 
>>> Additionally:
>>> 
>>> - please use the proper comment style, meaning each line begins
>>>   with a '*'
>>> 
>>> - the pr_debug statements may be useful for debugging, but are
>>>   a bit excessive for the final version
>>> 
>>> - + /* 2010-06-13: Pedro Garcia
>>> 
>>>    We have changelogs for this, simply explaining what the code
>>>    does is enough.
>>> 
>>> - Please CC the maintainer (which is me)
>>> --
>> 
>> Pedro, we have two kind of vlan setups :
>> 
>> accelerated and non accelerated ones.
>> 
>> Your patch address non accelated ones only, since you only touch
>> vlan_skb_recv()
>> 
>> Accelerated vlan can follow these paths :
>> 
>> 1) NAPI devices
>> 
>> vlan_gro_receive() -> vlan_gro_common()
>> 
>> 2) non NAPI devices
>> 
>> __vlan_hwaccel_rx() 
>> 
>> So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()
>> 
>> Please merge following bits to your patch submission :
>> 
>> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868
>> 
>> 
>> Good luck for your first patch !
> 
> Here it is again. I added the modifications in
> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
> accelerated incoming packets (it did not apply cleanly on the last
version
> of
> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly
> created by the user, VLAN 0 packets will be treated as no VLAN (802.1p
> packets), instead of dropping them.
> 
> The patch is now for two files: vlan_core (accel) and vlan_dev (non
accel)
> 
> I can not test on HW accelerated devices, so if someone can check it I
> will appreciate (even though in the thread above it looked like yes).
For
> non accel I tessted in 2.6.26. Now the patch is for
> net-next-2.6, and it compiles OK, but I a have to setup a test
environment
> to check it is still OK (should, but better to test).
> 

I tested the pacth under net-next-2.6, and it OOPSed the kernel (worked
under 2.6.26 but not under 2.6.35). I have found why and solved it, but
now, to my surprise, it only works when I leave the interface in
promiscuous mode.

After a lot of debugging, looks like the skb does not even arrive to
__netif_receive_skb, unless in promiscuous mode. Under what circunstances
could this happen?

Pedro 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16  9:08             ` Eric Dumazet
  2010-06-16 11:42               ` Patrick McHardy
@ 2010-06-27 23:21               ` Pedro Garcia
  2010-06-30 20:16                 ` David Miller
  1 sibling, 1 reply; 35+ messages in thread
From: Pedro Garcia @ 2010-06-27 23:21 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy, Ben Hutchings, Eric Dumazet

On Wed, 16 Jun 2010 11:08:04 +0200, Eric Dumazet <eric.dumazet@gmail.com>
wrote:
> Le mercredi 16 juin 2010 à 10:49 +0200, Pedro Garcia a écrit :
>> On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet <eric.dumazet@gmail.com>
> 
>> > Good luck for your first patch !
>> 
>> Here it is again. I added the modifications in
>> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
>> accelerated incoming packets (it did not apply cleanly on the last
>> version of
>> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly
>> created by the user, VLAN 0 packets will be treated as no VLAN (802.1p
>> packets), instead of dropping them.
>> 
>> The patch is now for two files: vlan_core (accel) and vlan_dev (non
>> accel)
>> 
>> I can not test on HW accelerated devices, so if someone can check it I
>> will appreciate (even though in the thread above it looked like yes). For
>> non accel I tessted in 2.6.26. Now the patch is for
>> net-next-2.6, and it compiles OK, but I a have to setup a test
>> environment to check it is still OK (should, but better to test).
>> 
>> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
> 
> OK, the patch itself is correct.
> 
> Now, could you please send it again with a proper changelog ?
> 
> In this changelog, please explain why patch is needed, and
> keep lines short (< 72 chars), like the one you did in your first mail.
> 
> I'll then add my Signed-off-by, since I wrote the accelerated part ;)
> 
> Note : I wonder if another patch is needed, in case 8021q module is
> _not_ loaded. We probably should accept vlan 0 frames in this case ?

Last version of the patch. Now I think it is OK, of course pending 
Eric's signed-off-by for the accel HW part.

If this is too long for a changelog, tell me and I will try to sum it
up:

- Without the 8021q module loaded in the kernel, all 802.1p packets 
(VLAN 0 but QoS tagging) are silently discarded (as expected, as 
the protocol is not loaded).

- Without this patch in 8021q module, these packets are forwarded to 
the module, but they are discarded also if VLAN 0 is not configured,
which should not be the default behaviour, as VLAN 0 is not really
a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
impossible to communicate with mixed 802.1p and non 802.1p devices on
the same network due to arp table issues.

- Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
is 0 and we have not defined a VLAN with ID 0, but we accept the 
packet with the encapsulated proto and pass it later to netif_rx.

- In the vlan device event handler, added some logic to add VLAN 0 
to HW filter in devices that support it (this prevented any traffic
in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
and probably also with other HW filtered cards, so we fix it here).

- In the vlan unregister logic, prevent the elimination of VLAN 0 
in devices with HW filter.

- The default behaviour is to ignore the VLAN 0 tagging and accept
the packet as if it was not tagged, but we can still define a 
VLAN 0 if desired (so it is backwards compatible).

Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
--
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 3c1c8c1..d9abc43 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -155,9 +155,10 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
        BUG_ON(!grp);
 
        /* Take it out of our own structures, but be sure to interlock with
-        * HW accelerating devices or SW vlan input packet processing.
+        * HW accelerating devices or SW vlan input packet processing if
+        * VLAN is not 0 (leave it there for 802.1p).
         */
-       if (real_dev->features & NETIF_F_HW_VLAN_FILTER)
+       if (vlan_id && (real_dev->features & NETIF_F_HW_VLAN_FILTER))
                ops->ndo_vlan_rx_kill_vid(real_dev, vlan_id);
 
        grp->nr_vlans--;
@@ -419,6 +420,14 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
        if (is_vlan_dev(dev))
                __vlan_device_event(dev, event);
 
+       if ((event == NETDEV_UP) &&
+           (dev->features & NETIF_F_HW_VLAN_FILTER) &&
+           (dev->netdev_ops->ndo_vlan_rx_add_vid)) {
+               pr_info("8021q: adding VLAN 0 to HW filter on device %s\n",
+                       dev->name);
+               dev->netdev_ops->ndo_vlan_rx_add_vid(dev, 0);
+       }
+
        grp = __vlan_find_group(dev);
        if (!grp)
                goto out;
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 50f58f5..daaca31 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -8,6 +8,9 @@
 int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
                      u16 vlan_tci, int polling)
 {
+       struct net_device *vlan_dev;
+       u16 vlan_id;
+
        if (netpoll_rx(skb))
                return NET_RX_DROP;
 
@@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 
        skb->skb_iif = skb->dev->ifindex;
        __vlan_hwaccel_put_tag(skb, vlan_tci);
-       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
+       vlan_id = vlan_tci & VLAN_VID_MASK;
+       vlan_dev = vlan_group_get_device(grp, vlan_id);
 
-       if (!skb->dev)
-               goto drop;
+       if (vlan_dev)
+               skb->dev = vlan_dev;
+       else
+               if (vlan_id)
+                       goto drop;
 
        return (polling ? netif_receive_skb(skb) : netif_rx(skb));
 
@@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
                unsigned int vlan_tci, struct sk_buff *skb)
 {
        struct sk_buff *p;
+       struct net_device *vlan_dev;
+       u16 vlan_id;
 
        if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
                skb->deliver_no_wcard = 1;
 
        skb->skb_iif = skb->dev->ifindex;
        __vlan_hwaccel_put_tag(skb, vlan_tci);
-       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
-
-       if (!skb->dev)
-               goto drop;
+       vlan_id = vlan_tci & VLAN_VID_MASK;
+       vlan_dev = vlan_group_get_device(grp, vlan_id);
+
+       if (vlan_dev)
+               skb->dev = vlan_dev;
+       else
+               if (vlan_id)
+                       goto drop;
 
        for (p = napi->gro_list; p; p = p->next) {
                NAPI_GRO_CB(p)->same_flow =
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 5298426..21f7229 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
 {
        struct vlan_hdr *vhdr;
        struct vlan_rx_stats *rx_stats;
+       struct net_device *vlan_dev;
        u16 vlan_id;
        u16 vlan_tci;
 
@@ -157,53 +158,69 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
        vlan_id = vlan_tci & VLAN_VID_MASK;
 
        rcu_read_lock();
-       skb->dev = __find_vlan_dev(dev, vlan_id);
-       if (!skb->dev) {
-               pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
-                        __func__, vlan_id, dev->name);
-               goto err_unlock;
-       }
-
-       rx_stats = per_cpu_ptr(vlan_dev_info(skb->dev)->vlan_rx_stats,
-                              smp_processor_id());
-       rx_stats->rx_packets++;
-       rx_stats->rx_bytes += skb->len;
-
-       skb_pull_rcsum(skb, VLAN_HLEN);
-
-       skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci);
+       vlan_dev = __find_vlan_dev(dev, vlan_id);
 
-       pr_debug("%s: priority: %u for TCI: %hu\n",
-                __func__, skb->priority, vlan_tci);
-
-       switch (skb->pkt_type) {
-       case PACKET_BROADCAST: /* Yeah, stats collect these together.. */
-               /* stats->broadcast ++; // no such counter :-( */
-               break;
-
-       case PACKET_MULTICAST:
-               rx_stats->multicast++;
-               break;
+       /* If the VLAN device is defined, we use it.
+        * If not, and the VID is 0, it is a 802.1p packet (not
+        * really a VLAN), so we will just netif_rx it later to the
+        * original interface, but with the skb->proto set to the
+        * wrapped proto: we do nothing here.
+        */
 
-       case PACKET_OTHERHOST:
-               /* Our lower layer thinks this is not local, let's make sure.
-                * This allows the VLAN to have a different MAC than the
-                * underlying device, and still route correctly.
-                */
-               if (!compare_ether_addr(eth_hdr(skb)->h_dest,
-                                       skb->dev->dev_addr))
-                       skb->pkt_type = PACKET_HOST;
-               break;
-       default:
-               break;
+       if (!vlan_dev) {
+               if (vlan_id) {
+                       pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
+                                __func__, vlan_id, dev->name);
+                       goto err_unlock;
+               }
+               rx_stats = NULL;
+       } else {
+               skb->dev = vlan_dev;
+
+               rx_stats = per_cpu_ptr(vlan_dev_info(skb->dev)->vlan_rx_stats,
+                                       smp_processor_id());
+               rx_stats->rx_packets++;
+               rx_stats->rx_bytes += skb->len;
+
+               skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci);
+
+               pr_debug("%s: priority: %u for TCI: %hu\n",
+                        __func__, skb->priority, vlan_tci);
+
+               switch (skb->pkt_type) {
+               case PACKET_BROADCAST:
+                       /* Yeah, stats collect these together.. */
+                       /* stats->broadcast ++; // no such counter :-( */
+                       break;
+
+               case PACKET_MULTICAST:
+                       rx_stats->multicast++;
+                       break;
+
+               case PACKET_OTHERHOST:
+                       /* Our lower layer thinks this is not local, let's make
+                       * sure.
+                       * This allows the VLAN to have a different MAC than the
+                       * underlying device, and still route correctly.
+                       */
+                       if (!compare_ether_addr(eth_hdr(skb)->h_dest,
+                                               skb->dev->dev_addr))
+                               skb->pkt_type = PACKET_HOST;
+                       break;
+               default:
+                       break;
+               }
        }
 
+       skb_pull_rcsum(skb, VLAN_HLEN);
        vlan_set_encap_proto(skb, vhdr);
 
-       skb = vlan_check_reorder_header(skb);
-       if (!skb) {
-               rx_stats->rx_errors++;
-               goto err_unlock;
+       if (vlan_dev) {
+               skb = vlan_check_reorder_header(skb);
+               if (!skb) {
+                       rx_stats->rx_errors++;
+                       goto err_unlock;
+               }
        }
 
        netif_rx(skb);


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-27 23:21               ` Pedro Garcia
@ 2010-06-30 20:16                 ` David Miller
  2010-07-01 18:47                   ` Pedro Garcia
  0 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2010-06-30 20:16 UTC (permalink / raw)
  To: pedro.netdev; +Cc: netdev, kaber, bhutchings, eric.dumazet

From: Pedro Garcia <pedro.netdev@dondevamos.com>
Date: Mon, 28 Jun 2010 01:21:19 +0200

> Last version of the patch. Now I think it is OK, of course pending 
> Eric's signed-off-by for the accel HW part.

Eric, please review.

> 
> If this is too long for a changelog, tell me and I will try to sum it
> up:

To me, not commit message is too long, the more the better. :)

> +       if ((event == NETDEV_UP) &&
> +           (dev->features & NETIF_F_HW_VLAN_FILTER) &&
> +           (dev->netdev_ops->ndo_vlan_rx_add_vid)) {

There is no reason to surround this final NULL pointer check with
parenthesis, it just makes reading it confusing.
> +       if (vlan_dev)
> +               skb->dev = vlan_dev;
> +       else
> +               if (vlan_id)
> +                       goto drop;

Please format this as:

	if (a)
		b;
	else if (c)
		d;


> +       if (vlan_dev)
> +               skb->dev = vlan_dev;
> +       else
> +               if (vlan_id)
> +                       goto drop;

Same here.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-30 20:16                 ` David Miller
@ 2010-07-01 18:47                   ` Pedro Garcia
  2010-07-01 20:19                     ` Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: Pedro Garcia @ 2010-07-01 18:47 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy, Ben Hutchings, Eric Dumazet, David Miller

The patch with the modifications suggested by David.

- Without the 8021q module loaded in the kernel, all 802.1p packets 
(VLAN 0 but QoS tagging) are silently discarded (as expected, as 
the protocol is not loaded).

- Without this patch in 8021q module, these packets are forwarded to 
the module, but they are discarded also if VLAN 0 is not configured,
which should not be the default behaviour, as VLAN 0 is not really
a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
impossible to communicate with mixed 802.1p and non 802.1p devices on
the same network due to arp table issues.

- Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
is 0 and we have not defined a VLAN with ID 0, but we accept the 
packet with the encapsulated proto and pass it later to netif_rx.

- In the vlan device event handler, added some logic to add VLAN 0 
to HW filter in devices that support it (this prevented any traffic
in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
and probably also with other HW filtered cards, so we fix it here).

- In the vlan unregister logic, prevent the elimination of VLAN 0 
in devices with HW filter.

- The default behaviour is to ignore the VLAN 0 tagging and accept
the packet as if it was not tagged, but we can still define a 
VLAN 0 if desired (so it is backwards compatible).

Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
--
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 3c1c8c1..a2ad152 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -155,9 +155,10 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 	BUG_ON(!grp);
 
 	/* Take it out of our own structures, but be sure to interlock with
-	 * HW accelerating devices or SW vlan input packet processing.
+	 * HW accelerating devices or SW vlan input packet processing if
+	 * VLAN is not 0 (leave it there for 802.1p).
 	 */
-	if (real_dev->features & NETIF_F_HW_VLAN_FILTER)
+	if (vlan_id && (real_dev->features & NETIF_F_HW_VLAN_FILTER))
 		ops->ndo_vlan_rx_kill_vid(real_dev, vlan_id);
 
 	grp->nr_vlans--;
@@ -419,6 +420,14 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 	if (is_vlan_dev(dev))
 		__vlan_device_event(dev, event);
 
+	if ((event == NETDEV_UP) &&
+	    (dev->features & NETIF_F_HW_VLAN_FILTER) &&
+	    dev->netdev_ops->ndo_vlan_rx_add_vid) {
+		pr_info("8021q: adding VLAN 0 to HW filter on device %s\n",
+			dev->name);
+		dev->netdev_ops->ndo_vlan_rx_add_vid(dev, 0);
+	}
+
 	grp = __vlan_find_group(dev);
 	if (!grp)
 		goto out;
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 50f58f5..0f91f46 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -8,6 +8,9 @@
 int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 		      u16 vlan_tci, int polling)
 {
+	struct net_device *vlan_dev;
+	u16 vlan_id;
+
 	if (netpoll_rx(skb))
 		return NET_RX_DROP;
 
@@ -16,9 +19,12 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 
 	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
-	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
+	vlan_id = vlan_tci & VLAN_VID_MASK;
+	vlan_dev = vlan_group_get_device(grp, vlan_id);
 
-	if (!skb->dev)
+	if (vlan_dev)
+		skb->dev = vlan_dev;
+	else if (vlan_id)
 		goto drop;
 
 	return (polling ? netif_receive_skb(skb) : netif_rx(skb));
@@ -82,15 +88,20 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 		unsigned int vlan_tci, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	struct net_device *vlan_dev;
+	u16 vlan_id;
 
 	if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
 		skb->deliver_no_wcard = 1;
 
 	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
-	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
+	vlan_id = vlan_tci & VLAN_VID_MASK;
+	vlan_dev = vlan_group_get_device(grp, vlan_id);
 
-	if (!skb->dev)
+	if (vlan_dev)
+		skb->dev = vlan_dev;
+	else if (vlan_id)
 		goto drop;
 
 	for (p = napi->gro_list; p; p = p->next) {
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 5298426..21f7229 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
 {
 	struct vlan_hdr *vhdr;
 	struct vlan_rx_stats *rx_stats;
+	struct net_device *vlan_dev;
 	u16 vlan_id;
 	u16 vlan_tci;
 
@@ -157,53 +158,69 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
 	vlan_id = vlan_tci & VLAN_VID_MASK;
 
 	rcu_read_lock();
-	skb->dev = __find_vlan_dev(dev, vlan_id);
-	if (!skb->dev) {
-		pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
-			 __func__, vlan_id, dev->name);
-		goto err_unlock;
-	}
-
-	rx_stats = per_cpu_ptr(vlan_dev_info(skb->dev)->vlan_rx_stats,
-			       smp_processor_id());
-	rx_stats->rx_packets++;
-	rx_stats->rx_bytes += skb->len;
-
-	skb_pull_rcsum(skb, VLAN_HLEN);
-
-	skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci);
+	vlan_dev = __find_vlan_dev(dev, vlan_id);
 
-	pr_debug("%s: priority: %u for TCI: %hu\n",
-		 __func__, skb->priority, vlan_tci);
-
-	switch (skb->pkt_type) {
-	case PACKET_BROADCAST: /* Yeah, stats collect these together.. */
-		/* stats->broadcast ++; // no such counter :-( */
-		break;
-
-	case PACKET_MULTICAST:
-		rx_stats->multicast++;
-		break;
+	/* If the VLAN device is defined, we use it.
+	 * If not, and the VID is 0, it is a 802.1p packet (not
+	 * really a VLAN), so we will just netif_rx it later to the
+	 * original interface, but with the skb->proto set to the
+	 * wrapped proto: we do nothing here.
+	 */
 
-	case PACKET_OTHERHOST:
-		/* Our lower layer thinks this is not local, let's make sure.
-		 * This allows the VLAN to have a different MAC than the
-		 * underlying device, and still route correctly.
-		 */
-		if (!compare_ether_addr(eth_hdr(skb)->h_dest,
-					skb->dev->dev_addr))
-			skb->pkt_type = PACKET_HOST;
-		break;
-	default:
-		break;
+	if (!vlan_dev) {
+		if (vlan_id) {
+			pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
+				 __func__, vlan_id, dev->name);
+			goto err_unlock;
+		}
+		rx_stats = NULL;
+	} else {
+		skb->dev = vlan_dev;
+
+		rx_stats = per_cpu_ptr(vlan_dev_info(skb->dev)->vlan_rx_stats,
+					smp_processor_id());
+		rx_stats->rx_packets++;
+		rx_stats->rx_bytes += skb->len;
+
+		skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci);
+
+		pr_debug("%s: priority: %u for TCI: %hu\n",
+			 __func__, skb->priority, vlan_tci);
+
+		switch (skb->pkt_type) {
+		case PACKET_BROADCAST:
+			/* Yeah, stats collect these together.. */
+			/* stats->broadcast ++; // no such counter :-( */
+			break;
+
+		case PACKET_MULTICAST:
+			rx_stats->multicast++;
+			break;
+
+		case PACKET_OTHERHOST:
+			/* Our lower layer thinks this is not local, let's make
+			* sure.
+			* This allows the VLAN to have a different MAC than the
+			* underlying device, and still route correctly.
+			*/
+			if (!compare_ether_addr(eth_hdr(skb)->h_dest,
+						skb->dev->dev_addr))
+				skb->pkt_type = PACKET_HOST;
+			break;
+		default:
+			break;
+		}
 	}
 
+	skb_pull_rcsum(skb, VLAN_HLEN);
 	vlan_set_encap_proto(skb, vhdr);
 
-	skb = vlan_check_reorder_header(skb);
-	if (!skb) {
-		rx_stats->rx_errors++;
-		goto err_unlock;
+	if (vlan_dev) {
+		skb = vlan_check_reorder_header(skb);
+		if (!skb) {
+			rx_stats->rx_errors++;
+			goto err_unlock;
+		}
 	}
 
 	netif_rx(skb);


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-07-01 18:47                   ` Pedro Garcia
@ 2010-07-01 20:19                     ` Eric Dumazet
  2010-07-18 16:43                       ` Pedro Garcia
  0 siblings, 1 reply; 35+ messages in thread
From: Eric Dumazet @ 2010-07-01 20:19 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Patrick McHardy, Ben Hutchings, David Miller

Le jeudi 01 juillet 2010 à 20:47 +0200, Pedro Garcia a écrit :
> The patch with the modifications suggested by David.
> 
> - Without the 8021q module loaded in the kernel, all 802.1p packets 
> (VLAN 0 but QoS tagging) are silently discarded (as expected, as 
> the protocol is not loaded).
> 
> - Without this patch in 8021q module, these packets are forwarded to 
> the module, but they are discarded also if VLAN 0 is not configured,
> which should not be the default behaviour, as VLAN 0 is not really
> a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
> impossible to communicate with mixed 802.1p and non 802.1p devices on
> the same network due to arp table issues.
> 
> - Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
> is 0 and we have not defined a VLAN with ID 0, but we accept the 
> packet with the encapsulated proto and pass it later to netif_rx.
> 
> - In the vlan device event handler, added some logic to add VLAN 0 
> to HW filter in devices that support it (this prevented any traffic
> in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
> and probably also with other HW filtered cards, so we fix it here).
> 
> - In the vlan unregister logic, prevent the elimination of VLAN 0 
> in devices with HW filter.
> 
> - The default behaviour is to ignore the VLAN 0 tagging and accept
> the packet as if it was not tagged, but we can still define a 
> VLAN 0 if desired (so it is backwards compatible).
> 
> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>


Seems fine but you need to respin your patch against latest net-next-2.6
tree.

Check your tree got commit 9618e2ffd78aaa (vlan: 64 bit rx counters)

Thanks !



^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16  8:49           ` Pedro Garcia
  2010-06-16  9:08             ` Eric Dumazet
  2010-06-24 18:28             ` [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet) Pedro Garcia Pelaez
@ 2010-07-08 12:54             ` Vladislav Zolotarov
  2010-07-08 12:58               ` Vladislav Zolotarov
  2010-07-08 13:51             ` Vladislav Zolotarov
  3 siblings, 1 reply; 35+ messages in thread
From: Vladislav Zolotarov @ 2010-07-08 12:54 UTC (permalink / raw)
  To: Pedro Garcia, netdev; +Cc: Patrick McHardy, Ben Hutchings, Eric Dumazet

Margaret,
In order to keep exploring this we need the following data:
1. What is the scenario exactly? What is vMotion? Which kind of operation does it do? Which kind of traffic does it pass?
2. What is the nature of the failure- driver hang? PSOD? Loss of traffic?
3. We need a grcdump after the failure.

Thanks,
vlad

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Pedro Garcia
> Sent: Wednesday, June 16, 2010 11:49 AM
> To: netdev@vger.kernel.org
> Cc: Patrick McHardy; Ben Hutchings; Eric Dumazet
> Subject: Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> (802.1p packet)
> 
> On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
> > Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
> >> Ben Hutchings wrote:
> >> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
> >> >
> >> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
> >> >> <bhutchings@solarflare.com> wrote:
> >> >>
> >> >>> I have no particular opinion on this change, but you need to read and
> >> >>> follow Documentation/SubmittingPatches.
> >> >>>
> >> >>> Ben.
> >> >>>
> >> >> Sorry, first kernel patch, and I did not know about it. I resubmit
> >> >> with
> >> >> the correct style / format:
> >> >>
> >> > [...]
> >> >
> >> > Sorry, no you haven't.
> >> >
> >> > - Networking changes go through David Miller's net-next-2.6 tree so you
> >> > need to use that as the baseline, not 2.6.26
> >> > - Patches should be applicable with -p1, not -p0 (so if you use diff,
> >> > you should run it from one directory level up)
> >> > - The patch was word-wrapped
> >>
> >> Additionally:
> >>
> >> - please use the proper comment style, meaning each line begins
> >>   with a '*'
> >>
> >> - the pr_debug statements may be useful for debugging, but are
> >>   a bit excessive for the final version
> >>
> >> - + /* 2010-06-13: Pedro Garcia
> >>
> >>    We have changelogs for this, simply explaining what the code
> >>    does is enough.
> >>
> >> - Please CC the maintainer (which is me)
> >> --
> >
> > Pedro, we have two kind of vlan setups :
> >
> > accelerated and non accelerated ones.
> >
> > Your patch address non accelated ones only, since you only touch
> > vlan_skb_recv()
> >
> > Accelerated vlan can follow these paths :
> >
> > 1) NAPI devices
> >
> > vlan_gro_receive() -> vlan_gro_common()
> >
> > 2) non NAPI devices
> >
> > __vlan_hwaccel_rx()
> >
> > So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()
> >
> > Please merge following bits to your patch submission :
> >
> > http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868
> >
> >
> > Good luck for your first patch !
> 
> Here it is again. I added the modifications in
> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
> accelerated incoming packets (it did not apply cleanly on the last version of
> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly
> created by the user, VLAN 0 packets will be treated as no VLAN (802.1p
> packets), instead of dropping them.
> 
> The patch is now for two files: vlan_core (accel) and vlan_dev (non accel)
> 
> I can not test on HW accelerated devices, so if someone can check it I will
> appreciate (even though in the thread above it looked like yes). For non
> accel I tessted in 2.6.26. Now the patch is for
> net-next-2.6, and it compiles OK, but I a have to setup a test environment to
> check it is still OK (should, but better to test).
> 
> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
> --
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index 50f58f5..daaca31 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -8,6 +8,9 @@
>  int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>                       u16 vlan_tci, int polling)
>  {
> +       struct net_device *vlan_dev;
> +       u16 vlan_id;
> +
>         if (netpoll_rx(skb))
>                 return NET_RX_DROP;
> 
> @@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct
> vlan_group *grp,
> 
>         skb->skb_iif = skb->dev->ifindex;
>         __vlan_hwaccel_put_tag(skb, vlan_tci);
> -       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> +       vlan_id = vlan_tci & VLAN_VID_MASK;
> +       vlan_dev = vlan_group_get_device(grp, vlan_id);
> 
> -       if (!skb->dev)
> -               goto drop;
> +       if (vlan_dev)
> +               skb->dev = vlan_dev;
> +       else
> +               if (vlan_id)
> +                       goto drop;
> 
>         return (polling ? netif_receive_skb(skb) : netif_rx(skb));
> 
> @@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct
> vlan_group *grp,
>                 unsigned int vlan_tci, struct sk_buff *skb)
>  {
>         struct sk_buff *p;
> +       struct net_device *vlan_dev;
> +       u16 vlan_id;
> 
>         if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>                 skb->deliver_no_wcard = 1;
> 
>         skb->skb_iif = skb->dev->ifindex;
>         __vlan_hwaccel_put_tag(skb, vlan_tci);
> -       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> -
> -       if (!skb->dev)
> -               goto drop;
> +       vlan_id = vlan_tci & VLAN_VID_MASK;
> +       vlan_dev = vlan_group_get_device(grp, vlan_id);
> +
> +       if (vlan_dev)
> +               skb->dev = vlan_dev;
> +       else
> +               if (vlan_id)
> +                       goto drop;
> 
>         for (p = napi->gro_list; p; p = p->next) {
>                 NAPI_GRO_CB(p)->same_flow =
> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> index 5298426..65512c3 100644
> --- a/net/8021q/vlan_dev.c
> +++ b/net/8021q/vlan_dev.c
> @@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device
> *dev,
>  {
>         struct vlan_hdr *vhdr;
>         struct vlan_rx_stats *rx_stats;
> +       struct net_device *vlan_dev;
>         u16 vlan_id;
>         u16 vlan_tci;
> 
> @@ -157,8 +158,18 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device
> *dev,
>         vlan_id = vlan_tci & VLAN_VID_MASK;
> 
>         rcu_read_lock();
> -       skb->dev = __find_vlan_dev(dev, vlan_id);
> -       if (!skb->dev) {
> +       vlan_dev = __find_vlan_dev(dev, vlan_id);
> +
> +       /* If the VLAN device is defined, we use it.
> +        * If not, and the VID is 0, it is a 802.1p packet (not
> +        * really a VLAN), so we will just netif_rx it later to the
> +        * original interface, but with the skb->proto set to the
> +        * wrapped proto: we do nothing here.
> +        */
> +
> +       if (vlan_dev) {
> +               skb->dev = vlan_dev;
> +       } else if (vlan_id) {
>                 pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
>                          __func__, vlan_id, dev->name);
>                 goto err_unlock;
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-07-08 12:54             ` Vladislav Zolotarov
@ 2010-07-08 12:58               ` Vladislav Zolotarov
  0 siblings, 0 replies; 35+ messages in thread
From: Vladislav Zolotarov @ 2010-07-08 12:58 UTC (permalink / raw)
  To: Vladislav Zolotarov, Pedro Garcia, netdev
  Cc: Patrick McHardy, Ben Hutchings, Eric Dumazet

Opps...
Sorry. Wrong thread. :)

Pls., discard!

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Vladislav Zolotarov
> Sent: Thursday, July 08, 2010 3:55 PM
> To: Pedro Garcia; netdev@vger.kernel.org
> Cc: Patrick McHardy; Ben Hutchings; Eric Dumazet
> Subject: RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> (802.1p packet)
> 
> Margaret,
> In order to keep exploring this we need the following data:
> 1. What is the scenario exactly? What is vMotion? Which kind of operation
> does it do? Which kind of traffic does it pass?
> 2. What is the nature of the failure- driver hang? PSOD? Loss of traffic?
> 3. We need a grcdump after the failure.
> 
> Thanks,
> vlad
> 
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> > Behalf Of Pedro Garcia
> > Sent: Wednesday, June 16, 2010 11:49 AM
> > To: netdev@vger.kernel.org
> > Cc: Patrick McHardy; Ben Hutchings; Eric Dumazet
> > Subject: Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> > (802.1p packet)
> >
> > On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet <eric.dumazet@gmail.com>
> > wrote:
> > > Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
> > >> Ben Hutchings wrote:
> > >> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
> > >> >
> > >> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
> > >> >> <bhutchings@solarflare.com> wrote:
> > >> >>
> > >> >>> I have no particular opinion on this change, but you need to read
> and
> > >> >>> follow Documentation/SubmittingPatches.
> > >> >>>
> > >> >>> Ben.
> > >> >>>
> > >> >> Sorry, first kernel patch, and I did not know about it. I resubmit
> > >> >> with
> > >> >> the correct style / format:
> > >> >>
> > >> > [...]
> > >> >
> > >> > Sorry, no you haven't.
> > >> >
> > >> > - Networking changes go through David Miller's net-next-2.6 tree so
> you
> > >> > need to use that as the baseline, not 2.6.26
> > >> > - Patches should be applicable with -p1, not -p0 (so if you use diff,
> > >> > you should run it from one directory level up)
> > >> > - The patch was word-wrapped
> > >>
> > >> Additionally:
> > >>
> > >> - please use the proper comment style, meaning each line begins
> > >>   with a '*'
> > >>
> > >> - the pr_debug statements may be useful for debugging, but are
> > >>   a bit excessive for the final version
> > >>
> > >> - + /* 2010-06-13: Pedro Garcia
> > >>
> > >>    We have changelogs for this, simply explaining what the code
> > >>    does is enough.
> > >>
> > >> - Please CC the maintainer (which is me)
> > >> --
> > >
> > > Pedro, we have two kind of vlan setups :
> > >
> > > accelerated and non accelerated ones.
> > >
> > > Your patch address non accelated ones only, since you only touch
> > > vlan_skb_recv()
> > >
> > > Accelerated vlan can follow these paths :
> > >
> > > 1) NAPI devices
> > >
> > > vlan_gro_receive() -> vlan_gro_common()
> > >
> > > 2) non NAPI devices
> > >
> > > __vlan_hwaccel_rx()
> > >
> > > So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()
> > >
> > > Please merge following bits to your patch submission :
> > >
> > > http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868
> > >
> > >
> > > Good luck for your first patch !
> >
> > Here it is again. I added the modifications in
> > http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
> > accelerated incoming packets (it did not apply cleanly on the last version
> of
> > the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly
> > created by the user, VLAN 0 packets will be treated as no VLAN (802.1p
> > packets), instead of dropping them.
> >
> > The patch is now for two files: vlan_core (accel) and vlan_dev (non accel)
> >
> > I can not test on HW accelerated devices, so if someone can check it I will
> > appreciate (even though in the thread above it looked like yes). For non
> > accel I tessted in 2.6.26. Now the patch is for
> > net-next-2.6, and it compiles OK, but I a have to setup a test environment
> to
> > check it is still OK (should, but better to test).
> >
> > Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
> > --
> > diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> > index 50f58f5..daaca31 100644
> > --- a/net/8021q/vlan_core.c
> > +++ b/net/8021q/vlan_core.c
> > @@ -8,6 +8,9 @@
> >  int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
> >                       u16 vlan_tci, int polling)
> >  {
> > +       struct net_device *vlan_dev;
> > +       u16 vlan_id;
> > +
> >         if (netpoll_rx(skb))
> >                 return NET_RX_DROP;
> >
> > @@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct
> > vlan_group *grp,
> >
> >         skb->skb_iif = skb->dev->ifindex;
> >         __vlan_hwaccel_put_tag(skb, vlan_tci);
> > -       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> > +       vlan_id = vlan_tci & VLAN_VID_MASK;
> > +       vlan_dev = vlan_group_get_device(grp, vlan_id);
> >
> > -       if (!skb->dev)
> > -               goto drop;
> > +       if (vlan_dev)
> > +               skb->dev = vlan_dev;
> > +       else
> > +               if (vlan_id)
> > +                       goto drop;
> >
> >         return (polling ? netif_receive_skb(skb) : netif_rx(skb));
> >
> > @@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct
> > vlan_group *grp,
> >                 unsigned int vlan_tci, struct sk_buff *skb)
> >  {
> >         struct sk_buff *p;
> > +       struct net_device *vlan_dev;
> > +       u16 vlan_id;
> >
> >         if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
> >                 skb->deliver_no_wcard = 1;
> >
> >         skb->skb_iif = skb->dev->ifindex;
> >         __vlan_hwaccel_put_tag(skb, vlan_tci);
> > -       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> > -
> > -       if (!skb->dev)
> > -               goto drop;
> > +       vlan_id = vlan_tci & VLAN_VID_MASK;
> > +       vlan_dev = vlan_group_get_device(grp, vlan_id);
> > +
> > +       if (vlan_dev)
> > +               skb->dev = vlan_dev;
> > +       else
> > +               if (vlan_id)
> > +                       goto drop;
> >
> >         for (p = napi->gro_list; p; p = p->next) {
> >                 NAPI_GRO_CB(p)->same_flow =
> > diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> > index 5298426..65512c3 100644
> > --- a/net/8021q/vlan_dev.c
> > +++ b/net/8021q/vlan_dev.c
> > @@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct
> net_device
> > *dev,
> >  {
> >         struct vlan_hdr *vhdr;
> >         struct vlan_rx_stats *rx_stats;
> > +       struct net_device *vlan_dev;
> >         u16 vlan_id;
> >         u16 vlan_tci;
> >
> > @@ -157,8 +158,18 @@ int vlan_skb_recv(struct sk_buff *skb, struct
> net_device
> > *dev,
> >         vlan_id = vlan_tci & VLAN_VID_MASK;
> >
> >         rcu_read_lock();
> > -       skb->dev = __find_vlan_dev(dev, vlan_id);
> > -       if (!skb->dev) {
> > +       vlan_dev = __find_vlan_dev(dev, vlan_id);
> > +
> > +       /* If the VLAN device is defined, we use it.
> > +        * If not, and the VID is 0, it is a 802.1p packet (not
> > +        * really a VLAN), so we will just netif_rx it later to the
> > +        * original interface, but with the skb->proto set to the
> > +        * wrapped proto: we do nothing here.
> > +        */
> > +
> > +       if (vlan_dev) {
> > +               skb->dev = vlan_dev;
> > +       } else if (vlan_id) {
> >                 pr_debug("%s: ERROR: No net_device for VID: %u on dev:
> %s\n",
> >                          __func__, vlan_id, dev->name);
> >                 goto err_unlock;
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> N�����r��y���b�X��ǧv�^�)޺{.n�+���z�^�)���w*
> jg���\x1e�����ݢj/���z�ޖ��2�ޙ���&�)ߡ�a��\x7f��\x1e�G���h�\x0f�j:+v���w�٥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-06-16  8:49           ` Pedro Garcia
                               ` (2 preceding siblings ...)
  2010-07-08 12:54             ` Vladislav Zolotarov
@ 2010-07-08 13:51             ` Vladislav Zolotarov
  3 siblings, 0 replies; 35+ messages in thread
From: Vladislav Zolotarov @ 2010-07-08 13:51 UTC (permalink / raw)
  To: Pedro Garcia, netdev; +Cc: Patrick McHardy, Ben Hutchings, Eric Dumazet

I've checked your patch on bnx2x HW acceleration capable device in both HW accelerated and none-accelerated modes and it looks good.

Thanks,
vlad

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Pedro Garcia
> Sent: Wednesday, June 16, 2010 11:49 AM
> To: netdev@vger.kernel.org
> Cc: Patrick McHardy; Ben Hutchings; Eric Dumazet
> Subject: Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag"
> (802.1p packet)
> 
> On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
> > Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
> >> Ben Hutchings wrote:
> >> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
> >> >
> >> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
> >> >> <bhutchings@solarflare.com> wrote:
> >> >>
> >> >>> I have no particular opinion on this change, but you need to read and
> >> >>> follow Documentation/SubmittingPatches.
> >> >>>
> >> >>> Ben.
> >> >>>
> >> >> Sorry, first kernel patch, and I did not know about it. I resubmit
> >> >> with
> >> >> the correct style / format:
> >> >>
> >> > [...]
> >> >
> >> > Sorry, no you haven't.
> >> >
> >> > - Networking changes go through David Miller's net-next-2.6 tree so you
> >> > need to use that as the baseline, not 2.6.26
> >> > - Patches should be applicable with -p1, not -p0 (so if you use diff,
> >> > you should run it from one directory level up)
> >> > - The patch was word-wrapped
> >>
> >> Additionally:
> >>
> >> - please use the proper comment style, meaning each line begins
> >>   with a '*'
> >>
> >> - the pr_debug statements may be useful for debugging, but are
> >>   a bit excessive for the final version
> >>
> >> - + /* 2010-06-13: Pedro Garcia
> >>
> >>    We have changelogs for this, simply explaining what the code
> >>    does is enough.
> >>
> >> - Please CC the maintainer (which is me)
> >> --
> >
> > Pedro, we have two kind of vlan setups :
> >
> > accelerated and non accelerated ones.
> >
> > Your patch address non accelated ones only, since you only touch
> > vlan_skb_recv()
> >
> > Accelerated vlan can follow these paths :
> >
> > 1) NAPI devices
> >
> > vlan_gro_receive() -> vlan_gro_common()
> >
> > 2) non NAPI devices
> >
> > __vlan_hwaccel_rx()
> >
> > So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()
> >
> > Please merge following bits to your patch submission :
> >
> > http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868
> >
> >
> > Good luck for your first patch !
> 
> Here it is again. I added the modifications in
> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
> accelerated incoming packets (it did not apply cleanly on the last version of
> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly
> created by the user, VLAN 0 packets will be treated as no VLAN (802.1p
> packets), instead of dropping them.
> 
> The patch is now for two files: vlan_core (accel) and vlan_dev (non accel)
> 
> I can not test on HW accelerated devices, so if someone can check it I will
> appreciate (even though in the thread above it looked like yes). For non
> accel I tessted in 2.6.26. Now the patch is for
> net-next-2.6, and it compiles OK, but I a have to setup a test environment to
> check it is still OK (should, but better to test).
> 
> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
> --
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index 50f58f5..daaca31 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -8,6 +8,9 @@
>  int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>                       u16 vlan_tci, int polling)
>  {
> +       struct net_device *vlan_dev;
> +       u16 vlan_id;
> +
>         if (netpoll_rx(skb))
>                 return NET_RX_DROP;
> 
> @@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct
> vlan_group *grp,
> 
>         skb->skb_iif = skb->dev->ifindex;
>         __vlan_hwaccel_put_tag(skb, vlan_tci);
> -       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> +       vlan_id = vlan_tci & VLAN_VID_MASK;
> +       vlan_dev = vlan_group_get_device(grp, vlan_id);
> 
> -       if (!skb->dev)
> -               goto drop;
> +       if (vlan_dev)
> +               skb->dev = vlan_dev;
> +       else
> +               if (vlan_id)
> +                       goto drop;
> 
>         return (polling ? netif_receive_skb(skb) : netif_rx(skb));
> 
> @@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct
> vlan_group *grp,
>                 unsigned int vlan_tci, struct sk_buff *skb)
>  {
>         struct sk_buff *p;
> +       struct net_device *vlan_dev;
> +       u16 vlan_id;
> 
>         if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>                 skb->deliver_no_wcard = 1;
> 
>         skb->skb_iif = skb->dev->ifindex;
>         __vlan_hwaccel_put_tag(skb, vlan_tci);
> -       skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> -
> -       if (!skb->dev)
> -               goto drop;
> +       vlan_id = vlan_tci & VLAN_VID_MASK;
> +       vlan_dev = vlan_group_get_device(grp, vlan_id);
> +
> +       if (vlan_dev)
> +               skb->dev = vlan_dev;
> +       else
> +               if (vlan_id)
> +                       goto drop;
> 
>         for (p = napi->gro_list; p; p = p->next) {
>                 NAPI_GRO_CB(p)->same_flow =
> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> index 5298426..65512c3 100644
> --- a/net/8021q/vlan_dev.c
> +++ b/net/8021q/vlan_dev.c
> @@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device
> *dev,
>  {
>         struct vlan_hdr *vhdr;
>         struct vlan_rx_stats *rx_stats;
> +       struct net_device *vlan_dev;
>         u16 vlan_id;
>         u16 vlan_tci;
> 
> @@ -157,8 +158,18 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device
> *dev,
>         vlan_id = vlan_tci & VLAN_VID_MASK;
> 
>         rcu_read_lock();
> -       skb->dev = __find_vlan_dev(dev, vlan_id);
> -       if (!skb->dev) {
> +       vlan_dev = __find_vlan_dev(dev, vlan_id);
> +
> +       /* If the VLAN device is defined, we use it.
> +        * If not, and the VID is 0, it is a 802.1p packet (not
> +        * really a VLAN), so we will just netif_rx it later to the
> +        * original interface, but with the skb->proto set to the
> +        * wrapped proto: we do nothing here.
> +        */
> +
> +       if (vlan_dev) {
> +               skb->dev = vlan_dev;
> +       } else if (vlan_id) {
>                 pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
>                          __func__, vlan_id, dev->name);
>                 goto err_unlock;
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-07-01 20:19                     ` Eric Dumazet
@ 2010-07-18 16:43                       ` Pedro Garcia
  2010-07-18 22:39                         ` David Miller
  0 siblings, 1 reply; 35+ messages in thread
From: Pedro Garcia @ 2010-07-18 16:43 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy, Ben Hutchings, David Miller, Eric Dumazet

On Thu, 01 Jul 2010 22:19:14 +0200, Eric Dumazet <eric.dumazet@gmail.com>
wrote:

> Seems fine but you need to respin your patch against latest net-next-2.6
> tree.
> 
> Check your tree got commit 9618e2ffd78aaa (vlan: 64 bit rx counters)
> 
> Thanks !

The patch applied to the lastest tree:

- Without the 8021q module loaded in the kernel, all 802.1p packets 
(VLAN 0 but QoS tagging) are silently discarded (as expected, as 
the protocol is not loaded).
 
- Without this patch in 8021q module, these packets are forwarded to 
the module, but they are discarded also if VLAN 0 is not configured,
which should not be the default behaviour, as VLAN 0 is not really
a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
impossible to communicate with mixed 802.1p and non 802.1p devices on
the same network due to arp table issues.

- Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
is 0 and we have not defined a VLAN with ID 0, but we accept the 
packet with the encapsulated proto and pass it later to netif_rx.

- In the vlan device event handler, added some logic to add VLAN 0 
to HW filter in devices that support it (this prevented any traffic
in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
and probably also with other HW filtered cards, so we fix it here).

- In the vlan unregister logic, prevent the elimination of VLAN 0 
in devices with HW filter.

- The default behaviour is to ignore the VLAN 0 tagging and accept
the packet as if it was not tagged, but we can still define a 
VLAN 0 if desired (so it is backwards compatible).

Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
--
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 3c1c8c1..a2ad152 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -155,9 +155,10 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 	BUG_ON(!grp);
 
 	/* Take it out of our own structures, but be sure to interlock with
-	 * HW accelerating devices or SW vlan input packet processing.
+	 * HW accelerating devices or SW vlan input packet processing if
+	 * VLAN is not 0 (leave it there for 802.1p).
 	 */
-	if (real_dev->features & NETIF_F_HW_VLAN_FILTER)
+	if (vlan_id && (real_dev->features & NETIF_F_HW_VLAN_FILTER))
 		ops->ndo_vlan_rx_kill_vid(real_dev, vlan_id);
 
 	grp->nr_vlans--;
@@ -419,6 +420,14 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 	if (is_vlan_dev(dev))
 		__vlan_device_event(dev, event);
 
+	if ((event == NETDEV_UP) &&
+	    (dev->features & NETIF_F_HW_VLAN_FILTER) &&
+	    dev->netdev_ops->ndo_vlan_rx_add_vid) {
+		pr_info("8021q: adding VLAN 0 to HW filter on device %s\n",
+			dev->name);
+		dev->netdev_ops->ndo_vlan_rx_add_vid(dev, 0);
+	}
+
 	grp = __vlan_find_group(dev);
 	if (!grp)
 		goto out;
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 1b9406a..01ddb04 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -8,6 +8,9 @@
 int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 		      u16 vlan_tci, int polling)
 {
+	struct net_device *vlan_dev;
+	u16 vlan_id;
+
 	if (netpoll_rx(skb))
 		return NET_RX_DROP;
 
@@ -16,9 +19,12 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 
 	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
-	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
+	vlan_id = vlan_tci & VLAN_VID_MASK;
+	vlan_dev = vlan_group_get_device(grp, vlan_id);
 
-	if (!skb->dev)
+	if (vlan_dev)
+		skb->dev = vlan_dev;
+	else if (vlan_id)
 		goto drop;
 
 	return (polling ? netif_receive_skb(skb) : netif_rx(skb));
@@ -83,15 +89,20 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 		unsigned int vlan_tci, struct sk_buff *skb)
 {
 	struct sk_buff *p;
+	struct net_device *vlan_dev;
+	u16 vlan_id;
 
 	if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
 		skb->deliver_no_wcard = 1;
 
 	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
-	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
+	vlan_id = vlan_tci & VLAN_VID_MASK;
+	vlan_dev = vlan_group_get_device(grp, vlan_id);
 
-	if (!skb->dev)
+	if (vlan_dev)
+		skb->dev = vlan_dev;
+	else if (vlan_id)
 		goto drop;
 
 	for (p = napi->gro_list; p; p = p->next) {
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 7cb285f..3d59c9b 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -142,6 +142,7 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
 {
 	struct vlan_hdr *vhdr;
 	struct vlan_rx_stats *rx_stats;
+	struct net_device *vlan_dev;
 	u16 vlan_id;
 	u16 vlan_tci;
 
@@ -157,55 +158,71 @@ int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev,
 	vlan_id = vlan_tci & VLAN_VID_MASK;
 
 	rcu_read_lock();
-	skb->dev = __find_vlan_dev(dev, vlan_id);
-	if (!skb->dev) {
-		pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
-			 __func__, vlan_id, dev->name);
-		goto err_unlock;
-	}
-
-	rx_stats = per_cpu_ptr(vlan_dev_info(skb->dev)->vlan_rx_stats,
-			       smp_processor_id());
-	u64_stats_update_begin(&rx_stats->syncp);
-	rx_stats->rx_packets++;
-	rx_stats->rx_bytes += skb->len;
-
-	skb_pull_rcsum(skb, VLAN_HLEN);
-
-	skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci);
+	vlan_dev = __find_vlan_dev(dev, vlan_id);
 
-	pr_debug("%s: priority: %u for TCI: %hu\n",
-		 __func__, skb->priority, vlan_tci);
-
-	switch (skb->pkt_type) {
-	case PACKET_BROADCAST: /* Yeah, stats collect these together.. */
-		/* stats->broadcast ++; // no such counter :-( */
-		break;
-
-	case PACKET_MULTICAST:
-		rx_stats->rx_multicast++;
-		break;
+	/* If the VLAN device is defined, we use it.
+	 * If not, and the VID is 0, it is a 802.1p packet (not
+	 * really a VLAN), so we will just netif_rx it later to the
+	 * original interface, but with the skb->proto set to the
+	 * wrapped proto: we do nothing here.
+	 */
 
-	case PACKET_OTHERHOST:
-		/* Our lower layer thinks this is not local, let's make sure.
-		 * This allows the VLAN to have a different MAC than the
-		 * underlying device, and still route correctly.
-		 */
-		if (!compare_ether_addr(eth_hdr(skb)->h_dest,
-					skb->dev->dev_addr))
-			skb->pkt_type = PACKET_HOST;
-		break;
-	default:
-		break;
+	if (!vlan_dev) {
+		if (vlan_id) {
+			pr_debug("%s: ERROR: No net_device for VID: %u on dev: %s\n",
+				 __func__, vlan_id, dev->name);
+			goto err_unlock;
+		}
+		rx_stats = NULL;
+	} else {
+		skb->dev = vlan_dev;
+
+		rx_stats = per_cpu_ptr(vlan_dev_info(skb->dev)->vlan_rx_stats,
+					smp_processor_id());
+		u64_stats_update_begin(&rx_stats->syncp);
+		rx_stats->rx_packets++;
+		rx_stats->rx_bytes += skb->len;
+
+		skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci);
+
+		pr_debug("%s: priority: %u for TCI: %hu\n",
+			 __func__, skb->priority, vlan_tci);
+
+		switch (skb->pkt_type) {
+		case PACKET_BROADCAST:
+			/* Yeah, stats collect these together.. */
+			/* stats->broadcast ++; // no such counter :-( */
+			break;
+
+		case PACKET_MULTICAST:
+			rx_stats->rx_multicast++;
+			break;
+
+		case PACKET_OTHERHOST:
+			/* Our lower layer thinks this is not local, let's make
+			 * sure.
+			 * This allows the VLAN to have a different MAC than the
+			 * underlying device, and still route correctly.
+			 */
+			if (!compare_ether_addr(eth_hdr(skb)->h_dest,
+						skb->dev->dev_addr))
+				skb->pkt_type = PACKET_HOST;
+			break;
+		default:
+			break;
+		}
+		u64_stats_update_end(&rx_stats->syncp);
 	}
-	u64_stats_update_end(&rx_stats->syncp);
 
+	skb_pull_rcsum(skb, VLAN_HLEN);
 	vlan_set_encap_proto(skb, vhdr);
 
-	skb = vlan_check_reorder_header(skb);
-	if (!skb) {
-		rx_stats->rx_errors++;
-		goto err_unlock;
+	if (vlan_dev) {
+		skb = vlan_check_reorder_header(skb);
+		if (!skb) {
+			rx_stats->rx_errors++;
+			goto err_unlock;
+		}
 	}
 
 	netif_rx(skb);


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
  2010-07-18 16:43                       ` Pedro Garcia
@ 2010-07-18 22:39                         ` David Miller
  2010-07-19 13:24                           ` [BUG net-next-2.6] vlan, bonding, bnx2 problems Eric Dumazet
  0 siblings, 1 reply; 35+ messages in thread
From: David Miller @ 2010-07-18 22:39 UTC (permalink / raw)
  To: pedro.netdev; +Cc: netdev, kaber, bhutchings, eric.dumazet

From: Pedro Garcia <pedro.netdev@dondevamos.com>
Date: Sun, 18 Jul 2010 18:43:25 +0200

> - Without the 8021q module loaded in the kernel, all 802.1p packets 
> (VLAN 0 but QoS tagging) are silently discarded (as expected, as 
> the protocol is not loaded).
>  
> - Without this patch in 8021q module, these packets are forwarded to 
> the module, but they are discarded also if VLAN 0 is not configured,
> which should not be the default behaviour, as VLAN 0 is not really
> a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
> impossible to communicate with mixed 802.1p and non 802.1p devices on
> the same network due to arp table issues.
> 
> - Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
> is 0 and we have not defined a VLAN with ID 0, but we accept the 
> packet with the encapsulated proto and pass it later to netif_rx.
> 
> - In the vlan device event handler, added some logic to add VLAN 0 
> to HW filter in devices that support it (this prevented any traffic
> in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
> and probably also with other HW filtered cards, so we fix it here).
> 
> - In the vlan unregister logic, prevent the elimination of VLAN 0 
> in devices with HW filter.
> 
> - The default behaviour is to ignore the VLAN 0 tagging and accept
> the packet as if it was not tagged, but we can still define a 
> VLAN 0 if desired (so it is backwards compatible).
> 
> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>

Applied, thanks Pedro.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [BUG net-next-2.6] vlan, bonding, bnx2 problems
  2010-07-18 22:39                         ` David Miller
@ 2010-07-19 13:24                           ` Eric Dumazet
  2010-07-19 16:35                             ` David Miller
  2010-07-19 18:14                             ` Michael Chan
  0 siblings, 2 replies; 35+ messages in thread
From: Eric Dumazet @ 2010-07-19 13:24 UTC (permalink / raw)
  To: David Miller, Michael Chan; +Cc: pedro.netdev, netdev, kaber, bhutchings

Le dimanche 18 juillet 2010 à 15:39 -0700, David Miller a écrit :
> From: Pedro Garcia <pedro.netdev@dondevamos.com>
> Date: Sun, 18 Jul 2010 18:43:25 +0200
> 
> > - Without the 8021q module loaded in the kernel, all 802.1p packets 
> > (VLAN 0 but QoS tagging) are silently discarded (as expected, as 
> > the protocol is not loaded).
> >  
> > - Without this patch in 8021q module, these packets are forwarded to 
> > the module, but they are discarded also if VLAN 0 is not configured,
> > which should not be the default behaviour, as VLAN 0 is not really
> > a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
> > impossible to communicate with mixed 802.1p and non 802.1p devices on
> > the same network due to arp table issues.
> > 
> > - Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
> > is 0 and we have not defined a VLAN with ID 0, but we accept the 
> > packet with the encapsulated proto and pass it later to netif_rx.
> > 
> > - In the vlan device event handler, added some logic to add VLAN 0 
> > to HW filter in devices that support it (this prevented any traffic
> > in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
> > and probably also with other HW filtered cards, so we fix it here).
> > 
> > - In the vlan unregister logic, prevent the elimination of VLAN 0 
> > in devices with HW filter.
> > 
> > - The default behaviour is to ignore the VLAN 0 tagging and accept
> > the packet as if it was not tagged, but we can still define a 
> > VLAN 0 if desired (so it is backwards compatible).
> > 
> > Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
> 
> Applied, thanks Pedro.

Hmm, current net-next-2.6 is not working with bonding and bnx2.

I got some fatal oops.

modprobe bond0
ifconfig bond0 down
echo 100 >/sys/class/net/bond0/bonding/miimon
echo 1 >/sys/class/net/bond0/bonding/mode
ifconfig bond0 up
ifenslave bond0 eth1 eth2
ip link set eth1 up
ip link set eth2 up

After some debugging to avoid crashes, I get :


[   31.784308] bonding: bond0: Setting MII monitoring interval to 100.
[   31.784391] bonding: bond0: setting mode to active-backup (1).
[   31.784900] 8021q: adding VLAN 0 to HW filter on device bond0
[   31.784903] ADDRCONF(NETDEV_UP): bond0: link is not ready
[   31.904440] ------------[ cut here ]------------
[   31.904500] WARNING: at drivers/net/bonding/bond_ipv6.c:185 bond_inet6addr_event+0x179/0x240 [bonding]()
[   31.904576] Hardware name: ProLiant BL460c G1
[   31.904629] Modules linked in: ipmi_si ipmi_msghandler hpilo bonding ipv6
[   31.904873] Pid: 4586, comm: ifenslave Tainted: G        W   2.6.35-rc1-01453-g3e12451-dirty #836
[   31.904948] Call Trace:
[   31.905002]  [<c13421c4>] ? printk+0x18/0x1c
[   31.905057]  [<c103c8fd>] warn_slowpath_common+0x6d/0xa0
[   31.905114]  [<f8cf5fd9>] ? bond_inet6addr_event+0x179/0x240 [bonding]
[   31.905172]  [<f8cf5fd9>] ? bond_inet6addr_event+0x179/0x240 [bonding]
[   31.905236]  [<c103c94d>] warn_slowpath_null+0x1d/0x20
[   31.905296]  [<f8cf5fd9>] bond_inet6addr_event+0x179/0x240 [bonding]
[   31.905354]  [<c105b061>] notifier_call_chain+0x41/0x60
[   31.905409]  [<c105b0cd>] atomic_notifier_call_chain+0x1d/0x20
[   31.905471]  [<f8b88b31>] addrconf_ifdown+0x211/0x320 [ipv6]
[   31.905529]  [<f8b897ae>] addrconf_notify+0x6e/0x870 [ipv6]
[   31.905586]  [<c1344912>] ? _raw_write_unlock_bh+0x12/0x20
[   31.905642]  [<c1344912>] ? _raw_write_unlock_bh+0x12/0x20
[   31.905701]  [<f8b8f1f0>] ? fib6_clean_all+0x70/0x80 [ipv6]
[   31.905770]  [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
[   31.905830]  [<c104a106>] ? lock_timer_base+0x26/0x50
[   31.905884]  [<c104a279>] ? del_timer+0x69/0xb0
[   31.905938]  [<c134493d>] ? _raw_spin_unlock_bh+0xd/0x10
[   31.905997]  [<f8b8f267>] ? fib6_run_gc+0x67/0xe0 [ipv6]
[   31.906052]  [<c105b061>] notifier_call_chain+0x41/0x60
[   31.906107]  [<c105b19a>] raw_notifier_call_chain+0x1a/0x20
[   31.906165]  [<c129fe37>] call_netdevice_notifiers+0x27/0x60
[   31.906221]  [<c12ac0cd>] ? rtmsg_ifinfo+0xbd/0xf0
[   31.906276]  [<c12a183c>] __dev_notify_flags+0x5c/0x80
[   31.906333]  [<c12a1897>] dev_change_flags+0x37/0x60
[   31.906390]  [<c12f6291>] devinet_ioctl+0x591/0x6f0
[   31.906445]  [<c11726be>] ? copy_to_user+0x2e/0x40
[   31.906500]  [<c12f7212>] inet_ioctl+0xa2/0xd0
[   31.906555]  [<c128f65e>] sock_ioctl+0x4e/0x240
[   31.906610]  [<c10d3a44>] vfs_ioctl+0x34/0xa0
[   31.906664]  [<c10c7cab>] ? alloc_file+0x1b/0xa0
[   31.906718]  [<c128f610>] ? sock_ioctl+0x0/0x240
[   31.906771]  [<c10d4186>] do_vfs_ioctl+0x66/0x550
[   31.906827]  [<c1022ca0>] ? do_page_fault+0x0/0x350
[   31.906881]  [<c1022e41>] ? do_page_fault+0x1a1/0x350
[   31.906936]  [<c129098c>] ? sys_socket+0x5c/0x70
[   31.906990]  [<c1291860>] ? sys_socketcall+0x60/0x270
[   31.907045]  [<c10d46a9>] sys_ioctl+0x39/0x60
[   31.907099]  [<c1002bd0>] sysenter_do_call+0x12/0x26
[   31.907153] ---[ end trace 5c4638450a77a22f ]---
[   32.046479] BUG: scheduling while atomic: ifenslave/4586/0x00000100
[   32.046540] Modules linked in: ipmi_si ipmi_msghandler hpilo bonding ipv6
[   32.046784] Pid: 4586, comm: ifenslave Tainted: G        W   2.6.35-rc1-01453-g3e12451-dirty #836
[   32.046860] Call Trace:
[   32.046910]  [<c13421c4>] ? printk+0x18/0x1c
[   32.046965]  [<c10315c9>] __schedule_bug+0x59/0x60
[   32.047019]  [<c1342a2c>] schedule+0x57c/0x850
[   32.047074]  [<c104a106>] ? lock_timer_base+0x26/0x50
[   32.047128]  [<c1342f78>] schedule_timeout+0x118/0x250
[   32.047183]  [<c104a2c0>] ? process_timeout+0x0/0x10
[   32.047238]  [<c13430c5>] schedule_timeout_uninterruptible+0x15/0x20
[   32.047295]  [<c104a345>] msleep+0x15/0x20
[   32.047350]  [<c1227082>] bnx2_napi_disable+0x52/0x80
[   32.047405]  [<c122b56f>] bnx2_netif_stop+0x3f/0xa0
[   32.047460]  [<c122b62a>] bnx2_vlan_rx_register+0x5a/0x80
[   32.047516]  [<f8ced776>] bond_enslave+0x526/0xa90 [bonding]
[   32.047576]  [<f8b8f0d0>] ? fib6_clean_node+0x0/0xb0 [ipv6]
[   32.047634]  [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
[   32.047689]  [<c129d2d3>] ? netdev_set_master+0x3/0xc0
[   32.047746]  [<f8cee4cb>] bond_do_ioctl+0x31b/0x430 [bonding]
[   32.047804]  [<c105b19a>] ? raw_notifier_call_chain+0x1a/0x20
[   32.047861]  [<c12abd5d>] ? __rtnl_unlock+0xd/0x10
[   32.047915]  [<c129f8cd>] ? __dev_get_by_name+0x7d/0xa0
[   32.047970]  [<c12a19b0>] dev_ifsioc+0xf0/0x290
[   32.048025]  [<f8cee1b0>] ? bond_do_ioctl+0x0/0x430 [bonding]
[   32.048081]  [<c12a1ce1>] dev_ioctl+0x191/0x610
[   32.048136]  [<c12eeb20>] ? udp_ioctl+0x0/0x70
[   32.048189]  [<c128f67c>] sock_ioctl+0x6c/0x240
[   32.048243]  [<c10d3a44>] vfs_ioctl+0x34/0xa0
[   32.048297]  [<c10c7cab>] ? alloc_file+0x1b/0xa0
[   32.048351]  [<c128f610>] ? sock_ioctl+0x0/0x240
[   32.048404]  [<c10d4186>] do_vfs_ioctl+0x66/0x550
[   32.048459]  [<c1022ca0>] ? do_page_fault+0x0/0x350
[   32.048513]  [<c1022e41>] ? do_page_fault+0x1a1/0x350
[   32.048568]  [<c129098c>] ? sys_socket+0x5c/0x70
[   32.048622]  [<c1291860>] ? sys_socketcall+0x60/0x270
[   32.048677]  [<c10d46a9>] sys_ioctl+0x39/0x60
[   32.048730]  [<c1002bd0>] sysenter_do_call+0x12/0x26
[   32.052025] bonding: bond0: enslaving eth1 as a backup interface with a down link.
[   32.100207] tg3 0000:14:04.0: PME# enabled
[   32.100222]  pci0000:00: wake-up capability enabled by ACPI
[   32.224488]  pci0000:00: wake-up capability disabled by ACPI
[   32.224492] tg3 0000:14:04.0: PME# disabled
[   32.348516] tg3 0000:14:04.0: BAR 0: set to [mem 0xfdff0000-0xfdffffff 64bit] (PCI address [0xfdff0000-0xfdffffff]
[   32.348524] tg3 0000:14:04.0: BAR 2: set to [mem 0xfdfe0000-0xfdfeffff 64bit] (PCI address [0xfdfe0000-0xfdfeffff]
[   32.363711] bonding: bond0: enslaving eth2 as a backup interface with a down link.



For bnx2, it seems commit 212f9934afccf9c9739921
was not sufficient to correct the "scheduling while atomic" bug...
enslaving a bnx2 on a bond device with one vlan already set :
 bond_enslave -> bnx2_vlan_rx_register -> bnx2_netif_stop -> bnx2_napi_disable -> msleep()

For the first oops, following patch cures it, but I am not pleased
with it. This zero-vid registration seems wrong at the beginning.

Thanks

[RFC net-next-2.6] bonding: fix bond_inet6addr_event() 

After commit ad1afb0039391 (vlan_dev: VLAN 0 should be treated
as "no vlan tag" (802.1p packet)),
bond_inet6addr_event() might be called with a NULL bond->vlgrp pointer, and
a non empty bond->vlan_list. vlan_group_get_device() is dereferencing a NULL pointer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---

diff --git a/drivers/net/bonding/bond_ipv6.c b/drivers/net/bonding/bond_ipv6.c
index 969ffed..121b073 100644
--- a/drivers/net/bonding/bond_ipv6.c
+++ b/drivers/net/bonding/bond_ipv6.c
@@ -178,6 +178,8 @@ static int bond_inet6addr_event(struct notifier_block *this,
 		}
 
 		list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
+			if (!bond->vlgrp)
+				continue;
 			vlan_dev = vlan_group_get_device(bond->vlgrp,
 							 vlan->vlan_id);
 			if (vlan_dev == event_dev) {




^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [BUG net-next-2.6] vlan, bonding, bnx2 problems
  2010-07-19 13:24                           ` [BUG net-next-2.6] vlan, bonding, bnx2 problems Eric Dumazet
@ 2010-07-19 16:35                             ` David Miller
  2010-07-19 18:14                             ` Michael Chan
  1 sibling, 0 replies; 35+ messages in thread
From: David Miller @ 2010-07-19 16:35 UTC (permalink / raw)
  To: eric.dumazet; +Cc: mchan, pedro.netdev, netdev, kaber, bhutchings

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 19 Jul 2010 15:24:14 +0200

> [RFC net-next-2.6] bonding: fix bond_inet6addr_event() 
> 
> After commit ad1afb0039391 (vlan_dev: VLAN 0 should be treated
> as "no vlan tag" (802.1p packet)),
> bond_inet6addr_event() might be called with a NULL bond->vlgrp pointer, and
> a non empty bond->vlan_list. vlan_group_get_device() is dereferencing a NULL pointer.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

I'll apply this bandaid for now, but yes we need to think more
deeply about this.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [BUG net-next-2.6] vlan, bonding, bnx2 problems
  2010-07-19 13:24                           ` [BUG net-next-2.6] vlan, bonding, bnx2 problems Eric Dumazet
  2010-07-19 16:35                             ` David Miller
@ 2010-07-19 18:14                             ` Michael Chan
  2010-07-19 20:19                               ` Jay Vosburgh
  1 sibling, 1 reply; 35+ messages in thread
From: Michael Chan @ 2010-07-19 18:14 UTC (permalink / raw)
  To: Eric Dumazet, fubar; +Cc: David Miller, pedro.netdev, netdev, kaber, bhutchings

Adding Jay to CC.

On Mon, 2010-07-19 at 06:24 -0700, Eric Dumazet wrote:
> [   32.046479] BUG: scheduling while atomic: ifenslave/4586/0x00000100
> [   32.046540] Modules linked in: ipmi_si ipmi_msghandler hpilo
> bonding ipv6
> [   32.046784] Pid: 4586, comm: ifenslave Tainted: G        W
> 2.6.35-rc1-01453-g3e12451-dirty #836
> [   32.046860] Call Trace:
> [   32.046910]  [<c13421c4>] ? printk+0x18/0x1c
> [   32.046965]  [<c10315c9>] __schedule_bug+0x59/0x60
> [   32.047019]  [<c1342a2c>] schedule+0x57c/0x850
> [   32.047074]  [<c104a106>] ? lock_timer_base+0x26/0x50
> [   32.047128]  [<c1342f78>] schedule_timeout+0x118/0x250
> [   32.047183]  [<c104a2c0>] ? process_timeout+0x0/0x10
> [   32.047238]  [<c13430c5>] schedule_timeout_uninterruptible
> +0x15/0x20
> [   32.047295]  [<c104a345>] msleep+0x15/0x20
> [   32.047350]  [<c1227082>] bnx2_napi_disable+0x52/0x80
> [   32.047405]  [<c122b56f>] bnx2_netif_stop+0x3f/0xa0
> [   32.047460]  [<c122b62a>] bnx2_vlan_rx_register+0x5a/0x80
> [   32.047516]  [<f8ced776>] bond_enslave+0x526/0xa90 [bonding]
> [   32.047576]  [<f8b8f0d0>] ? fib6_clean_node+0x0/0xb0 [ipv6]
> [   32.047634]  [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
> [   32.047689]  [<c129d2d3>] ? netdev_set_master+0x3/0xc0
> [   32.047746]  [<f8cee4cb>] bond_do_ioctl+0x31b/0x430 [bonding]
> [   32.047804]  [<c105b19a>] ? raw_notifier_call_chain+0x1a/0x20
> [   32.047861]  [<c12abd5d>] ? __rtnl_unlock+0xd/0x10
> [   32.047915]  [<c129f8cd>] ? __dev_get_by_name+0x7d/0xa0
> [   32.047970]  [<c12a19b0>] dev_ifsioc+0xf0/0x290
> [   32.048025]  [<f8cee1b0>] ? bond_do_ioctl+0x0/0x430 [bonding]
> [   32.048081]  [<c12a1ce1>] dev_ioctl+0x191/0x610
> [   32.048136]  [<c12eeb20>] ? udp_ioctl+0x0/0x70
> [   32.048189]  [<c128f67c>] sock_ioctl+0x6c/0x240
> [   32.048243]  [<c10d3a44>] vfs_ioctl+0x34/0xa0
> [   32.048297]  [<c10c7cab>] ? alloc_file+0x1b/0xa0
> [   32.048351]  [<c128f610>] ? sock_ioctl+0x0/0x240
> [   32.048404]  [<c10d4186>] do_vfs_ioctl+0x66/0x550
> [   32.048459]  [<c1022ca0>] ? do_page_fault+0x0/0x350
> [   32.048513]  [<c1022e41>] ? do_page_fault+0x1a1/0x350
> [   32.048568]  [<c129098c>] ? sys_socket+0x5c/0x70
> [   32.048622]  [<c1291860>] ? sys_socketcall+0x60/0x270
> [   32.048677]  [<c10d46a9>] sys_ioctl+0x39/0x60
> [   32.048730]  [<c1002bd0>] sysenter_do_call+0x12/0x26
> [   32.052025] bonding: bond0: enslaving eth1 as a backup interface
> with a down link.
> [   32.100207] tg3 0000:14:04.0: PME# enabled
> [   32.100222]  pci0000:00: wake-up capability enabled by ACPI
> [   32.224488]  pci0000:00: wake-up capability disabled by ACPI
> [   32.224492] tg3 0000:14:04.0: PME# disabled
> [   32.348516] tg3 0000:14:04.0: BAR 0: set to [mem
> 0xfdff0000-0xfdffffff 64bit] (PCI address [0xfdff0000-0xfdffffff]
> [   32.348524] tg3 0000:14:04.0: BAR 2: set to [mem
> 0xfdfe0000-0xfdfeffff 64bit] (PCI address [0xfdfe0000-0xfdfeffff]
> [   32.363711] bonding: bond0: enslaving eth2 as a backup interface
> with a down link.
> 
> 
> 
> For bnx2, it seems commit 212f9934afccf9c9739921
> was not sufficient to correct the "scheduling while atomic" bug...
> enslaving a bnx2 on a bond device with one vlan already set :
>  bond_enslave -> bnx2_vlan_rx_register -> bnx2_netif_stop ->
> bnx2_napi_disable -> msleep()
> 

There are a number of drivers that call napi_disable() during
->ndo_vlan_rx_regsiter().  bnx2 is lockless in the rx path and so we
need to disable NAPI rx processing and wait for it to be done before
modifying the vlgrp.

Jay, is there an alternative to holding the bond->lock when calling the
slave's ->ndo_vlan_rx_register()?



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [BUG net-next-2.6] vlan, bonding, bnx2 problems
  2010-07-19 18:14                             ` Michael Chan
@ 2010-07-19 20:19                               ` Jay Vosburgh
  2010-07-20 22:58                                 ` Jay Vosburgh
  0 siblings, 1 reply; 35+ messages in thread
From: Jay Vosburgh @ 2010-07-19 20:19 UTC (permalink / raw)
  To: Michael Chan
  Cc: Eric Dumazet, David Miller, pedro.netdev, netdev, kaber, bhutchings

Michael Chan <mchan@broadcom.com> wrote:

>Adding Jay to CC.
>
>On Mon, 2010-07-19 at 06:24 -0700, Eric Dumazet wrote:
>> [   32.046479] BUG: scheduling while atomic: ifenslave/4586/0x00000100
>> [   32.046540] Modules linked in: ipmi_si ipmi_msghandler hpilo
>> bonding ipv6
>> [   32.046784] Pid: 4586, comm: ifenslave Tainted: G        W
>> 2.6.35-rc1-01453-g3e12451-dirty #836
>> [   32.046860] Call Trace:
>> [   32.046910]  [<c13421c4>] ? printk+0x18/0x1c
>> [   32.046965]  [<c10315c9>] __schedule_bug+0x59/0x60
>> [   32.047019]  [<c1342a2c>] schedule+0x57c/0x850
>> [   32.047074]  [<c104a106>] ? lock_timer_base+0x26/0x50
>> [   32.047128]  [<c1342f78>] schedule_timeout+0x118/0x250
>> [   32.047183]  [<c104a2c0>] ? process_timeout+0x0/0x10
>> [   32.047238]  [<c13430c5>] schedule_timeout_uninterruptible
>> +0x15/0x20
>> [   32.047295]  [<c104a345>] msleep+0x15/0x20
>> [   32.047350]  [<c1227082>] bnx2_napi_disable+0x52/0x80
>> [   32.047405]  [<c122b56f>] bnx2_netif_stop+0x3f/0xa0
>> [   32.047460]  [<c122b62a>] bnx2_vlan_rx_register+0x5a/0x80
>> [   32.047516]  [<f8ced776>] bond_enslave+0x526/0xa90 [bonding]
>> [   32.047576]  [<f8b8f0d0>] ? fib6_clean_node+0x0/0xb0 [ipv6]
>> [   32.047634]  [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
>> [   32.047689]  [<c129d2d3>] ? netdev_set_master+0x3/0xc0
>> [   32.047746]  [<f8cee4cb>] bond_do_ioctl+0x31b/0x430 [bonding]
>> [   32.047804]  [<c105b19a>] ? raw_notifier_call_chain+0x1a/0x20
>> [   32.047861]  [<c12abd5d>] ? __rtnl_unlock+0xd/0x10
>> [   32.047915]  [<c129f8cd>] ? __dev_get_by_name+0x7d/0xa0
>> [   32.047970]  [<c12a19b0>] dev_ifsioc+0xf0/0x290
>> [   32.048025]  [<f8cee1b0>] ? bond_do_ioctl+0x0/0x430 [bonding]
>> [   32.048081]  [<c12a1ce1>] dev_ioctl+0x191/0x610
>> [   32.048136]  [<c12eeb20>] ? udp_ioctl+0x0/0x70
>> [   32.048189]  [<c128f67c>] sock_ioctl+0x6c/0x240
>> [   32.048243]  [<c10d3a44>] vfs_ioctl+0x34/0xa0
>> [   32.048297]  [<c10c7cab>] ? alloc_file+0x1b/0xa0
>> [   32.048351]  [<c128f610>] ? sock_ioctl+0x0/0x240
>> [   32.048404]  [<c10d4186>] do_vfs_ioctl+0x66/0x550
>> [   32.048459]  [<c1022ca0>] ? do_page_fault+0x0/0x350
>> [   32.048513]  [<c1022e41>] ? do_page_fault+0x1a1/0x350
>> [   32.048568]  [<c129098c>] ? sys_socket+0x5c/0x70
>> [   32.048622]  [<c1291860>] ? sys_socketcall+0x60/0x270
>> [   32.048677]  [<c10d46a9>] sys_ioctl+0x39/0x60
>> [   32.048730]  [<c1002bd0>] sysenter_do_call+0x12/0x26
>> [   32.052025] bonding: bond0: enslaving eth1 as a backup interface
>> with a down link.
>> [   32.100207] tg3 0000:14:04.0: PME# enabled
>> [   32.100222]  pci0000:00: wake-up capability enabled by ACPI
>> [   32.224488]  pci0000:00: wake-up capability disabled by ACPI
>> [   32.224492] tg3 0000:14:04.0: PME# disabled
>> [   32.348516] tg3 0000:14:04.0: BAR 0: set to [mem
>> 0xfdff0000-0xfdffffff 64bit] (PCI address [0xfdff0000-0xfdffffff]
>> [   32.348524] tg3 0000:14:04.0: BAR 2: set to [mem
>> 0xfdfe0000-0xfdfeffff 64bit] (PCI address [0xfdfe0000-0xfdfeffff]
>> [   32.363711] bonding: bond0: enslaving eth2 as a backup interface
>> with a down link.
>> 
>> 
>> 
>> For bnx2, it seems commit 212f9934afccf9c9739921
>> was not sufficient to correct the "scheduling while atomic" bug...
>> enslaving a bnx2 on a bond device with one vlan already set :
>>  bond_enslave -> bnx2_vlan_rx_register -> bnx2_netif_stop ->
>> bnx2_napi_disable -> msleep()
>> 
>
>There are a number of drivers that call napi_disable() during
>->ndo_vlan_rx_regsiter().  bnx2 is lockless in the rx path and so we
>need to disable NAPI rx processing and wait for it to be done before
>modifying the vlgrp.
>
>Jay, is there an alternative to holding the bond->lock when calling the
>slave's ->ndo_vlan_rx_register()?

	I believe so.  The lock is held here nominally to mutex
bonding's vlan_list.  The bond_add_vlans_on_slave function actually does
the lock and call to ndo_vlan_rx_register (plus one add_vid call per
configured VLAN); I think the call frame in the above stack has been
optimized out.

	For the specific cases of bond_add_vlans_on_slave and
bond_del_vlans_from_slave, we should be able to get away without holding
the bond->lock because we also hold RTNL, and it looks like all changes
to the vlan_list are implicitly mutexed by RTNL because all VLAN add /
remove for device or vid end up being done under RTNL.

	The cases within bonding that change the vlan_list will still
have to hold bond->lock, because other call sites within bonding check
the vlan_list without RTNL (and it would be impractical to have them do
so).

	The patch is as follows; I'm compiling this now to test.  If it
pans out, I'll post a formal submission in a bit.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8228088..decddf5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -565,10 +565,8 @@ static void bond_add_vlans_on_slave(struct bonding *bond, struct net_device *sla
 	struct vlan_entry *vlan;
 	const struct net_device_ops *slave_ops = slave_dev->netdev_ops;
 
-	write_lock_bh(&bond->lock);
-
 	if (list_empty(&bond->vlan_list))
-		goto out;
+		return;
 
 	if ((slave_dev->features & NETIF_F_HW_VLAN_RX) &&
 	    slave_ops->ndo_vlan_rx_register)
@@ -576,13 +574,10 @@ static void bond_add_vlans_on_slave(struct bonding *bond, struct net_device *sla
 
 	if (!(slave_dev->features & NETIF_F_HW_VLAN_FILTER) ||
 	    !(slave_ops->ndo_vlan_rx_add_vid))
-		goto out;
+		return;
 
 	list_for_each_entry(vlan, &bond->vlan_list, vlan_list)
 		slave_ops->ndo_vlan_rx_add_vid(slave_dev, vlan->vlan_id);
-
-out:
-	write_unlock_bh(&bond->lock);
 }
 
 static void bond_del_vlans_from_slave(struct bonding *bond,
@@ -592,10 +587,8 @@ static void bond_del_vlans_from_slave(struct bonding *bond,
 	struct vlan_entry *vlan;
 	struct net_device *vlan_dev;
 
-	write_lock_bh(&bond->lock);
-
 	if (list_empty(&bond->vlan_list))
-		goto out;
+		return;
 
 	if (!(slave_dev->features & NETIF_F_HW_VLAN_FILTER) ||
 	    !(slave_ops->ndo_vlan_rx_kill_vid))
@@ -614,9 +607,6 @@ unreg:
 	if ((slave_dev->features & NETIF_F_HW_VLAN_RX) &&
 	    slave_ops->ndo_vlan_rx_register)
 		slave_ops->ndo_vlan_rx_register(slave_dev, NULL);
-
-out:
-	write_unlock_bh(&bond->lock);
 }
 
 /*------------------------------- Link status -------------------------------*/

	
	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [BUG net-next-2.6] vlan, bonding, bnx2 problems
  2010-07-19 20:19                               ` Jay Vosburgh
@ 2010-07-20 22:58                                 ` Jay Vosburgh
  0 siblings, 0 replies; 35+ messages in thread
From: Jay Vosburgh @ 2010-07-20 22:58 UTC (permalink / raw)
  Cc: Michael Chan, Eric Dumazet, David Miller, pedro.netdev, netdev,
	kaber, bhutchings


Jay Vosburgh <fubar@us.ibm.com> wrote:

>Michael Chan <mchan@broadcom.com> wrote:
>
>>Adding Jay to CC.
>>
>>On Mon, 2010-07-19 at 06:24 -0700, Eric Dumazet wrote:
>>> [   32.046479] BUG: scheduling while atomic: ifenslave/4586/0x00000100
>>> [   32.046540] Modules linked in: ipmi_si ipmi_msghandler hpilo
>>> bonding ipv6
>>> [   32.046784] Pid: 4586, comm: ifenslave Tainted: G        W
>>> 2.6.35-rc1-01453-g3e12451-dirty #836
>>> [   32.046860] Call Trace:
>>> [   32.046910]  [<c13421c4>] ? printk+0x18/0x1c
>>> [   32.046965]  [<c10315c9>] __schedule_bug+0x59/0x60
>>> [   32.047019]  [<c1342a2c>] schedule+0x57c/0x850
>>> [   32.047074]  [<c104a106>] ? lock_timer_base+0x26/0x50
>>> [   32.047128]  [<c1342f78>] schedule_timeout+0x118/0x250
>>> [   32.047183]  [<c104a2c0>] ? process_timeout+0x0/0x10
>>> [   32.047238]  [<c13430c5>] schedule_timeout_uninterruptible
>>> +0x15/0x20
>>> [   32.047295]  [<c104a345>] msleep+0x15/0x20
>>> [   32.047350]  [<c1227082>] bnx2_napi_disable+0x52/0x80
>>> [   32.047405]  [<c122b56f>] bnx2_netif_stop+0x3f/0xa0
>>> [   32.047460]  [<c122b62a>] bnx2_vlan_rx_register+0x5a/0x80
>>> [   32.047516]  [<f8ced776>] bond_enslave+0x526/0xa90 [bonding]
>>> [   32.047576]  [<f8b8f0d0>] ? fib6_clean_node+0x0/0xb0 [ipv6]
>>> [   32.047634]  [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
>>> [   32.047689]  [<c129d2d3>] ? netdev_set_master+0x3/0xc0
>>> [   32.047746]  [<f8cee4cb>] bond_do_ioctl+0x31b/0x430 [bonding]
>>> [   32.047804]  [<c105b19a>] ? raw_notifier_call_chain+0x1a/0x20
>>> [   32.047861]  [<c12abd5d>] ? __rtnl_unlock+0xd/0x10
>>> [   32.047915]  [<c129f8cd>] ? __dev_get_by_name+0x7d/0xa0
>>> [   32.047970]  [<c12a19b0>] dev_ifsioc+0xf0/0x290
>>> [   32.048025]  [<f8cee1b0>] ? bond_do_ioctl+0x0/0x430 [bonding]
>>> [   32.048081]  [<c12a1ce1>] dev_ioctl+0x191/0x610
>>> [   32.048136]  [<c12eeb20>] ? udp_ioctl+0x0/0x70
>>> [   32.048189]  [<c128f67c>] sock_ioctl+0x6c/0x240
>>> [   32.048243]  [<c10d3a44>] vfs_ioctl+0x34/0xa0
>>> [   32.048297]  [<c10c7cab>] ? alloc_file+0x1b/0xa0
>>> [   32.048351]  [<c128f610>] ? sock_ioctl+0x0/0x240
>>> [   32.048404]  [<c10d4186>] do_vfs_ioctl+0x66/0x550
>>> [   32.048459]  [<c1022ca0>] ? do_page_fault+0x0/0x350
>>> [   32.048513]  [<c1022e41>] ? do_page_fault+0x1a1/0x350
>>> [   32.048568]  [<c129098c>] ? sys_socket+0x5c/0x70
>>> [   32.048622]  [<c1291860>] ? sys_socketcall+0x60/0x270
>>> [   32.048677]  [<c10d46a9>] sys_ioctl+0x39/0x60
>>> [   32.048730]  [<c1002bd0>] sysenter_do_call+0x12/0x26
>>> [   32.052025] bonding: bond0: enslaving eth1 as a backup interface
>>> with a down link.
>>> [   32.100207] tg3 0000:14:04.0: PME# enabled
>>> [   32.100222]  pci0000:00: wake-up capability enabled by ACPI
>>> [   32.224488]  pci0000:00: wake-up capability disabled by ACPI
>>> [   32.224492] tg3 0000:14:04.0: PME# disabled
>>> [   32.348516] tg3 0000:14:04.0: BAR 0: set to [mem
>>> 0xfdff0000-0xfdffffff 64bit] (PCI address [0xfdff0000-0xfdffffff]
>>> [   32.348524] tg3 0000:14:04.0: BAR 2: set to [mem
>>> 0xfdfe0000-0xfdfeffff 64bit] (PCI address [0xfdfe0000-0xfdfeffff]
>>> [   32.363711] bonding: bond0: enslaving eth2 as a backup interface
>>> with a down link.
>>> 
>>> 
>>> 
>>> For bnx2, it seems commit 212f9934afccf9c9739921
>>> was not sufficient to correct the "scheduling while atomic" bug...
>>> enslaving a bnx2 on a bond device with one vlan already set :
>>>  bond_enslave -> bnx2_vlan_rx_register -> bnx2_netif_stop ->
>>> bnx2_napi_disable -> msleep()
>>> 
>>
>>There are a number of drivers that call napi_disable() during
>>->ndo_vlan_rx_regsiter().  bnx2 is lockless in the rx path and so we
>>need to disable NAPI rx processing and wait for it to be done before
>>modifying the vlgrp.
>>
>>Jay, is there an alternative to holding the bond->lock when calling the
>>slave's ->ndo_vlan_rx_register()?
>
>	I believe so.  The lock is held here nominally to mutex
>bonding's vlan_list.  The bond_add_vlans_on_slave function actually does
>the lock and call to ndo_vlan_rx_register (plus one add_vid call per
>configured VLAN); I think the call frame in the above stack has been
>optimized out.
>
>	For the specific cases of bond_add_vlans_on_slave and
>bond_del_vlans_from_slave, we should be able to get away without holding
>the bond->lock because we also hold RTNL, and it looks like all changes
>to the vlan_list are implicitly mutexed by RTNL because all VLAN add /
>remove for device or vid end up being done under RTNL.
>
>	The cases within bonding that change the vlan_list will still
>have to hold bond->lock, because other call sites within bonding check
>the vlan_list without RTNL (and it would be impractical to have them do
>so).
>
>	The patch is as follows; I'm compiling this now to test.  If it
>pans out, I'll post a formal submission in a bit.

	Just an update; the "VLAN 0" patch:

commit ad1afb00393915a51c21b1ae8704562bf036855f
Author: Pedro Garcia <pedro.netdev@dondevamos.com>
Date:   Sun Jul 18 15:38:44 2010 -0700

    vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)

	has broken a bunch of VLAN-related things in bonding (more than
just the ipv6 event thing that was already fixed).  Now, 8021q will do
an "add_vid" for VLAN 0 without doing a vlan_rx_register and supplying a
struct vlan_group; this confuses the existing bonding code, which
assumes that register comes first.

	I'm working out the best way to fix the VLAN breakage before I
can test the below patch (which may have to change).

	-J

>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 8228088..decddf5 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -565,10 +565,8 @@ static void bond_add_vlans_on_slave(struct bonding *bond, struct net_device *sla
> 	struct vlan_entry *vlan;
> 	const struct net_device_ops *slave_ops = slave_dev->netdev_ops;
>
>-	write_lock_bh(&bond->lock);
>-
> 	if (list_empty(&bond->vlan_list))
>-		goto out;
>+		return;
>
> 	if ((slave_dev->features & NETIF_F_HW_VLAN_RX) &&
> 	    slave_ops->ndo_vlan_rx_register)
>@@ -576,13 +574,10 @@ static void bond_add_vlans_on_slave(struct bonding *bond, struct net_device *sla
>
> 	if (!(slave_dev->features & NETIF_F_HW_VLAN_FILTER) ||
> 	    !(slave_ops->ndo_vlan_rx_add_vid))
>-		goto out;
>+		return;
>
> 	list_for_each_entry(vlan, &bond->vlan_list, vlan_list)
> 		slave_ops->ndo_vlan_rx_add_vid(slave_dev, vlan->vlan_id);
>-
>-out:
>-	write_unlock_bh(&bond->lock);
> }
>
> static void bond_del_vlans_from_slave(struct bonding *bond,
>@@ -592,10 +587,8 @@ static void bond_del_vlans_from_slave(struct bonding *bond,
> 	struct vlan_entry *vlan;
> 	struct net_device *vlan_dev;
>
>-	write_lock_bh(&bond->lock);
>-
> 	if (list_empty(&bond->vlan_list))
>-		goto out;
>+		return;
>
> 	if (!(slave_dev->features & NETIF_F_HW_VLAN_FILTER) ||
> 	    !(slave_ops->ndo_vlan_rx_kill_vid))
>@@ -614,9 +607,6 @@ unreg:
> 	if ((slave_dev->features & NETIF_F_HW_VLAN_RX) &&
> 	    slave_ops->ndo_vlan_rx_register)
> 		slave_ops->ndo_vlan_rx_register(slave_dev, NULL);
>-
>-out:
>-	write_unlock_bh(&bond->lock);
> }
>
> /*------------------------------- Link status -------------------------------*/

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2010-07-20 22:58 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-13 19:20 [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet) Pedro Garcia
2010-06-13 21:56 ` Ben Hutchings
2010-06-14 16:49   ` Pedro Garcia
2010-06-14 17:02     ` Ben Hutchings
2010-06-14 17:11       ` Patrick McHardy
2010-06-14 19:12         ` Eric Dumazet
2010-06-16  8:49           ` Pedro Garcia
2010-06-16  9:08             ` Eric Dumazet
2010-06-16 11:42               ` Patrick McHardy
2010-06-16 13:28                 ` Pedro Garcia
2010-06-16 14:24                   ` Arnd Bergmann
2010-06-16 15:28                     ` Patrick McHardy
2010-06-16 18:26                       ` Arnd Bergmann
2010-06-16 18:58                         ` Eric Dumazet
2010-06-17  8:56                           ` Vladislav Zolotarov
2010-06-17 10:28                             ` Eric Dumazet
2010-06-17 14:08                               ` Vladislav Zolotarov
2010-06-16 14:24                   ` Eric Dumazet
2010-06-27 23:21               ` Pedro Garcia
2010-06-30 20:16                 ` David Miller
2010-07-01 18:47                   ` Pedro Garcia
2010-07-01 20:19                     ` Eric Dumazet
2010-07-18 16:43                       ` Pedro Garcia
2010-07-18 22:39                         ` David Miller
2010-07-19 13:24                           ` [BUG net-next-2.6] vlan, bonding, bnx2 problems Eric Dumazet
2010-07-19 16:35                             ` David Miller
2010-07-19 18:14                             ` Michael Chan
2010-07-19 20:19                               ` Jay Vosburgh
2010-07-20 22:58                                 ` Jay Vosburgh
2010-06-24 18:28             ` [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet) Pedro Garcia Pelaez
2010-07-08 12:54             ` Vladislav Zolotarov
2010-07-08 12:58               ` Vladislav Zolotarov
2010-07-08 13:51             ` Vladislav Zolotarov
2010-06-14 19:42         ` Joe Perches
2010-06-14 20:03           ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.