linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* problem with IPoA (CLIP),  NAT, and VLANS
@ 2009-02-12 13:28 Karl Hiramoto
  2009-02-16 15:02 ` Karl Hiramoto
  0 siblings, 1 reply; 19+ messages in thread
From: Karl Hiramoto @ 2009-02-12 13:28 UTC (permalink / raw)
  To: netdev, netfilter; +Cc: LKML

   
Hi all,

I have a scenario with  CLIP  IPoA(RFC1577) atm link over ADSL on the
WAN, 801.1q  VLANs on the LAN, and  NAT/MASQUERADE   that does not work.


Network config:
Nat_host <-->  router <---> server

a ping from the Nat_host  reaches the server on the WAN fine, and the
ping comes back to the  router, but the ping response never reaches the
Nat_Host.    Using TRACE rules it seems the ICMP ping response gets lost
inside the router.   I see the same behavior with TCP and UDP.



Other similar cases that everything works fine.

1.    If i don't use VLAN's  on the LAN everything works fine with
NAT/MASQUERADE on the WAN  with Ethernet, ATM IPoA (RFC1577), and  ATM
IPoE (RFC2684).

2.    If I use  VLAN's on the LAN everything works fine with
NAT/MASQUERADE but only with  Ethernet and ATM IPoE(RFC2684) on the WAN.



Script to configure  CLIP on the WAN and VLANs on the LAN where NAT does
not work:

cat atm.vlan.sh
#!/bin/sh

vconfig add eth0 1
vconfig add eth0 2
ip addr add 192.168.88.1/24 broadcast 192.168.88.255 dev eth0.1
ip link set eth0.1 up

echo "NOTE: config switch vlans at this point"


ip addr
ip link
echo -------
echo " remove old eth0"
ip addr del 192.168.88.1/24 dev eth0
ip link
ip addr


atmarpd -b
atmarp -c atm0

ip addr add 10.1.1.178/24 dev atm0
ip link set atm0 up
atmarp -s 10.1.1.1 0.8.32
atmarp -a


#echo "delete route to LAN  ERROR ok here if no route exists"
#ip route del default via 192.168.88.1 dev eth0.1
echo "route via ATM"
ip route add default via 10.1.1.1

#flush tables and make default policy accept.
iptables -F; iptables -t nat -F; iptables -t mangle -F
iptables -P INPUT ACCEPT
iptables -P OUTPUT ACCEPT
iptables -P FORWARD ACCEPT

iptables -t nat -A POSTROUTING -o atm0 -j MASQUERADE
echo 1 > /proc/sys/net/ipv4/ip_forward



##########  Script to configure IPoA on the WAN without vlans  where
everything works fine:

cat atm.no.vlan.sh
#!/bin/sh

atmarpd -b
atmarp -c atm0

ip addr add 10.1.1.178/24 dev atm0
ip link set atm0 up
atmarp -s 10.1.1.1 0.8.32
atmarp -a

ip route del default via 192.168.88.1 dev eth0
ip route add default via 10.1.1.1

#flush tables and make default policy accept.
iptables -F; iptables -t nat -F; iptables -t mangle -F
iptables -P INPUT ACCEPT
iptables -P OUTPUT ACCEPT
iptables -P FORWARD ACCEPT

iptables -t nat -A POSTROUTING -o atm0 -j MASQUERADE
echo 1 > /proc/sys/net/ipv4/ip_forward


############

This WARN_ON() is occurring When using VLANs on IPoA and NAT

------------[ cut here ]------------
WARNING: at net/ipv4/netfilter/nf_nat_standalone.c:89 nf_nat_fn+0x44/0x194 [iptable_nat]()
Modules linked in: xt_MARK crc_ccitt nf_conntrack_pptp nf_conntrack_proto_gre ixp4xx_crypto ipt_MASQUERADE ipt_REDIRECT nf_nat_sip nf_conntrack
_sip nf_nat_h323 nf_conntrack_h323 nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat_irc nf_conntrack_irc ipt_addrtype iptable_n
at nf_nat xt_TCPMSS xt_pkttype nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_mark iptable_mangle iptable_filter ip_tables ixp4xx_sa
r ixp4xx_atm ixp_osal eagle utopia ipt_ULOG

[<c0025798>] (dump_stack+0x0/0x14) from [<c0031878>] (warn_on_slowpath+0x4c/0x68)
[<c003182c>] (warn_on_slowpath+0x0/0x68) from [<bf0d13a0>] (nf_nat_fn+0x44/0x194 [iptable_nat])
 r6:00000004 r5:c583d240 r4:bf0d24e8
[<bf0d135c>] (nf_nat_fn+0x0/0x194 [iptable_nat]) from [<bf0d1764>] (nf_nat_out+0x44/0xc4 [iptable_nat])
[<bf0d1720>] (nf_nat_out+0x0/0xc4 [iptable_nat]) from [<c01e5a9c>] (nf_iterate+0x64/0xd0)
 r5:c0313d70 r4:bf0d24e8
[<c01e5a38>] (nf_iterate+0x0/0xd0) from [<c01e5b6c>] (nf_hook_slow+0x64/0xf0)
[<c01e5b08>] (nf_hook_slow+0x0/0xf0) from [<c01f1c38>] (ip_output+0x84/0xa4)
[<c01f1bb4>] (ip_output+0x0/0xa4) from [<c01eea10>] (ip_forward_finish+0x44/0x4c)
 r4:c583d240
[<c01ee9cc>] (ip_forward_finish+0x0/0x4c) from [<c01eecdc>] (ip_forward+0x2c4/0x340)
 r4:c583d240
[<c01eea18>] (ip_forward+0x0/0x340) from [<c01ed5d8>] (ip_rcv_finish+0x338/0x35c)
 r7:c7d78000 r6:c034ce64 r5:c588e018 r4:c034d0ac
[<c01ed2a0>] (ip_rcv_finish+0x0/0x35c) from [<c01edb18>] (ip_rcv+0x23c/0x270)
[<c01ed8dc>] (ip_rcv+0x0/0x270) from [<c01d0fe8>] (netif_receive_skb+0x380/0x3c0)
 r7:00000800 r6:c7d78000 r5:c583d240 r4:c034d0ac
[<c01d0c68>] (netif_receive_skb+0x0/0x3c0) from [<c01d35fc>] (process_backlog+0x8c/0x128)
[<c01d3570>] (process_backlog+0x0/0x128) from [<c01d2fec>] (net_rx_action+0x60/0x1b8)
[<c01d2f8c>] (net_rx_action+0x0/0x1b8) from [<c0036458>] (__do_softirq+0x68/0x104)
[<c00363f0>] (__do_softirq+0x0/0x104) from [<c00367c8>] (irq_exit+0x44/0x4c)
[<c0036784>] (irq_exit+0x0/0x4c) from [<c0021068>] (__exception_text_start+0x68/0x84)
[<c0021000>] (__exception_text_start+0x0/0x84) from [<c00219c4>] (__irq_svc+0x24/0x80)
Exception stack(0xc0313f4c to 0xc0313f94)
3f40:                            c0333ad4 c78e4600 a0000013 00000000 c0022dd8
3f60: c0312000 c0022dd8 c0333148 0001d74c 69054041 0001d67c c0313fc0 c0313fa4
3f80: c0313f94 c0022ca0 c0022de0 60000013 ffffffff
 r5:0000001f r4:ffffffff
[<c0022c64>] (cpu_idle+0x0/0x58) from [<c025b078>] (rest_init+0x54/0x68)
 r7:c031636c r6:c001edb8 r5:c0332cc4 r4:c033f260
[<c025b024>] (rest_init+0x0/0x68) from [<c00089d0>] (start_kernel+0x244/0x2a4)
[<c000878c>] (start_kernel+0x0/0x2a4) from [<00008034>] (0x8034)
 r6:c001f1bc r5:c03331ac r4:000039fd
---[ end trace 223a280469e2bcdb ]---






Thanks for any help or info you can give me.   Please CC me on responses.


Karl.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-12 13:28 problem with IPoA (CLIP), NAT, and VLANS Karl Hiramoto
@ 2009-02-16 15:02 ` Karl Hiramoto
  2009-02-16 23:20   ` Jarek Poplawski
  0 siblings, 1 reply; 19+ messages in thread
From: Karl Hiramoto @ 2009-02-16 15:02 UTC (permalink / raw)
  To: netdev, netfilter; +Cc: LKML

Karl Hiramoto wrote:
>    
> Hi all,
>
> I have a scenario with  CLIP  IPoA(RFC1577) atm link over ADSL on the
> WAN, 801.1q  VLANs on the LAN, and  NAT/MASQUERADE   that does not work.
>
>
> Network config:
> Nat_host <-->  router <---> server
>
> a ping from the Nat_host  reaches the server on the WAN fine, and the
> ping comes back to the  router, but the ping response never reaches the
> Nat_Host.    Using TRACE rules it seems the ICMP ping response gets lost
> inside the router.   I see the same behavior with TCP and UDP.
>
>   

The problem ended up being the packet being corrupted when the vlan tag
was being added and skb_cow_head()  was being called.

Anyone know why skb_cow_head() would corrupt the packet?  Perhaps it was
not allocated correctly?     I'm using a big-endian ARM IXP435 board.

NET_SKB_PAD change was already discussed:
http://kerneltrap.org/mailarchive/linux-netdev/2009/2/7/4919204


Signed-off by: Karl Hiramoto <karl@hiramoto.org>

--- linux-2.6.28.4.a/include/linux/if_vlan.h    2009-02-06
22:47:45.000000000 +0100
+++ linux-2.6.28.4.b/include/linux/if_vlan.h    2009-02-16
13:55:35.000000000 +0100
@@ -183,9 +183,12 @@ static inline struct sk_buff *__vlan_put
 {
        struct vlan_ethhdr *veth;

-       if (skb_cow_head(skb, VLAN_HLEN) < 0) {
-               kfree_skb(skb);
-               return NULL;
+       /* If not enough headroom for vlan tag */
+       if (skb_headroom(skb) < VLAN_HLEN) {
+               if (skb_cow_head(skb, VLAN_HLEN) < 0) {
+                       kfree_skb(skb);
+                       return NULL;
+               }
        }
        veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN)

--- linux-2.6.28.4.a/include/linux/skbuff.h     2009-02-06
22:47:45.000000000 +0100
+++ linux-2.6.28.4.b/include/linux/skbuff.h     2009-02-16
13:52:54.000000000 +0100
@@ -1263,7 +1263,7 @@ static inline int skb_network_offset(con
  * headroom, you should not reduce this.
  */
 #ifndef NET_SKB_PAD
-#define NET_SKB_PAD    16
+#define NET_SKB_PAD    32
 #endif

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-16 15:02 ` Karl Hiramoto
@ 2009-02-16 23:20   ` Jarek Poplawski
  2009-02-17  9:03     ` Patrick McHardy
  2009-02-17 11:49     ` Karl Hiramoto
  0 siblings, 2 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-16 23:20 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: netdev, netfilter, LKML

Karl Hiramoto wrote, On 02/16/2009 04:02 PM:
...
> The problem ended up being the packet being corrupted when the vlan tag
> was being added and skb_cow_head()  was being called.
> 
> Anyone know why skb_cow_head() would corrupt the packet?  Perhaps it was
> not allocated correctly?     I'm using a big-endian ARM IXP435 board.

Hi,
Very nice debugging, but I think your patch doesn't look like enough:
1) it skips copy for cloned skbs,
2) this skb_cow_head() is needed anyway, sometimes.
So the real bug should be found in skb_cow_head() or elsewhere.

I attach here 2 patches for testing:
1) skb->mac_header update: it looks like needed, but I don't know if
   it matters here,
2) an extention of your patch but with pskb_expand_head() called for
   one to one copy.

BTW, if it's not a big problem it would be nice to try this e.g. on
2.6.29-rc5.

Thanks,
Jarek P.

---> patch #1

 include/linux/if_vlan.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index f8ff918..e1ff5b1 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -210,6 +210,7 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
 
 	/* Move the mac addresses to the beginning of the new header. */
 	memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
+	skb->mac_header -= VLAN_HLEN;
 
 	/* first, the ethernet type */
 	veth->h_vlan_proto = htons(ETH_P_8021Q);

---> patch #2

 include/linux/if_vlan.h |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index f8ff918..e9a5eb1 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -202,9 +202,16 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
 {
 	struct vlan_ethhdr *veth;
 
-	if (skb_cow_head(skb, VLAN_HLEN) < 0) {
-		kfree_skb(skb);
-		return NULL;
+	if (skb_headroom(skb) < VLAN_HLEN) {
+		if (skb_cow_head(skb, VLAN_HLEN) < 0) {
+			kfree_skb(skb);
+			return NULL;
+		}
+	} else {
+		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC) < 0) {
+			kfree_skb(skb);
+			return NULL;
+		}
 	}
 	veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-16 23:20   ` Jarek Poplawski
@ 2009-02-17  9:03     ` Patrick McHardy
  2009-02-17  9:32       ` [PATCH] " Jarek Poplawski
                         ` (2 more replies)
  2009-02-17 11:49     ` Karl Hiramoto
  1 sibling, 3 replies; 19+ messages in thread
From: Patrick McHardy @ 2009-02-17  9:03 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Karl Hiramoto, netdev, netfilter, LKML

Jarek Poplawski wrote:
> Very nice debugging, but I think your patch doesn't look like enough:
> 1) it skips copy for cloned skbs,
> 2) this skb_cow_head() is needed anyway, sometimes.
> So the real bug should be found in skb_cow_head() or elsewhere.
> 
> I attach here 2 patches for testing:
> 1) skb->mac_header update: it looks like needed, but I don't know if
>    it matters here,
> 2) an extention of your patch but with pskb_expand_head() called for
>    one to one copy.
> 
> BTW, if it's not a big problem it would be nice to try this e.g. on
> 2.6.29-rc5.

The first patch looks fine. As for the second one, I would like
to understand why we're seing these packets. The VLAN driver uses
the original headroom plus the space it needs itself, which suggests
that the underlying driver specifies an incorrect headroom.
The driver doesn't appear to be part of the mainline kernel though.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17  9:03     ` Patrick McHardy
@ 2009-02-17  9:32       ` Jarek Poplawski
  2009-02-17  9:39       ` [PATCH v2] " Jarek Poplawski
  2009-02-17  9:52       ` Jarek Poplawski
  2 siblings, 0 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17  9:32 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Karl Hiramoto, netdev, netfilter, LKML

On Tue, Feb 17, 2009 at 10:03:57AM +0100, Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> Very nice debugging, but I think your patch doesn't look like enough:
>> 1) it skips copy for cloned skbs,
>> 2) this skb_cow_head() is needed anyway, sometimes.
>> So the real bug should be found in skb_cow_head() or elsewhere.
>>
>> I attach here 2 patches for testing:
>> 1) skb->mac_header update: it looks like needed, but I don't know if
>>    it matters here,
>> 2) an extention of your patch but with pskb_expand_head() called for
>>    one to one copy.
>>
>> BTW, if it's not a big problem it would be nice to try this e.g. on
>> 2.6.29-rc5.
>
> The first patch looks fine. As for the second one, I would like
> to understand why we're seing these packets. The VLAN driver uses
> the original headroom plus the space it needs itself, which suggests
> that the underlying driver specifies an incorrect headroom.
> The driver doesn't appear to be part of the mainline kernel though.
>

The second patch is only for debugging: to check if it's about offset
change or some values corrupted btw.

Since the first patch looks quite natural I resend it for applying.
I hope you'll ack it when Karl sends "Tested-by:" after checking it
at least with the currently working (IPoE) vlan case.

Thanks,
Jarek P.
----------------->
vlan: Update skb->mac_header in __vlan_put_tag().

After moving mac addresses in __vlan_put_tag() skb->mac_header() needs
to be updated.

Reported-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 include/linux/if_vlan.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index f8ff918..e1ff5b1 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -210,6 +210,7 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
 
 	/* Move the mac addresses to the beginning of the new header. */
 	memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
+	skb->mac_header -= VLAN_HLEN;
 
 	/* first, the ethernet type */
 	veth->h_vlan_proto = htons(ETH_P_8021Q);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2] Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17  9:03     ` Patrick McHardy
  2009-02-17  9:32       ` [PATCH] " Jarek Poplawski
@ 2009-02-17  9:39       ` Jarek Poplawski
  2009-02-17 11:05         ` Karl Hiramoto
  2009-02-19  7:31         ` David Miller
  2009-02-17  9:52       ` Jarek Poplawski
  2 siblings, 2 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17  9:39 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Karl Hiramoto, netdev, netfilter, LKML, David Miller

Sorry:
-----------------> take 2: changelog fix only

vlan: Update skb->mac_header in __vlan_put_tag().

After moving mac addresses in __vlan_put_tag() skb->mac_header needs
to be updated.

Reported-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 include/linux/if_vlan.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index f8ff918..e1ff5b1 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -210,6 +210,7 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
 
 	/* Move the mac addresses to the beginning of the new header. */
 	memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
+	skb->mac_header -= VLAN_HLEN;
 
 	/* first, the ethernet type */
 	veth->h_vlan_proto = htons(ETH_P_8021Q);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17  9:03     ` Patrick McHardy
  2009-02-17  9:32       ` [PATCH] " Jarek Poplawski
  2009-02-17  9:39       ` [PATCH v2] " Jarek Poplawski
@ 2009-02-17  9:52       ` Jarek Poplawski
  2 siblings, 0 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17  9:52 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Karl Hiramoto, netdev, netfilter, LKML

On Tue, Feb 17, 2009 at 10:03:57AM +0100, Patrick McHardy wrote:
...
> The first patch looks fine. As for the second one, I would like
> to understand why we're seing these packets. The VLAN driver uses
> the original headroom plus the space it needs itself, which suggests
> that the underlying driver specifies an incorrect headroom.
> The driver doesn't appear to be part of the mainline kernel though.

On the other hand, if Karl added his patch it seems this change of
NET_SKB_PAD isn't enough, and if omitting skb_cow_head() works, then
the headroom isn't the main problem?

Jarek P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17  9:39       ` [PATCH v2] " Jarek Poplawski
@ 2009-02-17 11:05         ` Karl Hiramoto
  2009-02-17 11:53           ` Jarek Poplawski
  2009-02-19  7:31         ` David Miller
  1 sibling, 1 reply; 19+ messages in thread
From: Karl Hiramoto @ 2009-02-17 11:05 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Patrick McHardy, netdev, netfilter, LKML, David Miller



Jarek Poplawski wrote:
> Sorry:
> -----------------> take 2: changelog fix only
>
> vlan: Update skb->mac_header in __vlan_put_tag().
>
> After moving mac addresses in __vlan_put_tag() skb->mac_header needs
> to be updated.
>   
This works on IPoE.  I notice no change with or w/o this patch on IPoE
only routing.    However has no effect on the original problem i
reported with  IPoA --> NAT --> VLAN

Tested-by: Karl Hiramoto <karl@hiramoto.org>

> Reported-by: Karl Hiramoto <karl@hiramoto.org>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> ---
>
>  include/linux/if_vlan.h |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
> index f8ff918..e1ff5b1 100644
> --- a/include/linux/if_vlan.h
> +++ b/include/linux/if_vlan.h
> @@ -210,6 +210,7 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
>  
>  	/* Move the mac addresses to the beginning of the new header. */
>  	memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
> +	skb->mac_header -= VLAN_HLEN;
>  
>  	/* first, the ethernet type */
>  	veth->h_vlan_proto = htons(ETH_P_8021Q);
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>   


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-16 23:20   ` Jarek Poplawski
  2009-02-17  9:03     ` Patrick McHardy
@ 2009-02-17 11:49     ` Karl Hiramoto
  2009-02-17 12:20       ` Jarek Poplawski
  2009-02-17 12:28       ` Patrick McHardy
  1 sibling, 2 replies; 19+ messages in thread
From: Karl Hiramoto @ 2009-02-17 11:49 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, netfilter, LKML

Jarek Poplawski wrote:
> Karl Hiramoto wrote, On 02/16/2009 04:02 PM:
> ...
>   
>> The problem ended up being the packet being corrupted when the vlan tag
>> was being added and skb_cow_head()  was being called.
>>
>> Anyone know why skb_cow_head() would corrupt the packet?  Perhaps it was
>> not allocated correctly?     I'm using a big-endian ARM IXP435 board.
>>     
>
> Hi,
> Very nice debugging, but I think your patch doesn't look like enough:
> 1) it skips copy for cloned skbs,
> 2) this skb_cow_head() is needed anyway, sometimes.
> So the real bug should be found in skb_cow_head() or elsewhere.
>
> I attach here 2 patches for testing:
> 2) an extention of your patch but with pskb_expand_head() called for
>    one to one copy.
>
>
>  include/linux/if_vlan.h |   13 ++++++++++---
>  1 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
> index f8ff918..e9a5eb1 100644
> --- a/include/linux/if_vlan.h
> +++ b/include/linux/if_vlan.h
> @@ -202,9 +202,16 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
>  {
>  	struct vlan_ethhdr *veth;
>  
> -	if (skb_cow_head(skb, VLAN_HLEN) < 0) {
> -		kfree_skb(skb);
> -		return NULL;
> +	if (skb_headroom(skb) < VLAN_HLEN) {
> +		if (skb_cow_head(skb, VLAN_HLEN) < 0) {
> +			kfree_skb(skb);
> +			return NULL;
> +		}
> +	} else {
> +		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC) < 0) {
> +			kfree_skb(skb);
> +			return NULL;
> +		}
>  	}
>  	veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
>  
>   


This breaks IPoA -> NAT -->VLAN for me.  Before, with a virgin 2.6.28.4 
I had no traffic.  With this patch,  what was a 50ms ping, jumps to
1050ms  and randomly there are duplicate packets.

What Patrick,  said may be the case,  that the underlaying atm driver is
not specifying the headroom.   The driver is ported to 2.6.28 from
2.6.11 based on the BSD licensed intel libaries:
http://www.intel.com/design/network/products/npfamily/download_ixp400.htm

Where underneath a bunch of layers the skb is allocated with kmalloc like:

size = PAGE_ALIGN (size + CACHE_LINE_SIZE);
order = get_order (size);
page = alloc_pages (GFP_KERNEL, order);


Why though does the same driver work with IPoE  Bridged ATM   RFC2684  ?


The other module in use is CLIP  net/atm/clip.c      I don't have any
other hardware though to test the CLIP driver with though.


A side note:  so far the original patch i sent works in all cases i have
tested, but fails with tcpdump.   I suspect its because the skb gets cloned.

Thanks,


Karl.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 11:05         ` Karl Hiramoto
@ 2009-02-17 11:53           ` Jarek Poplawski
  0 siblings, 0 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17 11:53 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: Patrick McHardy, netdev, netfilter, LKML, David Miller

On Tue, Feb 17, 2009 at 12:05:24PM +0100, Karl Hiramoto wrote:
> 
> 
> Jarek Poplawski wrote:
> > Sorry:
> > -----------------> take 2: changelog fix only
> >
> > vlan: Update skb->mac_header in __vlan_put_tag().
> >
> > After moving mac addresses in __vlan_put_tag() skb->mac_header needs
> > to be updated.
> >   
> This works on IPoE.  I notice no change with or w/o this patch on IPoE
> only routing.    However has no effect on the original problem i
> reported with  IPoA --> NAT --> VLAN
> 
> Tested-by: Karl Hiramoto <karl@hiramoto.org>

Yes, since it didn't affect the working case with your patch, I didn't
expect it would matter here, but it looks like needed so good to know
it seems not to make anything wrong at least.

Thanks for testing,
Jarek P.

> 
> > Reported-by: Karl Hiramoto <karl@hiramoto.org>
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> > ---
> >
> >  include/linux/if_vlan.h |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
> > index f8ff918..e1ff5b1 100644
> > --- a/include/linux/if_vlan.h
> > +++ b/include/linux/if_vlan.h
> > @@ -210,6 +210,7 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
> >  
> >  	/* Move the mac addresses to the beginning of the new header. */
> >  	memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
> > +	skb->mac_header -= VLAN_HLEN;
> >  
> >  	/* first, the ethernet type */
> >  	veth->h_vlan_proto = htons(ETH_P_8021Q);
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >   
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 11:49     ` Karl Hiramoto
@ 2009-02-17 12:20       ` Jarek Poplawski
  2009-02-17 12:53         ` Karl Hiramoto
  2009-02-17 12:28       ` Patrick McHardy
  1 sibling, 1 reply; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17 12:20 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: netdev, netfilter, LKML

On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
...
> A side note:  so far the original patch i sent works in all cases i have
> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.

If there is something readable from this tcpdump, it should be helpful
to see a packet for working and non-working case during such ping
(with -nXX option).

Jarek P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 11:49     ` Karl Hiramoto
  2009-02-17 12:20       ` Jarek Poplawski
@ 2009-02-17 12:28       ` Patrick McHardy
  1 sibling, 0 replies; 19+ messages in thread
From: Patrick McHardy @ 2009-02-17 12:28 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: Jarek Poplawski, netdev, netfilter, LKML

Karl Hiramoto wrote:
> What Patrick,  said may be the case,  that the underlaying atm driver is
> not specifying the headroom.   The driver is ported to 2.6.28 from
> 2.6.11 based on the BSD licensed intel libaries:
> http://www.intel.com/design/network/products/npfamily/download_ixp400.htm

If those are the ones I think they are, I'm not too surprised ...

> Where underneath a bunch of layers the skb is allocated with kmalloc like:
> 
> size = PAGE_ALIGN (size + CACHE_LINE_SIZE);
> order = get_order (size);
> page = alloc_pages (GFP_KERNEL, order);

How does it initialize the netdev struct?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 12:20       ` Jarek Poplawski
@ 2009-02-17 12:53         ` Karl Hiramoto
  2009-02-17 13:37           ` Jarek Poplawski
  2009-02-17 23:12           ` Jarek Poplawski
  0 siblings, 2 replies; 19+ messages in thread
From: Karl Hiramoto @ 2009-02-17 12:53 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, netfilter, LKML

Jarek Poplawski wrote:
> On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
> ...
>   
>> A side note:  so far the original patch i sent works in all cases i have
>> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.
>>     
>
> If there is something readable from this tcpdump, it should be helpful
> to see a packet for working and non-working case during such ping
> (with -nXX option).
> Jarek P.
>   

Note:  I have the patches i sent applied,  plus the  "skb->mac_header -=
VLAN_HLEN;"   patch from Jarek on 2.6.28.4

Doing a tcpdump simultaneously  on the atm and eth0.1 on the linux router.


tcpdump -i atm0 -nvXX icmp
tcpdump: listening on atm0, link-type LINUX_SLL (Linux cooked), capture
size 68 bytes
12:47:15.431821 IP (tos 0x0, ttl  63, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 80.33.85.178 > 80.58.0.33: ICMP echo request, id
54787, seq 1, length 64
        0x0000:  0004 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 0000 4000 3f01 457b 5021 55b2  E..T..@.?.E{P!U.
        0x0020:  503a 0021 0800 24cc d603 0001 d4b1 9a49  P:.!..$........I
        0x0030:  a130 0200 0809 0a0b 0c0d 0e0f 1011 1213  .0..............
        0x0040:  1415 1617                                ....
12:47:15.491209 IP (tos 0x0, ttl 126, id 51644, offset 0, flags [none],
proto: ICMP (1), length: 84) 80.58.0.33 > 80.33.85.178: ICMP echo reply,
id 54787, seq 1, length 64
        0x0000:  0000 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 c9bc 0000 7e01 7cbe 503a 0021  E..T....~.|.P:.!
        0x0020:  5021 55b2 0000 2ccc d603 0001 d4b1 9a49  P!U...,........I
        0x0030:  a130 0200 0809 0a0b 0c0d 0e0f 1011 1213  .0..............
        0x0040:  1415 1617                                ....
12:47:16.442008 IP (tos 0x0, ttl  63, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 80.33.85.178 > 80.58.0.33: ICMP echo request, id
54787, seq 2, length 64
        0x0000:  0004 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 0000 4000 3f01 457b 5021 55b2  E..T..@.?.E{P!U.
        0x0020:  503a 0021 0800 eda1 d603 0002 d5b1 9a49  P:.!...........I
        0x0030:  d759 0200 0809 0a0b 0c0d 0e0f 1011 1213  .Y..............
        0x0040:  1415 1617                                ....
12:47:16.498148 IP (tos 0x0, ttl 126, id 51784, offset 0, flags [none],
proto: ICMP (1), length: 84) 80.58.0.33 > 80.33.85.178: ICMP echo reply,
id 54787, seq 2, length 64
        0x0000:  0000 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 ca48 0000 7e01 7c32 503a 0021  E..T.H..~.|2P:.!
        0x0020:  5021 55b2 0000 f5a1 d603 0002 d5b1 9a49  P!U............I
        0x0030:  d759 0200 0809 0a0b 0c0d 0e0f 1011 1213  .Y..............
        0x0040:  1415 1617                  



tcpdump -i eth0.1 -nvXX icmp
tcpdump: listening on eth0.1, link-type EN10MB (Ethernet), capture size
68 bytes
12:47:15.434163 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 1, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 24cc d603 0001 d4b1 9a49 a130  .!..$........I.0
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819                                ....
12:47:16.441748 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 2, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 eda1 d603 0002 d5b1 9a49 d759  .!...........I.Y
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819                                ....
12:47:16.498342 IP (tos 0x0, ttl 125, id 47253, offset 0, flags [none],
proto: ICMP (1), length: 84) 80.58.0.33 > 192.168.88.2: ICMP echo reply,
id 54787, seq 2, length 64
        0x0000:  9b4a 525e a930 50db 8100 0001 0800 4500  .JR^.0P.......E.
        0x0010:  0054 b895 0000 7d01 1c0e 503a 0021 c0a8  .T....}...P:.!..
        0x0020:  5802 0000 ca1b d603 0002 b5b1 9a49 24e0  X............I$.
        0x0030:  0000 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819                                ....



This last tcpdump is on the machine doing a ping.   Note the ID and the
time of day looks to be corrupt.

ping -c 2  80.58.0.33
PING 80.58.0.33 (80.58.0.33) 56(84) bytes of data.
64 bytes from 80.58.0.33: icmp_seq=2 ttl=125 time=32156 ms

--- 80.58.0.33 ping statistics ---
2 packets transmitted, 1 received, 50% packet loss, time 1010ms
rtt min/avg/max/mdev = 32156.693/32156.693/32156.693/0.000 ms


tcpdump -i eth0 icmp -vn -XX
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96
bytes
13:47:16.143541 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
ICMP (1), length 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 1, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 24cc d603 0001 d4b1 9a49 a130  .!..$........I.0
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
13:47:17.154093 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
ICMP (1), length 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 2, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 eda1 d603 0002 d5b1 9a49 d759  .!...........I.Y
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
13:47:17.214073 IP (tos 0x0, ttl 125, id 47253, offset 0, flags [none],
proto ICMP (1), length 84) 80.58.0.33 > 192.168.88.2: ICMP echo reply,
id 54787, seq 2, length 64
        0x0000:  0015 c509 9b4a 525e a930 50db 0800 4500  .....JR^.0P...E.
        0x0010:  0054 b895 0000 7d01 1c0e 503a 0021 c0a8  .T....}...P:.!..
        0x0020:  5802 0000 ca1b d603 0002 b5b1 9a49 24e0  X............I$.
        0x0030:  0000 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345





Thanks,

karl

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 12:53         ` Karl Hiramoto
@ 2009-02-17 13:37           ` Jarek Poplawski
  2009-02-17 23:12           ` Jarek Poplawski
  1 sibling, 0 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17 13:37 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: netdev, netfilter, LKML

On Tue, Feb 17, 2009 at 01:53:25PM +0100, Karl Hiramoto wrote:
> Jarek Poplawski wrote:
> > On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
> > ...
> >   
> >> A side note:  so far the original patch i sent works in all cases i have
> >> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.

Hmm... I would like to make sure: if tcpdump breaks it too, then
we don't need to blame skb_cow_head(), do we? If it's like this then
it looks like the driver's problem with copied/cloned skbs?! (Maybe
because some "private" offset is overwritten?)

Jarek P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 12:53         ` Karl Hiramoto
  2009-02-17 13:37           ` Jarek Poplawski
@ 2009-02-17 23:12           ` Jarek Poplawski
  2009-02-18 17:47             ` Karl Hiramoto
  1 sibling, 1 reply; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-17 23:12 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: netdev, netfilter, LKML

Karl Hiramoto wrote, On 02/17/2009 01:53 PM:

> Jarek Poplawski wrote:
>> On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
>> ...
>>   
>>> A side note:  so far the original patch i sent works in all cases i have
>>> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.
>>>     
>> If there is something readable from this tcpdump, it should be helpful
>> to see a packet for working and non-working case during such ping
>> (with -nXX option).
>> Jarek P.
>>   
> 
> Note:  I have the patches i sent applied,  plus the  "skb->mac_header -=
> VLAN_HLEN;"   patch from Jarek on 2.6.28.4
> 
> Doing a tcpdump simultaneously  on the atm and eth0.1 on the linux router.

Nice job. Since tcpdump sees corrupted data, and we don't know if it's
before or after hitting the driver I'd suggest to try full skb_copy()
yet. So could you try if with your patch + the patch below tcpdump
still breaks these things?

BTW, I wonder what IXP400 config options do you use (especially
CONFIG_IXP400_ETH_SKB_RECYCLE)?

Jarek P.

--- patch #3 (for debugging only)

 net/core/dev.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index a17e006..b822a5d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1346,7 +1346,7 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
 		if ((ptype->dev == dev || !ptype->dev) &&
 		    (ptype->af_packet_priv == NULL ||
 		     (struct sock *)ptype->af_packet_priv != skb->sk)) {
-			struct sk_buff *skb2= skb_clone(skb, GFP_ATOMIC);
+			struct sk_buff *skb2= skb_copy(skb, GFP_ATOMIC);
 			if (!skb2)
 				break;
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-17 23:12           ` Jarek Poplawski
@ 2009-02-18 17:47             ` Karl Hiramoto
  2009-02-18 21:05               ` Jarek Poplawski
  0 siblings, 1 reply; 19+ messages in thread
From: Karl Hiramoto @ 2009-02-18 17:47 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, netfilter, LKML

[-- Attachment #1: Type: text/plain, Size: 2729 bytes --]

Jarek Poplawski wrote:
> Karl Hiramoto wrote, On 02/17/2009 01:53 PM:
>
>   
>> Jarek Poplawski wrote:
>>     
>>> On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
>>> ...
>>>   
>>>       
>>>> A side note:  so far the original patch i sent works in all cases i have
>>>> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.
>>>>     
>>>>         
>>> If there is something readable from this tcpdump, it should be helpful
>>> to see a packet for working and non-working case during such ping
>>> (with -nXX option).
>>> Jarek P.
>>>   
>>>       
>> Note:  I have the patches i sent applied,  plus the  "skb->mac_header -=
>> VLAN_HLEN;"   patch from Jarek on 2.6.28.4
>>
>> Doing a tcpdump simultaneously  on the atm and eth0.1 on the linux router.
>>     
>
> Nice job. Since tcpdump sees corrupted data, and we don't know if it's
> before or after hitting the driver I'd suggest to try full skb_copy()
> yet. So could you try if with your patch + the patch below tcpdump
> still breaks these things?
>
> BTW, I wonder what IXP400 config options do you use (especially
> CONFIG_IXP400_ETH_SKB_RECYCLE)?
>
> Jarek P.

Thanks for the replies.  Jarek, the last debugging patch you sent did
not work.  It did give me a good hint though.  The attached patch in for
AF_PACKET   receive in the when tcpdump is active and which calls
skb_clone() did fix my issue.


CONFIG_IXP400_ETH_SKB_RECYCLE does not exist in the code i have..  From
what i downloaded from intel,  i stripped out all the stuff that is not
having to do with ATM.  The functionalities of ixp4xx_qmgr ixp4xx_npe
and ixp4xx_eth  are now in the mainline kernel.    Ideally it would be
nice to get what this library does with the atm hardware into the
mainline, however the code in it's current state would not meet kernel
standards, and is quite a mess.


But yes,  the skb->data is   recycled  in a memory pool,   and i think i
noticed a few times packets that were corrupt, were really pointing to
old recycled packets.   I haven't confirmed this yet though.


I did eliminate the first  patch i sent 
http://lkml.org/lkml/2009/2/16/163    to  __vlan_put_tag()

And now only use the patch Jarek sent:  http://lkml.org/lkml/2009/2/17/104

Now i don't have any problems with the vlan tags after changing my atm
driver to do skb_reserve() like:

skb = dev_alloc_skb(size + NET_SKB_PAD);

skb_reserve(skb, NET_SKB_PAD);


So something with my driver causes skb_clone() to corrupt the packet
but  calling skb_copy() instead  keeps everything working.   There are
definitely other cases where skb_clone() is called so really have to fix
this in the atm_dev, but not really sure at the moment where to look next.


Thanks.
--
Karl



[-- Attachment #2: tcpdump-fix-hack.patch --]
[-- Type: text/plain, Size: 513 bytes --]

diff -Naurp linux-2.6.28.4.a/net/packet/af_packet.c linux-2.6.28.4.b/net/packet/af_packet.c
--- linux-2.6.28.4.a/net/packet/af_packet.c	2009-02-06 22:47:45.000000000 +0100
+++ linux-2.6.28.4.b/net/packet/af_packet.c	2009-02-18 16:10:08.000000000 +0100
@@ -524,7 +524,7 @@ static int packet_rcv(struct sk_buff *sk
 		goto drop_n_acct;
 
 	if (skb_shared(skb)) {
-		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+		struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC);
 		if (nskb == NULL)
 			goto drop_n_acct;
 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-18 17:47             ` Karl Hiramoto
@ 2009-02-18 21:05               ` Jarek Poplawski
  2009-02-19  7:30                 ` Jarek Poplawski
  0 siblings, 1 reply; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-18 21:05 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: netdev, netfilter, LKML

On Wed, Feb 18, 2009 at 06:47:57PM +0100, Karl Hiramoto wrote:
...
> Thanks for the replies.  Jarek, the last debugging patch you sent did
> not work.  It did give me a good hint though.  The attached patch in for
> AF_PACKET   receive in the when tcpdump is active and which calls
> skb_clone() did fix my issue.

Yes, good point!

> CONFIG_IXP400_ETH_SKB_RECYCLE does not exist in the code i have..  From
> what i downloaded from intel,  i stripped out all the stuff that is not
> having to do with ATM.  The functionalities of ixp4xx_qmgr ixp4xx_npe
> and ixp4xx_eth  are now in the mainline kernel.    Ideally it would be
> nice to get what this library does with the atm hardware into the
> mainline, however the code in it's current state would not meet kernel
> standards, and is quite a mess.
> 
> 
> But yes,  the skb->data is   recycled  in a memory pool,   and i think i
> noticed a few times packets that were corrupt, were really pointing to
> old recycled packets.   I haven't confirmed this yet though.
> 
> 
> I did eliminate the first  patch i sent 
> http://lkml.org/lkml/2009/2/16/163    to  __vlan_put_tag()
> 
> And now only use the patch Jarek sent:  http://lkml.org/lkml/2009/2/17/104

Yes, this patch looks like formally needed, but I guess currently it
isn't used by any path: otherwise we would know about it earlier.

> 
> Now i don't have any problems with the vlan tags after changing my atm
> driver to do skb_reserve() like:
> 
> skb = dev_alloc_skb(size + NET_SKB_PAD);
> 
> skb_reserve(skb, NET_SKB_PAD);
> 
> 
> So something with my driver causes skb_clone() to corrupt the packet
> but  calling skb_copy() instead  keeps everything working.   There are
> definitely other cases where skb_clone() is called so really have to fix
> this in the atm_dev, but not really sure at the moment where to look next.

Alas I've been mostly interested in verifying your first suspicion of
skb_cow_head() bug, and not so much in this driver ;-) IMHO after your
current findings the driver definitely looks like the main sinner. I'm
glad you found these hacks to make it workable, but I hope you realize
your data could be still corrupted in more or less visible way.

I looked only a bit into ixp400_eth.c without tracking libraries and
there are some rather strange things I didn't found in other drivers
like skb->truesize use. It looks like both skb and skb->header could
be used together for this recycling without respect for clones. If
so, this could still break in many places e.g.: if it affected you in
__vlan_put_tag() it seems this dev_queue_xmit_nit() could hit you too,
depending on your config or even size of packets. So I guess, knowing
this all, you should better try to hack this driver more yet.

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: problem with IPoA (CLIP),  NAT, and VLANS
  2009-02-18 21:05               ` Jarek Poplawski
@ 2009-02-19  7:30                 ` Jarek Poplawski
  0 siblings, 0 replies; 19+ messages in thread
From: Jarek Poplawski @ 2009-02-19  7:30 UTC (permalink / raw)
  To: Karl Hiramoto; +Cc: netdev, netfilter, LKML

On 18-02-2009 22:05, Jarek Poplawski wrote:
...
> depending on your config or even size of packets. So I guess, knowing
> this all, you should better try to hack this driver more yet.

...And the first step could be trying skb_copy() in the driver's
->hard_start_xmit() callback (of course, if you prefer data safety
over perfomance).

Jarek P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] Re: problem with IPoA (CLIP), NAT, and VLANS
  2009-02-17  9:39       ` [PATCH v2] " Jarek Poplawski
  2009-02-17 11:05         ` Karl Hiramoto
@ 2009-02-19  7:31         ` David Miller
  1 sibling, 0 replies; 19+ messages in thread
From: David Miller @ 2009-02-19  7:31 UTC (permalink / raw)
  To: jarkao2; +Cc: kaber, karl, netdev, netfilter, linux-kernel

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Tue, 17 Feb 2009 09:39:08 +0000

> vlan: Update skb->mac_header in __vlan_put_tag().
> 
> After moving mac addresses in __vlan_put_tag() skb->mac_header needs
> to be updated.
> 
> Reported-by: Karl Hiramoto <karl@hiramoto.org>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Patch applied, this is correct even though it doesn't fix the bug.

I'm pretty sure that packet corruption is a driver error of
some kind rather than some kind of generic problem.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-02-19  7:32 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-12 13:28 problem with IPoA (CLIP), NAT, and VLANS Karl Hiramoto
2009-02-16 15:02 ` Karl Hiramoto
2009-02-16 23:20   ` Jarek Poplawski
2009-02-17  9:03     ` Patrick McHardy
2009-02-17  9:32       ` [PATCH] " Jarek Poplawski
2009-02-17  9:39       ` [PATCH v2] " Jarek Poplawski
2009-02-17 11:05         ` Karl Hiramoto
2009-02-17 11:53           ` Jarek Poplawski
2009-02-19  7:31         ` David Miller
2009-02-17  9:52       ` Jarek Poplawski
2009-02-17 11:49     ` Karl Hiramoto
2009-02-17 12:20       ` Jarek Poplawski
2009-02-17 12:53         ` Karl Hiramoto
2009-02-17 13:37           ` Jarek Poplawski
2009-02-17 23:12           ` Jarek Poplawski
2009-02-18 17:47             ` Karl Hiramoto
2009-02-18 21:05               ` Jarek Poplawski
2009-02-19  7:30                 ` Jarek Poplawski
2009-02-17 12:28       ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).