linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.9 network regression killing amanda - 3c59x?
@ 2004-10-20 19:12 Matthias Andree
  2004-10-20 21:24 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Matthias Andree @ 2004-10-20 19:12 UTC (permalink / raw)
  To: Linux-Kernel mailing list, linux-net

[Note I'm not subscribed to linux-net, remember to Cc: replies]

Greetings,

after upgrading from 2.6.8.1 to 2.6.9, my Amanda server is no longer
receiving backups from its clients (except the one on its own loopback
interface) although estimates succeed (tcpdump excerpt below).

I am suspecting network driver trouble, as the server never sees the
sendbackup packets from remote clients and tcpdump logs a bad udp
checksum (which, I believe, causes Linux to discard the packet) in
2.6.9, but "udp sum ok" and working backup in 2.6.8.1, 2.6.7 and SuSE's
2.6.5-7.108-default.

Hardware: server uses amanda 2.4.4p2 and these cards
eth0 - 3Com 3c900 Combo (Boomerang) with 10Base2 Coax/BNC to client A
eth1 - 3Com 3c905 100BaseTX (Boomerang) with 10BaseT HD to DSL modem
eth2 - 3Com 3c905 100BaseTX (Boomerang) with 100BaseTX FD to client B
eth3 - VIA VT6102 (Rhine-II), unconnected

br0 bridges eth0 and eth2, with net.bridge.bridge-nf-call-iptables = 0.

client A uses a 3Com 3C900B-Combo and Linux 2.6.5
client B uses an Intel 82550 Pro/100 Ethernet and FreeBSD 4.10-RELEASE-p3


Has the 3c59x driver changed between 2.6.8.1 and 2.6.9?

Which patches or changesets are worth backing out?

Any directions for debugging?


Here are tcpdumps, taken on the server with -ibr0, client is connected
to eth0, the "Combo" card. The client was unchanged between the server
reboots:

Linux 2.6.9:

20:11:47.529071 IP (tos 0x0, ttl  64, id 27, offset 0, flags [DF], length: 153) 192.168.0.48.10080 > 192.168.0.1.985: [bad udp cksum a00!] UDP, length: 125
0x0000   4500 0099 001b 4000 4011 b8b7 c0a8 0030        E.....@.@......0
0x0010   c0a8 0001 2760 03d9 0085 b9ec 416d 616e        ....'`......Aman
0x0020   6461 2032 2e34 2052 4550 2048 414e 444c        da.2.4.REP.HANDL
0x0030   4520 3030 302d 4630 4241 3038 3038 2053        E.000-F0BA0808.S
0x0040   4551 2031 3039 3832 3935 3634 360a 434f        EQ.1098295646.CO
0x0050   4e4e 4543 5420 4441 5441 2033 3238 3830        NNECT.DATA.32880
0x0060   204d 4553 4720 3332 3838 3120 494e 4445        .MESG.32881.INDE
0x0070   5820 3332 3838 3200 4f50 5449 4f4e 5320        X.32882.OPTIONS.
0x0080   6665 6174 7572 6573 3d66 6666 6666 6566        features=fffffef
0x0090   6639 6666 6530 663b 0a                         f9ffe0f;.

Linux 2.6.8.1:

20:48:25.880812 IP (tos 0x0, ttl  64, id 51, offset 0, flags [DF], length: 153) 192.168.0.48.10080 > 192.168.0.1.706 [udp sum ok] UDP, length: 125
0x0000   4500 0099 0033 4000 4011 b89f c0a8 0030        E....3@.@......0
0x0010   c0a8 0001 2760 02c2 0085 b90c 416d 616e        ....'`......Aman
0x0020   6461 2032 2e34 2052 4550 2048 414e 444c        da.2.4.REP.HANDL
0x0030   4520 3030 302d 4630 4241 3038 3038 2053        E.000-F0BA0808.S
0x0040   4551 2031 3039 3832 3937 3933 340a 434f        EQ.1098297934.CO
0x0050   4e4e 4543 5420 4441 5441 2033 3239 3133        NNECT.DATA.32913
0x0060   204d 4553 4720 3332 3931 3420 494e 4445        .MESG.32914.INDE
0x0070   5820 3332 3931 350a 4f50 5449 4f4e 5320        X.32915.OPTIONS.
0x0080   6665 6174 7572 6573 3d66 6666 6666 6566        features=fffffef
0x0090   6639 6666 6530 663b 0a                         f9ffe0f;.

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.9 network regression killing amanda - 3c59x?
  2004-10-20 19:12 2.6.9 network regression killing amanda - 3c59x? Matthias Andree
@ 2004-10-20 21:24 ` Andrew Morton
  2004-10-20 22:10   ` 2.6.9 network regression killing amanda (was: 2.6.9 network regression killing amanda - 3c59x?) Matthias Andree
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2004-10-20 21:24 UTC (permalink / raw)
  To: Matthias Andree; +Cc: linux-kernel, linux-net

Matthias Andree <matthias.andree@gmx.de> wrote:
>
> Has the 3c59x driver changed between 2.6.8.1 and 2.6.9?

Not much, really.  Just vlan support.

> Which patches or changesets are worth backing out?

Try the 2.6.8.1 driver in a 2.6.9 kernel?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.6.9 network regression killing amanda (was: 2.6.9 network regression killing amanda - 3c59x?)
  2004-10-20 21:24 ` Andrew Morton
@ 2004-10-20 22:10   ` Matthias Andree
  2004-11-04 12:17     ` CULPRIT FOUND (was: 2.6.9 network regression killing amanda) Matthias Andree
  0 siblings, 1 reply; 4+ messages in thread
From: Matthias Andree @ 2004-10-20 22:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthias Andree, linux-kernel, linux-net

On Wed, 20 Oct 2004, Andrew Morton wrote:

> Matthias Andree <matthias.andree@gmx.de> wrote:
> >
> > Has the 3c59x driver changed between 2.6.8.1 and 2.6.9?
> 
> Not much, really.  Just vlan support.
> 
> > Which patches or changesets are worth backing out?
> 
> Try the 2.6.8.1 driver in a 2.6.9 kernel?

I tried the 2.6.8 3c59x driver (just changed 3c59x.c), problem persists.

I used the Rhine interface (VIA VT6102 Rhine II rev. 78), problem persists.

I removed the bridge and used the Rhine-II interface directly, problem persists.

Apparently the problem is not in 3c59x or bridge code but somewhere
else.  Here's a tcpdump of eth3, my Rhine-II interface, again,
192.168.0.1 is the machine running Linux 2.6.9 with the failing amanda
server, 192.168.0.2 is the FreeBSD 4.10-RELEASE-p3 client.

00:03:00.604639 IP (tos 0x0, ttl  64, id 1, offset 0, flags [DF], length: 145) 192.168.0.1.680 > 192.168.0.2.10080: [udp sum ok] UDP, length: 117
00:03:00.634775 IP (tos 0x0, ttl  64, id 35321, offset 0, flags [none], length: 78) 192.168.0.2.10080 > 192.168.0.1.680: [udp sum ok] UDP, length: 50
00:03:00.637893 IP (tos 0x0, ttl  64, id 35322, offset 0, flags [none], length: 111) 192.168.0.2.10080 > 192.168.0.1.680: [udp sum ok] UDP, length: 83
00:03:00.637977 IP (tos 0x0, ttl  64, id 2, offset 0, flags [DF], length: 78) 192.168.0.1.680 > 192.168.0.2.10080: [udp sum ok] UDP, length: 50
00:03:00.638546 IP (tos 0x0, ttl  64, id 3, offset 0, flags [DF], length: 474) 192.168.0.1.680 > 192.168.0.2.10080: [udp sum ok] UDP, length: 446
00:03:00.667236 IP (tos 0x0, ttl  64, id 35323, offset 0, flags [none], length: 78) 192.168.0.2.10080 > 192.168.0.1.680: [udp sum ok] UDP, length: 50
00:03:23.874517 IP (tos 0x0, ttl  64, id 35324, offset 0, flags [none], length: 172) 192.168.0.2.10080 > 192.168.0.1.680: [udp sum ok] UDP, length: 144
00:03:23.874666 IP (tos 0x0, ttl  64, id 7, offset 0, flags [DF], length: 78) 192.168.0.1.680 > 192.168.0.2.10080: [udp sum ok] UDP, length: 50
00:03:28.476532 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], length: 242) 192.168.0.1.683 > 192.168.0.2.10080: [udp sum ok] UDP, length: 214
00:03:28.505893 IP (tos 0x0, ttl  64, id 35325, offset 0, flags [none], length: 78) 192.168.0.2.10080 > 192.168.0.1.683: [udp sum ok] UDP, length: 50
00:03:28.534138 IP (tos 0x0, ttl  64, id 35326, offset 0, flags [none], length: 150) 192.168.0.2.10080 > 192.168.0.1.683: [bad udp cksum a!] UDP, length: 122
00:03:38.542777 IP (tos 0x0, ttl  64, id 35327, offset 0, flags [none], length: 150) 192.168.0.2.10080 > 192.168.0.1.683: [bad udp cksum a!] UDP, length: 122

I can provide hex dumps if desired.

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 4+ messages in thread

* CULPRIT FOUND (was: 2.6.9 network regression killing amanda)
  2004-10-20 22:10   ` 2.6.9 network regression killing amanda (was: 2.6.9 network regression killing amanda - 3c59x?) Matthias Andree
@ 2004-11-04 12:17     ` Matthias Andree
  0 siblings, 0 replies; 4+ messages in thread
From: Matthias Andree @ 2004-11-04 12:17 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel, linux-net

I found the issue that was breaking my amanda dumps.

Using the ip_conntrack_amanda component (either compiled into the kernel
or loaded as a module) corrupts the packets by replacing one LF by a NUL
byte. This in consequence invalidates the checksum, causing the packets
to be discarded.

Workaround: unload ip_conntrack_amanda.

Fix: I've sent a patch in separate mail, Subject:
[BK PATCH] Fix ip_conntrack_amanda data corruption bug that breaks amanda dumps

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-11-04 12:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-20 19:12 2.6.9 network regression killing amanda - 3c59x? Matthias Andree
2004-10-20 21:24 ` Andrew Morton
2004-10-20 22:10   ` 2.6.9 network regression killing amanda (was: 2.6.9 network regression killing amanda - 3c59x?) Matthias Andree
2004-11-04 12:17     ` CULPRIT FOUND (was: 2.6.9 network regression killing amanda) Matthias Andree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).