* netdev issues (3c905B) @ 2001-02-21 0:06 Vibol Hou 2001-02-21 0:21 ` Martin Moerman ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Vibol Hou @ 2001-02-21 0:06 UTC (permalink / raw) To: Linux-Kernel Hi, I have some problems on a heavily loaded web server. The first is that the kernel is spitting out a bunch of "NETDEV WATCHDOG: eth0: transmit timed out" errors. I do not recall this happening in 2.4.0 under the same conditions. Another problem that I seem to have, of which I have had reports from clients, is that the server has problems talking to clients using modems This didn't occur before with the 2.2 series kernel (all other things held constant). It seems each time a client tries to load up any site on the server, the connection will just die (or stall). This does not apply to high-bandwidth connections (DSL and up) since everything seems fine on DSL and faster, but I tried connecting using my dial-up account with Earthlink, and the reports seem to be true. Can those of you on a 56k modem try connecting to http://khmerconnection.com and see if the page loads? Apache isn't the only service affected. It seems *any* TCP communication runs like a turtle (even SSH. takes minutes to login, then minutes to echo each letter. doesn't do this on a DSL connection from the same computer). The card that is exhibiting this problem is a 3c905B (lspci below): 00:08.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30) Subsystem: 3Com Corporation: Unknown device 9055 Flags: bus master, medium devsel, latency 80, IRQ 17 I/O ports at e400 [size=128] Memory at e8001000 (32-bit, non-prefetchable) [size=128] Expansion ROM at e4000000 [disabled] [size=128K] Capabilities: [dc] Power Management version 1 dmesg shows hordes of these at high peak usage (300KBps+): NETDEV WATCHDOG: eth0: transmit timed out eth0: transmit timed out, tx_status 00 status e601. diagnostics: net 0cd8 media 8880 dma 0000003a. eth0: Interrupt posted but not delivered -- IRQ blocked by another device? Flags; bus-master 1, full 0; dirty 9256291(3) current 9256291(3). Transmit list 00000000 vs. f7de5230. 0: @f7de5200 length 80000042 status 00010042 1: @f7de5210 length 8000004a status 8001004a 2: @f7de5220 length 80000036 status 80010036 3: @f7de5230 length 80000036 status 00010036 4: @f7de5240 length 80000042 status 00010042 5: @f7de5250 length 80000036 status 00010036 6: @f7de5260 length 800005ea status 000105ea 7: @f7de5270 length 800005ea status 000105ea 8: @f7de5280 length 8000003a status 0001003a 9: @f7de5290 length 8000003e status 0001003e 10: @f7de52a0 length 8000003a status 0001003a 11: @f7de52b0 length 8000003e status 0001003e 12: @f7de52c0 length 8000003e status 0001003e 13: @f7de52d0 length 8000004a status 0001004a 14: @f7de52e0 length 8000004a status 0001004a 15: @f7de52f0 length 8000003e status 0001003e eth0: Resetting the Tx ring pointer. Any ideas? Thanks, -- Vibol Hou ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netdev issues (3c905B) 2001-02-21 0:06 netdev issues (3c905B) Vibol Hou @ 2001-02-21 0:21 ` Martin Moerman 2001-02-21 0:34 ` Vibol Hou 2001-02-21 9:47 ` 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) Ookhoi 2001-02-21 10:57 ` David S. Miller 2 siblings, 1 reply; 18+ messages in thread From: Martin Moerman @ 2001-02-21 0:21 UTC (permalink / raw) To: Vibol Hou; +Cc: Linux-Kernel Vibol, I see that the card is on IRQ 17 ??? can you send us /proc/interrupts /Martin On Tue, 20 Feb 2001, Vibol Hou wrote: > Hi, > > I have some problems on a heavily loaded web server. The first is that the > kernel is spitting out a bunch of "NETDEV WATCHDOG: eth0: transmit timed > out" errors. I do not recall this happening in 2.4.0 under the same > conditions. > > Another problem that I seem to have, of which I have had reports from > clients, is that the server has problems talking to clients using modems > This didn't occur before with the 2.2 series kernel (all other things held > constant). It seems each time a client tries to load up any site on the > server, the connection will just die (or stall). This does not apply to > high-bandwidth connections (DSL and up) since everything seems fine on DSL > and faster, but I tried connecting using my dial-up account with Earthlink, > and the reports seem to be true. Can those of you on a 56k modem try > connecting to http://khmerconnection.com and see if the page loads? Apache > isn't the only service affected. It seems *any* TCP communication runs like > a turtle (even SSH. takes minutes to login, then minutes to echo each > letter. doesn't do this on a DSL connection from the same computer). > > The card that is exhibiting this problem is a 3c905B (lspci below): > > 00:08.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] > (rev 30) > Subsystem: 3Com Corporation: Unknown device 9055 > Flags: bus master, medium devsel, latency 80, IRQ 17 > I/O ports at e400 [size=128] > Memory at e8001000 (32-bit, non-prefetchable) [size=128] > Expansion ROM at e4000000 [disabled] [size=128K] > Capabilities: [dc] Power Management version 1 > > dmesg shows hordes of these at high peak usage (300KBps+): > > NETDEV WATCHDOG: eth0: transmit timed out > eth0: transmit timed out, tx_status 00 status e601. > diagnostics: net 0cd8 media 8880 dma 0000003a. > eth0: Interrupt posted but not delivered -- IRQ blocked by another device? > Flags; bus-master 1, full 0; dirty 9256291(3) current 9256291(3). > Transmit list 00000000 vs. f7de5230. > 0: @f7de5200 length 80000042 status 00010042 > 1: @f7de5210 length 8000004a status 8001004a > 2: @f7de5220 length 80000036 status 80010036 > 3: @f7de5230 length 80000036 status 00010036 > 4: @f7de5240 length 80000042 status 00010042 > 5: @f7de5250 length 80000036 status 00010036 > 6: @f7de5260 length 800005ea status 000105ea > 7: @f7de5270 length 800005ea status 000105ea > 8: @f7de5280 length 8000003a status 0001003a > 9: @f7de5290 length 8000003e status 0001003e > 10: @f7de52a0 length 8000003a status 0001003a > 11: @f7de52b0 length 8000003e status 0001003e > 12: @f7de52c0 length 8000003e status 0001003e > 13: @f7de52d0 length 8000004a status 0001004a > 14: @f7de52e0 length 8000004a status 0001004a > 15: @f7de52f0 length 8000003e status 0001003e > eth0: Resetting the Tx ring pointer. > > Any ideas? > > Thanks, > -- > Vibol Hou > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: netdev issues (3c905B) 2001-02-21 0:21 ` Martin Moerman @ 2001-02-21 0:34 ` Vibol Hou 0 siblings, 0 replies; 18+ messages in thread From: Vibol Hou @ 2001-02-21 0:34 UTC (permalink / raw) To: Martin Moerman; +Cc: Linux-Kernel Hi Martin, Here's /proc/interrupts: CPU0 CPU1 0: 2748043 2754927 IO-APIC-edge timer 1: 2 0 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 4: 2737 2892 IO-APIC-edge serial 17: 9573612 9568840 IO-APIC-level eth0 18: 483436 482421 IO-APIC-level aic7xxx NMI: 5505505 5505399 LOC: 5502609 5502508 ERR: 0 -Vibol -----Original Message----- From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Martin Moerman Sent: Tuesday, February 20, 2001 4:22 PM To: Vibol Hou Cc: Linux-Kernel Subject: Re: netdev issues (3c905B) Vibol, I see that the card is on IRQ 17 ??? can you send us /proc/interrupts /Martin On Tue, 20 Feb 2001, Vibol Hou wrote: > Hi, > > I have some problems on a heavily loaded web server. The first is that the > kernel is spitting out a bunch of "NETDEV WATCHDOG: eth0: transmit timed > out" errors. I do not recall this happening in 2.4.0 under the same > conditions. > > Another problem that I seem to have, of which I have had reports from > clients, is that the server has problems talking to clients using modems > This didn't occur before with the 2.2 series kernel (all other things held > constant). It seems each time a client tries to load up any site on the > server, the connection will just die (or stall). This does not apply to > high-bandwidth connections (DSL and up) since everything seems fine on DSL > and faster, but I tried connecting using my dial-up account with Earthlink, > and the reports seem to be true. Can those of you on a 56k modem try > connecting to http://khmerconnection.com and see if the page loads? Apache > isn't the only service affected. It seems *any* TCP communication runs like > a turtle (even SSH. takes minutes to login, then minutes to echo each > letter. doesn't do this on a DSL connection from the same computer). > > The card that is exhibiting this problem is a 3c905B (lspci below): > > 00:08.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] > (rev 30) > Subsystem: 3Com Corporation: Unknown device 9055 > Flags: bus master, medium devsel, latency 80, IRQ 17 > I/O ports at e400 [size=128] > Memory at e8001000 (32-bit, non-prefetchable) [size=128] > Expansion ROM at e4000000 [disabled] [size=128K] > Capabilities: [dc] Power Management version 1 > > dmesg shows hordes of these at high peak usage (300KBps+): > > NETDEV WATCHDOG: eth0: transmit timed out > eth0: transmit timed out, tx_status 00 status e601. > diagnostics: net 0cd8 media 8880 dma 0000003a. > eth0: Interrupt posted but not delivered -- IRQ blocked by another device? > Flags; bus-master 1, full 0; dirty 9256291(3) current 9256291(3). > Transmit list 00000000 vs. f7de5230. > 0: @f7de5200 length 80000042 status 00010042 > 1: @f7de5210 length 8000004a status 8001004a > 2: @f7de5220 length 80000036 status 80010036 > 3: @f7de5230 length 80000036 status 00010036 > 4: @f7de5240 length 80000042 status 00010042 > 5: @f7de5250 length 80000036 status 00010036 > 6: @f7de5260 length 800005ea status 000105ea > 7: @f7de5270 length 800005ea status 000105ea > 8: @f7de5280 length 8000003a status 0001003a > 9: @f7de5290 length 8000003e status 0001003e > 10: @f7de52a0 length 8000003a status 0001003a > 11: @f7de52b0 length 8000003e status 0001003e > 12: @f7de52c0 length 8000003e status 0001003e > 13: @f7de52d0 length 8000004a status 0001004a > 14: @f7de52e0 length 8000004a status 0001004a > 15: @f7de52f0 length 8000003e status 0001003e > eth0: Resetting the Tx ring pointer. > > Any ideas? > > Thanks, > -- > Vibol Hou > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 0:06 netdev issues (3c905B) Vibol Hou 2001-02-21 0:21 ` Martin Moerman @ 2001-02-21 9:47 ` Ookhoi 2001-02-21 13:12 ` Gregory Maxwell 2001-02-21 10:57 ` David S. Miller 2 siblings, 1 reply; 18+ messages in thread From: Ookhoi @ 2001-02-21 9:47 UTC (permalink / raw) To: Vibol Hou; +Cc: Linux-Kernel, sim Hi! > Another problem that I seem to have, of which I have had reports from > clients, is that the server has problems talking to clients using modems > This didn't occur before with the 2.2 series kernel (all other things held > constant). It seems each time a client tries to load up any site on the > server, the connection will just die (or stall). This does not apply to > high-bandwidth connections (DSL and up) since everything seems fine on DSL > and faster, but I tried connecting using my dial-up account with Earthlink, > and the reports seem to be true. Can those of you on a 56k modem try > connecting to http://khmerconnection.com and see if the page loads? Apache > isn't the only service affected. It seems *any* TCP communication runs like > a turtle (even SSH. takes minutes to login, then minutes to echo each > letter. doesn't do this on a DSL connection from the same computer). > > The card that is exhibiting this problem is a 3c905B (lspci below): [cut] We have exactly the same problem but in our case it depends on the following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip header compression turned on, 3, a free internet access provider in Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster connection'). If we remove one of the three conditions, the connection is oke. It is only tcp which is affected. A packet on its way from linux server to windows client seems to get dropped once and retransmitted. This makes the connection _very_ slow. It seemes that Simon has the same problem. Can I provide tcp dumps to help and find the cause to this problem? Not sure yet this is only with 3com nics. Will test that. Ookhoi Date: Fri, 16 Feb 2001 20:02:11 -0500 From: Simon Kirby <sim@stormix.com> To: linux-kernel@vger.kernel.org, davem@redhat.com Cc: Alan Evetts <alane@netnation.com> Subject: Re: 2.4 TCP(?) timeouts On Fri, Feb 16, 2001 at 07:08:05PM -0500, Simon Kirby wrote: > Hello, > > Today we put 2.4.1 on our mail server after having see it perform well on > some other boxes. It seems now we are receiving a few calls every hour > from customers reporting that the server tends to hang and eventually > time out on them when downloading mail. All customers that have reported > this problem so far are on a didalup connection. Apparently the server > will stop transmitting data (or the client seems to think so), and then > their mail client will time out. We recorded a trace on the mail server end to one of the customers having the problem. At first they closed the connection because their mail client was set to a timeout of 1 minute, but then when they changed it to 5 seconds, it seemed to limp along further. It seems to me just like there's a huge amount of packet loss, but pinging the machine just after this shows 0% loss (just occasional jumps in response time). During this trace, when long periods of nothing went by, "netstat -tan |grep ip" showed nothing abnormal: a 0 byte receive queue and some data in the send queue equal to what would be retransmitted and eventually go through two minutes later. nmap: Remote operating system guess: Windows 2000 Professional, Build 2128 16:26:14.738836 < client.1104 > mail.pop3: S 1263956200:1263956200(0) win 8760 <mss 536,nop,nop,sackOK> (DF) 16:26:14.738888 > mail.pop3 > client.1104: S 26894293:26894293(0) ack 1263956201 win 5840 <mss 1460,nop,nop,sackOK> (DF) 16:26:15.014145 < client.1104 > mail.pop3: . 1:1(0) ack 1 win 9112 (DF) 16:26:15.014866 > mail.pop3 > client.1104: P 1:92(91) ack 1 win 5840 (DF) 16:26:15.291998 < client.1104 > mail.pop3: P 1:16(15) ack 92 win 9021 (DF) 16:26:15.292199 > mail.pop3 > client.1104: . 92:92(0) ack 16 win 5840 (DF) 16:26:15.292305 > mail.pop3 > client.1104: P 92:115(23) ack 16 win 5840 (DF) 16:26:16.686295 > mail.pop3 > client.1104: P 92:115(23) ack 16 win 5840 (DF) 16:26:16.954563 < client.1104 > mail.pop3: P 16:30(14) ack 115 win 8998 (DF) 16:26:16.976908 > mail.pop3 > client.1104: P 115:137(22) ack 30 win 5840 (DF) 16:26:19.776322 > mail.pop3 > client.1104: P 115:137(22) ack 30 win 5840 (DF) 16:26:20.033951 < client.1104 > mail.pop3: P 30:36(6) ack 137 win 8976 (DF) 16:26:20.034063 > mail.pop3 > client.1104: P 137:149(12) ack 36 win 5840 (DF) 16:26:25.626301 > mail.pop3 > client.1104: P 137:149(12) ack 36 win 5840 (DF) 16:26:25.922151 < client.1104 > mail.pop3: P 36:42(6) ack 149 win 8964 (DF) 16:26:25.922254 > mail.pop3 > client.1104: P 149:219(70) ack 42 win 5840 (DF) 16:26:36.949499 < client.1104 > mail.pop3: P 36:42(6) ack 149 win 8964 (DF) 16:26:36.949533 > mail.pop3 > client.1104: . 219:219(0) ack 42 win 5840 <nop,nop, sack 1 {36:42} > (DF) 16:26:37.116302 > mail.pop3 > client.1104: P 149:219(70) ack 42 win 5840 (DF) 16:26:37.380554 < client.1104 > mail.pop3: P 42:50(8) ack 219 win 8894 (DF) 16:26:37.380645 > mail.pop3 > client.1104: . 219:219(0) ack 50 win 5840 (DF) 16:26:37.380709 > mail.pop3 > client.1104: P 219:231(12) ack 50 win 5840 (DF) 16:26:59.567440 < client.1104 > mail.pop3: P 42:50(8) ack 219 win 8894 (DF) 16:26:59.567476 > mail.pop3 > client.1104: . 231:231(0) ack 50 win 5840 <nop,nop, sack 1 {42:50} > (DF) 16:26:59.776301 > mail.pop3 > client.1104: P 219:231(12) ack 50 win 5840 (DF) 16:27:00.043125 < client.1104 > mail.pop3: P 50:59(9) ack 231 win 8882 (DF) 16:27:00.043186 > mail.pop3 > client.1104: . 231:231(0) ack 59 win 5840 (DF) 16:27:00.043475 > mail.pop3 > client.1104: . 231:767(536) ack 59 win 5840 (DF) 16:27:00.043491 > mail.pop3 > client.1104: P 767:1220(453) ack 59 win 5840 (DF) 16:27:44.399831 < client.1104 > mail.pop3: P 50:59(9) ack 231 win 8882 (DF) 16:27:44.399869 > mail.pop3 > client.1104: . 1220:1220(0) ack 59 win 5840 <nop,nop, sack 1 {50:59} > (DF) 16:27:44.836304 > mail.pop3 > client.1104: . 231:767(536) ack 59 win 5840 (DF) 16:27:45.295946 < client.1104 > mail.pop3: . 59:59(0) ack 767 win 9112 (DF) 16:27:45.296003 > mail.pop3 > client.1104: P 767:1220(453) ack 59 win 5840 (DF) 16:29:14.886322 > mail.pop3 > client.1104: P 767:1220(453) ack 59 win 5840 (DF) 16:29:15.264417 < client.1104 > mail.pop3: P 59:67(8) ack 1220 win 8659 (DF) 16:29:15.264479 > mail.pop3 > client.1104: . 1220:1220(0) ack 67 win 5840 (DF) 16:29:15.265127 > mail.pop3 > client.1104: . 1220:1756(536) ack 67 win 5840 (DF) 16:29:15.265145 > mail.pop3 > client.1104: . 1756:2292(536) ack 67 win 5840 (DF) 16:30:45.187652 < client.1104 > mail.pop3: P 59:67(8) ack 1220 win 8659 (DF) 16:30:45.187727 > mail.pop3 > client.1104: . 2292:2292(0) ack 67 win 5840 <nop,nop, sack 1 {59:67} > (DF) 16:31:16.326378 > mail.pop3 > client.1104: . 1220:1756(536) ack 67 win 5840 (DF) 16:31:17.513053 < client.1104 > mail.pop3: . 67:67(0) ack 1756 win 9112 (DF) 16:31:17.513129 > mail.pop3 > client.1104: . 1756:2292(536) ack 67 win 5840 (DF) 16:31:17.513143 > mail.pop3 > client.1104: . 2292:2828(536) ack 67 win 5840 (DF) 16:33:17.506376 > mail.pop3 > client.1104: . 1756:2292(536) ack 67 win 5840 (DF) 16:33:17.919146 < client.1104 > mail.pop3: . 67:67(0) ack 2292 win 9112 (DF) 16:33:17.919198 > mail.pop3 > client.1104: . 2292:2828(536) ack 67 win 5840 (DF) 16:33:17.919211 > mail.pop3 > client.1104: . 2828:3364(536) ack 67 win 5840 (DF) 16:35:17.916383 > mail.pop3 > client.1104: . 2292:2828(536) ack 67 win 5840 (DF) 16:35:18.401250 < client.1104 > mail.pop3: . 67:67(0) ack 2828 win 9112 (DF) 16:35:18.401394 > mail.pop3 > client.1104: . 2828:3364(536) ack 67 win 5840 (DF) 16:35:18.401414 > mail.pop3 > client.1104: . 3364:3900(536) ack 67 win 5840 (DF) 16:37:18.396373 > mail.pop3 > client.1104: . 2828:3364(536) ack 67 win 5840 (DF) 16:37:21.763859 < client.1104 > mail.pop3: . 67:67(0) ack 3364 win 9112 (DF) 16:37:21.764049 > mail.pop3 > client.1104: . 3364:3900(536) ack 67 win 5840 (DF) 16:37:21.764062 > mail.pop3 > client.1104: . 3900:4436(536) ack 67 win 5840 (DF) 16:42:22.308578 < client.1104 > mail.pop3: F 67:67(0) ack 3364 win 9112 (DF) 16:42:22.308625 > mail.pop3 > client.1104: R 26897657:26897657(0) win 0 (DF) I'm not sure how the last part happened, but I'm guessing the server was waiting on the next transmit to send that it had already closed the connection, and the RST was sent out as a response to the socket already being closed locally when the customer eventually closed the connection. Would any of the networking changes in 2.4.1pre3 affect what is happening here? Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ sim@stormix.com ][ sim@netnation.com ] [ Opinions expressed are not necessarily those of my employers. ] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 9:47 ` 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) Ookhoi @ 2001-02-21 13:12 ` Gregory Maxwell 0 siblings, 0 replies; 18+ messages in thread From: Gregory Maxwell @ 2001-02-21 13:12 UTC (permalink / raw) To: Ookhoi; +Cc: Vibol Hou, Linux-Kernel, sim On Wed, Feb 21, 2001 at 10:47:24AM +0100, Ookhoi wrote: [snip] > We have exactly the same problem but in our case it depends on the > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > header compression turned on, 3, a free internet access provider in > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > connection'). > If we remove one of the three conditions, the connection is oke. It is > only tcp which is affected. > A packet on its way from linux server to windows client seems to get > dropped once and retransmitted. This makes the connection _very_ slow. [snip] It's been true for some time now that there are several firewalls, RAS, and NAT devices that break TCP connections in subtile but horrible ways when they encounter SACK, timestamps, have header compression enabled, or other 'exotic' features. Has anyone compiled a list of such bugs so that a test application could be created? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 0:06 netdev issues (3c905B) Vibol Hou 2001-02-21 0:21 ` Martin Moerman 2001-02-21 9:47 ` 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) Ookhoi @ 2001-02-21 10:57 ` David S. Miller 2001-02-21 11:33 ` Ookhoi ` (6 more replies) 2 siblings, 7 replies; 18+ messages in thread From: David S. Miller @ 2001-02-21 10:57 UTC (permalink / raw) To: ookhoi; +Cc: Vibol Hou, Linux-Kernel, sim Ookhoi writes: > We have exactly the same problem but in our case it depends on the > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > header compression turned on, 3, a free internet access provider in > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > connection'). > If we remove one of the three conditions, the connection is oke. It is > only tcp which is affected. > A packet on its way from linux server to windows client seems to get > dropped once and retransmitted. This makes the connection _very_ slow. :-( I hate these buggy systems. Does this patch below fix the performance problem and are the windows clients win2000 or win95? --- include/net/ip.h.~1~ Mon Feb 19 00:12:31 2001 +++ include/net/ip.h Wed Feb 21 02:56:15 2001 @@ -190,9 +190,11 @@ static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) { +#if 0 if (iph->frag_off&__constant_htons(IP_DF)) iph->id = 0; else +#endif __ip_select_ident(iph, dst); } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller @ 2001-02-21 11:33 ` Ookhoi 2001-02-21 17:17 ` Ookhoi ` (5 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: Ookhoi @ 2001-02-21 11:33 UTC (permalink / raw) To: David S. Miller; +Cc: Vibol Hou, Linux-Kernel, sim Hi David, > > We have exactly the same problem but in our case it depends on the > > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > > header compression turned on, 3, a free internet access provider in > > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > > connection'). > > If we remove one of the three conditions, the connection is oke. It is > > only tcp which is affected. > > A packet on its way from linux server to windows client seems to get > > dropped once and retransmitted. This makes the connection _very_ slow. > > :-( I hate these buggy systems. > > Does this patch below fix the performance problem and are the windows > clients win2000 or win95? It is 95 in our case. I'll test the patch today and report back to you. Thanks a lot! Ookhoi > --- include/net/ip.h.~1~ Mon Feb 19 00:12:31 2001 > +++ include/net/ip.h Wed Feb 21 02:56:15 2001 > @@ -190,9 +190,11 @@ > > static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) > { > +#if 0 > if (iph->frag_off&__constant_htons(IP_DF)) > iph->id = 0; > else > +#endif > __ip_select_ident(iph, dst); > } > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller 2001-02-21 11:33 ` Ookhoi @ 2001-02-21 17:17 ` Ookhoi 2001-02-21 19:06 ` Vibol Hou ` (4 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: Ookhoi @ 2001-02-21 17:17 UTC (permalink / raw) To: David S. Miller; +Cc: Vibol Hou, Linux-Kernel, sim Hi David! > > We have exactly the same problem but in our case it depends on the > > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > > header compression turned on, 3, a free internet access provider in > > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > > connection'). > > If we remove one of the three conditions, the connection is oke. It is > > only tcp which is affected. > > A packet on its way from linux server to windows client seems to get > > dropped once and retransmitted. This makes the connection _very_ slow. > > :-( I hate these buggy systems. > > Does this patch below fix the performance problem and are the windows > clients win2000 or win95? Yes, the problem is fixed! Thank you very much. :-) 'great' patch! Ookhoi > --- include/net/ip.h.~1~ Mon Feb 19 00:12:31 2001 > +++ include/net/ip.h Wed Feb 21 02:56:15 2001 > @@ -190,9 +190,11 @@ > > static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) > { > +#if 0 > if (iph->frag_off&__constant_htons(IP_DF)) > iph->id = 0; > else > +#endif > __ip_select_ident(iph, dst); > } ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller 2001-02-21 11:33 ` Ookhoi 2001-02-21 17:17 ` Ookhoi @ 2001-02-21 19:06 ` Vibol Hou 2001-02-21 19:22 ` Vibol Hou ` (3 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: Vibol Hou @ 2001-02-21 19:06 UTC (permalink / raw) To: David S. Miller, ookhoi; +Cc: Linux-Kernel, sim Win2K here, I'll apply the patch and let you know what happens. -Vibol -----Original Message----- From: David S. Miller [mailto:davem@redhat.com] Sent: Wednesday, February 21, 2001 2:57 AM To: ookhoi@dds.nl Cc: Vibol Hou; Linux-Kernel; sim@stormix.com Subject: Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) Ookhoi writes: > We have exactly the same problem but in our case it depends on the > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > header compression turned on, 3, a free internet access provider in > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > connection'). > If we remove one of the three conditions, the connection is oke. It is > only tcp which is affected. > A packet on its way from linux server to windows client seems to get > dropped once and retransmitted. This makes the connection _very_ slow. :-( I hate these buggy systems. Does this patch below fix the performance problem and are the windows clients win2000 or win95? --- include/net/ip.h.~1~ Mon Feb 19 00:12:31 2001 +++ include/net/ip.h Wed Feb 21 02:56:15 2001 @@ -190,9 +190,11 @@ static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) { +#if 0 if (iph->frag_off&__constant_htons(IP_DF)) iph->id = 0; else +#endif __ip_select_ident(iph, dst); } ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller ` (2 preceding siblings ...) 2001-02-21 19:06 ` Vibol Hou @ 2001-02-21 19:22 ` Vibol Hou 2001-02-21 22:30 ` Jordan Mendelson ` (2 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: Vibol Hou @ 2001-02-21 19:22 UTC (permalink / raw) To: David S. Miller, ookhoi; +Cc: Linux-Kernel, sim It looks like the patch fixed the problem. TCP communications over modem seems fine now with the same settings that didnt' work earlier. -Vibol -----Original Message----- From: David S. Miller [mailto:davem@redhat.com] Sent: Wednesday, February 21, 2001 2:57 AM To: ookhoi@dds.nl Cc: Vibol Hou; Linux-Kernel; sim@stormix.com Subject: Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) Ookhoi writes: > We have exactly the same problem but in our case it depends on the > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > header compression turned on, 3, a free internet access provider in > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > connection'). > If we remove one of the three conditions, the connection is oke. It is > only tcp which is affected. > A packet on its way from linux server to windows client seems to get > dropped once and retransmitted. This makes the connection _very_ slow. :-( I hate these buggy systems. Does this patch below fix the performance problem and are the windows clients win2000 or win95? --- include/net/ip.h.~1~ Mon Feb 19 00:12:31 2001 +++ include/net/ip.h Wed Feb 21 02:56:15 2001 @@ -190,9 +190,11 @@ static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) { +#if 0 if (iph->frag_off&__constant_htons(IP_DF)) iph->id = 0; else +#endif __ip_select_ident(iph, dst); } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller ` (3 preceding siblings ...) 2001-02-21 19:22 ` Vibol Hou @ 2001-02-21 22:30 ` Jordan Mendelson 2001-02-22 8:28 ` Ookhoi 2001-02-21 23:49 ` Jordan Mendelson 2001-02-21 23:52 ` David S. Miller 6 siblings, 1 reply; 18+ messages in thread From: Jordan Mendelson @ 2001-02-21 22:30 UTC (permalink / raw) To: David S. Miller; +Cc: ookhoi, Vibol Hou, Linux-Kernel, sim "David S. Miller" wrote: > > Ookhoi writes: > > We have exactly the same problem but in our case it depends on the > > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > > header compression turned on, 3, a free internet access provider in > > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > > connection'). > > If we remove one of the three conditions, the connection is oke. It is > > only tcp which is affected. > > A packet on its way from linux server to windows client seems to get > > dropped once and retransmitted. This makes the connection _very_ slow. > > :-( I hate these buggy systems. > > Does this patch below fix the performance problem and are the windows > clients win2000 or win95? I wanted to see if this would fix the problem I was seeing with Win9x users on PPP w/ compression dialing up to Earthlink in the bay area (there are others, but it's the only one I can reproduce). I compiled 2.4.1 with this change and for some odd reason, the kernel started dropping packets and became unusable (couldn't ssh in) after around 4050 connections were opened. I tested it also with 2.4.1-ac20 and had the same problem right around 4050 connections. This is on a VA Linux box with dual eepro100's (one used) connected to a Cisco 6509. > --- include/net/ip.h.~1~ Mon Feb 19 00:12:31 2001 > +++ include/net/ip.h Wed Feb 21 02:56:15 2001 > @@ -190,9 +190,11 @@ > > static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) > { > +#if 0 > if (iph->frag_off&__constant_htons(IP_DF)) > iph->id = 0; > else > +#endif > __ip_select_ident(iph, dst); > } > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 22:30 ` Jordan Mendelson @ 2001-02-22 8:28 ` Ookhoi 0 siblings, 0 replies; 18+ messages in thread From: Ookhoi @ 2001-02-22 8:28 UTC (permalink / raw) To: Jordan Mendelson; +Cc: David S. Miller, Vibol Hou, Linux-Kernel, sim Hi Jordan, > > > We have exactly the same problem but in our case it depends on the > > > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > > > header compression turned on, 3, a free internet access provider in > > > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > > > connection'). > > > If we remove one of the three conditions, the connection is oke. It is > > > only tcp which is affected. > > > A packet on its way from linux server to windows client seems to get > > > dropped once and retransmitted. This makes the connection _very_ slow. > > > > :-( I hate these buggy systems. > > > > Does this patch below fix the performance problem and are the windows > > clients win2000 or win95? > > I wanted to see if this would fix the problem I was seeing with Win9x > users on PPP w/ compression dialing up to Earthlink in the bay area > (there are others, but it's the only one I can reproduce). > > I compiled 2.4.1 with this change and for some odd reason, the kernel > started dropping packets and became unusable (couldn't ssh in) after > around 4050 connections were opened. I tested it also with 2.4.1-ac20 > and had the same problem right around 4050 connections. > > This is on a VA Linux box with dual eepro100's (one used) connected to a > Cisco 6509. I patched two computers, 2.4.1-ac20. One of them is a fairly loaded webserver. Both have an uptime of 15.15 and 16.30 hours, and are fine. Didn't test with that much connections though. Ookhoi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller ` (4 preceding siblings ...) 2001-02-21 22:30 ` Jordan Mendelson @ 2001-02-21 23:49 ` Jordan Mendelson 2001-02-21 23:52 ` David S. Miller 6 siblings, 0 replies; 18+ messages in thread From: Jordan Mendelson @ 2001-02-21 23:49 UTC (permalink / raw) To: David S. Miller; +Cc: ookhoi, Vibol Hou, Linux-Kernel, sim, netdev "David S. Miller" wrote: > > Ookhoi writes: > > We have exactly the same problem but in our case it depends on the > > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > > header compression turned on, 3, a free internet access provider in > > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > > connection'). > > If we remove one of the three conditions, the connection is oke. It is > > only tcp which is affected. > > A packet on its way from linux server to windows client seems to get > > dropped once and retransmitted. This makes the connection _very_ slow. > > :-( I hate these buggy systems. > > Does this patch below fix the performance problem and are the windows > clients win2000 or win95? Just a note however... this patch did fix the problem we were seeing with retransmits and Win95 compressed PPP and dialup over earthlink in the bay area. Now, if it didn't have the side effect of dropping packets left and right after ~4000 open connections (simultaneously), I could finally move our production system to 2.4.x. Jordan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 10:57 ` David S. Miller ` (5 preceding siblings ...) 2001-02-21 23:49 ` Jordan Mendelson @ 2001-02-21 23:52 ` David S. Miller 2001-02-22 0:10 ` Jordan Mendelson ` (3 more replies) 6 siblings, 4 replies; 18+ messages in thread From: David S. Miller @ 2001-02-21 23:52 UTC (permalink / raw) To: Jordan Mendelson; +Cc: ookhoi, Vibol Hou, Linux-Kernel, sim, netdev Jordan Mendelson writes: > Now, if it didn't have the side effect of dropping packets left and > right after ~4000 open connections (simultaneously), I could finally > move our production system to 2.4.x. There is no reason my patch should have this effect. All of this is what appears to be a bug in Windows TCP header compression, if the ID field of the IPv4 header does not change then it drops every other packet. The change I posted as-is, is unacceptable because it adds unnecessary cost to a fast path. The final change I actually use will likely involve using the TCP sequence numbers to calculate an "always changing" ID number in the IPv4 headers to placate these broken windows machines. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 23:52 ` David S. Miller @ 2001-02-22 0:10 ` Jordan Mendelson 2001-02-22 0:50 ` Jordan Mendelson ` (2 subsequent siblings) 3 siblings, 0 replies; 18+ messages in thread From: Jordan Mendelson @ 2001-02-22 0:10 UTC (permalink / raw) To: David S. Miller; +Cc: ookhoi, Vibol Hou, Linux-Kernel, sim, netdev "David S. Miller" wrote: > > Jordan Mendelson writes: > > Now, if it didn't have the side effect of dropping packets left and > > right after ~4000 open connections (simultaneously), I could finally > > move our production system to 2.4.x. > > There is no reason my patch should have this effect. My guess is that the fast path prevented the need for looking up the destination in some structure which is limited to ~4K entries (route table?). Jordan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 23:52 ` David S. Miller 2001-02-22 0:10 ` Jordan Mendelson @ 2001-02-22 0:50 ` Jordan Mendelson 2001-02-27 0:21 ` Simon Kirby 2001-02-27 0:26 ` David S. Miller 3 siblings, 0 replies; 18+ messages in thread From: Jordan Mendelson @ 2001-02-22 0:50 UTC (permalink / raw) To: David S. Miller; +Cc: ookhoi, Vibol Hou, Linux-Kernel, sim, netdev "David S. Miller" wrote: > > Jordan Mendelson writes: > > Now, if it didn't have the side effect of dropping packets left and > > right after ~4000 open connections (simultaneously), I could finally > > move our production system to 2.4.x. > > The change I posted as-is, is unacceptable because it adds unnecessary > cost to a fast path. The final change I actually use will likely > involve using the TCP sequence numbers to calculate an "always > changing" ID number in the IPv4 headers to placate these broken > windows machines. Just for kicks I modified the fast path to use a globally incremented count to see if it would fix both Win9x problem and my 4K connection problem and it appears to be working just fine. What probably happened was the sheer number of packets at 4K connections without the fast path just slowed everything down to a crawl. Thanks Dave, Jordan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 23:52 ` David S. Miller 2001-02-22 0:10 ` Jordan Mendelson 2001-02-22 0:50 ` Jordan Mendelson @ 2001-02-27 0:21 ` Simon Kirby 2001-02-27 0:26 ` David S. Miller 3 siblings, 0 replies; 18+ messages in thread From: Simon Kirby @ 2001-02-27 0:21 UTC (permalink / raw) To: David S. Miller; +Cc: Jordan Mendelson, ookhoi, Vibol Hou, Linux-Kernel, netdev On Wed, Feb 21, 2001 at 03:52:37PM -0800, David S. Miller wrote: > There is no reason my patch should have this effect. > > All of this is what appears to be a bug in Windows TCP header > compression, if the ID field of the IPv4 header does not change then > it drops every other packet. > > The change I posted as-is, is unacceptable because it adds unnecessary > cost to a fast path. The final change I actually use will likely > involve using the TCP sequence numbers to calculate an "always > changing" ID number in the IPv4 headers to placate these broken > windows machines. Has such a patch gone in to the kernel yet? Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ sim@stormix.com ][ sim@netnation.com ] [ Opinions expressed are not necessarily those of my employers. ] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) 2001-02-21 23:52 ` David S. Miller ` (2 preceding siblings ...) 2001-02-27 0:21 ` Simon Kirby @ 2001-02-27 0:26 ` David S. Miller 3 siblings, 0 replies; 18+ messages in thread From: David S. Miller @ 2001-02-27 0:26 UTC (permalink / raw) To: Simon Kirby; +Cc: Jordan Mendelson, ookhoi, Vibol Hou, Linux-Kernel, netdev Simon Kirby writes: > Has such a patch gone in to the kernel yet? Yep, it is in both the zerocopy and AC patches. (Linus is away at the moment) Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2001-02-27 0:30 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-02-21 0:06 netdev issues (3c905B) Vibol Hou 2001-02-21 0:21 ` Martin Moerman 2001-02-21 0:34 ` Vibol Hou 2001-02-21 9:47 ` 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B)) Ookhoi 2001-02-21 13:12 ` Gregory Maxwell 2001-02-21 10:57 ` David S. Miller 2001-02-21 11:33 ` Ookhoi 2001-02-21 17:17 ` Ookhoi 2001-02-21 19:06 ` Vibol Hou 2001-02-21 19:22 ` Vibol Hou 2001-02-21 22:30 ` Jordan Mendelson 2001-02-22 8:28 ` Ookhoi 2001-02-21 23:49 ` Jordan Mendelson 2001-02-21 23:52 ` David S. Miller 2001-02-22 0:10 ` Jordan Mendelson 2001-02-22 0:50 ` Jordan Mendelson 2001-02-27 0:21 ` Simon Kirby 2001-02-27 0:26 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).