linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [ircu-development] Slow on high-MTU (local host) connections?
@ 2001-11-07  3:34 Carlo Wood
  2001-11-07  3:52 ` David S. Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Carlo Wood @ 2001-11-07  3:34 UTC (permalink / raw)
  To: linux-kernel

On Mon, Nov 05, 2001 at 09:17:52PM -0500, Entrope wrote:
> GamesNET's seeing oddly long burst times when we connect srvx to ircu
> on the same machine (~1 minute).  These go away when connecting to
> another machine (connected via 100baseT; ~5 seconds).

I have reproduced this: 74 seconds with MTU == 16436 and 1.8 seconds(!)
with MTU == 8000.

I investigated this tonight and although it seems to be a kernel bug.

What typically happens is this (here process 'ircu' is bursting data
over a socket to process 'srvx2', using `localhost'):

UTC(sec.microsec) [who]   syscall <time spend in syscall>

1005101096.583475 [ircu ] send(6, "or the month: instead of /msg au"..., 1092, 0) = 1092 <0.000013>
1005101096.583551 [ircu ] poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN, revents=POLLIN},
                          {fd=4, events=POLLIN}, {fd=2, events=POLLIN}, {fd=1, events=POLLIN},
			  {fd=3, events=POLLIN}, {fd=0, events=POLLIN}], 7, 36000) = 1 <0.557834>
1005101096.583681 [srvx2] read(12, "otice from def, posted 08:38 PM,"..., 218556) = 16628 <0.000125>
1005101096.583867 [srvx2] read(12, "ure because you can\'t accidently"..., 201928) = 5432 <0.000041>
1005101096.583963 [srvx2] read(12, 0x401d089a, 196496) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
1005101096.585688 [srvx2] select(13, [12], [], NULL, {13, 0}) = 1 (in [12], left {12, 440000}) <0.556093>
1005101097.141491 [ircu ] time(NULL)            = 1005101097 <0.000005>
1005101097.141550 [ircu ] recv(5, "zAA NOTICE IXT :[\2users\2] Notice"..., 8192, 0) = 7948 <0.000070>

or

1005101096.040382 [srvx2] read(12, "ed 08:38 PM, 08/05/2001:\r\n:Globa"..., 87316) = 1898 <0.000026>
1005101096.040455 [srvx2] read(12, 0x401ec25e, 85418) = -1 EAGAIN (Resource temporarily unavailable) <0.000012>
1005101096.053036 [srvx2] select(13, [12], [], NULL, {13, 0}) = 1 (in [12], left {12, 740000}) <0.265021>
1005101096.072435 [ircu ] time(NULL)            = 1005101096 <0.000009>
1005101096.072503 [ircu ] send(5, "clan-dk.org Ak6f52 Ih2 :Juped Us"..., 488, 0) = 488 <0.000067>
1005101096.072628 [ircu ] send(5, "172 3 1005101095 JupeUser21 fake"..., 2048, 0) = 2048 <0.000023>
1005101096.072699 [ircu ] send(5, "SrQjW IiT :Juped User!\r\nI NICK J"..., 2048, 0) = 2048 <0.000021>
1005101096.072767 [ircu ] send(5, "005101095 JupeUser22 fake1.clan-"..., 2048, 0) = 2048 <0.000022>
1005101096.072836 [ircu ] send(5, "jC :Juped User!\r\nI NICK JupeUser"..., 2048, 0) = 2048 <0.000024>
1005101096.072909 [ircu ] send(5, "95 JupeUser22 fake1.clan-dk.org "..., 2048, 0) = 2048 <0.000023>
1005101096.072978 [ircu ] send(5, "ed User!\r\nI NICK JupeUser2290 3 "..., 2048, 0) = 2048 <0.000020>
1005101096.073045 [ircu ] send(5, "User23 fake1.clan-dk.org ATPpVB "..., 2048, 0) = 2048 <0.000024>
1005101096.073115 [ircu ] send(5, "!\r\nI NICK JupeUser2337 3 1005101"..., 2048, 0) = 1560 <0.000023>
1005101096.073190 [ircu ] poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN|POLLOUT, revents=POLLOUT},
                          {fd=4, events=POLLIN}, {fd=2, events=POLLIN}, {fd=1, events=POLLIN},
			  {fd=3, events=POLLIN}, {fd=0, events=POLLIN}], 7, 36000) = 1 <0.233812>
1005101096.307098 [ircu ] time(NULL)            = 1005101096 <0.000005>
1005101096.307154 [ircu ] send(5, "d User!\r\nI NICK JupeUser2355 3 1"..., 488, 0) = 488 <0.000035>

So basically, the long delay inside poll/select happens when
svrx enters select() *after* it got an EAGAIN error on a read().

I suggest to fix this for now by changing srvx to do the following:

- call read() with a small buffer (8 kb or something) and continue
  to call it until an ammount is returned that is less than this 8 kb.
  Then call select() again to see if there is more data.

The point is that when read() returns less than what you ask for,
there will not be more data -- unless you ask for ridiculous ammounts
of data, as you do; then of course it can never return that ammount.

-- 
Carlo Wood <carlo@alinoe.com>

PS CC to kernel list, because I think this delay should not be there.
   If anyone on the kernel list knows what causes this, then please
   CC me as I am not subscribed to this list.
PS2 Sent again to the kernel list because I used the wrong address
   the first time.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ircu-development] Slow on high-MTU (local host) connections?
  2001-11-07  3:34 [ircu-development] Slow on high-MTU (local host) connections? Carlo Wood
@ 2001-11-07  3:52 ` David S. Miller
  2001-11-07 15:23   ` Carlo Wood
  2001-11-09 16:04   ` Entrope
  0 siblings, 2 replies; 5+ messages in thread
From: David S. Miller @ 2001-11-07  3:52 UTC (permalink / raw)
  To: carlo; +Cc: linux-kernel


Repeat your experiment with nagle disabled on the sockets
in question.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ircu-development] Slow on high-MTU (local host) connections?
  2001-11-07  3:52 ` David S. Miller
@ 2001-11-07 15:23   ` Carlo Wood
  2001-11-09 16:04   ` Entrope
  1 sibling, 0 replies; 5+ messages in thread
From: Carlo Wood @ 2001-11-07 15:23 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

On Tue, Nov 06, 2001 at 07:52:57PM -0800, David S. Miller wrote:
> Repeat your experiment with nagle disabled on the sockets
> in question.

Thanks, I didn't try it yet - but I bet that will be it.
Sorry for the post to this list.

-- 
Carlo Wood <carlo@alinoe.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ircu-development] Slow on high-MTU (local host) connections?
  2001-11-07  3:52 ` David S. Miller
  2001-11-07 15:23   ` Carlo Wood
@ 2001-11-09 16:04   ` Entrope
  1 sibling, 0 replies; 5+ messages in thread
From: Entrope @ 2001-11-09 16:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: carlo, linux-kernel

Following up on this: Carlo and I both tried disabling Nagle on the
sockets (both ends).  We still saw the same slow sending pattern.

On one end (the "ircu" process in Carlo's original mail), the send and
receive buffers (SO_SNDBUF, SO_RCVBUF) had been set to 8 KB each.  If
these are increased to be larger than the loopback MTU, it makes the
problem go away.  (On the other, "srvx" end, the buffer sizes were
kept at the default.)

Is this expected behavior?  If so, why?  (The references I've found on
the web suggest that the send and receive buffer sizes would only come
into play if Nagle were still enabled.)

-- Entrope

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ircu-development] Slow on high-MTU (local host) connections?
       [not found]   ` <20011107042153.A13705@alinoe.com>
@ 2001-11-07  3:35     ` Carlo Wood
  0 siblings, 0 replies; 5+ messages in thread
From: Carlo Wood @ 2001-11-07  3:35 UTC (permalink / raw)
  To: linux-kernel

The examples in the previous mail are from the case with MTU is 8000
(and were to only two occurances of a EAGAIN for read() actually).

Allow me show the statistics for both MTU's:

MTU 16436:

~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu16436 | grep '\[srvx2\] read' | wc --lines
    323
~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu16436 | grep '\[srvx2\] read.*EAGAIN' | wc --lines
    323

Conclusion: ALL calls to select() that took longer than 0.1 second
were following a call to read() that failed with EAGAIN.  In total 323 times.


MTU 8000:

~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu8000 | grep '\[srvx2\] read' | wc --lines
      2
~>grep -B1 '\[srvx2\] select.*<0\.[1-9]' mtu8000 | grep '\[srvx2\] read.*EAGAIN' | wc --lines
      2

Idem, but only two occurences.


The total number of calls to select in both are respectively:

~>grep '\[srvx2\] select.*' mtu16436 | wc --lines
   1221
~>grep '\[srvx2\] select.*' mtu8000 | wc --lines
   658

-- 
Carlo Wood <carlo@alinoe.com>

(Also forwarded because first I used a wrong address)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-11-09 16:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-07  3:34 [ircu-development] Slow on high-MTU (local host) connections? Carlo Wood
2001-11-07  3:52 ` David S. Miller
2001-11-07 15:23   ` Carlo Wood
2001-11-09 16:04   ` Entrope
     [not found] <87snbstzfz.fsf@sanosuke.troilus.org>
     [not found] ` <20011107041140.A12198@alinoe.com>
     [not found]   ` <20011107042153.A13705@alinoe.com>
2001-11-07  3:35     ` Carlo Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).