* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available [not found] ` <015e01c082ac$4bf9c5e0$0701a8c0@morph> @ 2001-01-20 6:54 ` Michael Lindner 2001-01-20 7:07 ` Chris Wedgwood 0 siblings, 1 reply; 28+ messages in thread From: Michael Lindner @ 2001-01-20 6:54 UTC (permalink / raw) To: Dan Maas, linux-kernel, Chris Wedgwood Dan Maas wrote: > > > OK, if this is the case, how do I alter the scheduling class? > > man sched_setscheduler > > Set SCHED_FIFO or SCHED_RR; you'll need to be root to do this AFAIK. > > I do agree though, Linux's scheduler (for SCHED_OTHER processes) is much > less "ruthless" than, say, the NT scheduler. It's based on slowly-decaying > priorities; unlike in other systems, a high-priority process won't > necessarily pre-empt a low-priority process immediately after waking up. > > Another, less drastic thing to try is simply to increase the number of timer > interrupts per second (and thus the frequency at which scheduling decisions > are made) - see "#define HZ" in include/asm/param.h (you'll need to > recompile the kernel and all modules after changing it). The default for > Intel systems is 100, but I routinely tweak it to 1000. You could probably > go even higher without ill effects. Kernel mods are less drastic than a system call? :^) Reading the documentation for sched_setscheduler, it's unclear whether it would even suffice - it seems to deal with preempting other processes. I am not trying to get into the run queue ahead of other processes, the machine is sitting there IDLE and my process is still not getting to run for a full clock tick. You know, there's one other possibility, and that's if the data that is being sent isn't actually arriving until the next clock tick, which means the delay is in the appearance of sent data, not in select(). Given that the two processes are on the same machine, I would expect a send() on a TCP socket to deliver the data to its destination faster than that, however. -- Mike Lindner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 6:54 ` PROBLEM: select() on TCP socket sleeps for 1 tick even if data available Michael Lindner @ 2001-01-20 7:07 ` Chris Wedgwood 2001-01-20 7:46 ` Michael Lindner [not found] ` <3A694357.1A7C6AAC@att.net> 0 siblings, 2 replies; 28+ messages in thread From: Chris Wedgwood @ 2001-01-20 7:07 UTC (permalink / raw) To: Michael Lindner; +Cc: Dan Maas, linux-kernel On Sat, Jan 20, 2001 at 01:54:23AM -0500, Michael Lindner wrote: You know, there's one other possibility, and that's if the data that is being sent isn't actually arriving until the next clock tick, which means the delay is in the appearance of sent data, not in select(). Given that the two processes are on the same machine, I would expect a send() on a TCP socket to deliver the data to its destination faster than that, however. You can measure this latency; and it's indeed very low (lmbench gives 28 usecs on one of my machines). If process A blocks waiting for data, and process B sleeps after writing this data intended to wake process A, it should wake almost immediately. If you don't see this I would suspect an application bug -- can you use strace or some such and confirm this is not the case? --cw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 7:07 ` Chris Wedgwood @ 2001-01-20 7:46 ` Michael Lindner 2001-01-20 21:58 ` Edgar Toernig [not found] ` <3A694357.1A7C6AAC@att.net> 1 sibling, 1 reply; 28+ messages in thread From: Michael Lindner @ 2001-01-20 7:46 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Dan Maas, linux-kernel [-- Attachment #1: Type: text/plain, Size: 809 bytes --] Chris Wedgwood wrote: > > You can measure this latency; and it's indeed very low (lmbench gives > 28 usecs on one of my machines). > > If you don't see this I would suspect an application bug -- can you > use strace or some such and confirm this is not the case? OK, two new data points (thanks for staying with me here): 1. The problem only occurs when traffic is travelling over DIFFERENT sockets (i.e. A->B->C->D... or A->B->A but using a separate socket for traffic in each direction). 2. I wrote a very ugly program (attached) to reproduce the problem. Lest you think ill of me, most of this isn't actual code I wrote (the actual program that first reproduced the problem was in C++). Just run sockperf localhost 54321 54322 sockperf localhost 54322 54321 1 to see it in action. -- Mike Lindner [-- Attachment #2: sockperf.c --] [-- Type: text/plain, Size: 5007 bytes --] #include <fcntl.h> #include <memory.h> #include <netdb.h> #include <netinet/in.h> #include <signal.h> #include <stdio.h> #include <sys/select.h> #include <sys/types.h> #include <sys/socket.h> #include <varargs.h> #include <netinet/in.h> #include <netdb.h> #include <errno.h> #include <arpa/inet.h> #include <sys/time.h> #include <unistd.h> #ifndef INADDR_NONE #define INADDR_NONE ~0 #endif void errexit(format, va_alist) char *format; va_dcl { va_list args; va_start(args); vfprintf(stderr, format, args); va_end(args); exit(1); } /* * passivesock - allocate & bind a server socket using TCP or UDP */ int passivesock( service, protocol, qlen ) char *service; /* service associeted with the desired port */ char *protocol; /* name of protocol to use ("tcp" or "udp") */ int qlen; /* maximum length of the server request queue */ { struct servent *pse; struct protoent *ppe; struct sockaddr_in sin; int s, type; int one = 1; bzero((char *) & sin, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = INADDR_ANY; /* Map service name to port number */ if ( pse = getservbyname(service, protocol) ) sin.sin_port = htons(ntohs((u_short)pse->s_port)); else if ( (sin.sin_port = htons((u_short)atoi(service))) == 0 ) errexit("can't get \"%s\" service entry\n", service); /* Map protocol name to protocol number */ if ( (ppe = getprotobyname(protocol)) == 0) errexit("can't get \"%s\" protocol entry\n", protocol); /* Use protocol to chose a socket type */ if (strcmp(protocol, "udp") == 0) type = SOCK_DGRAM; else type = SOCK_STREAM; /* Allocate a socket */ s = socket(PF_INET, type, ppe->p_proto); if (s < 0 ) errexit("can't create socket: %s\n", strerror(errno)); setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)); /* Bind the socket */ if (bind(s, (struct sockaddr *) & sin, sizeof(sin)) < 0) errexit("can't bind to %s port: %s\n", service, strerror(errno)); if (type == SOCK_STREAM && listen(s, qlen) < 0) errexit("can't listen on %s port: %s\n", service, strerror(errno)); return s; } int connectsock(host, service, protocol) char *host; char *service; char *protocol; { struct hostent *phe; struct servent *pse; struct protoent *ppe; struct sockaddr_in sin; int s, type; memset(&sin, 0, sizeof(sin)); if (pse = getservbyname(service, protocol)) sin.sin_port = pse->s_port; else if ((sin.sin_port = htons((u_short) atoi(service))) == 0) { fprintf(stderr, "can't get '%s' service entry\n", service); exit(1); } if (phe = gethostbyname(host)) memcpy((char *) &sin.sin_addr, phe->h_addr, phe->h_length); else if ((sin.sin_addr.s_addr = inet_addr(host)) == INADDR_NONE) { fprintf(stderr, "can't get '%s' host entry\n", host); exit(1); } /* if (ppe = getprotobyname(protocol)) { fprintf(stderr, "can't get '%s' protocol entry\n", protocol); exit(1); } if (strcmp(protocol, "udp") == 0) type = SOCK_DGRAM; else type = SOCK_STREAM; */ sin.sin_family = AF_INET; s = socket(AF_INET, SOCK_STREAM, 6); if (s < 0) { perror("can't create socket\n"); exit(1); } if (connect(s, (struct sockaddr *) &sin, sizeof(sin)) < 0) { perror("can't connect to socket"); exit(1); } return s; } void pingpong(int r, int s, int ping) { struct timeval then; struct timeval now; fd_set fds; fd_set readfds; int pings = 0; FD_ZERO(&fds); FD_SET(r, &fds); gettimeofday(&then, 0); if (ping) { send(s, ".", 1, 0); pings++; } readfds = fds; while (select(r+1, &readfds, 0, 0, 0) > 0) { if (FD_ISSET(r, &readfds)) { char buf[1]; int n = read(r, buf, sizeof(buf)); if (n <= 0) { break; } else { if (pings++ < 1000) { send(s, ".", 1, 0); } else { break; } } } else { fprintf(stderr, "fd not set!\n"); } readfds = fds; } gettimeofday(&now, 0); fprintf(stderr, "elapsed time for 1000 pingpongs is %g\n", now.tv_sec - then.tv_sec + (now.tv_usec - then.tv_usec) / 1000000.0); fprintf(stderr, "closing %d\n", r); close(r); fprintf(stderr, "closing %d\n", s); close(s); } main(argc, argv) int argc; char **argv; { char buf[1024]; int n; int s; int f; if (argc < 3) { errexit("usage: %s host port1 port2 [initiate]\n", argv[0]); } signal(SIGPIPE, SIG_IGN); f = passivesock(argv[2], "tcp", 2); if (f < 0) { errexit("listen failed: %s\n", strerror(errno)); } for ( ; ; ) { struct sockaddr_in fsin; int alen = sizeof(fsin); if (argc < 5) { int r = accept(f, (struct sockaddr *) &fsin, &alen); int s = connectsock(argv[1], argv[3], "tcp"); if (r < 0) errexit("accept failed: %s\n", strerror(errno)); if (s < 0) errexit("connect failed: %s\n", strerror(errno)); pingpong(r, s, 0); } else { int s = connectsock(argv[1], argv[3], "tcp"); int r = accept(f, (struct sockaddr *) &fsin, &alen); if (r < 0) errexit("accept failed: %s\n", strerror(errno)); if (s < 0) errexit("connect failed: %s\n", strerror(errno)); pingpong(r, s, 1); break; } } return 0; } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 7:46 ` Michael Lindner @ 2001-01-20 21:58 ` Edgar Toernig 2001-01-21 0:35 ` Dan Maas 0 siblings, 1 reply; 28+ messages in thread From: Edgar Toernig @ 2001-01-20 21:58 UTC (permalink / raw) To: Michael Lindner; +Cc: Chris Wedgwood, Dan Maas, linux-kernel Michael Lindner wrote: >[...] > send(s, ".", 1, 0); >[...] > while (select(r+1, &readfds, 0, 0, 0) > 0) { >[...] >[select returns only after about 1 HZ] Ever heard of nagle? (If not, there's a long thread about it on the mailing list *g*) It's not the select that waits. It's a delay in the tcp send path waiting for more data. Try disabling it: int f=1; setsockopt(s, SOL_TCP, TCP_NODELAY, &f, sizeof(f)); Ciao, ET. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 21:58 ` Edgar Toernig @ 2001-01-21 0:35 ` Dan Maas 2001-01-21 0:34 ` Chris Wedgwood 0 siblings, 1 reply; 28+ messages in thread From: Dan Maas @ 2001-01-21 0:35 UTC (permalink / raw) To: Edgar Toernig, Michael Lindner; +Cc: Chris Wedgwood, linux-kernel > It's not the select that waits. It's a delay in the tcp send > path waiting for more data. Try disabling it: > > int f=1; > setsockopt(s, SOL_TCP, TCP_NODELAY, &f, sizeof(f)); Bingo! With this fix, 2.2.18 performance becomes almost identical to 2.4.0 performance. I assume 2.4.0 disables Nagle by default on local connections... Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-21 0:35 ` Dan Maas @ 2001-01-21 0:34 ` Chris Wedgwood 2001-01-21 1:22 ` Michael Lindner ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Chris Wedgwood @ 2001-01-21 0:34 UTC (permalink / raw) To: Dan Maas; +Cc: Edgar Toernig, Michael Lindner, linux-kernel On Sat, Jan 20, 2001 at 07:35:12PM -0500, Dan Maas wrote: Bingo! With this fix, 2.2.18 performance becomes almost identical to 2.4.0 performance. I assume 2.4.0 disables Nagle by default on local connections... 2.4.x has a smarter nagle algorithm. --cw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-21 0:34 ` Chris Wedgwood @ 2001-01-21 1:22 ` Michael Lindner 2001-01-21 1:29 ` David Schwartz 2001-01-21 3:20 ` Michael Lindner 2001-01-24 20:31 ` Boris Dragovic 2 siblings, 1 reply; 28+ messages in thread From: Michael Lindner @ 2001-01-21 1:22 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Dan Maas, Edgar Toernig, linux-kernel Chris Wedgwood wrote: > > On Sat, Jan 20, 2001 at 07:35:12PM -0500, Dan Maas wrote: > > Bingo! With this fix, 2.2.18 performance becomes almost identical to 2.4.0 > performance. I assume 2.4.0 disables Nagle by default on local > connections... > > 2.4.x has a smarter nagle algorithm. Thanks again for all the help, guys... Haven't installed 2.4 yet, but I tried the setsockoption route. Performance is better, but the two processes together never total more than 50% of the CPU (i.e. the thing is still schedule-bound, not compute bound, as it is on other platforms), and throughput is only up to 800 sends/sec. Better than the 100/sec. I was getting, but still a far cry from the identical box running Windows, where performance is 8K/sec. ...and I still don't understand why the identical program, but using one socket instead of 2 sockets, IS CPU bound, and gets on the order of 10K/sec. on the same HW. Diffs to produce 10K/sec. 1 socket version from my previous sample follow... -- Mike Lindner diff sockperf.c sockperf1.c 163c163 < if (pings++ < 1000) { --- > if (pings++ < 10000) { 177c177 < fprintf(stderr, "elapsed time for 1000 pingpongs is %g\n", now.tv_sec - then.tv_sec + (now.tv_usec - then.tv_usec) / 1000000.0); --- > fprintf(stderr, "elapsed time for 10000 pingpongs is %g\n", now.tv_sec - then.tv_sec + (now.tv_usec - then.tv_usec) / 1000000.0); 205c205 < int s = connectsock(argv[1], argv[3], "tcp"); --- > int s = r; 214c214 < int r = accept(f, (struct sockaddr *) &fsin, &alen); --- > int r = s; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-21 1:22 ` Michael Lindner @ 2001-01-21 1:29 ` David Schwartz 0 siblings, 0 replies; 28+ messages in thread From: David Schwartz @ 2001-01-21 1:29 UTC (permalink / raw) To: Michael Lindner; +Cc: linux-kernel > ...and I still don't understand why the identical program, but using one > socket instead of 2 sockets, IS CPU bound, and gets on the order of > 10K/sec. on the same HW. Diffs to produce 10K/sec. 1 socket version from > my previous sample follow... It's really this simple -- this isn't what TCP is intended for. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-21 0:34 ` Chris Wedgwood 2001-01-21 1:22 ` Michael Lindner @ 2001-01-21 3:20 ` Michael Lindner 2001-04-09 14:54 ` Stephen D. Williams 2001-01-24 20:31 ` Boris Dragovic 2 siblings, 1 reply; 28+ messages in thread From: Michael Lindner @ 2001-01-21 3:20 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Dan Maas, Edgar Toernig, linux-kernel OK, 2.4.0 kernel installed, and a new set of numbers: test kernel ping-pongs/s. @ total CPU util w/SOL_NDELAY sample (2 skts) 2.2.18 100 @ 0.1% 800 @ 1% sample (1 skt) 2.2.18 8000 @ 100% 8000 @ 50% real app 2.2.18 100 @ 0.1% 800 @ 1% sample (2 skts) 2.4.0 8000 @ 50% 8000 @ 50% sample (1 skt) 2.4.0 10000 @ 50% 10000 @ 50% real app 2.4.0 1200 @ 50% 1200 @ 50% real app Windows 2K 4000 @ 100% The two points that still seem strange to me are: 1. The 1 socket case is still 25% faster than the 2 socket case in 2.4.0 (in 2.2.18 the 1 socket case was 10x faster). 2. Linux never devotes more than 50% of the CPU (average over a long run) to the two processes (25% to each process, with the rest of the time idle). I'd really love to show that Linux is a viable platform for our SW, and I think it would be doable if I could figure out how to get the other 50% of my CPU involved. An "strace -rT" of the real app on 2.4.0 looks like this for each ping/pong. 0.052371 send(7, "\0\0\0 \177\0\0\1\3243\0\0\0\2\4\236\216\341\0\0\v\277"..., 32, 0) = 32 <0.000529> 0.000882 rt_sigprocmask(SIG_BLOCK, ~[], [RT_0], 8) = 0 <0.000021> 0.000242 rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0 <0.000021> 0.000173 select(8, [3 4 6 7], NULL, NULL, NULL) = 1 (in [6]) <0.000047> 0.000328 read(6, "\0\0\0 ", 4) = 4 <0.000031> 0.000179 read(6, "\177\0\0\1\3242\0\0\0\2\4\236\216\341\0\0\7\327\177\0\0"..., 28) = 28 <0.000075> -- Mike Lindner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-21 3:20 ` Michael Lindner @ 2001-04-09 14:54 ` Stephen D. Williams 2001-04-09 19:16 ` James Antill 0 siblings, 1 reply; 28+ messages in thread From: Stephen D. Williams @ 2001-04-09 14:54 UTC (permalink / raw) To: Michael Lindner; +Cc: Chris Wedgwood, Dan Maas, Edgar Toernig, linux-kernel An old thread, but important to get these fundamental performance numbers up there: 2.4.2 on an 800mhz PIII Sceptre laptop w/ 512MB ram: elapsed time for 100000 pingpongs is 3.81327 100000/3.81256 ~26229.09541095746689888159 10000/.379912 ~26321.88506812103855629724 26300 compares to 8000/sec. quite well ;-) You didn't give specs for your test machine unfortunately. Since this tests both 'sides' of an application communication, it indicates a 'null transaction' rate of twice that. This was typical cpu usage on a triple run of 10000: CPU states: 7.2% user, 92.7% system, 0.0% nice, 0.0% idle sdw Michael Lindner wrote: > > OK, 2.4.0 kernel installed, and a new set of numbers: > > test kernel ping-pongs/s. @ total CPU util w/SOL_NDELAY > sample (2 skts) 2.2.18 100 @ 0.1% 800 @ 1% > sample (1 skt) 2.2.18 8000 @ 100% 8000 @ 50% > real app 2.2.18 100 @ 0.1% 800 @ 1% > > sample (2 skts) 2.4.0 8000 @ 50% 8000 @ 50% > sample (1 skt) 2.4.0 10000 @ 50% 10000 @ 50% > real app 2.4.0 1200 @ 50% 1200 @ 50% > > real app Windows 2K 4000 @ 100% > > The two points that still seem strange to me are: > > 1. The 1 socket case is still 25% faster than the 2 socket case in 2.4.0 > (in 2.2.18 the 1 socket case was 10x faster). > > 2. Linux never devotes more than 50% of the CPU (average over a long > run) to the two processes (25% to each process, with the rest of the > time idle). > > I'd really love to show that Linux is a viable platform for our SW, and > I think it would be doable if I could figure out how to get the other > 50% of my CPU involved. An "strace -rT" of the real app on 2.4.0 looks > like this for each ping/pong. > > 0.052371 send(7, "\0\0\0 > \177\0\0\1\3243\0\0\0\2\4\236\216\341\0\0\v\277"..., 32, 0) = 32 > <0.000529> > 0.000882 rt_sigprocmask(SIG_BLOCK, ~[], [RT_0], 8) = 0 <0.000021> > 0.000242 rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0 > <0.000021> > 0.000173 select(8, [3 4 6 7], NULL, NULL, NULL) = 1 (in [6]) > <0.000047> > 0.000328 read(6, "\0\0\0 ", 4) = 4 <0.000031> > 0.000179 read(6, > "\177\0\0\1\3242\0\0\0\2\4\236\216\341\0\0\7\327\177\0\0"..., 28) = 28 > <0.000075> > > -- > Mike Lindner > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > Please read the FAQ at http://www.tux.org/lkml/ -- sdw@lig.net http://sdw.st Stephen D. Williams 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax Dec2000 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-04-09 14:54 ` Stephen D. Williams @ 2001-04-09 19:16 ` James Antill 2001-04-10 18:29 ` Stephen D. Williams 0 siblings, 1 reply; 28+ messages in thread From: James Antill @ 2001-04-09 19:16 UTC (permalink / raw) To: Stephen D. Williams Cc: Michael Lindner, Chris Wedgwood, Dan Maas, Edgar Toernig, linux-kernel "Stephen D. Williams" <sdw@lig.net> writes: > An old thread, but important to get these fundamental performance > numbers up there: > > 2.4.2 on an 800mhz PIII Sceptre laptop w/ 512MB ram: > > elapsed time for 100000 pingpongs is > 3.81327 > 100000/3.81256 > ~26229.09541095746689888159 > 10000/.379912 > ~26321.88506812103855629724 > > 26300 compares to 8000/sec. quite well ;-) You didn't give specs for > your test machine unfortunately. > > Since this tests both 'sides' of an application communication, it > indicates a 'null transaction' rate of twice that. > > This was typical cpu usage on a triple run of 10000: > CPU states: 7.2% user, 92.7% system, 0.0% nice, 0.0% idle I seemed to miss the original post, so I can't really comment on the tests. However... > Michael Lindner wrote: > > > > OK, 2.4.0 kernel installed, and a new set of numbers: > > > > test kernel ping-pongs/s. @ total CPU util w/SOL_NDELAY > > sample (2 skts) 2.2.18 100 @ 0.1% 800 @ 1% > > sample (1 skt) 2.2.18 8000 @ 100% 8000 @ 50% > > real app 2.2.18 100 @ 0.1% 800 @ 1% > > > > sample (2 skts) 2.4.0 8000 @ 50% 8000 @ 50% > > sample (1 skt) 2.4.0 10000 @ 50% 10000 @ 50% > > real app 2.4.0 1200 @ 50% 1200 @ 50% > > > > real app Windows 2K 4000 @ 100% > > > > The two points that still seem strange to me are: > > > > 1. The 1 socket case is still 25% faster than the 2 socket case in 2.4.0 > > (in 2.2.18 the 1 socket case was 10x faster). > > > > 2. Linux never devotes more than 50% of the CPU (average over a long > > run) to the two processes (25% to each process, with the rest of the > > time idle). > > > > I'd really love to show that Linux is a viable platform for our SW, and > > I think it would be doable if I could figure out how to get the other > > 50% of my CPU involved. An "strace -rT" of the real app on 2.4.0 looks > > like this for each ping/pong. > > > > 0.052371 send(7, "\0\0\0 > > \177\0\0\1\3243\0\0\0\2\4\236\216\341\0\0\v\277"..., 32, 0) = 32 > > <0.000529> > > 0.000882 rt_sigprocmask(SIG_BLOCK, ~[], [RT_0], 8) = 0 <0.000021> > > 0.000242 rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0 > > <0.000021> > > 0.000173 select(8, [3 4 6 7], NULL, NULL, NULL) = 1 (in [6]) > > <0.000047> > > 0.000328 read(6, "\0\0\0 ", 4) = 4 <0.000031> > > 0.000179 read(6, > > "\177\0\0\1\3242\0\0\0\2\4\236\216\341\0\0\7\327\177\0\0"..., 28) = 28 > > <0.000075> The strace here shows select() with an infinite timeout, you're numbers will be much better if you do (pseudo code)... struct timeval zerotime; zerotime.tv_sec = 0; zerotime.tv_usec = 0; if (!(ret = select( ... , &zerotime))) ret = select( ... , NULL); ...basically you completely miss the function call for __pollwait() inside poll_wait (include/linux/poll.h in the linux sources, with __pollwait being in fs/select.c). -- # James Antill -- james@and.org :0: * ^From: .*james@and\.org /dev/null ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-04-09 19:16 ` James Antill @ 2001-04-10 18:29 ` Stephen D. Williams 2001-04-10 20:25 ` James Antill 0 siblings, 1 reply; 28+ messages in thread From: Stephen D. Williams @ 2001-04-10 18:29 UTC (permalink / raw) To: James Antill Cc: Michael Lindner, Chris Wedgwood, Dan Maas, Edgar Toernig, linux-kernel James Antill wrote: > > "Stephen D. Williams" <sdw@lig.net> writes: > > > An old thread, but important to get these fundamental performance > > numbers up there: > > > > 2.4.2 on an 800mhz PIII Sceptre laptop w/ 512MB ram: > > > > elapsed time for 100000 pingpongs is > > 3.81327 > > 100000/3.81256 > > ~26229.09541095746689888159 > > 10000/.379912 > > ~26321.88506812103855629724 ... > I seemed to miss the original post, so I can't really comment on the > tests. However... It was a thread in January, but just ran accross it looking for something else. See below for results. > > Michael Lindner wrote: ... > > > 0.052371 send(7, "\0\0\0 > > > \177\0\0\1\3243\0\0\0\2\4\236\216\341\0\0\v\277"..., 32, 0) = 32 > > > <0.000529> > > > 0.000882 rt_sigprocmask(SIG_BLOCK, ~[], [RT_0], 8) = 0 <0.000021> > > > 0.000242 rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0 > > > <0.000021> > > > 0.000173 select(8, [3 4 6 7], NULL, NULL, NULL) = 1 (in [6]) > > > <0.000047> > > > 0.000328 read(6, "\0\0\0 ", 4) = 4 <0.000031> > > > 0.000179 read(6, > > > "\177\0\0\1\3242\0\0\0\2\4\236\216\341\0\0\7\327\177\0\0"..., 28) = 28 > > > <0.000075> > > The strace here shows select() with an infinite timeout, you're > numbers will be much better if you do (pseudo code)... > > struct timeval zerotime; > > zerotime.tv_sec = 0; > zerotime.tv_usec = 0; > > if (!(ret = select( ... , &zerotime))) > ret = select( ... , NULL); > > ...basically you completely miss the function call for __pollwait() > inside poll_wait (include/linux/poll.h in the linux sources, with > __pollwait being in fs/select.c). Apparently the extra system call overhead outweighs any benefit. In any case, what you suggest would be better done in the kernel anyway. The time went from 3.7 to 4.4 seconds per 100000. > > -- > # James Antill -- james@and.org > :0: > * ^From: .*james@and\.org > /dev/null -- sdw@lig.net http://sdw.st Stephen D. Williams 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax Dec2000 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-04-10 18:29 ` Stephen D. Williams @ 2001-04-10 20:25 ` James Antill 2001-04-11 21:03 ` Stephen D. Williams 0 siblings, 1 reply; 28+ messages in thread From: James Antill @ 2001-04-10 20:25 UTC (permalink / raw) To: Stephen D. Williams Cc: Michael Lindner, Chris Wedgwood, Dan Maas, Edgar Toernig, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2522 bytes --] "Stephen D. Williams" <sdw@lig.net> writes: > James Antill wrote: > > > > I seemed to miss the original post, so I can't really comment on the > > tests. However... > > It was a thread in January, but just ran accross it looking for > something else. See below for results. Ahh, ok. > > > Michael Lindner wrote: > ... > > > > <0.000021> > > > > 0.000173 select(8, [3 4 6 7], NULL, NULL, NULL) = 1 (in [6]) > > > > <0.000047> > > > > The strace here shows select() with an infinite timeout, you're > > numbers will be much better if you do (pseudo code)... [snip ... ] > > ...basically you completely miss the function call for __pollwait() > > inside poll_wait (include/linux/poll.h in the linux sources, with > > __pollwait being in fs/select.c). > > Apparently the extra system call overhead outweighs any benefit. There shouldn't be any "extra" system calls in the fast path. If data is waiting then you do one call to poll() either way, if not then you are wasting time blocking so it doesn't matter what you do. > In any > case, what you suggest would be better done in the kernel anyway. Possibly, however when this has come up before the kernel people have said it's hard to do in kernel space. > The > time went from 3.7 to 4.4 seconds per 100000. Ok here's a quick test that I've done. This passes data between 2 processes. Obviously you can't compare this to your code or Michael's, however... The results with USE_DOUBLE_POLL on are... % time ./pingpong ./pingpong 0.15s user 0.89s system 48% cpu 2.147 total % time ./pingpong ./pingpong 0.19s user 0.91s system 45% cpu 2.422 total % time ./pingpong ./pingpong 0.10s user 1.02s system 49% cpu 2.282 total The results with USE_DOUBLE_POLL off are... % time ./pingpong ./pingpong 0.24s user 1.07s system 50% cpu 2.614 total % time ./pingpong ./pingpong 0.21s user 1.00s system 44% cpu 2.695 total % time ./pingpong ./pingpong 0.21s user 1.13s system 50% cpu 2.667 total Don't forget that the poll here is done with _1_ fd. Most real programs have more, and so benifit more. I also did the TRY_NO_POLL, as I was pretty sure what the results would be, that gives... % time ./pingpong ./pingpong 0.03s user 0.41s system 50% cpu 0.874 total % time ./pingpong ./pingpong 0.06s user 0.44s system 58% cpu 0.855 total % time ./pingpong ./pingpong 0.07s user 0.35s system 51% cpu 0.820 total [-- Attachment #2: pingpong.c --] [-- Type: application/octet-stream, Size: 2356 bytes --] #define _GNU_SOURCE 1 #include <stdlib.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <sys/poll.h> #include <errno.h> #include <netinet/ip.h> #define USE_DOUBLE_POLL 1 #define USE_TRY_NO_POLL 0 #define DIE() die(__LINE__) static void die(int line) { char *buf = NULL; asprintf(&buf, "%d:", line); if (buf) perror(buf); exit (EXIT_FAILURE); } #define EV_WRITE 1 #define EV_READ 2 #define EV_POLL_WRITE 3 #define EV_POLL_READ 4 #if USE_TRY_NO_POLL # define EV_AFTER_READ EV_WRITE # define EV_AFTER_WRITE EV_READ #else # define EV_AFTER_READ EV_POLL_WRITE # define EV_AFTER_WRITE EV_POLL_READ #endif #define EV_IS_WRITE(x) ((x) & 1) static int do_event(int fd, int event) { char buf[1]; int ret = 0; switch (event) { case EV_WRITE: if ((ret = write(fd, "a", 1)) == -1) { if (errno != EAGAIN) DIE(); event = EV_POLL_WRITE; } else event = EV_AFTER_WRITE; break; case EV_READ: if ((ret = read(fd, buf, 1)) == -1) { if (errno == EPIPE) exit (EXIT_SUCCESS); if (errno != EAGAIN) DIE(); event = EV_POLL_READ; } else event = EV_AFTER_READ; break; case EV_POLL_WRITE: { struct pollfd p; p.fd = fd; p.events = POLLOUT; p.revents = 0; if (!USE_DOUBLE_POLL || !(ret = poll(&p, 1, 0))) ret = poll(&p, 1, -1); if (ret != 1) DIE(); event = EV_WRITE; } break; case EV_POLL_READ: { struct pollfd p; p.fd = fd; p.events = POLLIN; p.revents = 0; if (!USE_DOUBLE_POLL || !(ret = poll(&p, 1, 0))) ret = poll(&p, 1, -1); if (ret != 1) DIE(); event = EV_READ; } break; } return (event); } static void go_parent(int fd) { int event = EV_WRITE; unsigned int count = 100000; while (count) { int write_event = EV_IS_WRITE(event); event = do_event(fd, event); if (!!write_event != !!EV_IS_WRITE(event)) --count; } } static int go_chld(int fd) { int event = EV_WRITE; while (1) event = do_event(fd, event); } int main(void) { int fds[2]; pid_t chld = 0; if (socketpair(PF_LOCAL, SOCK_STREAM, IPPROTO_IP, fds) == -1) DIE(); if ((chld = fork()) == -1) DIE(); if (chld) go_parent(fds[0]); else go_chld(fds[1]); exit (EXIT_SUCCESS); } [-- Attachment #3: Type: text/plain, Size: 78 bytes --] -- # James Antill -- james@and.org :0: * ^From: .*james@and\.org /dev/null ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-04-10 20:25 ` James Antill @ 2001-04-11 21:03 ` Stephen D. Williams 2001-04-12 0:09 ` James Antill 0 siblings, 1 reply; 28+ messages in thread From: Stephen D. Williams @ 2001-04-11 21:03 UTC (permalink / raw) To: James Antill Cc: Michael Lindner, Chris Wedgwood, Dan Maas, Edgar Toernig, linux-kernel [-- Attachment #1: Type: text/plain, Size: 962 bytes --] James Antill wrote: ... > > The > > time went from 3.7 to 4.4 seconds per 100000. > > Ok here's a quick test that I've done. This passes data between 2 > processes. Obviously you can't compare this to your code or Michael's, > however... I've attached my version of his code with your suggested change. Possibly I didn't do it correctly. > The results with USE_DOUBLE_POLL on are... > > % time ./pingpong > ./pingpong 0.15s user 0.89s system 48% cpu 2.147 total > % time ./pingpong > ./pingpong 0.19s user 0.91s system 45% cpu 2.422 total > % time ./pingpong > ./pingpong 0.10s user 1.02s system 49% cpu 2.282 total > > The results with USE_DOUBLE_POLL off are... > > % time ./pingpong > ./pingpong 0.24s user 1.07s system 50% cpu 2.614 total ... sdw -- sdw@lig.net http://sdw.st Stephen D. Williams 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax Dec2000 [-- Attachment #2: sockperf.c --] [-- Type: text/plain, Size: 7448 bytes --] #include <fcntl.h> #include <memory.h> #include <netdb.h> #include <netinet/in.h> #include <signal.h> #include <stdio.h> #include <sys/select.h> #include <sys/types.h> #include <sys/socket.h> #include <varargs.h> #include <netinet/in.h> #include <netdb.h> #include <errno.h> #include <arpa/inet.h> #include <sys/time.h> #include <unistd.h> #include <netinet/tcp.h> #ifndef INADDR_NONE #define INADDR_NONE ~0 #endif void errexit(format, va_alist) char *format; va_dcl { va_list args; va_start(args); vfprintf(stderr, format, args); va_end(args); exit(1); } /* * passivesock - allocate & bind a server socket using TCP or UDP */ int passivesock( service, protocol, qlen ) char *service; /* service associeted with the desired port */ char *protocol; /* name of protocol to use ("tcp" or "udp") */ int qlen; /* maximum length of the server request queue */ { struct servent *pse; struct protoent *ppe; struct sockaddr_in sin; int s, type; int one = 1; int f=1; bzero((char *) & sin, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = INADDR_ANY; /* Map service name to port number */ if ( pse = getservbyname(service, protocol) ) sin.sin_port = htons(ntohs((u_short)pse->s_port)); else if ( (sin.sin_port = htons((u_short)atoi(service))) == 0 ) errexit("can't get \"%s\" service entry\n", service); /* Map protocol name to protocol number */ if ( (ppe = getprotobyname(protocol)) == 0) errexit("can't get \"%s\" protocol entry\n", protocol); /* Use protocol to chose a socket type */ if (strcmp(protocol, "udp") == 0) type = SOCK_DGRAM; else type = SOCK_STREAM; /* Allocate a socket */ s = socket(PF_INET, type, ppe->p_proto); if (s < 0 ) errexit("can't create socket: %s\n", strerror(errno)); setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)); setsockopt(s, SOL_TCP, TCP_NODELAY, &f, sizeof(f)); /* Bind the socket */ if (bind(s, (struct sockaddr *) & sin, sizeof(sin)) < 0) errexit("can't bind to %s port: %s\n", service, strerror(errno)); if (type == SOCK_STREAM && listen(s, qlen) < 0) errexit("can't listen on %s port: %s\n", service, strerror(errno)); return s; } int connectsock(host, service, protocol) char *host; char *service; char *protocol; { struct hostent *phe; struct servent *pse; struct protoent *ppe; struct sockaddr_in sin; int s, type; int f=1; memset(&sin, 0, sizeof(sin)); if (pse = getservbyname(service, protocol)) sin.sin_port = pse->s_port; else if ((sin.sin_port = htons((u_short) atoi(service))) == 0) { fprintf(stderr, "can't get '%s' service entry\n", service); exit(1); } if (phe = gethostbyname(host)) memcpy((char *) &sin.sin_addr, phe->h_addr, phe->h_length); else if ((sin.sin_addr.s_addr = inet_addr(host)) == INADDR_NONE) { fprintf(stderr, "can't get '%s' host entry\n", host); exit(1); } /* if (ppe = getprotobyname(protocol)) { fprintf(stderr, "can't get '%s' protocol entry\n", protocol); exit(1); } if (strcmp(protocol, "udp") == 0) type = SOCK_DGRAM; else type = SOCK_STREAM; */ sin.sin_family = AF_INET; s = socket(AF_INET, SOCK_STREAM, 6); setsockopt(s, SOL_TCP, TCP_NODELAY, &f, sizeof(f)); if (s < 0) { perror("can't create socket\n"); exit(1); } if (connect(s, (struct sockaddr *) &sin, sizeof(sin)) < 0) { perror("can't connect to socket"); exit(1); } return s; } void pingpong(int r, int s, int ping) { struct timeval then; struct timeval now; fd_set fds; fd_set readfds; int pings = 0; struct timeval zerotime; int ret; FD_ZERO(&fds); FD_SET(r, &fds); gettimeofday(&then, 0); if (ping) { send(s, ".", 1, 0); pings++; } readfds = fds; zerotime.tv_sec = 0; zerotime.tv_usec = 0; // if (!(ret = select( ... , &zerotime))) // ret = select( ... , NULL); // while ((readfds=fds, ret = select(r+1, &readfds, 0, 0, 0)) ) { while ((ret = select(r+1, &readfds, 0, 0, &zerotime)) || (readfds=fds, ret = select(r+1, &readfds, 0, 0, 0)) ) { if (FD_ISSET(r, &readfds)) { char buf[1]; int n = read(r, buf, sizeof(buf)); if (n <= 0) { break; } else { if (pings++ < 100000) { send(s, ".", 1, 0); } else { break; } } } else { fprintf(stderr, "fd not set!\n"); } readfds = fds; } gettimeofday(&now, 0); fprintf(stderr, "elapsed time for 100000 pingpongs is %g\n", now.tv_sec - then.tv_sec + (now.tv_usec - then.tv_usec) / 1000000.0); fprintf(stderr, "closing %d\n", r); close(r); fprintf(stderr, "closing %d\n", s); close(s); } main(argc, argv) int argc; char **argv; { char buf[1024]; int n; int s; int f; if (argc < 3) { errexit("usage: %s host port1 port2 [initiate]\n", argv[0]); } signal(SIGPIPE, SIG_IGN); f = passivesock(argv[2], "tcp", 2); if (f < 0) { errexit("listen failed: %s\n", strerror(errno)); } for ( ; ; ) { struct sockaddr_in fsin; int alen = sizeof(fsin); if (argc < 5) { int r = accept(f, (struct sockaddr *) &fsin, &alen); int s = connectsock(argv[1], argv[3], "tcp"); if (r < 0) errexit("accept failed: %s\n", strerror(errno)); if (s < 0) errexit("connect failed: %s\n", strerror(errno)); pingpong(r, s, 0); } else { int s = connectsock(argv[1], argv[3], "tcp"); int r = accept(f, (struct sockaddr *) &fsin, &alen); if (r < 0) errexit("accept failed: %s\n", strerror(errno)); if (s < 0) errexit("connect failed: %s\n", strerror(errno)); pingpong(r, s, 1); break; } } return 0; } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-04-11 21:03 ` Stephen D. Williams @ 2001-04-12 0:09 ` James Antill 0 siblings, 0 replies; 28+ messages in thread From: James Antill @ 2001-04-12 0:09 UTC (permalink / raw) To: Stephen D. Williams Cc: Michael Lindner, Chris Wedgwood, Dan Maas, Edgar Toernig, linux-kernel "Stephen D. Williams" <sdw@lig.net> writes: > James Antill wrote: > ... > > > The > > > time went from 3.7 to 4.4 seconds per 100000. > > > > Ok here's a quick test that I've done. This passes data between 2 > > processes. Obviously you can't compare this to your code or Michael's, > > however... > > I've attached my version of his code with your suggested change. > Possibly I didn't do it correctly. It's not a code thing, but I think you are measuring the wrong thing (at least in relation to the original question). Given the below diff... --- sdw-sockperf.c-orig Wed Apr 11 18:30:28 2001 +++ sdw-sockperf.c Wed Apr 11 18:33:09 2001 @@ -17,6 +17,7 @@ #include <unistd.h> #include <netinet/tcp.h> +#define USE_DOUBLE_SELECT 0 #ifndef INADDR_NONE #define INADDR_NONE ~0 @@ -147,7 +148,8 @@ int pings = 0; struct timeval zerotime; int ret; - + unsigned int misses = 0; + FD_ZERO(&fds); FD_SET(r, &fds); gettimeofday(&then, 0); @@ -163,8 +165,10 @@ // if (!(ret = select( ... , &zerotime))) // ret = select( ... , NULL); // while ((readfds=fds, ret = select(r+1, &readfds, 0, 0, 0)) ) {- while ((ret = select(r+1, &readfds, 0, 0, &zerotime)) || - (readfds=fds, ret = select(r+1, &readfds, 0, 0, 0)) ) { + while ((USE_DOUBLE_SELECT && + (ret = select(r+1, &readfds, 0, 0, &zerotime))) || + (++misses && + (readfds=fds, ret = select(r+1, &readfds, 0, 0, 0)) )) { if (FD_ISSET(r, &readfds)) { char buf[1]; int n = read(r, buf, sizeof(buf)); @@ -186,6 +190,8 @@ readfds = fds; } gettimeofday(&now, 0); + fprintf(stderr, "USE_DOUBLE_SELECT=%d\n", USE_DOUBLE_SELECT); + fprintf(stderr, "misses=%u\n", misses); fprintf(stderr, "elapsed time for 100000 pingpongs is %g\n", now.tv_sec - then.tv_sec + (now.tv_usec - then.tv_usec) / 1000000.0); fprintf(stderr, "closing %d\n", r); ...I get constitently better results for "localhost 45644 45644 a" with USE_DOUBLE_SELECT=1 worth noting is that misses == 0 was always true. However if I have 2 programs, one doing "localhost 45642 45643" and one doing "localhost 45643 45642 a" then I get better results for USE_DOUBLE_SELECT=0[1] and misses is 80-90 thousand (Ie. it has to do 2 select calls 80-90 % of the time). Please note that the original question was, select/poll does a small schedule if you specify a timeout and that's bad. However in the two process case you _need_ the schedule, because there isn't any data there yet. So again given the original assumtion that data is available on one of the fd's then doing the double select is better, but if it isn't then you're wasting time no matter what you do. As to why my test code got good results even though it uses 2 processes, I used PF_LOCAL/AF_LOCAL sockets not PF_INET/AF_INET and those are fast enough at transfering the data that you don't need the schedule (misses == 0, if you add similar code to above). [1] This is on a real computer, on a 486 with 8Meg of RAM I still get better results with USE_DOUBLE_SELECT=1, and there are still 80% misses (no idea why). -- # James Antill -- james@and.org :0: * ^From: .*james@and\.org /dev/null ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-21 0:34 ` Chris Wedgwood 2001-01-21 1:22 ` Michael Lindner 2001-01-21 3:20 ` Michael Lindner @ 2001-01-24 20:31 ` Boris Dragovic 2 siblings, 0 replies; 28+ messages in thread From: Boris Dragovic @ 2001-01-24 20:31 UTC (permalink / raw) To: Chris Wedgwood; +Cc: linux-kernel can someone explain what is nagle or pinpoint explanation :) lynx On Sun, 21 Jan 2001, Chris Wedgwood wrote: > On Sat, Jan 20, 2001 at 07:35:12PM -0500, Dan Maas wrote: > > Bingo! With this fix, 2.2.18 performance becomes almost identical to 2.4.0 > performance. I assume 2.4.0 disables Nagle by default on local > connections... > > 2.4.x has a smarter nagle algorithm. > > > > --cw > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <3A694357.1A7C6AAC@att.net>]
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available [not found] ` <3A694357.1A7C6AAC@att.net> @ 2001-01-20 9:41 ` Dan Maas 2001-01-20 17:26 ` Michael Lindner 0 siblings, 1 reply; 28+ messages in thread From: Dan Maas @ 2001-01-20 9:41 UTC (permalink / raw) To: Michael Lindner, Chris Wedgwood; +Cc: linux-kernel What kernel have you been using? I have reproduced your problem on a standard 2.2.18 kernel (elapsed time ~10sec). However, using a 2.4.0 kernel with HZ=1000, I see a 100x improvement (elapsed time ~0.1 sec; note that increasing HZ alone should only give a 10x improvement). Perhaps the scheduler was fixed in 2.4.0? 2.2.18 very definitely has some scheduling anomalies. In your benchmark, select() or poll() takes 10ms, as can be observed with strace -T. Skipping the select() and blocking in read() gives the same behavior. This leads me to believe the scheduler is at fault, and not select(), poll(), or read(). When run without strace, 2.4.0 appears to have no problems with your benchmark. Elapsed time is 0.1 sec -- this may be the full potential of my machine (PII/450). Removing select() and blocking in read() results in a further improvement, to 0.07 sec. Strace disturbs the behavior of 2.4.0 in strange ways. Running the benchmark under strace with 2.4.0 causes the scheduler delays to return -- ~1ms delays appear in select() or write(). This is confusing - it appears that context switches can happen inside write() as well as select(), a result I don't understand at all (the socket buffers never completely fill since you only write 1000 bytes to each one). Other notes: poll() behaves same as select(). Using the SCHED_FIFO class and mlockall() has no effect on this benchmark. Setting the sockets non-blocking also has no effect. I wish I had the Linux Trace Toolkit handy; it would give a much better idea of what's going on than strace... Dan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 9:41 ` Dan Maas @ 2001-01-20 17:26 ` Michael Lindner 0 siblings, 0 replies; 28+ messages in thread From: Michael Lindner @ 2001-01-20 17:26 UTC (permalink / raw) To: Dan Maas; +Cc: Chris Wedgwood, linux-kernel Dan Maas wrote: > > What kernel have you been using? I have reproduced your problem on a > standard 2.2.18 kernel (elapsed time ~10sec). However, using a 2.4.0 kernel > with HZ=1000, I see a 100x improvement (elapsed time ~0.1 sec; note that > increasing HZ alone should only give a 10x improvement). Perhaps the > scheduler was fixed in 2.4.0? Sounds like a good reason for me to upgrade - I am running 2.2.18 now. If it's fixed in 2.4.0, then I'm happy (although I'm usually leery of installing ANYTHING that ends in ".0", Linux has never been anything less than stable). It sounds like there are some other anomalies this tickles that might bear looing into, though. Thanks for all the help, and again, my apologies for posting a lame-o test program with the original report - had I taken the time to make a test program that REALLY exercised the problem as described, I would have saved you all a lot of time. -- Mike Lindner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available
@ 2001-01-24 23:56 Bernd Eckenfels
0 siblings, 0 replies; 28+ messages in thread
From: Bernd Eckenfels @ 2001-01-24 23:56 UTC (permalink / raw)
To: linux-kernel
> can someone explain what is nagle or pinpoint explanation :)
nagel's algorithm is used to "wait" with sending of small packets until more
data is available, because sending biger packets has less overhead.
greetings
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available
@ 2001-01-20 10:53 Bernd Eckenfels
0 siblings, 0 replies; 28+ messages in thread
From: Bernd Eckenfels @ 2001-01-20 10:53 UTC (permalink / raw)
To: linux-kernel
In article <3A68F855.6F16F152@att.net> you wrote:
> My problem is that if data is NOT available when select()
> starts, but becomes available immediately afterwards, select()
> doesn't wake up immediately, but sleeps for 1/100 second.
It does not sleep for a 1/100second, it will but the process in the run queue
and of course the process needs to wait for the current scheduled process to
finish it's scheduling. This happens only every tick.
If there is no process in the run queue, mabe this can be done faster (already
is done faster?)
Greetings
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 28+ messages in thread
* PROBLEM: select() on TCP socket sleeps for 1 tick even if data available @ 2001-01-19 20:47 Michael Lindner 2001-01-19 23:20 ` David Schwartz 2001-01-19 23:31 ` Chris Wedgwood 0 siblings, 2 replies; 28+ messages in thread From: Michael Lindner @ 2001-01-19 20:47 UTC (permalink / raw) To: linux-kernel [1.] select() sleeps for 1 tick even if data available [2.] Full description of the problem/report: If select() is waiting for data to become available on a TCP socket FD, and data becomes available, it doesn't return until the next clock tick. This produces large latencies when passing data between processes several times, since each transaction (which requires microseconds) does not occur until the next clock tick, limiting the entire throughput of the system to 100 transactions/second. [3.] Keywords: select, socket, networking [4.] Kernel version (from /proc/version): 2.2.18 [5.] Output of Oops.. message (not applicable) [6.] A small shell script or example program which triggers the problem (if possible) #include <sys/time.h> #include <sys/types.h> #include <unistd.h> /* this program should take 1 second to complete, but takes 10 */ /* yes i know, it doesn't do any I/O, but the behavior is the same as if it did */ main() { for (int i = 0; i < 1000; i++) { struct timeval to; to.tv_sec = 0; to.tv_usec = 1000; select(0, 0, 0, 0, &to); } return 0; } [7.] Environment Red Hat 7.0 with kernel upgraded to 2.2.18 [7.1.] Software (add the output of the ver_linux script here) ver_linux di dnot appear to work. However, here's its output. Linux mlindner-ras.sonusnet.com 2.2.18 #8 Wed Jan 3 01:40:29 EST 2001 i586 unknown Kernel modules found Gnu C 2.96 Binutils 2.10.0.18 Linux C Library .. ldd: missing file arguments Try `ldd --help' for more information. ls: /usr/lib/libg++.so: No such file or directory Procps 2.0.7 Mount 2.10m Net-tools (2000-05-21) Kbd [option...] Sh-utils 2.0 Sh-utils Parker. Sh-utils Sh-utils Inc. Sh-utils NO Sh-utils PURPOSE. [7.2.] Processor information (from /proc/cpuinfo): processor : 0 vendor_id : AuthenticAMD cpu family : 5 model : 8 model name : AMD-K6(tm) 3D processor stepping : 12 cpu MHz : 451.033 cache size : 64 KB fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mmx 3dnow bogomips : 897.84 [7.3.] Module information (from /proc/modules): N/A [7.4.] SCSI information (from /proc/scsi/scsi) N/A [7.5.] Other information that might be relevant to the problem (please look in /proc and include all information that you think to be relevant): [X.] Other notes, patches, fixes, workarounds: -- Mike Lindner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-19 20:47 Michael Lindner @ 2001-01-19 23:20 ` David Schwartz 2001-01-20 2:30 ` Michael Lindner 2001-01-19 23:31 ` Chris Wedgwood 1 sibling, 1 reply; 28+ messages in thread From: David Schwartz @ 2001-01-19 23:20 UTC (permalink / raw) To: Michael Lindner, linux-kernel > If select() is waiting for data to become available on a > TCP socket FD, and > data becomes available, it doesn't return until the next clock tick. If your application has scheduling requirements, you need to communicate them to the scheduler. > #include <sys/time.h> > #include <sys/types.h> > #include <unistd.h> > > /* this program should take 1 second to complete, but takes 10 */ > /* yes i know, it doesn't do any I/O, but the behavior is the > same as if it did */ > main() > { > for (int i = 0; i < 1000; i++) { > struct timeval to; > to.tv_sec = 0; > to.tv_usec = 1000; > select(0, 0, 0, 0, &to); > } > return 0; > } This program doesn't demonstrate anything except that Linux's sleep time is granular. This shouldn't be news to anyone. If you don't force a reschedule, everything works the way it's supposed to: main() { int i, j, pipes[2]; struct timeval tv; fd_set rd; pipe(pipes); write(pipes[1], "foo", 4); for(i=0; i<1000; i++) { FD_ZERO(&rd); FD_SET(pipes[0], &rd); tv.tv_sec=0; tv.tv_usec=1000; j=select(10, &rd, NULL, NULL, &tv); if(j!=1) printf("oops\n"); } } DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-19 23:20 ` David Schwartz @ 2001-01-20 2:30 ` Michael Lindner 2001-01-20 3:27 ` David Schwartz 2001-01-20 12:26 ` Martin MaD Douda 0 siblings, 2 replies; 28+ messages in thread From: Michael Lindner @ 2001-01-20 2:30 UTC (permalink / raw) To: David Schwartz; +Cc: linux-kernel, Chris Wedgwood Thanks CW and DS for the prompt replies. However, although each addressed the (flawed) example I included, neither addressed the problem described in the text. I wrote: > > If select() is waiting for data to become available on a > > TCP socket FD, and > > data becomes available, it doesn't return until the next clock tick. David Schwartz wrote: > This program doesn't demonstrate anything except that Linux's sleep time is > granular. This shouldn't be news to anyone. If you don't force a reschedule, > everything works the way it's supposed to: The sample program you included doesn't show anything other than that select() doesn't sleep at all if there's already data available when select() starts. That was not my claim either. My problem is that if data is NOT available when select() starts, but becomes available immediately afterwards, select() doesn't wake up immediately, but sleeps for 1/100 second. In other words, select doesn't wake up immediately when data becomes available, but on the next clock tick. This is not the experience I've had with any other OS I've used, and is a source of great latency in my application. Since I am passing data from one process to another, and that data is generated as a result of data received via a select(), the next delivery occurs a clock tick later, with the machine mostly idle. It can be argued that there's no law governing the latency of select() waking up, and that my application is expecting too much. Yet, it runs on other UNIXes and Windows, and I'd like to be able to get the same high performance out of Linux. P.S. Chris Wedgwood writes: "The time passed to slect is a _minimum_ " but the man page for select says: "timeout is an upper bound on the amount of time elapsed before select returns." who is right? -- Mike Lindner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 2:30 ` Michael Lindner @ 2001-01-20 3:27 ` David Schwartz 2001-01-20 4:37 ` Michael Lindner 2001-01-20 12:26 ` Martin MaD Douda 1 sibling, 1 reply; 28+ messages in thread From: David Schwartz @ 2001-01-20 3:27 UTC (permalink / raw) To: Michael Lindner; +Cc: linux-kernel, Chris Wedgwood > Thanks CW and DS for the prompt replies. However, although each > addressed the (flawed) example I included, neither addressed the > problem described in the text. > > I wrote: > > > If select() is waiting for data to become available on a > > > TCP socket FD, and > > > data becomes available, it doesn't return until the next clock tick. > > David Schwartz wrote: > > This program doesn't demonstrate anything except that > > Linux's sleep time is > > granular. This shouldn't be news to anyone. If you don't force > > a reschedule, > > everything works the way it's supposed to: > The sample program you included doesn't show anything other > than that select() doesn't sleep at all if there's already > data available when select() starts. That was not my claim > either. Correct. Select doesn't sleep unless it has to. > My problem is that if data is NOT available when select() > starts, but becomes available immediately afterwards, select() > doesn't wake up immediately, but sleeps for 1/100 second. How can you tell when select wakes up the process? What you are seeing has nothing whatsoever to do with select and simply has to do with the fact that the kernel does not give the CPU to a process the second that process may want it. > In other words, select doesn't wake up immediately when > data becomes available, but on the next clock tick. Right. The process becomes eligible to run -- it's no longer blocked. The scheduler then schedules it whenever it feels like it. > This is > not the experience I've had with any other OS I've used, > and is a source of great latency in my application. Since > I am passing data from one process to another, and that > data is generated as a result of data received via a select(), > the next delivery occurs a clock tick later, with the machine > mostly idle. If you have scheduling latency requirements, you MUST communicate them to the scheduler. If your process had an altered scheduling class, then you would be right -- it should get the CPU immediately. Otherwise, there's no reason for the scheduler to give that process the CPU immediately. > It can be argued that there's no law governing the latency of > select() waking up, and that my application is expecting > too much. Yet, it runs on other UNIXes and Windows, and I'd > like to be able to get the same high performance out of Linux. Then tell that to the scheduler. > P.S. Chris Wedgwood writes: > "The time passed to slect is a _minimum_ " > > but the man page for select says: > "timeout is an upper bound on the amount of time elapsed > before select returns." > > who is right? This should read "before select returns the process to the list of runnable processes". DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 3:27 ` David Schwartz @ 2001-01-20 4:37 ` Michael Lindner 0 siblings, 0 replies; 28+ messages in thread From: Michael Lindner @ 2001-01-20 4:37 UTC (permalink / raw) To: David Schwartz; +Cc: linux-kernel, Chris Wedgwood David Schwartz wrote: > > How can you tell when select wakes up the process? What you are seeing has > nothing whatsoever to do with select and simply has to do with the fact that > the kernel does not give the CPU to a process the second that process may > want it. I guess I can't. But on an idle machine, I would expect a process that becomes runnable would be run immediately, not on the next clock tick. strace reports that each select() is taking 0.009xxx seconds of real time, and the system's CPU load (as reported by top) is under 1%. ... > If you have scheduling latency requirements, you MUST communicate them to > the scheduler. If your process had an altered scheduling class, then you > would be right -- it should get the CPU immediately. Otherwise, there's no > reason for the scheduler to give that process the CPU immediately. OK, if this is the case, how do I alter the scheduling class? -- Mike Lindner - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 2:30 ` Michael Lindner 2001-01-20 3:27 ` David Schwartz @ 2001-01-20 12:26 ` Martin MaD Douda 2001-01-20 11:39 ` Bjorn Wesen 1 sibling, 1 reply; 28+ messages in thread From: Martin MaD Douda @ 2001-01-20 12:26 UTC (permalink / raw) To: Michael Lindner; +Cc: linux-kernel On Fri, 19 Jan 2001, Michael Lindner wrote: > data is generated as a result of data received via a select(), > the next delivery occurs a clock tick later, with the machine > mostly idle. ^^^^^^^^^^^ The machine is in fact not idle - there is a task running - idle task. Could the problem be that scheduler does not preempt this task to run something more useful? Symptoms seems to show this. Martin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-20 12:26 ` Martin MaD Douda @ 2001-01-20 11:39 ` Bjorn Wesen 0 siblings, 0 replies; 28+ messages in thread From: Bjorn Wesen @ 2001-01-20 11:39 UTC (permalink / raw) To: Martin MaD Douda; +Cc: Michael Lindner, linux-kernel On Sat, 20 Jan 2001, Martin MaD Douda wrote: > On Fri, 19 Jan 2001, Michael Lindner wrote: > > data is generated as a result of data received via a select(), > > the next delivery occurs a clock tick later, with the machine > > mostly idle. > > The machine is in fact not idle - there is a task running - idle task. > Could the problem be that scheduler does not preempt this task to run > something more useful? Normally, the "idle task" (task[0]) does this pseudo-code: while(1) { if(need_resched) schedule(); } to minimize latency out of idle so if that actually is running it should not be a problem (unless need_resched is not set by the wakeup calls) Perhaps the kapm-idled kernel thread is killing your latency, you could try disabling APM and APM-making-idle-calls especially. Also check ps aux and see if anything else is taking your idle CPU %. -BW - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: PROBLEM: select() on TCP socket sleeps for 1 tick even if data available 2001-01-19 20:47 Michael Lindner 2001-01-19 23:20 ` David Schwartz @ 2001-01-19 23:31 ` Chris Wedgwood 1 sibling, 0 replies; 28+ messages in thread From: Chris Wedgwood @ 2001-01-19 23:31 UTC (permalink / raw) To: Michael Lindner; +Cc: linux-kernel main() { for (int i = 0; i < 1000; i++) { struct timeval to; to.tv_sec = 0; to.tv_usec = 1000; select(0, 0, 0, 0, &to); } return 0; } ia32 with HZ=100 means sleep will sleep for 10ms each time, no less. The time passed to slect is a _minimum_ the kernel make no guarantee it will be less, but assures it will be more (10ms is the default timer interval for the intel platform, others (e.g. alpha) differ) run your conde with strace -t to see what i mean --cw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2001-04-12 0:11 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <fa.nc2eokv.1dj8r80@ifi.uio.no> [not found] ` <fa.dcei62v.1s5scos@ifi.uio.no> [not found] ` <015e01c082ac$4bf9c5e0$0701a8c0@morph> 2001-01-20 6:54 ` PROBLEM: select() on TCP socket sleeps for 1 tick even if data available Michael Lindner 2001-01-20 7:07 ` Chris Wedgwood 2001-01-20 7:46 ` Michael Lindner 2001-01-20 21:58 ` Edgar Toernig 2001-01-21 0:35 ` Dan Maas 2001-01-21 0:34 ` Chris Wedgwood 2001-01-21 1:22 ` Michael Lindner 2001-01-21 1:29 ` David Schwartz 2001-01-21 3:20 ` Michael Lindner 2001-04-09 14:54 ` Stephen D. Williams 2001-04-09 19:16 ` James Antill 2001-04-10 18:29 ` Stephen D. Williams 2001-04-10 20:25 ` James Antill 2001-04-11 21:03 ` Stephen D. Williams 2001-04-12 0:09 ` James Antill 2001-01-24 20:31 ` Boris Dragovic [not found] ` <3A694357.1A7C6AAC@att.net> 2001-01-20 9:41 ` Dan Maas 2001-01-20 17:26 ` Michael Lindner 2001-01-24 23:56 Bernd Eckenfels -- strict thread matches above, loose matches on Subject: below -- 2001-01-20 10:53 Bernd Eckenfels 2001-01-19 20:47 Michael Lindner 2001-01-19 23:20 ` David Schwartz 2001-01-20 2:30 ` Michael Lindner 2001-01-20 3:27 ` David Schwartz 2001-01-20 4:37 ` Michael Lindner 2001-01-20 12:26 ` Martin MaD Douda 2001-01-20 11:39 ` Bjorn Wesen 2001-01-19 23:31 ` Chris Wedgwood
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).