linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "David Schwartz" <davids@webmaster.com>
To: "Entrope" <entrope@gamesnet.net>
Cc: "Davide Libenzi" <davidel@xmailserver.org>,
	"Eric Varsanyi" <e0206@foo21.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: RE: [Patch][RFC] epoll and half closed TCP connections
Date: Sun, 13 Jul 2003 23:14:13 -0700	[thread overview]
Message-ID: <MDEHLPKNGKAHNMBLJOLKGEHGEFAA.davids@webmaster.com> (raw)
In-Reply-To: <877k6m6l2k.fsf@sanosuke.troilus.org>


> Your argument is bogus.  My first-hand experience is with IRC servers,
> which customarily have thousands of connections at once, with a very
> few percent active in a given check.  The scaling problem is not with
> the length of waiting or how poll() is optimized -- it is with the
> overhead *inherent* to processing poll().  Common IRC servers spend
> 100% of CPU when using poll() for only a few thousand clients.  Those
> same servers, using FreeBSD's kqueue()/kevent() API, use well under
> 10% of the CPU.

	My argument is not bogus, you just don't understand it. Two algorithms can
scale at the same order and yet one can perform much better than the other.
Poll, for example, could use 10% of the CPU at 100 users and 100% at 1000
users. While kqueue/kevent could use .01% at 100 users and .2% at 1000
users. With these numbers, poll is much more scalable (its CPU usage went up
by a factor of 10 while kqueue went up by a factor of 20) yet kqueue will
outperform poll.

	I am specifically talking about scalability, in the compsci sense. I grant
that epoll will use less CPU than poll in every configuration.

> Yes, the amount of time spent doing useful work increases as the
> poll() load increases -- but the time wasted setting up and checking
> activity for poll() is something you can never reclaim, and which only
> goes up as your CPU gets faster.  epoll() makes you pay the cost of
> updating the interest list only when the list changes; poll() makes
> you pay the cost every time you call it.

	I know what epoll *is*.

> Empirically, four of the five biggest IRC networks run server software
> that prefers kqueue() on FreeBSD.  kqueue() did not cause them to be
> large, but using kqueue() addresses specific concerns.  On the network
> I can speak for, we look forward to having epoll() on Linux for the
> same reason.

	Wonderful, now please show me where the error in my argument is.

> > 	My experience has been that this is a huge problem with
> > Linux but not with
> > any other OS. It can be solved in user-space with some other
> > penalities by
> > an adaptive sleep before each call to 'poll' and polling with a
> > zero timeout
> > (thus avoiding the wait queue pain). But all the deficiencies
> > in the 'poll'
> > implementation in the world won't show anything except that
> > 'poll' is badly
> > implemented.

> Your experience must be unique, because many people have seen poll()'s
> inactive-client overhead cause CPU wastage problems on non-Linux OSes
> (for me, FreeBSD and Solaris).

	References please? And again, artificial cases where the number of active
descriptors were held constant while the number of inactive descriptors were
increased do not count.

> poll() may be badly implemented on Linux or not, but it shares a
> design flaw with select(): that the application must specify the list
> of FDs for each system call, no matter how few change per call.  That
> is the design flaw that epoll() addresses.

	I know that. What does that have to do with anything? Are you even reading
what I'm writing?

> If you truly believe that
> poll()'s implementation is so flawed, please provide an improved
> implementation.

	It's trivial to make the optimizations that my post very obviously
suggests. One would be to defer creating the wait queue until it's clear we
need to wait. The problem is, these optimizations would harm the low-load
and no-load cases and it's my understanding from the last time this was
discussed that changes that worsen the 'most common' case will be refused
even if they improve the 'high load' case.

> To put it another way, all the optimizations in the world for a 'poll'
> implementation won't sustain it unless you understand the flaw in its
> specification.  The specification requires inefficient use of CPU for
> very common situations.

	Fine, but show me how that flaw impacts scalability. Please read what I
said again, I understand that 'epoll' is superior to 'poll'. I'm
specifically disputing whether or not 'epoll' has a specific mathematical
characteristic.

	DS



  reply	other threads:[~2003-07-14  5:59 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-12 18:16 [Patch][RFC] epoll and half closed TCP connections Eric Varsanyi
2003-07-12 19:44 ` Jamie Lokier
2003-07-12 20:51   ` Eric Varsanyi
2003-07-12 20:48     ` Davide Libenzi
2003-07-12 21:19       ` Eric Varsanyi
2003-07-12 21:20         ` Davide Libenzi
2003-07-12 21:41         ` Davide Libenzi
2003-07-12 23:11           ` Eric Varsanyi
2003-07-12 23:55             ` Davide Libenzi
2003-07-13  1:05               ` Eric Varsanyi
2003-07-13 20:32       ` David Schwartz
2003-07-13 21:10         ` Jamie Lokier
2003-07-13 23:05           ` David Schwartz
2003-07-13 23:09             ` Davide Libenzi
2003-07-14  8:14               ` Alan Cox
2003-07-14 15:03                 ` Davide Libenzi
2003-07-14  1:27             ` Jamie Lokier
2003-07-13 21:14         ` Davide Libenzi
2003-07-13 23:05           ` David Schwartz
2003-07-13 23:11             ` Davide Libenzi
2003-07-13 23:52             ` Entrope
2003-07-14  6:14               ` David Schwartz [this message]
2003-07-14  7:20                 ` Jamie Lokier
2003-07-14  1:51             ` Jamie Lokier
2003-07-14  6:14               ` David Schwartz
2003-07-15 20:27             ` James Antill
2003-07-16  1:46               ` David Schwartz
2003-07-16  2:09                 ` James Antill
2003-07-13 13:12     ` Jamie Lokier
2003-07-13 16:55       ` Davide Libenzi
2003-07-12 20:01 ` Davide Libenzi
2003-07-13  5:24   ` David S. Miller
2003-07-13 14:07     ` Jamie Lokier
2003-07-13 17:00       ` Davide Libenzi
2003-07-13 19:15         ` Jamie Lokier
2003-07-13 23:03           ` Davide Libenzi
2003-07-14  1:41             ` Jamie Lokier
2003-07-14  2:24               ` POLLRDONCE optimisation for epoll users (was: epoll and half closed TCP connections) Jamie Lokier
2003-07-14  2:37                 ` Davide Libenzi
2003-07-14  2:43                   ` Davide Libenzi
2003-07-14  2:56                   ` Jamie Lokier
2003-07-14  3:02                     ` Davide Libenzi
2003-07-14  3:16                       ` Jamie Lokier
2003-07-14  3:21                         ` Davide Libenzi
2003-07-14  3:42                           ` Jamie Lokier
2003-07-14  4:00                             ` Davide Libenzi
2003-07-14  5:51                               ` Jamie Lokier
2003-07-14  6:24                                 ` Davide Libenzi
2003-07-14  6:57                                   ` Jamie Lokier
2003-07-14  3:12                     ` Jamie Lokier
2003-07-14  3:17                       ` Davide Libenzi
2003-07-14  3:35                         ` Jamie Lokier
2003-07-14  3:04                   ` Jamie Lokier
2003-07-14  3:12                     ` Davide Libenzi
2003-07-14  3:27                       ` Jamie Lokier
2003-07-14 17:09     ` [Patch][RFC] epoll and half closed TCP connections kuznet
2003-07-14 17:09       ` Davide Libenzi
2003-07-14 21:45       ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MDEHLPKNGKAHNMBLJOLKGEHGEFAA.davids@webmaster.com \
    --to=davids@webmaster.com \
    --cc=davidel@xmailserver.org \
    --cc=e0206@foo21.com \
    --cc=entrope@gamesnet.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).