linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "David Schwartz" <davids@webmaster.com>
To: "Davide Libenzi" <davidel@xmailserver.org>
Cc: "Eric Varsanyi" <e0206@foo21.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: RE: [Patch][RFC] epoll and half closed TCP connections
Date: Sun, 13 Jul 2003 16:05:38 -0700	[thread overview]
Message-ID: <MDEHLPKNGKAHNMBLJOLKGEFKEFAA.davids@webmaster.com> (raw)
In-Reply-To: <Pine.LNX.4.55.0307131334380.15022@bigblue.dev.mcafeelabs.com>


> Let's look at what the poll code does :
>
> 1) It has to allocate the kernel buffer for events
>
> 2) It has to copy it from userspace
>
> 3) It has to allocate wait queue buffer calling get_free_page (possibly
> 	multiple times when we talk about decent fds numbers)
>
> 4) It has to loop calling N times f_op->poll() that in turn will add into
> 	the wait queue getting/releasing IRQ locks
>
> 5) Loop another M loop to copy events to userspace
>
> 6) Call kfree() for all blocks allocated
>
> 7) Call poll_freewait() that will go with another N loop to unregister
> 	poll waits, that in turn will do another N IRQ locks

	This is really just due to bad coding in 'poll', or more precisely very bad
for this case. For example, why is it allocating a wait queue buffer if the
odds that it will need to wait are basically zero? Why is it adding file
descriptors to the wait queue before it has determined that it needs to
wait?

	As load increases, more and more calls to 'poll' require no waiting. Yet
'poll' is heavily optimized for the 'no or low load' case. That's why 'poll'
doesn't scale on Linux.

> Yes, of course. The time spent inside poll/select becomes a PITA when you
> start dealing with huge number of fds. And this is kernel time. This does
> not obviously mean that if epoll is 10 times faster than poll under load,
> and you switch your app on epoll, it'll be ten times faster. It means that
> the kernel time spent inside poll will be 1/10. And many of the operations
> done by poll require IRQ locks and this increase the time the kernel
> spend with disabled IRQs, that is never a good thing.

	My experience has been that this is a huge problem with Linux but not with
any other OS. It can be solved in user-space with some other penalities by
an adaptive sleep before each call to 'poll' and polling with a zero timeout
(thus avoiding the wait queue pain). But all the deficiencies in the 'poll'
implementation in the world won't show anything except that 'poll' is badly
implemented.

> - Davide

	DS



  reply	other threads:[~2003-07-13 22:51 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-12 18:16 [Patch][RFC] epoll and half closed TCP connections Eric Varsanyi
2003-07-12 19:44 ` Jamie Lokier
2003-07-12 20:51   ` Eric Varsanyi
2003-07-12 20:48     ` Davide Libenzi
2003-07-12 21:19       ` Eric Varsanyi
2003-07-12 21:20         ` Davide Libenzi
2003-07-12 21:41         ` Davide Libenzi
2003-07-12 23:11           ` Eric Varsanyi
2003-07-12 23:55             ` Davide Libenzi
2003-07-13  1:05               ` Eric Varsanyi
2003-07-13 20:32       ` David Schwartz
2003-07-13 21:10         ` Jamie Lokier
2003-07-13 23:05           ` David Schwartz
2003-07-13 23:09             ` Davide Libenzi
2003-07-14  8:14               ` Alan Cox
2003-07-14 15:03                 ` Davide Libenzi
2003-07-14  1:27             ` Jamie Lokier
2003-07-13 21:14         ` Davide Libenzi
2003-07-13 23:05           ` David Schwartz [this message]
2003-07-13 23:11             ` Davide Libenzi
2003-07-13 23:52             ` Entrope
2003-07-14  6:14               ` David Schwartz
2003-07-14  7:20                 ` Jamie Lokier
2003-07-14  1:51             ` Jamie Lokier
2003-07-14  6:14               ` David Schwartz
2003-07-15 20:27             ` James Antill
2003-07-16  1:46               ` David Schwartz
2003-07-16  2:09                 ` James Antill
2003-07-13 13:12     ` Jamie Lokier
2003-07-13 16:55       ` Davide Libenzi
2003-07-12 20:01 ` Davide Libenzi
2003-07-13  5:24   ` David S. Miller
2003-07-13 14:07     ` Jamie Lokier
2003-07-13 17:00       ` Davide Libenzi
2003-07-13 19:15         ` Jamie Lokier
2003-07-13 23:03           ` Davide Libenzi
2003-07-14  1:41             ` Jamie Lokier
2003-07-14  2:24               ` POLLRDONCE optimisation for epoll users (was: epoll and half closed TCP connections) Jamie Lokier
2003-07-14  2:37                 ` Davide Libenzi
2003-07-14  2:43                   ` Davide Libenzi
2003-07-14  2:56                   ` Jamie Lokier
2003-07-14  3:02                     ` Davide Libenzi
2003-07-14  3:16                       ` Jamie Lokier
2003-07-14  3:21                         ` Davide Libenzi
2003-07-14  3:42                           ` Jamie Lokier
2003-07-14  4:00                             ` Davide Libenzi
2003-07-14  5:51                               ` Jamie Lokier
2003-07-14  6:24                                 ` Davide Libenzi
2003-07-14  6:57                                   ` Jamie Lokier
2003-07-14  3:12                     ` Jamie Lokier
2003-07-14  3:17                       ` Davide Libenzi
2003-07-14  3:35                         ` Jamie Lokier
2003-07-14  3:04                   ` Jamie Lokier
2003-07-14  3:12                     ` Davide Libenzi
2003-07-14  3:27                       ` Jamie Lokier
2003-07-14 17:09     ` [Patch][RFC] epoll and half closed TCP connections kuznet
2003-07-14 17:09       ` Davide Libenzi
2003-07-14 21:45       ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MDEHLPKNGKAHNMBLJOLKGEFKEFAA.davids@webmaster.com \
    --to=davids@webmaster.com \
    --cc=davidel@xmailserver.org \
    --cc=e0206@foo21.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).