All of lore.kernel.org
 help / color / mirror / Atom feed
From: linux@horizon.com
To: davem@davemloft.net, linux@horizon.com
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu
Subject: Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3
Date: 22 Feb 2007 18:52:04 -0500	[thread overview]
Message-ID: <20070222235204.28947.qmail@science.horizon.com> (raw)
In-Reply-To: <20070222.054012.71091934.davem@davemloft.net>

> It's brilliant for disk I/O, not for networking for which
> blocking is the norm not the exception.
> 
> So people will have to likely do something like divide their
> applications into handling for I/O to files and I/O to networking.
> So beautiful. :-)
>
> Nobody has proposed anything yet which scales well and handles both
> cases.

The truly brilliant thing about the whole "create a thread on blocking"
is that you immediately make *every* system call asynchronous-capable,
including the thousands of obscure ioctls, without having to boil the
ocean rewriting 5/6 of the kernel from implicit (stack-based) to
explicit state machines.

You're right that it doesn't solve everything, but it's a big step
forward while keeping a reasonably clean interface.


Now, we have some portions of the kernel (to be precise, those that
currently support poll() and select()) that are written as explicit
state machines and can block on a much smaller context structure.

In truth, the division you assume above isn't so terrible.
My applications are *already* written like that.  It's just "poll()
until I accumulate a whole request, then fork a thread to handle it."

The only way to avoid allocating a kernel stack is to have the entire
handling code path, including the return to user space, written in
explicit state machine style.  (Once you get to user space, you can have
a threading library there if you like.) All the flaming about different
ways to implement completion notification is precisely because not much
is known about the best way to do it; there aren't a lot of applications
that work that way.

(Certainly that's because it wasn't possible before, but it's clearly
an area that requires research, so not committing to an implementation
is A Good Thing.)

But once that is solved, and "system call complete" can be reported
without returning to a user-space thread (which is basically an alternate
system call submission interface, *independent* of the fibril/threadlet
non-blocking implementation), then you can find the hot paths in the
kernel and special-case them to avoid creating a whole thread.

To use a networking analogy, this is a cleanly layered protocol design,
with an optimized fast path *implementation* that blurs the boundaries.


As for the overhead of threading, there are basically three parts:
1) System call (user/kernel boundary crossing) costs.  These depend only
   on the total number of system calls and not on the number of threads
   making them.  They can be mitigated *if necessary* with a syslet-like
   "macro syscall" mechanism to increase the work per boundary crossing.

   The only place threading might increase these numbers is thread
   synchronization, and futexes already solve that pretty well.

2) Register and stack swapping.  These (and associated cache issues)
   are basically unavoidable, and are the bare minimum that longjmp()
   does.  Nothing thread-based is going to reduce this.  (Actually,
   the kernel can do better than user space because it can do lazy FPU
   state swapping.)

3) MMU context switch costs.  These are the big ones, particular on x86
   without TLB context IDs.  However, these fall into a few categories:
   - Mandatory switches because the entire application is blocked.
     I don't see how this can be avoided; these are the cases where
     even a user-space longjmp-based thread library would context
     switch.
   - Context switches between threads in an application.  The Linux
     kernel already optimizes out the MMU context switch in this case,
     and the scheduler already knows that such context switches are
     cheaper and preferred.

   The one further optimization that's possible is if you have a system
   call that (in a common case) blocks multiple times *without accessing
   user memory*.  This is not a read() or write(), but could be
   something like fsync() or ftruncate().  In this case, you could
   temporarily mark the thread as a "kernel thread" that can run in any
   MMU context, and then fix it explicitly when you unmark it on the
   return path.

I can see the space overhead of 1:1 threading, but I really don't think
there's much time overhead.

  reply	other threads:[~2007-02-23  0:15 UTC|newest]

Thread overview: 277+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-22 12:27 [patch 00/13] Syslets, "Threadlets", generic AIO support, v3 linux
2007-02-22 13:40 ` David Miller
2007-02-22 23:52   ` linux [this message]
     [not found] <efa6f5910703021120ia0b8d18ne0c806eb2e73da4c@mail.gmail.com>
     [not found] ` <20070303103427.GC22557@2ka.mipt.ru>
2007-03-03 10:58   ` Ihar `Philips` Filipau
2007-03-03 11:03     ` Evgeniy Polyakov
2007-03-03 11:51       ` Ihar `Philips` Filipau
2007-03-03 21:12     ` Ray Lee
2007-03-03 22:14       ` Ihar `Philips` Filipau
2007-03-03 23:54         ` Ray Lee
2007-03-04  9:33       ` Michael K. Edwards
  -- strict thread matches above, loose matches on Subject: below --
2007-02-27 14:15 Al Boldi
2007-02-27 14:15 Al Boldi
2007-02-27 19:22 ` Theodore Tso
2007-03-01 10:21 ` Pavel Machek
2007-02-25 19:11 Al Boldi
2007-02-25 19:10 Al Boldi
2007-02-21 21:13 Ingo Molnar
2007-02-21 22:46 ` Michael K. Edwards
2007-02-21 22:57   ` Ingo Molnar
2007-02-22  0:53     ` Michael K. Edwards
2007-02-22  1:33       ` Michael K. Edwards
2007-02-22  6:51       ` Ingo Molnar
2007-02-22  8:59         ` Michael K. Edwards
2007-02-21 23:03   ` Ingo Molnar
2007-02-21 23:24   ` Ingo Molnar
2007-02-22  0:55     ` Michael K. Edwards
2007-02-21 23:31   ` Ingo Molnar
2007-02-21 23:46     ` Ulrich Drepper
2007-02-22  7:40       ` Ingo Molnar
2007-02-22 11:31         ` Evgeniy Polyakov
2007-02-22 11:52           ` Arjan van de Ven
2007-02-22 12:39             ` Evgeniy Polyakov
2007-02-22 13:41               ` David Miller
2007-02-22 14:31                 ` Ingo Molnar
2007-02-22 14:47                   ` David Miller
2007-02-22 15:02                     ` Evgeniy Polyakov
2007-02-22 15:15                     ` Ingo Molnar
2007-02-22 15:29                       ` Ingo Molnar
2007-02-22 17:17                       ` David Miller
2007-02-23 11:12                         ` Ingo Molnar
2007-02-22 20:13                     ` Davide Libenzi
2007-02-22 21:30                     ` Zach Brown
2007-02-22 14:59                   ` Ingo Molnar
2007-02-22 21:42                   ` Michael K. Edwards
2007-02-22 14:53                 ` Avi Kivity
2007-02-22 12:59           ` Ingo Molnar
2007-02-22 13:32             ` Evgeniy Polyakov
2007-02-22 19:46               ` Davide Libenzi
2007-02-23 12:15                 ` Evgeniy Polyakov
2007-02-23 17:43                   ` Davide Libenzi
2007-02-23 18:01                     ` Evgeniy Polyakov
2007-02-23 20:43                       ` Davide Libenzi
2007-02-23 11:51               ` Ingo Molnar
2007-02-23 12:22                 ` Evgeniy Polyakov
2007-02-23 12:41                   ` Evgeniy Polyakov
2007-02-25 17:45                   ` Ingo Molnar
2007-02-25 18:09                     ` Evgeniy Polyakov
2007-02-25 19:04                       ` Ingo Molnar
2007-02-25 19:42                         ` Evgeniy Polyakov
2007-02-25 20:38                           ` Ingo Molnar
2007-02-26 12:39                           ` Ingo Molnar
2007-02-26 14:05                             ` Evgeniy Polyakov
2007-02-26 14:15                               ` Ingo Molnar
2007-02-26 16:55                                 ` Evgeniy Polyakov
2007-02-26 20:35                                   ` Ingo Molnar
2007-02-26 22:06                                     ` Bill Huey
2007-02-27 10:09                                     ` Evgeniy Polyakov
2007-02-27 17:13                                     ` Pavel Machek
2007-02-27  2:18                                   ` Davide Libenzi
2007-02-27 10:13                                     ` Evgeniy Polyakov
2007-02-27 16:01                                       ` Davide Libenzi
2007-02-27 16:21                                         ` Evgeniy Polyakov
2007-02-27 16:58                                           ` Eric Dumazet
2007-02-27 17:06                                             ` Evgeniy Polyakov
2007-02-27 19:20                                           ` Davide Libenzi
2007-02-26 19:47                           ` Davide Libenzi
2007-02-25 23:14                         ` Michael K. Edwards
2007-02-22 14:17             ` Suparna Bhattacharya
2007-02-22 14:36               ` Ingo Molnar
2007-02-23 14:23                 ` Suparna Bhattacharya
2007-02-22 21:24             ` Michael K. Edwards
2007-02-23  0:30               ` Alan
2007-02-23  2:47                 ` Michael K. Edwards
2007-02-23  8:31                   ` Michael K. Edwards
2007-02-23 10:22                   ` Ingo Molnar
2007-02-23 12:37                   ` Alan
2007-02-23 23:49                     ` Michael K. Edwards
2007-02-24  1:08                       ` Alan
2007-02-24  0:51                         ` Michael K. Edwards
2007-02-24  2:17                           ` Michael K. Edwards
2007-02-24  3:25                           ` Michael K. Edwards
2007-02-23 12:17               ` Ingo Molnar
2007-02-24 19:52                 ` Michael K. Edwards
2007-02-24 21:04                   ` Davide Libenzi
2007-02-24 23:01                     ` Michael K. Edwards
2007-02-25 22:44           ` Linus Torvalds
2007-02-26 13:11             ` Ingo Molnar
2007-02-26 17:37               ` Evgeniy Polyakov
2007-02-26 18:19                 ` Arjan van de Ven
2007-02-26 18:38                   ` Evgeniy Polyakov
2007-02-26 18:56                     ` Chris Friesen
2007-02-26 19:20                       ` Evgeniy Polyakov
2007-02-26 17:28             ` Evgeniy Polyakov
2007-02-26 17:57               ` Linus Torvalds
2007-02-26 18:32                 ` Evgeniy Polyakov
2007-02-26 19:22                   ` Linus Torvalds
2007-02-26 19:30                     ` Evgeniy Polyakov
2007-02-26 20:04                       ` Linus Torvalds
2007-02-27  8:09                         ` Evgeniy Polyakov
2007-02-26 19:54                 ` Ingo Molnar
2007-02-27 10:28                   ` Evgeniy Polyakov
2007-02-27 11:52                     ` Theodore Tso
2007-02-27 12:11                       ` Evgeniy Polyakov
2007-02-27 12:13                         ` Ingo Molnar
2007-02-27 12:40                           ` Evgeniy Polyakov
2007-02-28 16:14                         ` Pavel Machek
2007-03-01  8:18                           ` Evgeniy Polyakov
2007-03-01  9:26                             ` Pavel Machek
2007-03-01  9:47                               ` Evgeniy Polyakov
2007-03-01  9:54                                 ` Ingo Molnar
2007-03-01 10:59                                   ` Evgeniy Polyakov
2007-03-01 11:00                                     ` Ingo Molnar
2007-03-01 11:16                                       ` Evgeniy Polyakov
2007-03-01 11:27                                         ` Ingo Molnar
2007-03-01 11:36                                           ` Evgeniy Polyakov
2007-03-01 11:41                                         ` Ingo Molnar
2007-03-01 11:47                                           ` Ingo Molnar
2007-03-01 12:10                                             ` Evgeniy Polyakov
2007-03-01 12:43                                               ` Ingo Molnar
2007-03-01 13:01                                                 ` Evgeniy Polyakov
2007-03-01 13:11                                                   ` Ingo Molnar
2007-03-01 13:30                                                     ` Evgeniy Polyakov
2007-03-01 14:19                                                       ` Eric Dumazet
2007-03-01 14:16                                                         ` Ingo Molnar
2007-03-01 14:31                                                           ` Eric Dumazet
2007-03-01 14:27                                                             ` Ingo Molnar
2007-03-01 14:54                                                           ` Evgeniy Polyakov
2007-03-01 15:09                                                             ` Ingo Molnar
2007-03-01 15:36                                                               ` Evgeniy Polyakov
2007-03-02 10:57                                                                 ` Ingo Molnar
2007-03-02 11:48                                                                   ` Evgeniy Polyakov
2007-03-02 17:32                                                                   ` Davide Libenzi
2007-03-02 19:39                                                                     ` Ingo Molnar
2007-03-02 20:18                                                                       ` Davide Libenzi
2007-03-02 20:29                                                                         ` Ingo Molnar
2007-03-02 20:53                                                                           ` Davide Libenzi
2007-03-02 21:21                                                                             ` Michael K. Edwards
2007-03-02 21:43                                                                             ` Nicholas Miell
2007-03-03  0:52                                                                               ` Davide Libenzi
2007-03-03  1:36                                                                                 ` Nicholas Miell
2007-03-03  1:48                                                                                   ` Benjamin LaHaise
2007-03-03  2:19                                                                                   ` Davide Libenzi
2007-03-03  7:19                                                                                     ` Ingo Molnar
2007-03-03  7:20                                                                                       ` Ingo Molnar
2007-03-03  9:02                                                                                       ` Davide Libenzi
2007-03-01 19:31                                                             ` Davide Libenzi
2007-03-02  8:10                                                               ` Evgeniy Polyakov
2007-03-02 17:13                                                                 ` Davide Libenzi
2007-03-02 19:13                                                                   ` Davide Libenzi
2007-03-03 10:06                                                                   ` Evgeniy Polyakov
2007-03-03 18:46                                                                     ` Davide Libenzi
2007-03-03 20:31                                                                       ` Evgeniy Polyakov
2007-03-03 21:57                                                                         ` Davide Libenzi
2007-03-03 22:10                                                                           ` Davide Libenzi
2007-03-04 16:23                                                                           ` Kirk Kuchov
2007-03-04 17:46                                                                             ` Kyle Moffett
2007-03-05  5:23                                                                               ` Michael K. Edwards
2007-03-04 21:17                                                                             ` Davide Libenzi
2007-03-04 22:49                                                                               ` Kirk Kuchov
2007-03-04 22:57                                                                                 ` Davide Libenzi
2007-03-05  4:46                                                                                 ` Magnus Naeslund(k)
2007-03-06  8:19                                                                                 ` Pavel Machek
2007-03-07 12:02                                                                                   ` Kirk Kuchov
2007-03-07 17:26                                                                                     ` Linus Torvalds
2007-03-07 17:39                                                                                     ` Ingo Molnar
2007-03-07 18:21                                                                                       ` Kirk Kuchov
2007-03-07 18:24                                                                                         ` Jens Axboe
2007-03-07 18:32                                                                                         ` Evgeniy Polyakov
2007-03-01 12:01                                           ` Evgeniy Polyakov
2007-03-01 11:14                                     ` Eric Dumazet
2007-03-01 11:20                                       ` Evgeniy Polyakov
2007-03-01 11:28                                         ` Eric Dumazet
2007-03-01 11:47                                           ` Evgeniy Polyakov
2007-03-01 13:12                                             ` Eric Dumazet
2007-03-01 14:43                                               ` Evgeniy Polyakov
2007-03-01 14:47                                                 ` Ingo Molnar
2007-03-01 15:23                                                   ` Evgeniy Polyakov
2007-03-01 15:32                                                     ` Eric Dumazet
2007-03-01 15:41                                                       ` Eric Dumazet
2007-03-01 15:51                                                         ` Evgeniy Polyakov
2007-03-01 15:47                                                       ` Evgeniy Polyakov
2007-03-01 19:47                                                       ` Davide Libenzi
2007-03-01 14:57                                                 ` Evgeniy Polyakov
2007-03-01 12:19                                           ` Evgeniy Polyakov
2007-03-01 12:34                                     ` Ingo Molnar
2007-03-01 13:26                                       ` Evgeniy Polyakov
2007-03-01 13:32                                         ` Ingo Molnar
2007-03-01 14:24                                           ` Evgeniy Polyakov
2007-03-01 16:56                                     ` David Lang
2007-03-01 17:56                                       ` Evgeniy Polyakov
2007-03-01 18:41                                         ` David Lang
2007-03-01 19:19                                   ` Davide Libenzi
2007-03-01 10:11                                 ` Pavel Machek
2007-03-01 10:19                                   ` Ingo Molnar
2007-03-01 11:18                                   ` Evgeniy Polyakov
2007-03-02 10:27                                     ` Pavel Machek
2007-03-02 10:37                                       ` Evgeniy Polyakov
2007-03-02 10:56                                         ` Ingo Molnar
2007-03-02 11:08                                           ` Evgeniy Polyakov
2007-03-02 17:28                                             ` Davide Libenzi
2007-03-03 10:27                                               ` Evgeniy Polyakov
2007-03-01 19:24                             ` Johann Borck
2007-03-01 19:37                               ` David Lang
2007-03-01 20:34                                 ` Johann Borck
2007-02-27 12:34                       ` Ingo Molnar
2007-02-27 13:14                         ` Evgeniy Polyakov
2007-02-27 13:32                         ` Avi Kivity
2007-02-28  3:03                       ` Michael K. Edwards
2007-02-28  8:02                         ` Evgeniy Polyakov
2007-02-28 17:01                           ` Michael K. Edwards
2007-02-28 16:38                         ` Phillip Susi
2007-02-22 19:38         ` Davide Libenzi
2007-02-28  9:45           ` Ingo Molnar
2007-02-28 16:17             ` Davide Libenzi
2007-02-28 16:42               ` Linus Torvalds
2007-02-28 17:26                 ` Ingo Molnar
2007-02-28 18:22                 ` Davide Libenzi
2007-02-28 18:42                   ` Linus Torvalds
2007-02-28 18:50                     ` Davide Libenzi
2007-02-28 19:03                       ` Chris Friesen
2007-02-28 19:42                         ` Davide Libenzi
2007-03-01  8:38                           ` Evgeniy Polyakov
2007-03-01  9:28                             ` Evgeniy Polyakov
2007-02-28 23:12                 ` Ingo Molnar
2007-03-01  1:33                   ` Andrea Arcangeli
2007-03-01  9:15                     ` Evgeniy Polyakov
2007-03-01 21:27                   ` Linus Torvalds
2007-02-28 20:21               ` Ingo Molnar
2007-02-28 21:09                 ` Davide Libenzi
2007-02-28 21:23                   ` Ingo Molnar
2007-02-28 21:46                     ` Davide Libenzi
2007-02-28 22:22                       ` Ingo Molnar
2007-02-28 22:47                         ` Davide Libenzi
2007-02-22 21:23         ` Zach Brown
2007-02-22 21:32           ` Benjamin LaHaise
2007-02-22 21:44             ` Zach Brown
2007-02-22  1:04     ` Michael K. Edwards
2007-02-22  7:00       ` Ingo Molnar
2007-02-22  9:29         ` Michael K. Edwards
2007-02-22 10:01 ` Suparna Bhattacharya
2007-02-22 11:20   ` Ingo Molnar
2007-02-24 18:34 ` Evgeniy Polyakov
2007-02-25 17:23   ` Ingo Molnar
2007-02-25 17:44     ` Evgeniy Polyakov
2007-02-25 17:54       ` Ingo Molnar
2007-02-25 18:21         ` Evgeniy Polyakov
2007-02-25 18:22           ` Ingo Molnar
2007-02-25 18:37             ` Evgeniy Polyakov
2007-02-25 18:34               ` Ingo Molnar
2007-02-25 20:01                 ` Frederik Deweerdt
2007-02-25 19:21               ` Ingo Molnar
     [not found]                 ` <20070225194645.GB1353@2ka.mipt.ru>
     [not found]                   ` <20070225195308.GC15681@elte.hu>
     [not found]                     ` <Pine.LNX.4.64.0702251232350.6011@alien.or.mcafeemobile.com>
2007-02-26  8:16                       ` Ingo Molnar
2007-02-26  9:25                         ` Evgeniy Polyakov
2007-02-26  9:55                           ` Ingo Molnar
2007-02-26 10:31                             ` Ingo Molnar
2007-02-26 10:43                               ` Evgeniy Polyakov
2007-02-26 20:02                               ` Davide Libenzi
2007-02-26 10:33                             ` Evgeniy Polyakov
2007-02-26 10:35                               ` Ingo Molnar
2007-02-26 10:47                                 ` Evgeniy Polyakov
2007-02-26 12:51                                   ` Ingo Molnar
2007-02-26 16:46                                     ` Evgeniy Polyakov
2007-02-27  6:24                                       ` Ingo Molnar
2007-02-27 10:41                                         ` Evgeniy Polyakov
2007-02-27 10:49                                           ` Ingo Molnar
2007-02-25 18:25           ` Evgeniy Polyakov
2007-02-25 18:24             ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070222235204.28947.qmail@science.horizon.com \
    --to=linux@horizon.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.