linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Asynchronous io
@ 2001-04-12 15:40 CJ
  2001-04-12 16:22 ` Bart Trojanowski
  0 siblings, 1 reply; 5+ messages in thread
From: CJ @ 2001-04-12 15:40 UTC (permalink / raw)
  To: Linux Kernel

//Linux really needs a clean basis for asynchronous and 
//unbuffered i/o libraries.  Something like the fork/thread
//clone(), but to replace select() and aio_* polling.  This 
//might be a start. And it is just a file and very like a
//pipe or socket.

//Suppose we add /dev/qio with 64 byte sectors as follows: 

struct qio{            //64 byte i/o request
    u16 flags;          //0.0 request block variant, SEEK_SET...
    u16 verb;           //0.2 open,close,read,mmap,sync,write,
                        //    ioctl
                        //    mallocIO&read,write&freeIO,
                        //    mallocIO,freeIO
                        //    autothread might be an ioctl()
    u16 errno;          //0.4 per request status
    u16 completehow;    //0.6 queue,AST,pipe,SIGIO,SIGIO||delete ok
    u64 offset;         //1 
    u32 length;         //2.0 bytes requested
    u32 timeout;        //2.4 im ms or us?
    u32 transferred;    //3.0 bytes
    u32 qiohandle;      //3.4 for cancell or polling
    void* handle;       //4 (open & close might write)
    void* buffer;       //5
    void* callback;     //6 optimize special cases w/ completehow
    void* callparam;    //7 
};                      //all fields are read xor write

//Writing to the device would schedule i/o, reading would reap
//completions.  Bad writes would give the byte offset to the 
//rejected sector field if detected synchronously.  Multiple 
//sector writes would be truncated on the first bad sector.
//Accepted writes would be buffered in the kernel.

//Each open creates a new queue, each write is read in the
//same queue.  Any number of threads can read or write a queue.

//some cases might be simplified by kernel processed completions, 
//such as VMS AST emulation, or putting results in a pipe. Hence
//completehow, which might use callback and callparam.

//timeout?  
//canceling i/o?  
//Sun aio emulation?  
//VMS qio emulation?  
//MS IOCP emulation?
//malloc()&free() safe across threads?
//Should O_DIRECT would error unless properly aligned etc.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Asynchronous io
  2001-04-12 15:40 Asynchronous io CJ
@ 2001-04-12 16:22 ` Bart Trojanowski
  0 siblings, 0 replies; 5+ messages in thread
From: Bart Trojanowski @ 2001-04-12 16:22 UTC (permalink / raw)
  To: CJ; +Cc: Linux Kernel

Hi CJ,
  you should really read the thread titled "Linux's implementation of
poll() not scalable?" in the LKML archives, here is a link:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0010.3/0003.html
There are many problems with the /dev/something interface for events and
all is described in that thread.

  I have worked on a way suggested by Linus to get rid of the hit in
performance when using select() and poll().  I have a working model for
TCP sockets (as that is what I wanted to speed up - a TCP based proxy).
My implementation is still in alpha but is available here:
  http://www.jukie.net/~bart/kernel/fdevent/

  Now, before anyone gets to excited... I spoke with Linus about this and
he suggested that I speak with Ben LaHaise who is working on async io
using some modifications to the wail queue.  I have send him mail but have
not heard from Ben - I guess he must be as busy as the rest of us with a
full mailbox of messages that he has no time to reply to. :)

  My implementation introduces two new system calls: bind_event and
get_events (as descibed in Linus' email above).  The project is still in
an alpha stage so I don't have any benchmarks.  I am working on this at my
own time so progress is moving at a slow pace... unfortunately.

Regards,
Bart.

On Thu, 12 Apr 2001, CJ wrote:

> //Linux really needs a clean basis for asynchronous and
> //unbuffered i/o libraries.  Something like the fork/thread
> //clone(), but to replace select() and aio_* polling.  This
> //might be a start. And it is just a file and very like a
> //pipe or socket.
>
> //Suppose we add /dev/qio with 64 byte sectors as follows:
>
> struct qio{            //64 byte i/o request
>     u16 flags;          //0.0 request block variant, SEEK_SET...
>     u16 verb;           //0.2 open,close,read,mmap,sync,write,
>                         //    ioctl
>                         //    mallocIO&read,write&freeIO,
>                         //    mallocIO,freeIO
>                         //    autothread might be an ioctl()
>     u16 errno;          //0.4 per request status
>     u16 completehow;    //0.6 queue,AST,pipe,SIGIO,SIGIO||delete ok
>     u64 offset;         //1
>     u32 length;         //2.0 bytes requested
>     u32 timeout;        //2.4 im ms or us?
>     u32 transferred;    //3.0 bytes
>     u32 qiohandle;      //3.4 for cancell or polling
>     void* handle;       //4 (open & close might write)
>     void* buffer;       //5
>     void* callback;     //6 optimize special cases w/ completehow
>     void* callparam;    //7
> };                      //all fields are read xor write
>
> //Writing to the device would schedule i/o, reading would reap
> //completions.  Bad writes would give the byte offset to the
> //rejected sector field if detected synchronously.  Multiple
> //sector writes would be truncated on the first bad sector.
> //Accepted writes would be buffered in the kernel.
>
> //Each open creates a new queue, each write is read in the
> //same queue.  Any number of threads can read or write a queue.
>
> //some cases might be simplified by kernel processed completions,
> //such as VMS AST emulation, or putting results in a pipe. Hence
> //completehow, which might use callback and callparam.
>
> //timeout?
> //canceling i/o?
> //Sun aio emulation?
> //VMS qio emulation?
> //MS IOCP emulation?
> //malloc()&free() safe across threads?
> //Should O_DIRECT would error unless properly aligned etc.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
	WebSig: http://www.jukie.net/~bart/sig/




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Asynchronous IO
  2001-04-13  8:45 Asynchronous IO Dan Maas
  2001-04-14  2:00 ` Christopher Smith
@ 2001-04-19 18:19 ` Stephen C. Tweedie
  1 sibling, 0 replies; 5+ messages in thread
From: Stephen C. Tweedie @ 2001-04-19 18:19 UTC (permalink / raw)
  To: Dan Maas; +Cc: linux-kernel, cj, bart, Stephen Tweedie

Hi,

On Fri, Apr 13, 2001 at 04:45:07AM -0400, Dan Maas wrote:
> IIRC the problem with implementing asynchronous *disk* I/O in Linux today is
> that the filesystem code assumes synchronous I/O operations that block the
> whole process/thread. So implementing "real" asynch I/O (without the
> overhead of creating a process context for each operation) would require
> re-writing the filesystems as non-blocking state machines. Last I heard this
> was a long-term goal, but nobody's done the work yet

SGI and Ben LaHaise both have kernel async IO functionality working,
and Ingo Molnar's Tux code has support for doing certain filesystem
lookup operations asynchronously too.  

--Stephen

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Asynchronous IO
  2001-04-13  8:45 Asynchronous IO Dan Maas
@ 2001-04-14  2:00 ` Christopher Smith
  2001-04-19 18:19 ` Stephen C. Tweedie
  1 sibling, 0 replies; 5+ messages in thread
From: Christopher Smith @ 2001-04-14  2:00 UTC (permalink / raw)
  To: Dan Maas, linux-kernel; +Cc: cj, bart

--On Friday, April 13, 2001 04:45:07 -0400 Dan Maas <dmaas@dcine.com> wrote:
> IIRC the problem with implementing asynchronous *disk* I/O in Linux today
> is that the filesystem code assumes synchronous I/O operations that block
> the whole process/thread. So implementing "real" asynch I/O (without the
> overhead of creating a process context for each operation) would require
> re-writing the filesystems as non-blocking state machines. Last I heard
> this was a long-term goal, but nobody's done the work yet (aside from
> maybe the SGI folks with XFS?). Or maybe I don't know what I'm talking
> about...

If the FS supports generic read then this is not a problem. This is what 
SGI's KAIO does as well as Bart's work.

> Bart, glad to hear you are working on an event interface, sounds cool! One
> feature that I really, really, *really* want to see implemented is the
> ability to block on a set of any "waitable kernel objects" with one
> syscall - not just file descriptors, but also SysV semaphores and message
> queues, UNIX signals and child proceses, file locks, pthreads condition
> variables, asynch disk I/O completions, etc. I am dying for a clean way to
> accomplish this that doesn't require more than one thread... (Win32 and
> FreeBSD kick our butts here with MsgWaitForMultipleObjects() and
> kevent()...) IMHO cleaning up this API deficiency is just as important as
> optimizing the extreme case of socket I/O with zillions of file
> descriptors...

Actually, sigwaitinfo() has zero problem waiting on muliple signals. If you 
are using real-time signals each signal can pass a pointer to the relevant 
object, so even if you're only blocking on a single signal you can receive 
info about several objects.

<insert thread about how signals suck here>

--Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Asynchronous IO
@ 2001-04-13  8:45 Dan Maas
  2001-04-14  2:00 ` Christopher Smith
  2001-04-19 18:19 ` Stephen C. Tweedie
  0 siblings, 2 replies; 5+ messages in thread
From: Dan Maas @ 2001-04-13  8:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: cj, bart

IIRC the problem with implementing asynchronous *disk* I/O in Linux today is
that the filesystem code assumes synchronous I/O operations that block the
whole process/thread. So implementing "real" asynch I/O (without the
overhead of creating a process context for each operation) would require
re-writing the filesystems as non-blocking state machines. Last I heard this
was a long-term goal, but nobody's done the work yet (aside from maybe the
SGI folks with XFS?). Or maybe I don't know what I'm talking about...

Bart, glad to hear you are working on an event interface, sounds cool! One
feature that I really, really, *really* want to see implemented is the
ability to block on a set of any "waitable kernel objects" with one
syscall - not just file descriptors, but also SysV semaphores and message
queues, UNIX signals and child proceses, file locks, pthreads condition
variables, asynch disk I/O completions, etc. I am dying for a clean way to
accomplish this that doesn't require more than one thread... (Win32 and
FreeBSD kick our butts here with MsgWaitForMultipleObjects() and
kevent()...) IMHO cleaning up this API deficiency is just as important as
optimizing the extreme case of socket I/O with zillions of file
descriptors...

Regards,
Dan


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-04-20 11:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-04-12 15:40 Asynchronous io CJ
2001-04-12 16:22 ` Bart Trojanowski
2001-04-13  8:45 Asynchronous IO Dan Maas
2001-04-14  2:00 ` Christopher Smith
2001-04-19 18:19 ` Stephen C. Tweedie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).