archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <>
To: Ingo Oeser <>
Cc:, Kernel Mailing List <>
Subject: Re: Make pipe data structure be a circular list of pages, rather
Date: Fri, 14 Jan 2005 14:44:04 -0800 (PST)	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Fri, 14 Jan 2005, Ingo Oeser wrote:
> Data sink/source is simple, indeed. You just implemented a buffering
> between a drivers output filp, if I understand you correctly.

Yes and no. It is indeed just buffering.

The thing I find most intriguing about it, and which is why I htink it's
important is that while it's "just" buffering, it is _standard_ buffering.  
That's where it gets interesting. Everybody needs buffers, and they are
generally so easy to implement that there's no point in having much of a 
"buffer library". 

But it the standard way allows you to do interesting stuff _between_ two 
things, than that's where it gets interesting. The combinations it allows.

> Now both directions together and it gets a bit more difficult.
> Driver writers basically reimplement fs/pipe.c over and over again.

But bidirectional is just two of those things. 

> Always the same basic operations for that needing the same structure
> to handle it.


> Maybe the solution here is just using ONE fd and read/write to it.

No. Use two totally separate fd's, and make it _cheap_ to move between 
them. That's what "splice()" gives you - basically a very low-cost way to 
move between two uni-directional things. No "memcpy()", because memcpy is 
expensive for large streams of data.

> > > We also don't have wire_fds(), which would wire up two fds by
> > > connecting the underlying file pointers with each other and closing the
> > > fds.
> >
> > But that is _exactly_ what splice() would do.
> >
> > So you could have the above open of "/dev/mpeginput", and then you just> > sit and splice the result into a file or whatever.
> >
> > See?
> But you are still pushing everything through the page cache.
> I would like to see "fop->connect(producer, consumer)"

No no. There's no buffer cache involved. There is just "buffers".

If you end up reading from a regular file (or writing to one), then yes,
the buffers end up being picked up from the buffer cache. But that's by no
means required. The buffers can be just anonymous pages (like the ones a
regular "write()" to a pipe generates), or they could be DMA pages
allocated for the data by a device driver. Or they could be the page that
contains a "skb" from a networking device. 

I really think that splice() is what you want, and you call "wire()". 

> Imagine 3 chips. One is a encoder/decoder chip with 8 channels, the other
> 2 chips are a video DAC/ADC and an audio DAC/ADC with 4 channels each.
> These chips can be wired directly by programming a wire matrix (much like a 
> dumb routing table). But you can also receive/send via all of these chips
> to/from harddisk for recording/playback. 
> So you have to implement the drivers for these chips to provide one filp per 
> channel and one minor per chip.
> If I can do this with splice, you've got me and I'm really looking forward to 
> your first commits/patches, since this itch is scratching me since long ;-)

Yes, I believe that we're talking about the same thing. What you can do in 
my vision is:

 - create a pipe for feeding the audio decoder chip. This is just the 
   sound driver interface to a pipe, and it's the "device_open()" code I 
   gave as an example in my previous email, except going the other way (ie 
   the writer is the 'fd', and the driver is the "reader" of the data).

   You'd do this with a simple 'fd = open("/dev/xxxx", O_WRONLY)",
   together with some ioctl (if necessary) to set up the actual
   _parameters_ for the piped device.

 - do a "splice()" from a file to the pipe. A splice from a regular file 
   is really nothing but a page cache lookup, and moving those page cache 
   pages to the pipe.

 - tell the receiving fd to start processing it (again, this might be an 
   ioctl - some devices may need directions on how to interpret the data,
   or it might be implicit in the fact that the pipe got woken up by being 

Going the other way (receiving data from a hardware device) is all the 
same thing - and there "tee()" may be useful, since it would allow the 
received data to be dup'ed to two different sinks.


  reply	other threads:[~2005-01-14 22:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-08  8:25 Make pipe data structure be a circular list of pages, rather linux
2005-01-08 18:41 ` Linus Torvalds
2005-01-08 21:47   ` Alan Cox
2005-01-13 21:46   ` Ingo Oeser
2005-01-13 22:32     ` Linus Torvalds
2005-01-14 21:03       ` Ingo Oeser
2005-01-14 21:29         ` Linus Torvalds
2005-01-14 22:12           ` Ingo Oeser
2005-01-14 22:44             ` Linus Torvalds [this message]
2005-01-14 23:34               ` Ingo Oeser
2005-01-15  0:16                 ` Linus Torvalds
2005-01-16  2:59                   ` Linus Torvalds
2005-01-17 16:03                     ` Ingo Oeser
2005-01-19 21:12                     ` Make pipe data structure be a circular list of pages, rather than linux
2005-01-20  2:06                       ` Robert White
2005-01-15 23:42 Make pipe data structure be a circular list of pages, rather linux
2005-01-15 22:55 ` Alan Cox
2005-01-16  0:12   ` Linus Torvalds
2005-01-16  2:02     ` Miquel van Smoorenburg
2005-01-16  2:06     ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).