From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner+willy=40w.ods.org-S262042AbVAOAQS@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S262042AbVAOAQS (ORCPT <rfc822;willy@w.ods.org>);
	Fri, 14 Jan 2005 19:16:18 -0500
Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262043AbVAOAQS
	(ORCPT <rfc822;linux-kernel-outgoing>);
	Fri, 14 Jan 2005 19:16:18 -0500
Received: from fw.osdl.org ([65.172.181.6]:17363 "EHLO mail.osdl.org")
	by vger.kernel.org with ESMTP id S262042AbVAOAQG (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 14 Jan 2005 19:16:06 -0500
Date: Fri, 14 Jan 2005 16:16:00 -0800 (PST)
From: Linus Torvalds <torvalds@osdl.org>
To: Ingo Oeser <ioe-lkml@axxeo.de>
cc: linux@horizon.com, Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Make pipe data structure be a circular list of pages, rather
In-Reply-To: <200501150034.31880.ioe-lkml@axxeo.de>
Message-ID: <Pine.LNX.4.58.0501141550000.2310@ppc970.osdl.org>
References: <20050108082535.24141.qmail@science.horizon.com>
 <200501142312.50861.ioe-lkml@axxeo.de> <Pine.LNX.4.58.0501141430320.2310@ppc970.osdl.org>
 <200501150034.31880.ioe-lkml@axxeo.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
X-Mailing-List: linux-kernel@vger.kernel.org


On Sat, 15 Jan 2005, Ingo Oeser wrote:
> 
> Now imagine this (ACSCII art in monospace font):
> 
> [ chip A ] ------(1)------ [ chip B ]   [ CPU ]   [memory]  [disk]
>       |                        |           |         |        |
>       +----------(2)-----------+---(2)-----+----(2)--+--------+
> 
> Yes, I understand and support your vision. Now I would like to use path (1)
> the direct for this. Possible? 

Yes, if I understand what you're thinking right. I'm just assuming (for 
the sake of simplicity) that "A" generates all data, and "B" is the one 
that uses it. If both A and B can generate/use data, it still works, but 
you'd need two pipes per endpoint, just because pipes are fundamentally 
uni-directional.

Also, for this particular example, I'll also assume that whatever buffers 
that chips A/B use are "mappable" to the CPU too: they could be either 
regular memory pages (that A/B DMA to/from), or they can be IO buffers on 
the card itself that can be mapped into the CPU address space and 
memcpy'ed.

That second simplification is something that at least my current example
code (which I don't think I've sent to you) kind of assumes. It's not
really _forced_ onto the design, but it avoids the need for a separate 
"accessor" function to copy things around.

[ If you cannot map them into CPU space, then each "struct pipe_buffer" op
  structure would need operations to copy them to/from kernel memory and
  to copy them to/from user memory, so the second simplification basically
  avoids having to have four extra "accessor" functions. ]

What you'd do to get your topology above is:
 - create a "pipe" for both A and B (let's call them "fdA" and "fdB" 
   respectively).
 - the driver is responsible for creating the "struct pipe_buffer" for any 
   data generated by chip A. If it's regular memory, it's a page 
   allocation and the appropriate DMA setup, and if it's IO memory on the 
   card, you'd need to generate fake "struct page *" and a mapping 
   function for them.
 - if you want to send the data that A generates to both fdB _and_ write 
   it to disk (file X), you'd also need one anonymous pipe (fdC), and open
   a fdX for writing to the file, and then just in your process
   effectively do:

	for (;;) {
		n = tee(fdA, fdB, fbC);
		splice(fdC, fdX, n);
	}

   and you'd get what I think you are after. The reason you need the "fdC" 
   is simply because the "tee()" operation would only work on pipes: you 
   can not duplicate any other data structure with "tee()" than a pipe
   buffer, so you couldn't do a direct "tee(fdA, fdB, fdX)" in my world.

(The above example is _extremely_ simplified: in real life, you'd have to
take care of partial results from "splice()" etc error conditions, of
course).

NOTE! My first cut will _assume_ that all buffers are in RAM, so it
wouldn't support the notion of on-card memory. On-card memory does
complicate things - not just the accessor functions, but also the notion
of how to "splice()" (or "tee()" from one to the other). If you assume
buffers-in-RAM, then all pointers are the same, and a tee/splice really
just moves a pointer around. But if the pointer can have magic meaning
that depends on the source it came from, then splicing such a pointer to
another device must inevitably imply at least the possibility of some kind
of memory copy.

So I don't think you'll get _exactly_ what you want for a while, since
you'd have to go through in-memory buffers. But there's no huge conceptual
problem (just enough implementation issues to make it inconvenient) with
the concept of actually keeping the buffer on an external controller.

		Linus