Pipes and fd question. Large amounts of data.

* Pipes and fd question. Large amounts of data.
@ 2005-01-30  9:15 Oded Shimon
       [not found] ` <200501300941.45554.miles@milessabin.com>
  2005-01-30 19:41 ` Miquel van Smoorenburg
  0 siblings, 2 replies; 5+ messages in thread
From: Oded Shimon @ 2005-01-30  9:15 UTC (permalink / raw)
  To: linux-kernel

A Unix C programming question. Has to do mostly with pipes, so I am hoping I 
am asking in the right place.

I have a rather unique situation. I have 2 programs, neither of which   have 
control over.
Program A writes into TWO fifo's.
Program B reads from two fifo's.

My program is the middle step.

The problem - neither programs are aware of each other, and write into any of 
the fifo's at their own free will. They will also block until whatever data 
moving they did is complete.

Meaning, if I were to use the direct approach and have no middle step, the 
programs would be thrown into a deadlock instantly. as one program will write 
info fifo 1, and the other will be reading from fifo 2.

The amounts of data is very large, GB's of data in total, and at least 10mb a 
second or possibly as much as 300mb a second. So efficiency in context 
switching is very important.

programs A & B both write and read using large chunks, usually 300k.

So far, my solution is using select() and non blocking pipes. I also used 
large buffers (20mb). In my measurements, at worst case the programs 
write/read 6mb before switching to the other fifo. so 20mb is safe enough.

I have implemented this, but it has a major disadvantage - every 'write()' 
only write 4k at a time, never more, because of how non-blocking pipes are 
done. at 20,000 context switches a second, this method reaches barely 10mb a 
second, if not less.

Blocking pipes have an advantage - they can write large chunks at a time. They 
have a more serious disadvantage though - the amount of data you ask to be 
written/read, IS the amount of data that will be written or read, and will 
block until that much data is moved. I cannot know beforehand exactly how 
much data the programs want, so this could easily fall into a dead lock.

Ideally, I could do this:
my program:  write(20mb);
program B:     read(300k);
my program:  write() returns with return value '300,000'

I was unable to find anything like this solution or similar.
No combination of blocking/non blocking fd's will give this, or any system 
call.
I am looking for alternative/better suggestions.

- ods15.

^ permalink raw reply	[flat|nested] 5+ messages in thread