linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Async IO using threads
@ 2002-02-27 10:55 Reza Roboubi
  2002-02-27 17:37 ` Masoud Sharbiani
  0 siblings, 1 reply; 4+ messages in thread
From: Reza Roboubi @ 2002-02-27 10:55 UTC (permalink / raw)
  To: LK

SUMMARY:

Basically, I'm trying to do async io through a SCHED_FIFO thread with
high priority reading the disk, and the other less prioritized thread
doing "real" work.  But I can't get _nearly_ enough out of the CPU while
reading the disk with the other thread.  It is just intolerably 
inefficient and I _hope_ that I am making  a mistake. 
Any ideas on how this should work are appreciated.

MORE INFO (only if you must have it):

I read much of the async io / kio discussion on the LK mailing list. 
Finally
Linus concluded that threading _is_ the way to go for now(2001 I
believe).

First, I have kernel 2.2.16  (RedHat 6.2).   If this has been corrected
in the 2.4, then please let me know, but I think not.

On my system, "raw" read()ing a large chunk of the /dev/hda5 partition
shows that reading a page (4k) takes about 230000 clock "ticks" which is
the cpu effort required for 23 context switches.  So I figure if the
disk generates the "io available" interrupt once every 4k chunk  (this
might be the bad assumption), then linux  has plenty time to do
several switches between the interrupt handler, and the high priority
SCHED_FIFO process, and the low priority SCHED_FIFO process, and still
have time for plenty useful work at the user level, and time to get back
to handle the io request.  During this read(), I should be able to use
at _least_ 50% of my CPU.  But I get much less than 10 percent!!  Why??

If there is anything that should be done to the kernel, please let me
know  as I'd certainly be very willing to help.  How  exactly _does_
this  scheduling and io thing work?  Is there some "jiffy" that _must_
expire before Linux switches and lets my other thread do useful work? 
If so,  then how do you shorten it?  Or is it that my IDE disk is very
lousy?   Then what are the parameters I should consider in an IDE disk
and how do I tell what I have??  Or is this simply a bad and pending
Linux bug?
(hard to believe)

Or maybe my test code  is faulty (unlikely also.)

(Test code at http://www.linisoft.com/test/async.c .)

Please reply to me directly.

Thanks in advance for any insight.

-- 
Reza

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Async IO using threads
  2002-02-27 10:55 Async IO using threads Reza Roboubi
@ 2002-02-27 17:37 ` Masoud Sharbiani
  2002-02-28 17:29   ` Reza Roboubi
  2002-03-01  0:43   ` Reza Roboubi
  0 siblings, 2 replies; 4+ messages in thread
From: Masoud Sharbiani @ 2002-02-27 17:37 UTC (permalink / raw)
  To: Reza Roboubi, linux-kernel

Hello,
I tried your program on my system (P3 800MHz/256Meg ram, IDE harddrive 
with UDMA enabled, 2.4.17-rmap12f) with minor changes: I used a file 
instead of
a raw device. after creating file (64Mega bytes) and flushing read cache 
(writing another huge file with DD on same filesystem), this is what
happened:
A normal read test (for speed measurements).
[root@masouds1 bsd]# time cat mytest > /dev/null

real    0m1.771s
user    0m0.020s
sys     0m0.280s
---
So, 1.7 sec. total time to read data from file.
Now, I flushed cache again and ran your test program:
[root@masouds1 bsd]# ./async
useful CPU work         1 at time(secs, micro-secs) 1014831058 173783
useful CPU work     80848 at time(secs, micro-secs) 1014831059 776664
useful CPU work   1216070 at time(secs, micro-secs) 1014831069 786353

Between number 2 and 3, your program sleeps 10 seconds.  That would be 
121607 counters each second. Now, when reader-thread and worker-thread 
are both running, you get 80000 counts for 1.6 seconds where you should 
get 1.6 * 121607 = 194570. That is a 33% of CPU power.
and remember that lots of time is consumed during copying 64 megabytes 
of data to user buffer (let alone kernel moving it around and context 
switches).
So I believe there isn't a bug in recent version of Linux kernel. Unless 
I'm way off track!
Can you run same test I did and report results here?

Masoud
PS: make sure you are not running your IDE drive in PIO mode.




Reza Roboubi wrote:

>SUMMARY:
>
>Basically, I'm trying to do async io through a SCHED_FIFO thread with
>high priority reading the disk, and the other less prioritized thread
>doing "real" work.  But I can't get _nearly_ enough out of the CPU while
>reading the disk with the other thread.  It is just intolerably 
>inefficient and I _hope_ that I am making  a mistake. 
>Any ideas on how this should work are appreciated.
>
>MORE INFO (only if you must have it):
>
>I read much of the async io / kio discussion on the LK mailing list. 
>Finally
>Linus concluded that threading _is_ the way to go for now(2001 I
>believe).
>
>First, I have kernel 2.2.16  (RedHat 6.2).   If this has been corrected
>in the 2.4, then please let me know, but I think not.
>
>On my system, "raw" read()ing a large chunk of the /dev/hda5 partition
>shows that reading a page (4k) takes about 230000 clock "ticks" which is
>the cpu effort required for 23 context switches.  So I figure if the
>disk generates the "io available" interrupt once every 4k chunk  (this
>might be the bad assumption), then linux  has plenty time to do
>several switches between the interrupt handler, and the high priority
>SCHED_FIFO process, and the low priority SCHED_FIFO process, and still
>have time for plenty useful work at the user level, and time to get back
>to handle the io request.  During this read(), I should be able to use
>at _least_ 50% of my CPU.  But I get much less than 10 percent!!  Why??
>
>If there is anything that should be done to the kernel, please let me
>know  as I'd certainly be very willing to help.  How  exactly _does_
>this  scheduling and io thing work?  Is there some "jiffy" that _must_
>expire before Linux switches and lets my other thread do useful work? 
>If so,  then how do you shorten it?  Or is it that my IDE disk is very
>lousy?   Then what are the parameters I should consider in an IDE disk
>and how do I tell what I have??  Or is this simply a bad and pending
>Linux bug?
>(hard to believe)
>
>Or maybe my test code  is faulty (unlikely also.)
>
>(Test code at http://www.linisoft.com/test/async.c .)
>
>Please reply to me directly.
>
>Thanks in advance for any insight.
>




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Async IO using threads
  2002-02-27 17:37 ` Masoud Sharbiani
@ 2002-02-28 17:29   ` Reza Roboubi
  2002-03-01  0:43   ` Reza Roboubi
  1 sibling, 0 replies; 4+ messages in thread
From: Reza Roboubi @ 2002-02-28 17:29 UTC (permalink / raw)
  To: Masoud Sharbiani; +Cc: linux-kernel

Masoud,

First let me thank you for reading and running my test code on your
system.  I greatly appreciate that.  It's so good to see a man with hard
numbers as opposed to just "speaches."  Your response has been extremely
helpful.  

> PS: make sure you are not running your IDE drive in PIO mode.

This one line tip of yours was probably more helpful to me than many
hours of heart-bleeding M$ support can be to some people.

You reminded me that long ago,  due to system instability, I had turned
down some of my  BIOS features.  They could have caused the kernel to
set my hda settings conservatively.  Turns out, that not  only dma was
off, but also "multiple read" was set to one.   But the dma did the
major change:

> a raw device. after creating file (64Mega bytes) and flushing read cache
> (writing another huge file with DD on same filesystem), this is what
> happened:
> A normal read test (for speed measurements).
> [root@masouds1 bsd]# time cat mytest > /dev/null
> 
> real    0m1.771s
> user    0m0.020s
> sys     0m0.280s
> ---

I got:

[root in ~]$ time cat /scratch0/big  > /dev/null
0.53user 3.12system 0:09.35elapsed 39%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (25214major+14minor)pagefaults 0swaps

This is probably much better than I had before (big = 102 MB).


> So, 1.7 sec. total time to read data from file.
> Now, I flushed cache again and ran your test program:
> [root@masouds1 bsd]# ./async
> useful CPU work         1 at time(secs, micro-secs) 1014831058 173783
> useful CPU work     80848 at time(secs, micro-secs) 1014831059 776664
> useful CPU work   1216070 at time(secs, micro-secs) 1014831069 786353
> 

I get:
[root in /home/reza/backup/tmpwork/tests/linux_timings]$ ./async.out 
useful CPU work         1 at time(secs, micro-secs) 1014905754 12224
useful CPU work    240204 at time(secs, micro-secs) 1014905758 8111
useful CPU work   1082083 at time(secs, micro-secs) 1014905768 15236

(using raw, NOT cache)

This is 0.63% efficiency.  It is beautiful.   Note that this answers my 
basic question, that I had known all along anyways:  Making my code
complex to take advantage of multi-threading is most certainly worth it.

Now, I did the test again, this time using fifos for doing the "real
work", this is less efficient, and gives about 0.45% of  the CPU back
during another thread's read(2).

Intuition suggests that this can still be better, because I also did
tests for memcpy and thread context switching under Linux, and Linux is
very efficient in these areas (my machine can do roughly 400k context
switches in the 4 seconds it took to read that ~50MB  chunk (see test
above)) This appeasr to be excellent performance (on the micro second
scale anyways).  And one might figure that the CPU does not need 55% of
it's power sustaining a few inter-thread context switches and copies
during read(large_chunk).  But my tests are small chunks of code.  when
things get large, as they are in the kernel, I can see constant factors
like TLB updates and such adding up.  

I can see how valuable it would be to put aside some time and study the
ide driver source, and the kernel in general.  At least  when one  wants
something specific, like the Google servers, one probably can find ways
to tailor the kernel and get more out of it.  Maybe much more for
something real specific.  No WONDER Google would choose Linux.  It is
impossible to customize any closed source os that way.  

In any case, for any more questions in this regard, the source code and
the LK archives should be my reference.  But your great help got me
beautifully into the order of magnitude I wanted.  

Thank you so much, again.

-- 
Reza

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Async IO using threads
  2002-02-27 17:37 ` Masoud Sharbiani
  2002-02-28 17:29   ` Reza Roboubi
@ 2002-03-01  0:43   ` Reza Roboubi
  1 sibling, 0 replies; 4+ messages in thread
From: Reza Roboubi @ 2002-03-01  0:43 UTC (permalink / raw)
  To: Masoud Sharbiani; +Cc: linux-kernel

Btw, I mentioned that I rewrote the test to do the "useful work" using
fifos, and that gave 0.45% of the CPU back during the read() operation.

Just in case anyone wants that test, it is on the web site with the
other test:

http://www.linisoft.com/test/asyncf.c  //async using fifo
http://www.linisoft.com/test/async.c  //async using __asm__(lock)
-- 
Reza

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-03-01  0:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-02-27 10:55 Async IO using threads Reza Roboubi
2002-02-27 17:37 ` Masoud Sharbiani
2002-02-28 17:29   ` Reza Roboubi
2002-03-01  0:43   ` Reza Roboubi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).