linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* O_DIRECT! or O_DIRECT?
@ 2001-07-03 20:34 Samium Gromoff
  2001-07-03 20:38 ` kernel
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Samium Gromoff @ 2001-07-03 20:34 UTC (permalink / raw)
  To: linux-kernel

        HI folks, sometime ago i seen on lkml a post
    from >< regarding the implementation of O_DIRECT.
     The thing about to care, is the fact, that *nobody*,
    reacted on this post. It seems to me that nobody was
    happy enough about this to tell "oh yes! at last!"

        This is interesting, because one real advantage
    of O_DIRECT are these greased weasel fast 15-20 Mb/s
    file copies, which ones makes windoze users to look
    on us as on lesser beings.

        I understand, though, that this approach scales
    bad in the terms of multithread loads, which ones are
    especially important in server environments, the place
    linux initially growed from, and that is why it wasn`t
    already implemented.

        One more problem i see here, and i think it is an
    *extremely* important one, that making open( ... ,
    BLA_BLA_BLA | O_DIRECT) is a thing some people may
    overspeculate with. I mean that implementing O_DIRECT
    in cp(1), wins the prize, but in the case of, say,
    find(1) it is definitely not a wise move. The problem
    may be determined as "poisoning" software with this
    godblessed O_DIRECT, to the state, when 70% of code
    on an average machine will use it, thus *completely*
    killing the advantages of buffered access, and
    suddenly *bang!*: the overall performance is died.

        But the worst thing, is what the process of
    poisoning is completely uncontrollable: each
    stupid doodie can think, that His shitful piece of Code,
    is Especially Important, ant that in his case O_DIRECT
    is perfectly suitable. And in the case His code is
    someway performance critical, then most likely O_DIRECT
    will really improve his Code benchmarks, and that is
    making things really awful, leading to the hell large
    crowd of pig happy dudes thinking their useless code
    is life critical, and thus dooming linux.

        Maybe i`m stupid, as these potential dudes, and
    painting things in too dark colors, but O_DIRECT,
    i think, is a dangerous thing to play with.

        That is why, i think, Linus as far as i can properly
    recall, wasn`t happy with it et al.

        Maybe i`m missing the whole point, and thus i want to
    hear what other people will tell about it.


Cheers,

  Samium Gromoff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-03 20:34 O_DIRECT! or O_DIRECT? Samium Gromoff
@ 2001-07-03 20:38 ` kernel
  2001-07-03 21:12 ` Anton Altaparmakov
  2001-07-04 17:52 ` Stephen C. Tweedie
  2 siblings, 0 replies; 8+ messages in thread
From: kernel @ 2001-07-03 20:38 UTC (permalink / raw)
  To: Samium Gromoff; +Cc: linux-kernel

On Wed, 4 Jul 2001, Samium Gromoff wrote: 

>         Maybe i`m missing the whole point, and thus i want to
>     hear what other people will tell about it.

Several of us are working on it.

		-ben


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-03 20:34 O_DIRECT! or O_DIRECT? Samium Gromoff
  2001-07-03 20:38 ` kernel
@ 2001-07-03 21:12 ` Anton Altaparmakov
  2001-07-04 17:52 ` Stephen C. Tweedie
  2 siblings, 0 replies; 8+ messages in thread
From: Anton Altaparmakov @ 2001-07-03 21:12 UTC (permalink / raw)
  To: Samium Gromoff; +Cc: linux-kernel

At 21:34 03/07/2001, Samium Gromoff wrote:
[snip]
>        One more problem i see here, and i think it is an
>     *extremely* important one, that making open( ... ,
>     BLA_BLA_BLA | O_DIRECT) is a thing some people may
>     overspeculate with. I mean that implementing O_DIRECT
>     in cp(1), wins the prize, but in the case of, say,

Why should it? It is very well possible that the file(s) being copied have 
been accessed beforehand and hence are already in the page/buffer cache. 
Using O_DIRECT would not only completely bypass the page/buffer cache but 
it would also cause the cache to be flushed (if dirty) and the cache 
buffers/pages invalidated (otherwise you lose coherency). This is going to 
be _slower_ than not using O_DIRECT.

>     find(1) it is definitely not a wise move. The problem
>     may be determined as "poisoning" software with this
>     godblessed O_DIRECT, to the state, when 70% of code
>     on an average machine will use it, thus *completely*
>     killing the advantages of buffered access, and
>     suddenly *bang!*: the overall performance is died.

Er. Using O_DIRECT means you are doing _unbuffered_ access. - Maybe I am 
misunderstanding your comments, but is seems to me you have the whole 
concept of O_DIRECT the wrong way round.

>        But the worst thing, is what the process of
>     poisoning is completely uncontrollable: each
>     stupid doodie can think, that His shitful piece of Code,
>     is Especially Important, ant that in his case O_DIRECT
>     is perfectly suitable. And in the case His code is
>     someway performance critical, then most likely O_DIRECT
>     will really improve his Code benchmarks, and that is
>     making things really awful, leading to the hell large
>     crowd of pig happy dudes thinking their useless code
>     is life critical, and thus dooming linux.

O_DIRECT _decreases_ performance drastically in most cases. So nobody in 
their right mind would use it for normal applications. - The people who 
would use it and would actually experience a speed _increase_ would be 
programmers of large databases which perform their own caching in user 
space (thus making the normal fs level caching unnecessary, and in fact, 
worse than the unbuffered case) and programmers of multi media streaming 
applications (e.g. video/audio streaming including DVD playback[1] for 
example) which know that A) the data is not in the cache and B) the data 
will never be accessed again in the near future so caching the data is not 
only pointless but causes actually useful (other, unrelated) data present 
in the cache to be displaced out of the cache.

>        Maybe i`m stupid, as these potential dudes, and
>     painting things in too dark colors, but O_DIRECT,
>     i think, is a dangerous thing to play with.

It is indeed. It is only useful in very special circumstances as described 
above. Using it in "normal" applications is stupid and will lead to 
degradation of performance of the application using it.

>        Maybe i`m missing the whole point, and thus i want to
>     hear what other people will tell about it.

I think you do... I hope I managed to explain what O_DIRECT actually is above.

Shame you didn't attend the Linux Developers Conference (in Manchester) 
last weekend as Andrea Arcangeli gave a very nice talk explaining O_DIRECT 
in depth.

Best regards,

         Anton

[1] Actually DVD players make use or raw i/o to access the DVD disk device 
as a whole, thus bypassing file system code altogether, which is even 
faster, but if you were to copy a DVD to your hard drive than O_DIRECT 
would give you the described benefits.


-- 
   "Nothing succeeds like success." - Alexandre Dumas
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-03 20:34 O_DIRECT! or O_DIRECT? Samium Gromoff
  2001-07-03 20:38 ` kernel
  2001-07-03 21:12 ` Anton Altaparmakov
@ 2001-07-04 17:52 ` Stephen C. Tweedie
  2001-07-04 18:27   ` Miquel van Smoorenburg
  2 siblings, 1 reply; 8+ messages in thread
From: Stephen C. Tweedie @ 2001-07-04 17:52 UTC (permalink / raw)
  To: Samium Gromoff; +Cc: linux-kernel, Stephen Tweedie

Hi,

On Wed, Jul 04, 2001 at 12:34:35AM +0400, Samium Gromoff wrote:
> 
>         This is interesting, because one real advantage
>     of O_DIRECT are these greased weasel fast 15-20 Mb/s
>     file copies, which ones makes windoze users to look
>     on us as on lesser beings.

Not true.

O_DIRECT does not speed up sequential file accesses.  If anything, it
may well slow them down, especially for writes.  What O_DIRECT does is
twofold --- it guarantees physical IO to the disk (so that you know
for sure that the data is on disk for writes, or that the data on disk
is readable for reads); and it avoids the memory and CPU overhead of
keeping any cached copy of the data.

But because O_DIRECT is completely synchronous, it's not possible for
the kernel to implement its normal readahead and writebehind IO
clustering for direct IO.  If you use the normal approach of writing
4k at a time to an O_DIRECT file, things may well be *massively*
slower than usual because the kernel is sending individual 4k IOs to
the disk, and because it is waiting for each IO to complete before the
application provides the next one.

On the contrary, buffered writes allow the kernel to batch those 4k
writes into large disk IOs, perhaps 100k or more; and the kernel can
maintain a queue of more than one such IO, so that once the first IO
completes the next one is immediately ready to be sent out.

For these reasons, buffered IO is often faster than O_DIRECT for pure
sequential access.  The downside it its greater CPU cost and the fact
that it pollutes the cache (which, in turn, causes even _more_ CPU
overhead when the VM is forced to start reclaiming old cache data to
make room for new blocks.)

O_DIRECT is great for cases like multimedia (where you want to
maximise CPU available to the application and where you know in
advance that the data is unlikely to fit in cache) and databases
(where the application is caching things already and extra copies in
memory are just a waste of memory).  It is not an automatic win for
all applications.

Cheers,
 Stephen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-04 17:52 ` Stephen C. Tweedie
@ 2001-07-04 18:27   ` Miquel van Smoorenburg
  2001-07-04 18:34     ` Stephen C. Tweedie
  0 siblings, 1 reply; 8+ messages in thread
From: Miquel van Smoorenburg @ 2001-07-04 18:27 UTC (permalink / raw)
  To: linux-kernel

In article <20010704185230.F28793@redhat.com>,
Stephen C. Tweedie <sct@redhat.com> wrote:
>For these reasons, buffered IO is often faster than O_DIRECT for pure
>sequential access.  The downside it its greater CPU cost and the fact
>that it pollutes the cache (which, in turn, causes even _more_ CPU
>overhead when the VM is forced to start reclaiming old cache data to
>make room for new blocks.)

Any chance of something like O_SEQUENTIAL (like madvise(MADV_SEQUENTIAL))

Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-04 18:27   ` Miquel van Smoorenburg
@ 2001-07-04 18:34     ` Stephen C. Tweedie
  2001-07-04 20:23       ` Miquel van Smoorenburg
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen C. Tweedie @ 2001-07-04 18:34 UTC (permalink / raw)
  To: Miquel van Smoorenburg; +Cc: linux-kernel, Stephen Tweedie

Hi,

On Wed, Jul 04, 2001 at 06:27:13PM +0000, Miquel van Smoorenburg wrote:
> In article <20010704185230.F28793@redhat.com>,
> Stephen C. Tweedie <sct@redhat.com> wrote:
> >For these reasons, buffered IO is often faster than O_DIRECT for pure
> >sequential access.  The downside it its greater CPU cost and the fact
> >that it pollutes the cache (which, in turn, causes even _more_ CPU
> >overhead when the VM is forced to start reclaiming old cache data to
> >make room for new blocks.)
> 
> Any chance of something like O_SEQUENTIAL (like madvise(MADV_SEQUENTIAL))

What for?  The kernel already optimises readahead and writebehind for
sequential files.

If you want to provide specific extra hints to the kernel, then things
like O_UNCACHE might be more appropriate to instruct the kernel to
explicitly remove the cached page after IO completes (to avoid the VM
overhead of maintaining useless cache).  That would provide a definite
improvement over normal IO for large multimedia-style files or for
huge copies.  But what part of the normal handling of sequential files
would O_SEQUENTIAL change?  Good handling of sequential files should
be the default, not an explicitly-requested feature.

Cheers, 
 Stephen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-04 18:34     ` Stephen C. Tweedie
@ 2001-07-04 20:23       ` Miquel van Smoorenburg
  2001-07-05 15:06         ` Stephen C. Tweedie
  0 siblings, 1 reply; 8+ messages in thread
From: Miquel van Smoorenburg @ 2001-07-04 20:23 UTC (permalink / raw)
  To: linux-kernel

In article <20010704193402.A6403@redhat.com>,
Stephen C. Tweedie <sct@redhat.com> wrote:
>Hi,
>
>On Wed, Jul 04, 2001 at 06:27:13PM +0000, Miquel van Smoorenburg wrote:
>> 
>> Any chance of something like O_SEQUENTIAL (like madvise(MADV_SEQUENTIAL))
>
>What for?  The kernel already optimises readahead and writebehind for
>sequential files.

Yes, but I really do mean like in madvise().

>If you want to provide specific extra hints to the kernel, then things
>like O_UNCACHE might be more appropriate to instruct the kernel to
>explicitly remove the cached page after IO completes (to avoid the VM
>overhead of maintaining useless cache).  That would provide a definite
>improvement over normal IO for large multimedia-style files or for
>huge copies.  But what part of the normal handling of sequential files
>would O_SEQUENTIAL change?  Good handling of sequential files should
>be the default, not an explicitly-requested feature.

exactly what I meant, since that is what MADV_SEQUENTIAL seems to do:

linux/mm/filemap.c:

 *  MADV_SEQUENTIAL - pages in the given range will probably be accessed
 *              once, so they can be aggressively read ahead, and
 *              can be freed soon after they are accessed.

/*
 * Read-ahead and flush behind for MADV_SEQUENTIAL areas.  Since we are
 * sure this is sequential access, we don't need a flexible read-ahead
 * window size -- we can always use a large fixed size window.
 */
static void nopage_sequential_readahead(struct vm_area_struct * vma,

O_SEQUENTIAL perhaps is the wrong name.

I'd like to see this so I can run tar to backup a machine during the
day (if tar used this flag, ofcourse) without performance going
down the drain because of cache pollution.

Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: O_DIRECT! or O_DIRECT?
  2001-07-04 20:23       ` Miquel van Smoorenburg
@ 2001-07-05 15:06         ` Stephen C. Tweedie
  0 siblings, 0 replies; 8+ messages in thread
From: Stephen C. Tweedie @ 2001-07-05 15:06 UTC (permalink / raw)
  To: Miquel van Smoorenburg; +Cc: linux-kernel, Stephen Tweedie

Hi,

On Wed, Jul 04, 2001 at 08:23:10PM +0000, Miquel van Smoorenburg wrote:

> >huge copies.  But what part of the normal handling of sequential files
> >would O_SEQUENTIAL change?  Good handling of sequential files should
> >be the default, not an explicitly-requested feature.
> 
> exactly what I meant, since that is what MADV_SEQUENTIAL seems to do:
> 
> linux/mm/filemap.c:
> 
>  *  MADV_SEQUENTIAL - pages in the given range will probably be accessed
>  *              once, so they can be aggressively read ahead, and
>  *              can be freed soon after they are accessed.

We already have "drop-behind" for sequential reads --- we lower the
priority of recently read-in pages so that if they don't get accessed
again, they can be reclaimed.  This should be, and is, part of the
default kernel behaviour for such things.

The trouble is that you still need the VM to go around and clean up
those pages if you need the memory for something else.  There's a big
difference between "can be freed" and "are forcibly freed".  O_DIRECT
behaves like the latter: the memory is automatically reclaimed after
use so it results in no memory pressure at all, whereas the
MADV_SEQUENTIAL type of behaviour just allows the VM to reclaim those
pages on demand --- the VM still has to do the work.

Cheers,
 Stephen

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-07-05 15:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-07-03 20:34 O_DIRECT! or O_DIRECT? Samium Gromoff
2001-07-03 20:38 ` kernel
2001-07-03 21:12 ` Anton Altaparmakov
2001-07-04 17:52 ` Stephen C. Tweedie
2001-07-04 18:27   ` Miquel van Smoorenburg
2001-07-04 18:34     ` Stephen C. Tweedie
2001-07-04 20:23       ` Miquel van Smoorenburg
2001-07-05 15:06         ` Stephen C. Tweedie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).