From: Bill Davidsen <davidsen@tmr.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Chris Mason <chris.mason@oracle.com>,
dean gaudet <dean@arctic.org>, Viktor <vvp01@inbox.ru>,
Aubrey <aubreylee@gmail.com>, Hua Zhong <hzhong@gmail.com>,
Hugh Dickins <hugh@veritas.com>,
linux-kernel@vger.kernel.org, hch@infradead.org,
kenneth.w.chen@intel.com, akpm@osdl.org
Subject: Re: O_DIRECT question
Date: Sat, 13 Jan 2007 15:07:06 -0500 [thread overview]
Message-ID: <45A93BEA.6040601@tmr.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0701121611370.3470@woody.osdl.org>
Linus Torvalds wrote:
>
> On Sat, 13 Jan 2007, Michael Tokarev wrote:
>> (No, really - this load isn't entirely synthetic. It's a typical database
>> workload - random I/O all over, on a large file. If it can, it combines
>> several I/Os into one, by requesting more than a single block at a time,
>> but overall it is random.)
>
> My point is that you can get basically ALL THE SAME GOOD BEHAVIOUR without
> having all the BAD behaviour that O_DIRECT adds.
>
> For example, just the requirement that O_DIRECT can never create a file
> mapping, and can never interact with ftruncate would actually make
> O_DIRECT a lot more palatable to me. Together with just the requirement
> that an O_DIRECT open would literally disallow any non-O_DIRECT accesses,
> and flush the page cache entirely, would make all the aliases go away.
>
> At that point, O_DIRECT would be a way of saying "we're going to do
> uncached accesses to this pre-allocated file". Which is a half-way
> sensible thing to do.
But it's not necessary, it would break existing programs, would be
incompatible with other o/s like AIX, BSD, Solaris. And it doesn't
provide the legitimate use for O_DIRECT in avoiding cache pollution when
writing a LARGE file.
>
> But what O_DIRECT does right now is _not_ really sensible, and the
> O_DIRECT propeller-heads seem to have some problem even admitting that
> there _is_ a problem, because they don't care.
You say that as if it were a failing. Currently if you mix access via
O_DIRECT and non-DIRECT you can get unexpected results. You can screw
yourself, mangle your data, or have no problems at all if you avoid
trying to access the same bytes in multiple ways. There are lots of ways
to get or write stale data, not all involve O_DIRECT in any way, and the
people actually using O_DIRECT now are managing very well.
I don't regard it as a system failing that I am allowed to shoot myself
in the foot, it's one of the benefits of Linux over Windows. Using
O_DIRECT now is like being your own lawyer, room for both creativity and
serious error. But what's there appears portable, which is important as
well.
I do have one thought, WRT reading uninitialized disk data. I would hope
that sparse files are handled right, and that when doing a write with
O_DIRECT the metadata is not updated until the write is done.
>
> A lot of DB people seem to simply not care about security or anything
> else.anything else. I'm trying to tell you that quoting numbers is
> pointless, when simply the CORRECTNESS of O_DIRECT is very much in doubt.
The guiding POSIX standard appears dead, and major DB programs which
work on Linux run on AIX, Solaris, and BSD. That sounds like a good
level of compatibility. I'm not sure what more correctness you would
want beyond a proposed standard and common practice. It's tricky to use,
like many other neat features.
I xonfess I have abused O_DIRECT by opening a file with O_DIRECT,
fdopen()ing it for C, supplying my own large aligned buffer, and using
that with an otherwise unmodified large program which uses fprintf().
That worked on all of the major UNIX variants as well.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2007-01-13 20:06 UTC|newest]
Thread overview: 130+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-11 2:57 O_DIRECT question Aubrey
2007-01-11 3:05 ` Linus Torvalds
2007-01-11 3:15 ` Linus Torvalds
2007-01-11 6:09 ` Nick Piggin
2007-01-11 15:50 ` Linus Torvalds
2007-01-11 16:19 ` Aubrey
2007-01-16 3:41 ` Jörn Engel
2007-01-11 16:23 ` bert hubert
2007-01-11 16:52 ` Xavier Bestel
2007-01-11 17:04 ` Linus Torvalds
2007-01-11 18:41 ` Trond Myklebust
2007-01-11 19:00 ` Linus Torvalds
2007-01-11 19:49 ` Trond Myklebust
2007-01-12 17:03 ` Viktor
2007-01-20 16:19 ` Denis Vlasenko
2007-01-22 15:52 ` Phillip Susi
2007-01-11 5:50 ` Aubrey
2007-01-11 6:06 ` Andrew Morton
2007-01-11 6:45 ` Aubrey
2007-01-11 6:57 ` Andrew Morton
2007-01-11 7:05 ` Nick Piggin
2007-01-11 7:54 ` Aubrey
2007-01-11 8:05 ` Roy Huang
2007-01-11 16:45 ` Linus Torvalds
2007-01-17 4:29 ` Aubrey Li
2007-01-12 2:12 ` Aubrey
2007-01-12 2:47 ` Nick Piggin
2007-01-12 3:59 ` Roy Huang
2007-01-11 8:12 ` Nick Piggin
2007-01-11 8:49 ` Roy Huang
2007-01-11 9:09 ` Nick Piggin
2007-01-12 2:48 ` Bill Davidsen
2007-01-12 4:30 ` Nick Piggin
2007-01-12 4:46 ` Linus Torvalds
2007-01-12 4:56 ` Nick Piggin
2007-01-12 4:58 ` Nick Piggin
2007-01-12 5:18 ` Linus Torvalds
2007-01-12 5:22 ` Aubrey
2007-01-12 14:59 ` Bill Davidsen
2007-01-13 4:51 ` Nick Piggin
2007-01-11 6:16 ` Alexander Shishkin
2007-01-11 6:57 ` Aubrey
2007-01-11 12:13 ` Viktor
2007-01-11 15:53 ` Phillip Susi
2007-01-11 16:20 ` Linus Torvalds
2007-01-11 17:13 ` Michael Tokarev
2007-01-11 23:01 ` Phillip Susi
2007-01-11 23:06 ` Hua Zhong
2007-01-12 15:21 ` Phillip Susi
2007-01-20 16:36 ` Denis Vlasenko
2007-01-20 20:55 ` Michael Tokarev
2007-01-20 23:05 ` Denis Vlasenko
2007-01-21 12:09 ` Michael Tokarev
2007-01-21 20:02 ` Denis Vlasenko
2007-01-22 16:17 ` Phillip Susi
2007-01-24 21:15 ` Denis Vlasenko
2007-01-25 15:44 ` Phillip Susi
2007-01-25 17:38 ` Denis Vlasenko
2007-01-25 19:28 ` Phillip Susi
2007-01-25 19:52 ` Denis Vlasenko
2007-01-25 20:03 ` Phillip Susi
2007-01-25 20:45 ` Michael Tokarev
2007-01-25 21:11 ` Denis Vlasenko
2007-01-26 16:02 ` Mark Lord
2007-01-26 16:52 ` Viktor
2007-01-26 16:58 ` Phillip Susi
2007-01-26 17:05 ` Phillip Susi
2007-01-26 23:16 ` Denis Vlasenko
2007-02-06 20:39 ` Pavel Machek
2007-01-26 18:23 ` Bill Davidsen
2007-01-26 23:35 ` Denis Vlasenko
2007-01-28 15:18 ` Bill Davidsen
2007-01-28 17:03 ` Denis Vlasenko
2007-01-29 15:43 ` Phillip Susi
2007-01-29 17:00 ` Andrea Arcangeli
2007-01-30 0:05 ` Denis Vlasenko
[not found] ` <45BE7D99.70200@cfl.rr.com>
[not found] ` <20070130023056.GN8030@opteron.random>
[not found] ` <45BF65E3.6070102@cfl.rr.com>
[not found] ` <20070130164806.GQ8030@opteron.random>
2007-01-30 18:50 ` Phillip Susi
2007-01-30 19:57 ` Andrea Arcangeli
2007-01-30 20:06 ` Andrea Arcangeli
2007-01-30 23:07 ` Phillip Susi
2007-01-31 2:28 ` Andrea Arcangeli
2007-01-31 9:37 ` Michael Tokarev
2007-01-26 15:53 ` Bill Davidsen
2007-01-11 17:42 ` Alan
2007-01-11 18:00 ` Linus Torvalds
2007-01-12 7:57 ` dean gaudet
2007-01-12 15:27 ` Phillip Susi
2007-01-12 18:06 ` Linus Torvalds
2007-01-12 20:23 ` Chris Mason
2007-01-12 20:46 ` Michael Tokarev
2007-01-12 20:52 ` Michael Tokarev
2007-01-12 21:03 ` Michael Tokarev
2007-01-12 21:17 ` Linus Torvalds
2007-01-12 21:54 ` Michael Tokarev
2007-01-12 22:09 ` Linus Torvalds
2007-01-12 22:26 ` Michael Tokarev
2007-01-12 22:35 ` Erik Andersen
2007-01-12 22:47 ` Andrew Morton
2007-01-14 9:11 ` Nate Diller
2007-01-20 16:45 ` Denis Vlasenko
2007-01-22 1:47 ` Andrea Arcangeli
2007-01-13 20:07 ` Bill Davidsen [this message]
2007-01-13 20:27 ` Michael Tokarev
2007-01-14 15:39 ` Bill Davidsen
2007-01-12 21:39 ` Disk Cache, Was: " Zan Lynx
2007-01-12 22:10 ` Michael Tokarev
2007-01-15 12:11 ` Helge Hafting
2007-01-12 16:59 ` Viktor
2007-01-11 12:45 ` Erik Mouw
2007-01-11 4:51 ` Andrew Morton
2007-01-11 5:06 ` Gerrit Huizenga
2007-01-11 16:09 ` Badari Pulavarty
2007-01-11 12:34 ` linux-os (Dick Johnson)
2007-01-11 13:06 ` Martin Mares
2007-01-11 14:15 ` Jens Axboe
2007-01-12 2:13 ` Bill Davidsen
2007-01-17 14:27 Alex Tomas
2007-01-22 15:59 Al Boldi
[not found] <7BYkO-5OV-17@gated-at.bofh.it>
[not found] ` <7BYul-6gz-5@gated-at.bofh.it>
[not found] ` <7C18X-1zo-5@gated-at.bofh.it>
[not found] ` <7C1iw-22q-7@gated-at.bofh.it>
[not found] ` <7C1Vb-2Ny-3@gated-at.bofh.it>
[not found] ` <7C256-2ZR-27@gated-at.bofh.it>
[not found] ` <7C2eE-3rT-15@gated-at.bofh.it>
[not found] ` <7C31d-4qb-11@gated-at.bofh.it>
[not found] ` <7C3kj-55E-9@gated-at.bofh.it>
2007-01-11 13:20 ` Bodo Eggert
[not found] ` <7C74B-2A4-23@gated-at.bofh.it>
[not found] ` <7CaYA-mT-19@gated-at.bofh.it>
[not found] ` <7Cpuz-64X-1@gated-at.bofh.it>
[not found] ` <7Cz0T-4PH-17@gated-at.bofh.it>
[not found] ` <7CBcl-86B-9@gated-at.bofh.it>
[not found] ` <7CBvH-52-9@gated-at.bofh.it>
[not found] ` <7CBFn-hw-1@gated-at.bofh.it>
[not found] ` <7CBP1-KI-3@gated-at.bofh.it>
[not found] ` <7CBYG-WK-3@gated-at.bofh.it>
2007-01-13 16:53 ` Bodo Eggert
2007-01-13 19:30 ` Bill Davidsen
2007-01-14 18:51 ` Bodo Eggert
[not found] ` <7CXmz-88G-29@gated-at.bofh.it>
[not found] ` <7CXFR-8vZ-15@gated-at.bofh.it>
[not found] ` <7DfMP-2ak-19@gated-at.bofh.it>
2007-01-14 19:39 ` Bodo Eggert
[not found] ` <7DyYK-6lE-3@gated-at.bofh.it>
2007-01-16 20:26 ` Bodo Eggert
2007-01-17 5:55 ` Arjan van de Ven
2007-01-17 22:36 ` Bodo Eggert
[not found] ` <7HkaQ-2Nb-9@gated-at.bofh.it>
[not found] ` <7HDZP-Pv-1@gated-at.bofh.it>
[not found] ` <7HIPV-8kp-35@gated-at.bofh.it>
2007-01-27 14:01 ` Bodo Eggert
2007-01-27 14:14 ` Denis Vlasenko
2007-01-28 15:30 ` Bill Davidsen
2007-01-28 17:18 ` Denis Vlasenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45A93BEA.6040601@tmr.com \
--to=davidsen@tmr.com \
--cc=akpm@osdl.org \
--cc=aubreylee@gmail.com \
--cc=chris.mason@oracle.com \
--cc=dean@arctic.org \
--cc=hch@infradead.org \
--cc=hugh@veritas.com \
--cc=hzhong@gmail.com \
--cc=kenneth.w.chen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@osdl.org \
--cc=vvp01@inbox.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.