From: Benjamin LaHaise <bcrl@kvack.org>
To: "Michael K. Edwards" <medwards.linux@gmail.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: sys_write() racy for multi-threaded append?
Date: Fri, 9 Mar 2007 09:59:20 -0500 [thread overview]
Message-ID: <20070309145920.GJ6209@kvack.org> (raw)
In-Reply-To: <f2b55d220703090419w755d42d0mea4f220e3caaa59a@mail.gmail.com>
On Fri, Mar 09, 2007 at 04:19:55AM -0800, Michael K. Edwards wrote:
> On 3/8/07, Benjamin LaHaise <bcrl@kvack.org> wrote:
> >Any number of things can cause a short write to occur, and rewinding the
> >file position after the fact is just as bad. A sane app has to either
> >serialise the writes itself or use a thread safe API like pwrite().
>
> Not on a pipe/FIFO. Short writes there are flat out verboten by
> 1003.1 unless O_NONBLOCK is set. (Not that f_pos is interesting on a
> pipe except as a "bytes sent" indicator -- and in the multi-threaded
> scenario, if you do the speculative update that I'm suggesting, you
> can't 100% trust it unless you ensure that you are not in
> mid-read/write in some other thread at the moment you sample f_pos.
> But that doesn't make it useless.)
Writes to a pipe/FIFO are atomic, so long as they fit within the pipe buffer
size, while f_pos on a pipe is undefined -- what exactly is the issue here?
The semantics you're assuming are not defined by POSIX. Heck, even looking
at a man page for one of the *BSDs states "Some devices are incapable of
seeking. The value of the pointer associated with such a device is
undefined." What part of undefined is problematic?
> As to what a "sane app" has to do: it's just not that unusual to write
> application code that treats a short read/write as a catastrophic
> error, especially when the fd is of a type that is known never to
> produce a short read/write unless something is drastically wrong. For
> instance, I bomb on short write in audio applications where the driver
> is known to block until enough bytes have been read/written, period.
> When switching from reading a stream of audio frames from thread A to
> reading them from thread B, I may be willing to omit app
> serialization, because I can tolerate an imperfect hand-off in which
> thread A steals one last frame after thread B has started reading --
> as long as the fd doesn't get screwed up. There is no reason for the
> generic sys_read code to leave a race open in which the same frame is
> read by both threads and a hardware buffer overrun results later.
I hope I don't have to run any of your software. Short writes can and do
happen because of a variety of reasons: signals, memory allocation failures,
quota being exceeded.... These are all error conditions the kernel has to
provide well defined semantics for, as well behaved applications will try
to handle them gracefully.
> In short, I'm not proposing that the kernel perfectly serialize
> concurrent reads and writes to arbitrary fd types. I'm proposing that
> it not do something blatantly stupid and easily avoided in generic
> code that makes it impossible for any fd type to guarantee that, after
> 10 successful pipelined 100-byte reads or writes, f_pos will have
> advanced by 1000.
The semantics you're looking for are defined for regular files with
O_APPEND. Anything else is asking for synchronization that other
applications do not require and do not desire.
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.
next prev parent reply other threads:[~2007-03-09 14:59 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-08 23:08 sys_write() racy for multi-threaded append? Michael K. Edwards
2007-03-08 23:43 ` Eric Dumazet
2007-03-08 23:57 ` Michael K. Edwards
2007-03-09 0:15 ` Eric Dumazet
2007-03-09 0:45 ` Michael K. Edwards
2007-03-09 1:34 ` Benjamin LaHaise
2007-03-09 12:19 ` Michael K. Edwards
2007-03-09 13:44 ` Eric Dumazet
2007-03-09 14:10 ` Alan Cox
2007-03-09 14:59 ` Benjamin LaHaise [this message]
2007-03-10 6:43 ` Michael K. Edwards
2007-03-09 5:53 ` Eric Dumazet
2007-03-09 11:52 ` Michael K. Edwards
2007-03-09 0:43 ` Alan Cox
[not found] <7WzUo-1zl-21@gated-at.bofh.it>
[not found] ` <7WAx2-2pg-21@gated-at.bofh.it>
[not found] ` <7WAGF-2Bx-9@gated-at.bofh.it>
[not found] ` <7WB07-3g5-33@gated-at.bofh.it>
[not found] ` <7WBt7-3SZ-23@gated-at.bofh.it>
2007-03-12 7:53 ` Bodo Eggert
2007-03-12 16:26 ` Michael K. Edwards
2007-03-12 18:48 ` Bodo Eggert
2007-03-13 0:46 ` Michael K. Edwards
2007-03-13 2:24 ` Alan Cox
2007-03-13 7:25 ` Michael K. Edwards
2007-03-13 7:42 ` David Miller
2007-03-13 16:24 ` Michael K. Edwards
2007-03-13 17:59 ` Michael K. Edwards
2007-03-13 19:09 ` Christoph Hellwig
2007-03-13 23:40 ` Michael K. Edwards
2007-03-14 0:09 ` Michael K. Edwards
2007-03-13 13:15 ` Alan Cox
2007-03-14 20:09 ` Michael K. Edwards
2007-03-16 16:43 ` Frank Ch. Eigler
2007-03-16 17:25 ` Alan Cox
2007-03-13 14:00 ` David M. Lloyd
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070309145920.GJ6209@kvack.org \
--to=bcrl@kvack.org \
--cc=dada1@cosmosbay.com \
--cc=linux-kernel@vger.kernel.org \
--cc=medwards.linux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).