linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin LaHaise <bcrl@kvack.org>
To: "Michael K. Edwards" <medwards.linux@gmail.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: sys_write() racy for multi-threaded append?
Date: Fri, 9 Mar 2007 09:59:20 -0500	[thread overview]
Message-ID: <20070309145920.GJ6209@kvack.org> (raw)
In-Reply-To: <f2b55d220703090419w755d42d0mea4f220e3caaa59a@mail.gmail.com>

On Fri, Mar 09, 2007 at 04:19:55AM -0800, Michael K. Edwards wrote:
> On 3/8/07, Benjamin LaHaise <bcrl@kvack.org> wrote:
> >Any number of things can cause a short write to occur, and rewinding the
> >file position after the fact is just as bad.  A sane app has to either
> >serialise the writes itself or use a thread safe API like pwrite().
> 
> Not on a pipe/FIFO.  Short writes there are flat out verboten by
> 1003.1 unless O_NONBLOCK is set.  (Not that f_pos is interesting on a
> pipe except as a "bytes sent" indicator  -- and in the multi-threaded
> scenario, if you do the speculative update that I'm suggesting, you
> can't 100% trust it unless you ensure that you are not in
> mid-read/write in some other thread at the moment you sample f_pos.
> But that doesn't make it useless.)

Writes to a pipe/FIFO are atomic, so long as they fit within the pipe buffer 
size, while f_pos on a pipe is undefined -- what exactly is the issue here?  
The semantics you're assuming are not defined by POSIX.  Heck, even looking 
at a man page for one of the *BSDs states "Some devices are incapable of 
seeking.  The value of the pointer associated with such a device is 
undefined."  What part of undefined is problematic?

> As to what a "sane app" has to do: it's just not that unusual to write
> application code that treats a short read/write as a catastrophic
> error, especially when the fd is of a type that is known never to
> produce a short read/write unless something is drastically wrong.  For
> instance, I bomb on short write in audio applications where the driver
> is known to block until enough bytes have been read/written, period.
> When switching from reading a stream of audio frames from thread A to
> reading them from thread B, I may be willing to omit app
> serialization, because I can tolerate an imperfect hand-off in which
> thread A steals one last frame after thread B has started reading --
> as long as the fd doesn't get screwed up.  There is no reason for the
> generic sys_read code to leave a race open in which the same frame is
> read by both threads and a hardware buffer overrun results later.

I hope I don't have to run any of your software.  Short writes can and do 
happen because of a variety of reasons: signals, memory allocation failures, 
quota being exceeded....  These are all error conditions the kernel has to 
provide well defined semantics for, as well behaved applications will try 
to handle them gracefully.

> In short, I'm not proposing that the kernel perfectly serialize
> concurrent reads and writes to arbitrary fd types.  I'm proposing that
> it not do something blatantly stupid and easily avoided in generic
> code that makes it impossible for any fd type to guarantee that, after
> 10 successful pipelined 100-byte reads or writes, f_pos will have
> advanced by 1000.

The semantics you're looking for are defined for regular files with 
O_APPEND.  Anything else is asking for synchronization that other 
applications do not require and do not desire.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.

  parent reply	other threads:[~2007-03-09 14:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-08 23:08 sys_write() racy for multi-threaded append? Michael K. Edwards
2007-03-08 23:43 ` Eric Dumazet
2007-03-08 23:57   ` Michael K. Edwards
2007-03-09  0:15     ` Eric Dumazet
2007-03-09  0:45       ` Michael K. Edwards
2007-03-09  1:34         ` Benjamin LaHaise
2007-03-09 12:19           ` Michael K. Edwards
2007-03-09 13:44             ` Eric Dumazet
2007-03-09 14:10             ` Alan Cox
2007-03-09 14:59             ` Benjamin LaHaise [this message]
2007-03-10  6:43               ` Michael K. Edwards
2007-03-09  5:53         ` Eric Dumazet
2007-03-09 11:52           ` Michael K. Edwards
2007-03-09  0:43 ` Alan Cox
     [not found] <7WzUo-1zl-21@gated-at.bofh.it>
     [not found] ` <7WAx2-2pg-21@gated-at.bofh.it>
     [not found]   ` <7WAGF-2Bx-9@gated-at.bofh.it>
     [not found]     ` <7WB07-3g5-33@gated-at.bofh.it>
     [not found]       ` <7WBt7-3SZ-23@gated-at.bofh.it>
2007-03-12  7:53         ` Bodo Eggert
2007-03-12 16:26           ` Michael K. Edwards
2007-03-12 18:48             ` Bodo Eggert
2007-03-13  0:46               ` Michael K. Edwards
2007-03-13  2:24                 ` Alan Cox
2007-03-13  7:25                   ` Michael K. Edwards
2007-03-13  7:42                     ` David Miller
2007-03-13 16:24                       ` Michael K. Edwards
2007-03-13 17:59                         ` Michael K. Edwards
2007-03-13 19:09                           ` Christoph Hellwig
2007-03-13 23:40                             ` Michael K. Edwards
2007-03-14  0:09                               ` Michael K. Edwards
2007-03-13 13:15                     ` Alan Cox
2007-03-14 20:09                       ` Michael K. Edwards
2007-03-16 16:43                         ` Frank Ch. Eigler
2007-03-16 17:25                         ` Alan Cox
2007-03-13 14:00                   ` David M. Lloyd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070309145920.GJ6209@kvack.org \
    --to=bcrl@kvack.org \
    --cc=dada1@cosmosbay.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=medwards.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).