linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael K. Edwards" <medwards.linux@gmail.com>
To: "Benjamin LaHaise" <bcrl@kvack.org>
Cc: "Eric Dumazet" <dada1@cosmosbay.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: sys_write() racy for multi-threaded append?
Date: Fri, 9 Mar 2007 04:19:55 -0800	[thread overview]
Message-ID: <f2b55d220703090419w755d42d0mea4f220e3caaa59a@mail.gmail.com> (raw)
In-Reply-To: <20070309013405.GI6209@kvack.org>

On 3/8/07, Benjamin LaHaise <bcrl@kvack.org> wrote:
> Any number of things can cause a short write to occur, and rewinding the
> file position after the fact is just as bad.  A sane app has to either
> serialise the writes itself or use a thread safe API like pwrite().

Not on a pipe/FIFO.  Short writes there are flat out verboten by
1003.1 unless O_NONBLOCK is set.  (Not that f_pos is interesting on a
pipe except as a "bytes sent" indicator  -- and in the multi-threaded
scenario, if you do the speculative update that I'm suggesting, you
can't 100% trust it unless you ensure that you are not in
mid-read/write in some other thread at the moment you sample f_pos.
But that doesn't make it useless.)

As to what a "sane app" has to do: it's just not that unusual to write
application code that treats a short read/write as a catastrophic
error, especially when the fd is of a type that is known never to
produce a short read/write unless something is drastically wrong.  For
instance, I bomb on short write in audio applications where the driver
is known to block until enough bytes have been read/written, period.
When switching from reading a stream of audio frames from thread A to
reading them from thread B, I may be willing to omit app
serialization, because I can tolerate an imperfect hand-off in which
thread A steals one last frame after thread B has started reading --
as long as the fd doesn't get screwed up.  There is no reason for the
generic sys_read code to leave a race open in which the same frame is
read by both threads and a hardware buffer overrun results later.

In short, I'm not proposing that the kernel perfectly serialize
concurrent reads and writes to arbitrary fd types.  I'm proposing that
it not do something blatantly stupid and easily avoided in generic
code that makes it impossible for any fd type to guarantee that, after
10 successful pipelined 100-byte reads or writes, f_pos will have
advanced by 1000.

Cheers,
- Michael

  reply	other threads:[~2007-03-09 12:20 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-08 23:08 sys_write() racy for multi-threaded append? Michael K. Edwards
2007-03-08 23:43 ` Eric Dumazet
2007-03-08 23:57   ` Michael K. Edwards
2007-03-09  0:15     ` Eric Dumazet
2007-03-09  0:45       ` Michael K. Edwards
2007-03-09  1:34         ` Benjamin LaHaise
2007-03-09 12:19           ` Michael K. Edwards [this message]
2007-03-09 13:44             ` Eric Dumazet
2007-03-09 14:10             ` Alan Cox
2007-03-09 14:59             ` Benjamin LaHaise
2007-03-10  6:43               ` Michael K. Edwards
2007-03-09  5:53         ` Eric Dumazet
2007-03-09 11:52           ` Michael K. Edwards
2007-03-09  0:43 ` Alan Cox
     [not found] <7WzUo-1zl-21@gated-at.bofh.it>
     [not found] ` <7WAx2-2pg-21@gated-at.bofh.it>
     [not found]   ` <7WAGF-2Bx-9@gated-at.bofh.it>
     [not found]     ` <7WB07-3g5-33@gated-at.bofh.it>
     [not found]       ` <7WBt7-3SZ-23@gated-at.bofh.it>
2007-03-12  7:53         ` Bodo Eggert
2007-03-12 16:26           ` Michael K. Edwards
2007-03-12 18:48             ` Bodo Eggert
2007-03-13  0:46               ` Michael K. Edwards
2007-03-13  2:24                 ` Alan Cox
2007-03-13  7:25                   ` Michael K. Edwards
2007-03-13  7:42                     ` David Miller
2007-03-13 16:24                       ` Michael K. Edwards
2007-03-13 17:59                         ` Michael K. Edwards
2007-03-13 19:09                           ` Christoph Hellwig
2007-03-13 23:40                             ` Michael K. Edwards
2007-03-14  0:09                               ` Michael K. Edwards
2007-03-13 13:15                     ` Alan Cox
2007-03-14 20:09                       ` Michael K. Edwards
2007-03-16 16:43                         ` Frank Ch. Eigler
2007-03-16 17:25                         ` Alan Cox
2007-03-13 14:00                   ` David M. Lloyd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f2b55d220703090419w755d42d0mea4f220e3caaa59a@mail.gmail.com \
    --to=medwards.linux@gmail.com \
    --cc=bcrl@kvack.org \
    --cc=dada1@cosmosbay.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).