All of lore.kernel.org
 help / color / mirror / Atom feed
* do_loop_readv_writev() not as described for drivers implementing only write()?
@ 2010-03-02 12:55 Mike McTernan
  2010-03-03 12:11 ` Daniel Baluta
  0 siblings, 1 reply; 4+ messages in thread
From: Mike McTernan @ 2010-03-02 12:55 UTC (permalink / raw)
  To: linux-kernel

Hi,

I'm using writev() with an old FPGA driver which only implements
write(), not aio_write().  I'm expecting the behaviour described in the
man page for writev():

"The  data  transfers  performed by readv() and writev() are atomic: the
data written by writev() is written as a single block that is not
intermingled with output from writes in other processes (but see pipe(7)
for an exception); analogously, readv() is guaranteed to read a
contiguous block of data from the file, regardless of read operations
performed in other threads or processes that have file descriptors
referring to the same open file description (see open(2))."

I appear to be observing intermingling of individual iovec entries that
are being written to the same fd from different threads i.e. each call
to writev() isn't producing a contiguous block to be output.  This is at
odds with the man page description.

Looking into the kernel sources (from around 2.6.28 to 2.6.33), the
driver doesn't implement aio_write(), so vfs_writev() gets handled by
do_loop_readv_writev() as a series of discrete calls to the driver's
write().

I can't see where any locking is applied to ensure each iovec is handled
serially without 'internmingling', which would awkwardly have to be
outside the driver in this case.

Hunting around I found various good articles on writev() and the aio
stuff e.g.

  http://lwn.net/Articles/170954/
  http://lwn.net/Articles/24366/

But nowhere can I find whether it is expected behaviour that
writev/readv() for an driver which only implements write/readv() is
actually non-atomic.  Lots of sources are stating the atomicity of these
calls though.

Have I overlooked some good docs or some locking hidden in the vfs
handling?

Aside I'm working to update the driver to provide aio_write() so it can
provide it's own locking such that the userspace.

Kind Regards,

Mike


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: do_loop_readv_writev() not as described for drivers implementing  only write()?
  2010-03-02 12:55 do_loop_readv_writev() not as described for drivers implementing only write()? Mike McTernan
@ 2010-03-03 12:11 ` Daniel Baluta
  2010-03-03 13:51   ` Mike McTernan
  2010-03-04  1:46   ` Ulrich Drepper
  0 siblings, 2 replies; 4+ messages in thread
From: Daniel Baluta @ 2010-03-03 12:11 UTC (permalink / raw)
  To: Mike McTernan; +Cc: linux-kernel

On Tue, Mar 2, 2010 at 2:55 PM, Mike McTernan <mmcternan@airvana.com> wrote:
> Hi,
>
> I'm using writev() with an old FPGA driver which only implements
> write(), not aio_write().  I'm expecting the behaviour described in the
> man page for writev():
>
> "The  data  transfers  performed by readv() and writev() are atomic: the
> data written by writev() is written as a single block that is not
> intermingled with output from writes in other processes (but see pipe(7)
> for an exception); analogously, readv() is guaranteed to read a
> contiguous block of data from the file, regardless of read operations
> performed in other threads or processes that have file descriptors
> referring to the same open file description (see open(2))."
>
> I appear to be observing intermingling of individual iovec entries that
> are being written to the same fd from different threads i.e. each call
> to writev() isn't producing a contiguous block to be output.  This is at
> odds with the man page description.
>
> Looking into the kernel sources (from around 2.6.28 to 2.6.33), the
> driver doesn't implement aio_write(), so vfs_writev() gets handled by
> do_loop_readv_writev() as a series of discrete calls to the driver's
> write().
>
> I can't see where any locking is applied to ensure each iovec is handled
> serially without 'internmingling', which would awkwardly have to be
> outside the driver in this case.

I'm also interested about this topic. Can anyone help?
I would say that the libc enforces atomicity, but I have to dig deeper for
a real answer :).

thanks,
Daniel.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: do_loop_readv_writev() not as described for drivers implementing only write()?
  2010-03-03 12:11 ` Daniel Baluta
@ 2010-03-03 13:51   ` Mike McTernan
  2010-03-04  1:46   ` Ulrich Drepper
  1 sibling, 0 replies; 4+ messages in thread
From: Mike McTernan @ 2010-03-03 13:51 UTC (permalink / raw)
  To: Daniel Baluta; +Cc: linux-kernel

Hi,

> I would say that the libc enforces atomicity, but I have to dig deeper
for
> a real answer :).

If libc did do it, it would have to be fancy since readv() and writev()
are documented as being atomic between processes, as well as threads.

Still, all things being possible, I visually checked the glibc I'm
using. 

glibc-2.8/sysdeps/unix/sysv/linux/writev.c:

ssize_t
__libc_writev (fd, vector, count)
     int fd;
     const struct iovec *vector;
     int count;
{
  if (SINGLE_THREAD_P)
    return do_writev (fd, vector, count);

  int oldtype = LIBC_CANCEL_ASYNC ();

  ssize_t result = do_writev (fd, vector, count);

  LIBC_CANCEL_RESET (oldtype);

  return result;
}

So we end up in do_writev(), which is defined in the same file:

static ssize_t
do_writev (int fd, const struct iovec *vector, int count)
{
  ssize_t bytes_written;

  bytes_written = INLINE_SYSCALL (writev, 3, fd, CHECK_N (vector,
count), count);

  if (bytes_written >= 0 || errno != EINVAL || count <= UIO_FASTIOV)
    return bytes_written;

  return __atomic_writev_replacement (fd, vector, count);
}

So assuming the syscall to writev() returns >= 0 (which is true for my
driver), I can't see locking being provided here unless
LIBC_CANCEL_ASYNC is doing something spectacular which I've overlooked.

The other observation from this is that the kernel could detect a case
where writev() is not supported by a driver (i.e. write() is implemented
but aio_write() is not) and return -ENOSYS instead of calling
do_loop_readv_writev().  This would allow the libc to use it's
replacement function - __atomic_writev_replacement() in this case.  

How writev() correctly works is still a mystery to me at this point.

Regards,

Mike

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: do_loop_readv_writev() not as described for drivers implementing  only write()?
  2010-03-03 12:11 ` Daniel Baluta
  2010-03-03 13:51   ` Mike McTernan
@ 2010-03-04  1:46   ` Ulrich Drepper
  1 sibling, 0 replies; 4+ messages in thread
From: Ulrich Drepper @ 2010-03-04  1:46 UTC (permalink / raw)
  To: Daniel Baluta; +Cc: Mike McTernan, linux-kernel

On Wed, Mar 3, 2010 at 04:11, Daniel Baluta <daniel.baluta@gmail.com> wrote:
> I would say that the libc enforces atomicity,

Not at all.  Any such implementation would unconditionally have to use
file locking and that's only a convention, not a requirement.  The
libc only tries to work around a missing writev syscall.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-03-04  1:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-02 12:55 do_loop_readv_writev() not as described for drivers implementing only write()? Mike McTernan
2010-03-03 12:11 ` Daniel Baluta
2010-03-03 13:51   ` Mike McTernan
2010-03-04  1:46   ` Ulrich Drepper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.