* do_loop_readv_writev() not as described for drivers implementing only write()?
@ 2010-03-02 12:55 Mike McTernan
2010-03-03 12:11 ` Daniel Baluta
0 siblings, 1 reply; 4+ messages in thread
From: Mike McTernan @ 2010-03-02 12:55 UTC (permalink / raw)
To: linux-kernel
Hi,
I'm using writev() with an old FPGA driver which only implements
write(), not aio_write(). I'm expecting the behaviour described in the
man page for writev():
"The data transfers performed by readv() and writev() are atomic: the
data written by writev() is written as a single block that is not
intermingled with output from writes in other processes (but see pipe(7)
for an exception); analogously, readv() is guaranteed to read a
contiguous block of data from the file, regardless of read operations
performed in other threads or processes that have file descriptors
referring to the same open file description (see open(2))."
I appear to be observing intermingling of individual iovec entries that
are being written to the same fd from different threads i.e. each call
to writev() isn't producing a contiguous block to be output. This is at
odds with the man page description.
Looking into the kernel sources (from around 2.6.28 to 2.6.33), the
driver doesn't implement aio_write(), so vfs_writev() gets handled by
do_loop_readv_writev() as a series of discrete calls to the driver's
write().
I can't see where any locking is applied to ensure each iovec is handled
serially without 'internmingling', which would awkwardly have to be
outside the driver in this case.
Hunting around I found various good articles on writev() and the aio
stuff e.g.
http://lwn.net/Articles/170954/
http://lwn.net/Articles/24366/
But nowhere can I find whether it is expected behaviour that
writev/readv() for an driver which only implements write/readv() is
actually non-atomic. Lots of sources are stating the atomicity of these
calls though.
Have I overlooked some good docs or some locking hidden in the vfs
handling?
Aside I'm working to update the driver to provide aio_write() so it can
provide it's own locking such that the userspace.
Kind Regards,
Mike
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: do_loop_readv_writev() not as described for drivers implementing only write()?
2010-03-02 12:55 do_loop_readv_writev() not as described for drivers implementing only write()? Mike McTernan
@ 2010-03-03 12:11 ` Daniel Baluta
2010-03-03 13:51 ` Mike McTernan
2010-03-04 1:46 ` Ulrich Drepper
0 siblings, 2 replies; 4+ messages in thread
From: Daniel Baluta @ 2010-03-03 12:11 UTC (permalink / raw)
To: Mike McTernan; +Cc: linux-kernel
On Tue, Mar 2, 2010 at 2:55 PM, Mike McTernan <mmcternan@airvana.com> wrote:
> Hi,
>
> I'm using writev() with an old FPGA driver which only implements
> write(), not aio_write(). I'm expecting the behaviour described in the
> man page for writev():
>
> "The data transfers performed by readv() and writev() are atomic: the
> data written by writev() is written as a single block that is not
> intermingled with output from writes in other processes (but see pipe(7)
> for an exception); analogously, readv() is guaranteed to read a
> contiguous block of data from the file, regardless of read operations
> performed in other threads or processes that have file descriptors
> referring to the same open file description (see open(2))."
>
> I appear to be observing intermingling of individual iovec entries that
> are being written to the same fd from different threads i.e. each call
> to writev() isn't producing a contiguous block to be output. This is at
> odds with the man page description.
>
> Looking into the kernel sources (from around 2.6.28 to 2.6.33), the
> driver doesn't implement aio_write(), so vfs_writev() gets handled by
> do_loop_readv_writev() as a series of discrete calls to the driver's
> write().
>
> I can't see where any locking is applied to ensure each iovec is handled
> serially without 'internmingling', which would awkwardly have to be
> outside the driver in this case.
I'm also interested about this topic. Can anyone help?
I would say that the libc enforces atomicity, but I have to dig deeper for
a real answer :).
thanks,
Daniel.
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: do_loop_readv_writev() not as described for drivers implementing only write()?
2010-03-03 12:11 ` Daniel Baluta
@ 2010-03-03 13:51 ` Mike McTernan
2010-03-04 1:46 ` Ulrich Drepper
1 sibling, 0 replies; 4+ messages in thread
From: Mike McTernan @ 2010-03-03 13:51 UTC (permalink / raw)
To: Daniel Baluta; +Cc: linux-kernel
Hi,
> I would say that the libc enforces atomicity, but I have to dig deeper
for
> a real answer :).
If libc did do it, it would have to be fancy since readv() and writev()
are documented as being atomic between processes, as well as threads.
Still, all things being possible, I visually checked the glibc I'm
using.
glibc-2.8/sysdeps/unix/sysv/linux/writev.c:
ssize_t
__libc_writev (fd, vector, count)
int fd;
const struct iovec *vector;
int count;
{
if (SINGLE_THREAD_P)
return do_writev (fd, vector, count);
int oldtype = LIBC_CANCEL_ASYNC ();
ssize_t result = do_writev (fd, vector, count);
LIBC_CANCEL_RESET (oldtype);
return result;
}
So we end up in do_writev(), which is defined in the same file:
static ssize_t
do_writev (int fd, const struct iovec *vector, int count)
{
ssize_t bytes_written;
bytes_written = INLINE_SYSCALL (writev, 3, fd, CHECK_N (vector,
count), count);
if (bytes_written >= 0 || errno != EINVAL || count <= UIO_FASTIOV)
return bytes_written;
return __atomic_writev_replacement (fd, vector, count);
}
So assuming the syscall to writev() returns >= 0 (which is true for my
driver), I can't see locking being provided here unless
LIBC_CANCEL_ASYNC is doing something spectacular which I've overlooked.
The other observation from this is that the kernel could detect a case
where writev() is not supported by a driver (i.e. write() is implemented
but aio_write() is not) and return -ENOSYS instead of calling
do_loop_readv_writev(). This would allow the libc to use it's
replacement function - __atomic_writev_replacement() in this case.
How writev() correctly works is still a mystery to me at this point.
Regards,
Mike
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: do_loop_readv_writev() not as described for drivers implementing only write()?
2010-03-03 12:11 ` Daniel Baluta
2010-03-03 13:51 ` Mike McTernan
@ 2010-03-04 1:46 ` Ulrich Drepper
1 sibling, 0 replies; 4+ messages in thread
From: Ulrich Drepper @ 2010-03-04 1:46 UTC (permalink / raw)
To: Daniel Baluta; +Cc: Mike McTernan, linux-kernel
On Wed, Mar 3, 2010 at 04:11, Daniel Baluta <daniel.baluta@gmail.com> wrote:
> I would say that the libc enforces atomicity,
Not at all. Any such implementation would unconditionally have to use
file locking and that's only a convention, not a requirement. The
libc only tries to work around a missing writev syscall.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-03-04 1:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-02 12:55 do_loop_readv_writev() not as described for drivers implementing only write()? Mike McTernan
2010-03-03 12:11 ` Daniel Baluta
2010-03-03 13:51 ` Mike McTernan
2010-03-04 1:46 ` Ulrich Drepper
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.