linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Spurious -EIO when reading a file being written with O_DIRECT?
@ 2003-08-06 11:08 Oleg Drokin
  2003-08-06 21:42 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Drokin @ 2003-08-06 11:08 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1403 bytes --]

Hello!

   We were reported a problem where if a file being written in directio mode
   and being read at the same time (in "normal/buffered" mode), then reading
   process gets -EIO when near the end of file.

   Initially I thought this is reiserfs-only problemm and digged in that
   direction, but then it turned out reiserfs does everything correctly
   and the VFS itself seems to be racey (my current suspiction is directio
   process uses get_block() that extends the file <schedule> reading process
   gets the buffer and submits io, then waits for page to become uptodate
   <schedule> direct io process unmaps buffer's metadata
   As a result - that page never becomes uptodate and we get -EIO from do_generic_file_read. )
   If I take i_sem around call to do_generic_file_read in generic_file_read (in 2.4.21-pre10),
   that of course helps (this is of course not a correct fix, but just a demonstration
   that some VFS race is in place).
   The same problem can be observed on ext2 in both 2.4.21-pre10 and in 2.6.0-test2
   Attached is test_directio.c program, compile it and run with some filename as argument,
   immediately start "tail" with same filename and you'd get almost immediate
   I/O error from tail on 2.4 and you'd get same I/O error in 2.6 only after some more waiting.

   Is this something known and expected (or may be somebody have a fix already? ;) )?

Bye,
    Oleg

[-- Attachment #2: test_directio.c --]
[-- Type: text/plain, Size: 633 bytes --]

#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>

#define  BLOCK_SIZE (4096) 

#warning ARCH DEPENDENT fixup for your arch
#define O_DIRECT 040000 // ARCH DEPENDENT fixup for your arch */

static char buf[128 * BLOCK_SIZE];


int main(int argc, char** argv)
{
	int fd;
	char* aligned_buf = (char*)(( (uintptr_t)&buf + (BLOCK_SIZE-1)) & ~(BLOCK_SIZE-1));
	int aligned_size = sizeof(buf) - BLOCK_SIZE;
	
	
	if (-1 == ( fd = open (argv[1], O_RDWR | O_CREAT | O_DIRECT))) {
		perror("open: ");
		return 1;
	}
	
	while (1) {
		if( aligned_size!=write(fd, aligned_buf, aligned_size)) {
			perror("write: ");
			return 1;
		}
	}
}

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Spurious -EIO when reading a file being written with O_DIRECT?
  2003-08-06 11:08 Spurious -EIO when reading a file being written with O_DIRECT? Oleg Drokin
@ 2003-08-06 21:42 ` Andrew Morton
  2003-08-07  5:38   ` Oleg Drokin
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2003-08-06 21:42 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: linux-kernel

Oleg Drokin <green@namesys.com> wrote:
>
>    We were reported a problem where if a file being written in directio mode
>     and being read at the same time (in "normal/buffered" mode), then reading
>     process gets -EIO when near the end of file.
> 
>     Initially I thought this is reiserfs-only problemm and digged in that
>     direction, but then it turned out reiserfs does everything correctly
>     and the VFS itself seems to be racey (my current suspiction is directio
>     process uses get_block() that extends the file <schedule> reading process
>     gets the buffer and submits io, then waits for page to become uptodate
>     <schedule> direct io process unmaps buffer's metadata
>     As a result - that page never becomes uptodate and we get -EIO from do_generic_file_read. )
>     If I take i_sem around call to do_generic_file_read in generic_file_read (in 2.4.21-pre10),
>     that of course helps (this is of course not a correct fix, but just a demonstration
>     that some VFS race is in place).
>     The same problem can be observed on ext2 in both 2.4.21-pre10 and in 2.6.0-test2
>     Attached is test_directio.c program, compile it and run with some filename as argument,
>     immediately start "tail" with same filename and you'd get almost immediate
>     I/O error from tail on 2.4 and you'd get same I/O error in 2.6 only after some more waiting.
> 
>     Is this something known and expected (or may be somebody have a fix already? ;) )?

Test a current 2.4 kernel - it has lots of redone O_DIRECT-vs-buffered
locking.

A 2.6 forward-port of that was done by Badari but I lost it and need to
find it again.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Spurious -EIO when reading a file being written with O_DIRECT?
  2003-08-06 21:42 ` Andrew Morton
@ 2003-08-07  5:38   ` Oleg Drokin
  0 siblings, 0 replies; 3+ messages in thread
From: Oleg Drokin @ 2003-08-07  5:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Hello!

On Wed, Aug 06, 2003 at 02:42:06PM -0700, Andrew Morton wrote:
> >    We were reported a problem where if a file being written in directio mode
> >     and being read at the same time (in "normal/buffered" mode), then reading
> >     process gets -EIO when near the end of file.
> > 
> >     Initially I thought this is reiserfs-only problemm and digged in that
> >     direction, but then it turned out reiserfs does everything correctly
> >     and the VFS itself seems to be racey (my current suspiction is directio
> >     process uses get_block() that extends the file <schedule> reading process
> >     gets the buffer and submits io, then waits for page to become uptodate
> >     <schedule> direct io process unmaps buffer's metadata
> >     As a result - that page never becomes uptodate and we get -EIO from do_generic_file_read. )
> >     If I take i_sem around call to do_generic_file_read in generic_file_read (in 2.4.21-pre10),
> >     that of course helps (this is of course not a correct fix, but just a demonstration
> >     that some VFS race is in place).
> >     The same problem can be observed on ext2 in both 2.4.21-pre10 and in 2.6.0-test2
> >     Attached is test_directio.c program, compile it and run with some filename as argument,
> >     immediately start "tail" with same filename and you'd get almost immediate
> >     I/O error from tail on 2.4 and you'd get same I/O error in 2.6 only after some more waiting.
> >     Is this something known and expected (or may be somebody have a fix already? ;) )?
> Test a current 2.4 kernel - it has lots of redone O_DIRECT-vs-buffered
> locking.

Stupid me.
I mean I tested with 2.4.22-pre10 which is pretty current.
I mean, there were no changes to buffers code in between 2.4.22-pre10 .. 2.4.22-rc1

> A 2.6 forward-port of that was done by Badari but I lost it and need to
> find it again.

Since it did not help 2.4.22 code, I think 2.6.0 won't benefit from it too.
The testcase is really easy and everyone can reproduce the problem easily.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-08-07  5:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-06 11:08 Spurious -EIO when reading a file being written with O_DIRECT? Oleg Drokin
2003-08-06 21:42 ` Andrew Morton
2003-08-07  5:38   ` Oleg Drokin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).