linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: Daniel McNeil <daniel@osdl.org>
Cc: Janet Morgan <janetmor@us.ibm.com>,
	Badari Pulavarty <pbadari@us.ibm.com>,
	"linux-aio@kvack.org" <linux-aio@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix
Date: Tue, 30 Dec 2003 10:23:34 +0530	[thread overview]
Message-ID: <20031230045334.GA3484@in.ibm.com> (raw)
In-Reply-To: <1071190292.1937.13.camel@ibm-c.pdx.osdl.net>

On Thu, Dec 11, 2003 at 04:51:33PM -0800, Daniel McNeil wrote:
> I've done more testing with added debug code.
> 
> It looks like the filemap_write_and_wait() call is returning
> with data that has not made it disk.
> 
> I added code to filemap_write_and_wait() to check if
> mapping->dirty_pages is not empty after calling filemap_fdatawrite()
> and filemap_fdatawait() and retry.  Even with the retry, the test still
> sees uninitialized data when running tests overnight (I added a printk
> so I know the retry is happening).  There are pages left on the
> dirty_pages list even after the write and wait.   

There are two calls to filemap_write_and_wait() for a DIO read
-- do you know in which if these cases you see dirty pages after
the write and wait ? The first is called without i_sem protection,
so it is possible for pages to be dirtied by a parallel buffered
write which is OK because this is just an extra/superfluous call 
when it comes to DIO reads. The second call, however happens with i_sem 
held and is used to guarantee that there are no exposures, so if 
you are seeing remant dirty pages in this case it would be something 
to worry about.

Regards
Suparna

> 
> I've added a bunch more debug code and am currently running the test
> again to see if I can find out what is going on.
> 
> I'll send more email when I know more.
> 
> Daniel
> 
> 
> 
> On Mon, 2003-12-08 at 10:23, Daniel McNeil wrote:
> > My patch did not fix the problem. :(
> > 
> > The tests over the weekend reported bad data.
> > 
> > The sampling of the i_size after dropping i_sem still looks wrong
> > to me.  I will keep looking and see if I can find another problem.
> > 
> > 
> > Daniel
> > 
> > On Fri, 2003-12-05 at 17:29, Daniel McNeil wrote:
> > > Janet,
> > > 
> > > I think I found the problem that is causing the dio reads to get
> > > unintialized data with buffer writes.  i_size is sampled after
> > > i_sem is dropped and after the i/o have completed.  i_size could
> > > have changed in the mean time.  This patch samples i_size before
> > > dropping i_sem.  I'll leave the test running over the weekend to
> > > verify.
> > > 
> > > Daniel
> > > 
> > > On Tue, 2003-12-02 at 18:33, Janet Morgan wrote:
> > > > Hi Suparna, Daniel,
> > > > 
> > > > Just wanted to let you know that I seem to be hitting a failure when I
> > > > run the testcase (attached) that Stephen Tweedie wrote to expose the DIO
> > > > races (the testcase issues buffered writes and dio reads).  I'm trying to
> > > > debug the failure, but I know it will be slow going for me.
> > > > 
> > > > I hit the problem when I run multiple instances of the test against
> > > > the same filesystem (I do specify a unique file for each instance of
> > > > the test).   I normally run about 6 instances of the test, e.g.,
> > > > "direct_read_under foo1", "direct_read_under foo2", etc.   The test
> > > > runs in an infinite loop until a data mis-compare is detected.
> > > > I have not been able to reproduce the failure when I restrict each
> > > > instance of the test to its own filesystem.
> > > > 
> > > > I've tried running with the combinations below and I eventually
> > > > get a failure in each case.   I assume all but the last combo includes
> > > > the critical aio/dio fixes for the associated base.
> > > > 
> > > > Combinations tested:
> > > > 
> > > >    Daniel's latest:
> > > >        2.6.0-test9-mm5 +
> > > >        aio-dio-fallback-bio_count-race.patch +
> > > >        direct-io-memleak-fix.patch
> > > > 
> > > >    Suparna's latest:
> > > >        2.6.0-test9-mm5 +
> > > >        Suparna's current patches:
> > > >            
> > > > http://marc.theaimsgroup.com/?l=linux-aio&m=106983304420570&w=2  +
> > > >            
> > > > http://marc.theaimsgroup.com/?l=linux-kernel&m=106904658121299&w=2  +
> > > >        direct-io-memleak-fix.patch
> > > >  
> > > >     stock linux-2.6.0-test9 plus:
> > > >       aio-refcnt.patch  +
> > > >       dio-aio-fixes.patch +
> > > >       dio-aio-fixes-fixes.patch +
> > > >       O_DIRECT-race-fixes-rollup.patch 
> > > > 
> > > >     stock linux-2.6.0-test11 plus:
> > > >       O_DIRECT-race-fixes-rollup.patch
> > > >  
> > > > Thanks,
> > > > -Janet
> > > > 
> > > > ______________________________________________________________________
> > > > 
> > > > #define _XOPEN_SOURCE 600
> > > > #define _GNU_SOURCE
> > > > 
> > > > #include <unistd.h>
> > > > #include <stdlib.h>
> > > > #include <stdio.h>
> > > > #include <string.h>
> > > > #include <errno.h>
> > > > #include <sys/fcntl.h>
> > > > #include <sys/mman.h>
> > > > #include <sys/wait.h>
> > > > 
> > > > #define BIGSIZE 128*1024*1024
> > > > #define READSIZE 32*1024*1024
> > > > #define WRITESIZE 32*1024*1024
> > > > 
> > > > int pagesize;
> > > > char *iobuf;
> > > > 
> > > > void assert(const char *what, int assertion)
> > > > {
> > > > 	if (assertion)
> > > > 		return;
> > > > 	perror(what);
> > > > 	exit(1);
> > > > }
> > > > 
> > > > void do_buffered_writes(int fd, int pattern)
> > > > {
> > > > 	int rc;
> > > > 	int offset;
> > > > 	
> > > > 	memset(iobuf, pattern, WRITESIZE);
> > > > 	for (offset = 0; offset+WRITESIZE <= BIGSIZE; offset += WRITESIZE) {
> > > > 		rc = pwrite(fd, iobuf, WRITESIZE, offset);
> > > > 		assert("pwrite", rc >= 0);
> > > > 		if (rc != WRITESIZE) {
> > > > 			fprintf(stderr, "short write (%d out of %d)\n",
> > > > 				rc, WRITESIZE);
> > > > 			exit(1);
> > > > 		}
> > > > 		fsync(fd);
> > > > 	}
> > > > }
> > > > 
> > > > int do_direct_reads(char *filename)
> > > > {
> > > > 	int fd;
> > > > 	int offset;
> > > > 	int rc, i;
> > > > 	int *p;
> > > > 	
> > > > 	fd = open(filename, O_DIRECT|O_RDONLY, 0);
> > > > 	assert("open", fd >= 0);
> > > > 
> > > > 	for (offset = 0; offset+READSIZE <= BIGSIZE; offset += READSIZE) {
> > > > 		rc = pread(fd, iobuf, READSIZE, offset);
> > > > 		assert("pread", rc >= 0);
> > > > 		if (rc != READSIZE) {
> > > > 			fprintf(stderr, "short read (%d out of %d)\n",
> > > > 				rc, READSIZE);
> > > > 			exit(1);
> > > > 		}
> > > > 		for (i=0, p=(int *)iobuf; i<READSIZE; i+=4) {
> > > > 			if (*p) {
> > > > 				fprintf(stderr,
> > > > 					"Found data (%08x) at offset %d+%d\n",
> > > > 					*p, offset, i);
> > > > 				return 1;
> > > > 			}
> > > > 			p++;
> > > > 		}
> > > > 	}
> > > > 	return 0;
> > > > }
> > > > 
> > > > int main(int argc, char *argv[])
> > > > {
> > > > 	char *filename;
> > > > 	int fd;
> > > > 	int pid;
> > > > 	int err;
> > > > 	int pass = 0;
> > > > 	int bufsize;
> > > > 	
> > > > 	if (argc != 2) {
> > > > 		fprintf(stderr, "Needs a filename as an argument.\n");
> > > > 		exit(1);
> > > > 	}
> > > > 	
> > > > 	filename = argv[1];
> > > > 	
> > > > 	pagesize = getpagesize();
> > > > 	bufsize = READSIZE;
> > > > 	if (WRITESIZE > READSIZE)
> > > > 		bufsize = WRITESIZE;
> > > > 	err = posix_memalign((void**) &iobuf, pagesize, bufsize);
> > > > 	if (err) {
> > > > 		fprintf(stderr, "Error allocating %d aligned bytes.\n", bufsize);
> > > > 		exit(1);
> > > > 	}
> > > > 	
> > > > 	fd = open(filename, O_CREAT|O_TRUNC|O_RDWR, 0666);
> > > > 	assert("open", fd >= 0);
> > > > 	
> > > > 	do {
> > > > 		printf("Pass %d...\n", ++pass);
> > > > 		
> > > > 		assert("ftruncate", ftruncate(fd, BIGSIZE) == 0);
> > > > 		fsync(fd);
> > > > 
> > > > 		pid = fork();
> > > > 		assert("fork", pid >= 0);
> > > > 		
> > > > 		if (!pid) {
> > > > 			do_buffered_writes(fd, 0);
> > > > 			exit(0);
> > > > 		}
> > > > 		
> > > > 		err = do_direct_reads(filename);
> > > > 
> > > > 		wait4(pid, NULL, WNOHANG, 0);
> > > > 		
> > > > 		if (err) 
> > > > 			break;
> > > > 
> > > > 		/* Fill the file with a known pattern so that the blocks
> > > > 		 * on disk can be detected if they become exposed. */
> > > > 		do_buffered_writes(fd, 1);
> > > > 		fsync(fd);
> > > > 
> > > > 		assert("ftruncate", ftruncate(fd, 0) == 0);
> > > > 		fsync(fd);
> > > > 	} while (1);
> > > > 	
> > > > 	return err;
> > > > }
> > > > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India


  parent reply	other threads:[~2003-12-30  4:48 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3FCD4B66.8090905@us.ibm.com>
2003-12-06  1:29 ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Daniel McNeil
2003-12-08 18:23   ` Daniel McNeil
2003-12-12  0:51     ` Daniel McNeil
2003-12-17  1:25       ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Daniel McNeil
2003-12-17  2:03         ` Andrew Morton
2003-12-17 19:25           ` Daniel McNeil
2003-12-17 20:17             ` Janet Morgan
2003-12-31  9:18           ` Suparna Bhattacharya
2003-12-31  9:35             ` Andrew Morton
2003-12-31  9:55               ` Suparna Bhattacharya
2003-12-31  9:59                 ` Andrew Morton
2003-12-31 10:09                   ` Suparna Bhattacharya
2003-12-31 10:10                     ` Andrew Morton
2003-12-31 10:48                       ` Suparna Bhattacharya
2003-12-31 10:53                         ` Andrew Morton
2003-12-31 10:54                           ` Andrew Morton
2003-12-31 11:17                             ` Andrew Morton
2003-12-31 22:34                               ` [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch Daniel McNeil
2003-12-31 22:41                                 ` [PATCH linux-2.6.1-rc1-mm1] aiodio_fallback_bio_count.patch Daniel McNeil
2003-12-31 23:46                                   ` Andrew Morton
2004-01-02  5:14                                     ` Suparna Bhattacharya
2004-01-02  7:46                                       ` Andrew Morton
2004-01-05  3:55                                         ` Suparna Bhattacharya
2004-01-05  5:06                                           ` Andrew Morton
2004-01-05  5:28                                             ` Suparna Bhattacharya
2004-01-05  5:28                                               ` Andrew Morton
2004-01-05  6:06                                                 ` Suparna Bhattacharya
2004-01-05  6:14                                                 ` Lincoln Dale
2003-12-31 22:47                                 ` [PATCH linux-2.6.1-rc1-mm1] dio_isize.patch Daniel McNeil
2003-12-31 23:42                                 ` [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch Andrew Morton
2004-01-02  4:20                                   ` Suparna Bhattacharya
2004-01-02  4:36                                     ` Andrew Morton
2004-01-02  5:50                               ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Suparna Bhattacharya
2004-01-02  7:31                                 ` Andrew Morton
2004-01-05 13:49                                 ` Marcelo Tosatti
2004-01-05 20:27                                   ` Andrew Morton
2004-03-29 15:44                                 ` Marcelo Tosatti
2004-01-11 23:14                               ` Janet Morgan
2004-01-11 23:44                                 ` Andrew Morton
2004-01-12 18:00                                   ` filemap_fdatawait.patch Daniel McNeil
2004-01-12 19:39                                   ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Janet Morgan
2004-01-12 19:46                                     ` Daniel McNeil
2004-01-13  4:12                                 ` Janet Morgan
2003-12-30  4:53       ` Suparna Bhattacharya [this message]
2003-12-31  0:29         ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Daniel McNeil
2003-12-31  6:09           ` Suparna Bhattacharya
2004-01-08 23:55             ` Daniel McNeil
2004-01-09  3:55               ` Suparna Bhattacharya
2004-02-05  1:39                 ` [PATCH 2.6.2-rc3-mm1] DIO read race fix Daniel McNeil
2004-02-05  1:54                   ` Badari Pulavarty
2004-02-05  2:07                   ` Andrew Morton
2004-02-05  2:54                     ` Janet Morgan
2004-02-05  3:19                       ` Andrew Morton
2004-02-05  3:43                         ` Suparna Bhattacharya
2004-02-05  5:33                   ` Andrew Morton
2004-02-05 17:52                     ` Daniel McNeil
2004-02-05 18:53                     ` Badari Pulavarty
2004-03-29 15:41           ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Suparna Bhattacharya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031230045334.GA3484@in.ibm.com \
    --to=suparna@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=daniel@osdl.org \
    --cc=janetmor@us.ibm.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).