linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel McNeil <daniel@osdl.org>
To: Janet Morgan <janetmor@us.ibm.com>
Cc: Suparna Bhattacharya <suparna@in.ibm.com>,
	Badari Pulavarty <pbadari@us.ibm.com>,
	"linux-aio@kvack.org" <linux-aio@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch
Date: 16 Dec 2003 17:25:14 -0800	[thread overview]
Message-ID: <1071624314.1826.12.camel@ibm-c.pdx.osdl.net> (raw)
In-Reply-To: <1071190292.1937.13.camel@ibm-c.pdx.osdl.net>

[-- Attachment #1: Type: text/plain, Size: 8469 bytes --]

I have found something else that might be causing the problem.
filemap_fdatawait() skips pages that are not marked PG_writeback.
However, when a page is going to be written, PG_dirty is cleared
before PG_writeback is set (while the PG_locked is set).  So it
looks like filemap_fdatawait() can see a page just before it is
going to be written and not wait for it.  Here is a patch that
makes filemap_fdatawait() wait for locked pages as well to make
sure it does not missed any pages.

I'm running the test on kernel with this patch over night to see
if it fixes the problem.

Thoughts?

Daniel

On Thu, 2003-12-11 at 16:51, Daniel McNeil wrote:
> I've done more testing with added debug code.
> 
> It looks like the filemap_write_and_wait() call is returning
> with data that has not made it disk.
> 
> I added code to filemap_write_and_wait() to check if
> mapping->dirty_pages is not empty after calling filemap_fdatawrite()
> and filemap_fdatawait() and retry.  Even with the retry, the test still
> sees uninitialized data when running tests overnight (I added a printk
> so I know the retry is happening).  There are pages left on the
> dirty_pages list even after the write and wait.   
> 
> I've added a bunch more debug code and am currently running the test
> again to see if I can find out what is going on.
> 
> I'll send more email when I know more.
> 
> Daniel
> 
> 
> 
> On Mon, 2003-12-08 at 10:23, Daniel McNeil wrote:
> > My patch did not fix the problem. :(
> > 
> > The tests over the weekend reported bad data.
> > 
> > The sampling of the i_size after dropping i_sem still looks wrong
> > to me.  I will keep looking and see if I can find another problem.
> > 
> > 
> > Daniel
> > 
> > On Fri, 2003-12-05 at 17:29, Daniel McNeil wrote:
> > > Janet,
> > > 
> > > I think I found the problem that is causing the dio reads to get
> > > unintialized data with buffer writes.  i_size is sampled after
> > > i_sem is dropped and after the i/o have completed.  i_size could
> > > have changed in the mean time.  This patch samples i_size before
> > > dropping i_sem.  I'll leave the test running over the weekend to
> > > verify.
> > > 
> > > Daniel
> > > 
> > > On Tue, 2003-12-02 at 18:33, Janet Morgan wrote:
> > > > Hi Suparna, Daniel,
> > > > 
> > > > Just wanted to let you know that I seem to be hitting a failure when I
> > > > run the testcase (attached) that Stephen Tweedie wrote to expose the DIO
> > > > races (the testcase issues buffered writes and dio reads).  I'm trying to
> > > > debug the failure, but I know it will be slow going for me.
> > > > 
> > > > I hit the problem when I run multiple instances of the test against
> > > > the same filesystem (I do specify a unique file for each instance of
> > > > the test).   I normally run about 6 instances of the test, e.g.,
> > > > "direct_read_under foo1", "direct_read_under foo2", etc.   The test
> > > > runs in an infinite loop until a data mis-compare is detected.
> > > > I have not been able to reproduce the failure when I restrict each
> > > > instance of the test to its own filesystem.
> > > > 
> > > > I've tried running with the combinations below and I eventually
> > > > get a failure in each case.   I assume all but the last combo includes
> > > > the critical aio/dio fixes for the associated base.
> > > > 
> > > > Combinations tested:
> > > > 
> > > >    Daniel's latest:
> > > >        2.6.0-test9-mm5 +
> > > >        aio-dio-fallback-bio_count-race.patch +
> > > >        direct-io-memleak-fix.patch
> > > > 
> > > >    Suparna's latest:
> > > >        2.6.0-test9-mm5 +
> > > >        Suparna's current patches:
> > > >            
> > > > http://marc.theaimsgroup.com/?l=linux-aio&m=106983304420570&w=2  +
> > > >            
> > > > http://marc.theaimsgroup.com/?l=linux-kernel&m=106904658121299&w=2  +
> > > >        direct-io-memleak-fix.patch
> > > >  
> > > >     stock linux-2.6.0-test9 plus:
> > > >       aio-refcnt.patch  +
> > > >       dio-aio-fixes.patch +
> > > >       dio-aio-fixes-fixes.patch +
> > > >       O_DIRECT-race-fixes-rollup.patch 
> > > > 
> > > >     stock linux-2.6.0-test11 plus:
> > > >       O_DIRECT-race-fixes-rollup.patch
> > > >  
> > > > Thanks,
> > > > -Janet
> > > > 
> > > > ______________________________________________________________________
> > > > 
> > > > #define _XOPEN_SOURCE 600
> > > > #define _GNU_SOURCE
> > > > 
> > > > #include <unistd.h>
> > > > #include <stdlib.h>
> > > > #include <stdio.h>
> > > > #include <string.h>
> > > > #include <errno.h>
> > > > #include <sys/fcntl.h>
> > > > #include <sys/mman.h>
> > > > #include <sys/wait.h>
> > > > 
> > > > #define BIGSIZE 128*1024*1024
> > > > #define READSIZE 32*1024*1024
> > > > #define WRITESIZE 32*1024*1024
> > > > 
> > > > int pagesize;
> > > > char *iobuf;
> > > > 
> > > > void assert(const char *what, int assertion)
> > > > {
> > > > 	if (assertion)
> > > > 		return;
> > > > 	perror(what);
> > > > 	exit(1);
> > > > }
> > > > 
> > > > void do_buffered_writes(int fd, int pattern)
> > > > {
> > > > 	int rc;
> > > > 	int offset;
> > > > 	
> > > > 	memset(iobuf, pattern, WRITESIZE);
> > > > 	for (offset = 0; offset+WRITESIZE <= BIGSIZE; offset += WRITESIZE) {
> > > > 		rc = pwrite(fd, iobuf, WRITESIZE, offset);
> > > > 		assert("pwrite", rc >= 0);
> > > > 		if (rc != WRITESIZE) {
> > > > 			fprintf(stderr, "short write (%d out of %d)\n",
> > > > 				rc, WRITESIZE);
> > > > 			exit(1);
> > > > 		}
> > > > 		fsync(fd);
> > > > 	}
> > > > }
> > > > 
> > > > int do_direct_reads(char *filename)
> > > > {
> > > > 	int fd;
> > > > 	int offset;
> > > > 	int rc, i;
> > > > 	int *p;
> > > > 	
> > > > 	fd = open(filename, O_DIRECT|O_RDONLY, 0);
> > > > 	assert("open", fd >= 0);
> > > > 
> > > > 	for (offset = 0; offset+READSIZE <= BIGSIZE; offset += READSIZE) {
> > > > 		rc = pread(fd, iobuf, READSIZE, offset);
> > > > 		assert("pread", rc >= 0);
> > > > 		if (rc != READSIZE) {
> > > > 			fprintf(stderr, "short read (%d out of %d)\n",
> > > > 				rc, READSIZE);
> > > > 			exit(1);
> > > > 		}
> > > > 		for (i=0, p=(int *)iobuf; i<READSIZE; i+=4) {
> > > > 			if (*p) {
> > > > 				fprintf(stderr,
> > > > 					"Found data (%08x) at offset %d+%d\n",
> > > > 					*p, offset, i);
> > > > 				return 1;
> > > > 			}
> > > > 			p++;
> > > > 		}
> > > > 	}
> > > > 	return 0;
> > > > }
> > > > 
> > > > int main(int argc, char *argv[])
> > > > {
> > > > 	char *filename;
> > > > 	int fd;
> > > > 	int pid;
> > > > 	int err;
> > > > 	int pass = 0;
> > > > 	int bufsize;
> > > > 	
> > > > 	if (argc != 2) {
> > > > 		fprintf(stderr, "Needs a filename as an argument.\n");
> > > > 		exit(1);
> > > > 	}
> > > > 	
> > > > 	filename = argv[1];
> > > > 	
> > > > 	pagesize = getpagesize();
> > > > 	bufsize = READSIZE;
> > > > 	if (WRITESIZE > READSIZE)
> > > > 		bufsize = WRITESIZE;
> > > > 	err = posix_memalign((void**) &iobuf, pagesize, bufsize);
> > > > 	if (err) {
> > > > 		fprintf(stderr, "Error allocating %d aligned bytes.\n", bufsize);
> > > > 		exit(1);
> > > > 	}
> > > > 	
> > > > 	fd = open(filename, O_CREAT|O_TRUNC|O_RDWR, 0666);
> > > > 	assert("open", fd >= 0);
> > > > 	
> > > > 	do {
> > > > 		printf("Pass %d...\n", ++pass);
> > > > 		
> > > > 		assert("ftruncate", ftruncate(fd, BIGSIZE) == 0);
> > > > 		fsync(fd);
> > > > 
> > > > 		pid = fork();
> > > > 		assert("fork", pid >= 0);
> > > > 		
> > > > 		if (!pid) {
> > > > 			do_buffered_writes(fd, 0);
> > > > 			exit(0);
> > > > 		}
> > > > 		
> > > > 		err = do_direct_reads(filename);
> > > > 
> > > > 		wait4(pid, NULL, WNOHANG, 0);
> > > > 		
> > > > 		if (err) 
> > > > 			break;
> > > > 
> > > > 		/* Fill the file with a known pattern so that the blocks
> > > > 		 * on disk can be detected if they become exposed. */
> > > > 		do_buffered_writes(fd, 1);
> > > > 		fsync(fd);
> > > > 
> > > > 		assert("ftruncate", ftruncate(fd, 0) == 0);
> > > > 		fsync(fd);
> > > > 	} while (1);
> > > > 	
> > > > 	return err;
> > > > }
> > > > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

[-- Attachment #2: linux-2.6.0-test10-mm1.fdatawait.patch --]
[-- Type: text/plain, Size: 839 bytes --]

--- linux-2.6.0-test10-mm1.ddm/mm/filemap.c	2003-12-05 17:14:57.000000000 -0800
+++ linux-2.6.0-test10-mm1.ddm.DIO/mm/filemap.c	2003-12-16 17:03:02.801786218 -0800
@@ -188,6 +188,20 @@ restart:
 		struct page *page;
 
 		page = list_entry(mapping->locked_pages.next,struct page,list);
+		/*
+		 * If the page is locked, it might be in process of being 
+		 * setup for writeback but without PG_writeback set 
+		 * and with PG_dirty cleared.
+		 * (PG_dirty is cleared BEFORE PG_writeback is set)
+		 * So, wait for the PG_locked to clear, then start over.
+		 */
+		if (PageLocked(page)) {
+			page_cache_get(page);
+			spin_unlock(&mapping->page_lock);
+			wait_on_page_locked(page);
+			page_cache_release(page);
+			goto restart;
+		}
 		list_del(&page->list);
 		if (PageDirty(page))
 			list_add(&page->list, &mapping->dirty_pages);

  reply	other threads:[~2003-12-17  1:25 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3FCD4B66.8090905@us.ibm.com>
2003-12-06  1:29 ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Daniel McNeil
2003-12-08 18:23   ` Daniel McNeil
2003-12-12  0:51     ` Daniel McNeil
2003-12-17  1:25       ` Daniel McNeil [this message]
2003-12-17  2:03         ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Andrew Morton
2003-12-17 19:25           ` Daniel McNeil
2003-12-17 20:17             ` Janet Morgan
2003-12-31  9:18           ` Suparna Bhattacharya
2003-12-31  9:35             ` Andrew Morton
2003-12-31  9:55               ` Suparna Bhattacharya
2003-12-31  9:59                 ` Andrew Morton
2003-12-31 10:09                   ` Suparna Bhattacharya
2003-12-31 10:10                     ` Andrew Morton
2003-12-31 10:48                       ` Suparna Bhattacharya
2003-12-31 10:53                         ` Andrew Morton
2003-12-31 10:54                           ` Andrew Morton
2003-12-31 11:17                             ` Andrew Morton
2003-12-31 22:34                               ` [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch Daniel McNeil
2003-12-31 22:41                                 ` [PATCH linux-2.6.1-rc1-mm1] aiodio_fallback_bio_count.patch Daniel McNeil
2003-12-31 23:46                                   ` Andrew Morton
2004-01-02  5:14                                     ` Suparna Bhattacharya
2004-01-02  7:46                                       ` Andrew Morton
2004-01-05  3:55                                         ` Suparna Bhattacharya
2004-01-05  5:06                                           ` Andrew Morton
2004-01-05  5:28                                             ` Suparna Bhattacharya
2004-01-05  5:28                                               ` Andrew Morton
2004-01-05  6:06                                                 ` Suparna Bhattacharya
2004-01-05  6:14                                                 ` Lincoln Dale
2003-12-31 22:47                                 ` [PATCH linux-2.6.1-rc1-mm1] dio_isize.patch Daniel McNeil
2003-12-31 23:42                                 ` [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch Andrew Morton
2004-01-02  4:20                                   ` Suparna Bhattacharya
2004-01-02  4:36                                     ` Andrew Morton
2004-01-02  5:50                               ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Suparna Bhattacharya
2004-01-02  7:31                                 ` Andrew Morton
2004-01-05 13:49                                 ` Marcelo Tosatti
2004-01-05 20:27                                   ` Andrew Morton
2004-03-29 15:44                                 ` Marcelo Tosatti
2004-01-11 23:14                               ` Janet Morgan
2004-01-11 23:44                                 ` Andrew Morton
2004-01-12 18:00                                   ` filemap_fdatawait.patch Daniel McNeil
2004-01-12 19:39                                   ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Janet Morgan
2004-01-12 19:46                                     ` Daniel McNeil
2004-01-13  4:12                                 ` Janet Morgan
2003-12-30  4:53       ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Suparna Bhattacharya
2003-12-31  0:29         ` Daniel McNeil
2003-12-31  6:09           ` Suparna Bhattacharya
2004-01-08 23:55             ` Daniel McNeil
2004-01-09  3:55               ` Suparna Bhattacharya
2004-02-05  1:39                 ` [PATCH 2.6.2-rc3-mm1] DIO read race fix Daniel McNeil
2004-02-05  1:54                   ` Badari Pulavarty
2004-02-05  2:07                   ` Andrew Morton
2004-02-05  2:54                     ` Janet Morgan
2004-02-05  3:19                       ` Andrew Morton
2004-02-05  3:43                         ` Suparna Bhattacharya
2004-02-05  5:33                   ` Andrew Morton
2004-02-05 17:52                     ` Daniel McNeil
2004-02-05 18:53                     ` Badari Pulavarty
2004-03-29 15:41           ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Suparna Bhattacharya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1071624314.1826.12.camel@ibm-c.pdx.osdl.net \
    --to=daniel@osdl.org \
    --cc=akpm@osdl.org \
    --cc=janetmor@us.ibm.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=suparna@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).