All of lore.kernel.org
 help / color / mirror / Atom feed
* AIO and DIO testing on 2.6.0-test7-mm1
@ 2003-10-17 23:12 Daniel McNeil
  2003-10-18  0:24 ` Andrew Morton
  2003-10-20 14:27 ` Suparna Bhattacharya
  0 siblings, 2 replies; 19+ messages in thread
From: Daniel McNeil @ 2003-10-17 23:12 UTC (permalink / raw)
  To: Andrew Morton, Suparna Bhattacharya; +Cc: linux-aio, Linux Kernel Mailing List

I've been doing some testing on 2.6.0-test7-mm1 with O_DIRECT and
AIO.  I wrote some tests to check the buffered i/o verses O_DIRECT
i/o races and O_DIRECT verses truncate.

I still had apply suparna's direct-io.c patch to prevent oopses.
Suparna, you said the patch was not the complete patch, do you have
the complete patch?

I wrote some simple tests to create 2 processes with one doing
O_DIRECT i/o (of zeros) and the other doing buffered i/o and checking
that the buffered i/o process never saw non-zero data which was
posssible before the i_alloc_sem and other changes.  The test I
wrote are:
	dio_append - use O_DIRECT to append while doing buffered reads
 	dio_sparse - write O_DIRECT to holes while doing buffered reads
	dio_truncate - do O_DIRECT reads while truncating
	aiodio_append - same as dio_append but with AIO
	aiodio_sparse - same as dio_sparse but with AIO

These and a simple README are available here
	http://developer.osdl.org/daniel/AIO/TESTS/

The good news was I  never got any non-zero data or oops on test7-mm1
(with the direct-io patch).

Unfortunately, when I ran these on test7, I also did not get any
non-zero data.  On AIO test7 still gives me oopses:

Slab corruption: start=e7d9573c, expend=e7d957db, problemat=e7d95774
Last user: [<c018b612>](__aio_put_req+0x97/0x185)
Data: ********************************************************00 00 89 02 00 00
00 00 ***********************************************************************************************A5
Next: 71 F0 2C .12 B6 18 C0 71 F0 2C .********************
slab error in check_poison_obj(): cache `kiocb': object was modified after freeing
Call Trace:
 [<c0148c95>] check_poison_obj+0x106/0x18f
 [<c0148ed4>] slab_destroy+0x1b6/0x1be
 [<c014bf2f>] reap_timer_fnc+0x254/0x326
 [<c014bcdb>] reap_timer_fnc+0x0/0x326
 [<c012d80d>] run_timer_softirq+0xed/0x226
 [<c0128dcd>] do_softirq+0xc9/0xcb
 [<c011a165>] smp_apic_timer_interrupt+0xcd/0x135
 [<c0108029>] default_idle+0x0/0x32
 [<c010b0b6>] apic_timer_interrupt+0x1a/0x20
 [<c0108029>] default_idle+0x0/0x32
 [<c0108056>] default_idle+0x2d/0x32
 [<c01080d4>] cpu_idle+0x3a/0x43
 [<c0125050>] printk+0x1b4/0x258


I guess this expected because it does not have the ref counting and
other AIO fixes.

I'm planning on doing more testing and write some AIO verses truncate
tests.

If you have any ideas on how to better test the AIO and O_DIRET changes
in -mm, just let me know.

Daniel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: AIO and DIO testing on 2.6.0-test7-mm1
  2003-10-17 23:12 AIO and DIO testing on 2.6.0-test7-mm1 Daniel McNeil
@ 2003-10-18  0:24 ` Andrew Morton
  2003-10-20 14:27 ` Suparna Bhattacharya
  1 sibling, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2003-10-18  0:24 UTC (permalink / raw)
  To: Daniel McNeil; +Cc: suparna, linux-aio, linux-kernel

Daniel McNeil <daniel@osdl.org> wrote:
>
> On AIO test7 still gives me oopses:

Oh crap, that means I need to rebase those two patches on top of -linus,
not on top of -linus+AIOstuff.

I shall do that, thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: AIO and DIO testing on 2.6.0-test7-mm1
  2003-10-17 23:12 AIO and DIO testing on 2.6.0-test7-mm1 Daniel McNeil
  2003-10-18  0:24 ` Andrew Morton
@ 2003-10-20 14:27 ` Suparna Bhattacharya
  2003-10-20 23:47   ` Daniel McNeil
  1 sibling, 1 reply; 19+ messages in thread
From: Suparna Bhattacharya @ 2003-10-20 14:27 UTC (permalink / raw)
  To: Daniel McNeil; +Cc: Andrew Morton, linux-aio, Linux Kernel Mailing List

On Fri, Oct 17, 2003 at 04:12:58PM -0700, Daniel McNeil wrote:
> I've been doing some testing on 2.6.0-test7-mm1 with O_DIRECT and
> AIO.  I wrote some tests to check the buffered i/o verses O_DIRECT
> i/o races and O_DIRECT verses truncate.

Sounds very useful - thanks for doing this !

> 
> I still had apply suparna's direct-io.c patch to prevent oopses.
> Suparna, you said the patch was not the complete patch, do you have
> the complete patch?

Not yet.
A complete patch would need to address one more case that's rather
tricky to solve -- the case where a single dio request covers an 
allocated region followed by a hole.

Besides that there is a pending bug to address -- i.e to avoid
dropping i_sem during the actual i/o in the case where we are
extending the file (an intervening buffered write could extend
i_size, exposing uninitialized blocks). For AIO-DIO this means 
forcing the i/o to be synchronous for this case (as also for 
the case where we are overwriting an allocated region followed 
by a hole). Until we can use i/o barriers.

I was playing with a retry-based implementation for AIO-DIO,
where the first (tricky) case above becomes simple to handle ...
But didn't get a chance to work much on this during the last 
few days. I actually do have a patch, but there are occasional 
hangs with a lot of AIO-DIO writes to an ext3 filesystem in 
particular, that I can't explain as yet.

Do your testcases cover any of these cases already ?

Regards
Suparna

> 
> I wrote some simple tests to create 2 processes with one doing
> O_DIRECT i/o (of zeros) and the other doing buffered i/o and checking
> that the buffered i/o process never saw non-zero data which was
> posssible before the i_alloc_sem and other changes.  The test I
> wrote are:
> 	dio_append - use O_DIRECT to append while doing buffered reads
>  	dio_sparse - write O_DIRECT to holes while doing buffered reads
> 	dio_truncate - do O_DIRECT reads while truncating
> 	aiodio_append - same as dio_append but with AIO
> 	aiodio_sparse - same as dio_sparse but with AIO
> 
> These and a simple README are available here
> 	http://developer.osdl.org/daniel/AIO/TESTS/
> 
> The good news was I  never got any non-zero data or oops on test7-mm1
> (with the direct-io patch).
> 
> Unfortunately, when I ran these on test7, I also did not get any
> non-zero data.  On AIO test7 still gives me oopses:
> 
> Slab corruption: start=e7d9573c, expend=e7d957db, problemat=e7d95774
> Last user: [<c018b612>](__aio_put_req+0x97/0x185)
> Data: ********************************************************00 00 89 02 00 00
> 00 00 ***********************************************************************************************A5
> Next: 71 F0 2C .12 B6 18 C0 71 F0 2C .********************
> slab error in check_poison_obj(): cache `kiocb': object was modified after freeing
> Call Trace:
>  [<c0148c95>] check_poison_obj+0x106/0x18f
>  [<c0148ed4>] slab_destroy+0x1b6/0x1be
>  [<c014bf2f>] reap_timer_fnc+0x254/0x326
>  [<c014bcdb>] reap_timer_fnc+0x0/0x326
>  [<c012d80d>] run_timer_softirq+0xed/0x226
>  [<c0128dcd>] do_softirq+0xc9/0xcb
>  [<c011a165>] smp_apic_timer_interrupt+0xcd/0x135
>  [<c0108029>] default_idle+0x0/0x32
>  [<c010b0b6>] apic_timer_interrupt+0x1a/0x20
>  [<c0108029>] default_idle+0x0/0x32
>  [<c0108056>] default_idle+0x2d/0x32
>  [<c01080d4>] cpu_idle+0x3a/0x43
>  [<c0125050>] printk+0x1b4/0x258
> 
> 
> I guess this expected because it does not have the ref counting and
> other AIO fixes.
> 
> I'm planning on doing more testing and write some AIO verses truncate
> tests.
> 
> If you have any ideas on how to better test the AIO and O_DIRET changes
> in -mm, just let me know.
> 
> Daniel
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to majordomo@kvack.org.  For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Labs, India


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: AIO and DIO testing on 2.6.0-test7-mm1
  2003-10-20 14:27 ` Suparna Bhattacharya
@ 2003-10-20 23:47   ` Daniel McNeil
  2003-10-21 12:11     ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1) Suparna Bhattacharya
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel McNeil @ 2003-10-20 23:47 UTC (permalink / raw)
  To: Suparna Bhattacharya; +Cc: Andrew Morton, linux-aio, Linux Kernel Mailing List

On Mon, 2003-10-20 at 07:27, Suparna Bhattacharya wrote:
> On Fri, Oct 17, 2003 at 04:12:58PM -0700, Daniel McNeil wrote:
> > I've been doing some testing on 2.6.0-test7-mm1 with O_DIRECT and
> > AIO.  I wrote some tests to check the buffered i/o verses O_DIRECT
> > i/o races and O_DIRECT verses truncate.
> 
> Sounds very useful - thanks for doing this !
> 
> > 
> > I still had apply suparna's direct-io.c patch to prevent oopses.
> > Suparna, you said the patch was not the complete patch, do you have
> > the complete patch?
> 
> Not yet.
> A complete patch would need to address one more case that's rather
> tricky to solve -- the case where a single dio request covers an 
> allocated region followed by a hole.
> 
> Besides that there is a pending bug to address -- i.e to avoid
> dropping i_sem during the actual i/o in the case where we are
> extending the file (an intervening buffered write could extend
> i_size, exposing uninitialized blocks). For AIO-DIO this means 
> forcing the i/o to be synchronous for this case (as also for 
> the case where we are overwriting an allocated region followed 
> by a hole). Until we can use i/o barriers.
> 
> I was playing with a retry-based implementation for AIO-DIO,
> where the first (tricky) case above becomes simple to handle ...
> But didn't get a chance to work much on this during the last 
> few days. I actually do have a patch, but there are occasional 
> hangs with a lot of AIO-DIO writes to an ext3 filesystem in 
> particular, that I can't explain as yet.
> 
> Do your testcases cover any of these cases already ?
> 
> Regards
> Suparna
> 

I was hoping that my test cases would hit some of these potential AIO
verses buffered write (growing the file and filling in holes) races.
My tests always write zeroes and holes should always return zeroes, so
the check makes sure that the data is always zero. I am planning on
updating the tests to have more readers and maybe more writers to
check if we ever see uninitialized data.  I am starting to test on
2.6-test8 and will let you know how it goes.

Daniel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1)
  2003-10-20 23:47   ` Daniel McNeil
@ 2003-10-21 12:11     ` Suparna Bhattacharya
  2003-10-23  0:40       ` Daniel McNeil
  0 siblings, 1 reply; 19+ messages in thread
From: Suparna Bhattacharya @ 2003-10-21 12:11 UTC (permalink / raw)
  To: Daniel McNeil
  Cc: Andrew Morton, linux-aio, Linux Kernel Mailing List, pbadari

On Mon, Oct 20, 2003 at 04:47:53PM -0700, Daniel McNeil wrote:
> On Mon, 2003-10-20 at 07:27, Suparna Bhattacharya wrote:
> > On Fri, Oct 17, 2003 at 04:12:58PM -0700, Daniel McNeil wrote:
> > > 
> > > I still had apply suparna's direct-io.c patch to prevent oopses.
> > > Suparna, you said the patch was not the complete patch, do you have
> > > the complete patch?
> > 
> > Not yet.
> > A complete patch would need to address one more case that's rather
> > tricky to solve -- the case where a single dio request covers an 
> > allocated region followed by a hole.
> > 
> > Besides that there is a pending bug to address -- i.e to avoid
> > dropping i_sem during the actual i/o in the case where we are
> > extending the file (an intervening buffered write could extend
> > i_size, exposing uninitialized blocks). For AIO-DIO this means 
> > forcing the i/o to be synchronous for this case (as also for 
> > the case where we are overwriting an allocated region followed 
> > by a hole). Until we can use i/o barriers.
> > 
> > I was playing with a retry-based implementation for AIO-DIO,
> > where the first (tricky) case above becomes simple to handle ...
> > But didn't get a chance to work much on this during the last 
> > few days. I actually do have a patch, but there are occasional 
> > hangs with a lot of AIO-DIO writes to an ext3 filesystem in 
> > particular, that I can't explain as yet.
> > 

OK, here is the patch I was talking about, in case you get a 
chance to experiment with it. (you'll need to reverse the 
earlier incomplete fix for overwriting holes if you apply
this).

It converts AIO-DIO to be based on the retry model, and addresses
the following known remaining loopholes in the current DIO
code:

1) For DIO file extends, we can't drop i_sem during the actual
   i/o  -- intermediate writes could extend i_size exposing
   unwritten blocks
2) For AIO-DIO extends, we are updating i_size before i/o 
   completes --- could expose unwritten blocks to intermediate
   reads
3) For AIO-DIO writes to holes, finished_one_bio causes 
   aio_complete() (and dio end_io) to be called before 
   falling back to buffered i/o. 
4) For AIO-DIO writes to allocated blocks followed by a hole,
   we've submitted BIOs for some of the i/o before we come
   across the hole and then we fallback to buffered i/o
   without waiting for that i/o to complete.

This patch avoids dropping i_sem in between during synchronous 
writes altogether. I had initially skipped dropping i_sem in the
case of file extends only, but observed some hangs with ext3
... couldn't be sure if it was really OK to drop i_sem before
completing the ext3_stop_journal. In the AIO case i_sem is
released during the i/o when the request overwrites existing
blocks only.

The second thing it does is to wait for i/o to complete even
for AIO, if any part of the request involves overwriting a hole 
or extending the file (takes care of 1, 2 and 4). Using the
retry model makes it possible to switch from async to sync
wait if appropriate _after_ the i/o is issued.


Regards
Suparna

diffstat aio-dio-retry.patch

 fs/aio.c            |    9 ++
 fs/direct-io.c      |  177 ++++++++++++++++++++++++++--------------------------
 include/linux/aio.h |    2 
 mm/filemap.c        |    4 -
 4 files changed, 100 insertions(+), 92 deletions(-)


diff -urp linux-2.6.0-test6-mm4/fs/aio.c 260t6mm4/fs/aio.c
--- linux-2.6.0-test6-mm4/fs/aio.c	Tue Oct 21 19:20:50 2003
+++ 260t6mm4/fs/aio.c	Tue Oct 21 19:09:26 2003
@@ -408,6 +408,7 @@ static struct kiocb *__aio_get_req(struc
 	req->ki_cancel = NULL;
 	req->ki_retry = NULL;
 	req->ki_user_obj = NULL;
+	req->ki_private = NULL;
 	INIT_LIST_HEAD(&req->ki_run_list);
 
 	/* Check if the completion queue has enough free space to
@@ -700,8 +701,12 @@ static ssize_t aio_run_iocb(struct kiocb
 	iocb->ki_retry = NULL;
 	spin_unlock_irq(&ctx->ctx_lock);
 
-	/* Quit retrying if the i/o has been cancelled */
-	if (kiocbIsCancelled(iocb)) {
+	/* 
+	 * Quit retrying if the i/o has been cancelled unless
+	 * there is some private iocb state that is still being 
+	 * referenced
+	 */
+	if (kiocbIsCancelled(iocb) && !iocb->ki_private) {
 		ret = -EINTR;
 		aio_complete(iocb, ret, 0);
 		/* must not access the iocb after this */
diff -urp linux-2.6.0-test6-mm4/fs/direct-io.c 260t6mm4/fs/direct-io.c
--- linux-2.6.0-test6-mm4/fs/direct-io.c	Tue Oct 21 18:58:52 2003
+++ 260t6mm4/fs/direct-io.c	Tue Oct 21 19:14:55 2003
@@ -118,11 +118,10 @@ struct dio {
 	atomic_t bios_in_flight;	/* nr bios in flight */
 	spinlock_t bio_list_lock;	/* protects bio_list */
 	struct bio *bio_list;		/* singly linked via bi_private */
-	struct task_struct *waiter;	/* waiting task (NULL if none) */
+	wait_queue_head_t wqh;		/* to wait on in-flight bios */
 
 	/* AIO related stuff */
 	struct kiocb *iocb;		/* kiocb */
-	int is_async;			/* is IO async ? */
 	int result;			/* IO result */
 };
 
@@ -218,33 +217,12 @@ static void dio_complete(struct dio *dio
  * Called when a BIO has been processed.  If the count goes to zero then IO is
  * complete and we can signal this to the AIO layer.
  */
-static void finished_one_bio(struct dio *dio)
+static inline void finished_one_bio(struct dio *dio)
 {
-	if (atomic_dec_and_test(&dio->bio_count)) {
-		if (dio->is_async) {
-			dio_complete(dio, dio->block_in_file << dio->blkbits,
-					dio->result);
-			aio_complete(dio->iocb, dio->result, 0);
-			kfree(dio);
-		}
-	}
+	atomic_dec(&dio->bio_count);
 }
 
 static int dio_bio_complete(struct dio *dio, struct bio *bio);
-/*
- * Asynchronous IO callback. 
- */
-static int dio_bio_end_aio(struct bio *bio, unsigned int bytes_done, int error)
-{
-	struct dio *dio = bio->bi_private;
-
-	if (bio->bi_size)
-		return 1;
-
-	/* cleanup the bio */
-	dio_bio_complete(dio, bio);
-	return 0;
-}
 
 /*
  * The BIO completion handler simply queues the BIO up for the process-context
@@ -265,8 +243,8 @@ static int dio_bio_end_io(struct bio *bi
 	bio->bi_private = dio->bio_list;
 	dio->bio_list = bio;
 	atomic_dec(&dio->bios_in_flight);
-	if (dio->waiter && atomic_read(&dio->bios_in_flight) == 0)
-		wake_up_process(dio->waiter);
+	if (atomic_read(&dio->bios_in_flight) == 0)
+		wake_up(&dio->wqh);
 	spin_unlock_irqrestore(&dio->bio_list_lock, flags);
 	return 0;
 }
@@ -283,10 +261,7 @@ dio_bio_alloc(struct dio *dio, struct bl
 
 	bio->bi_bdev = bdev;
 	bio->bi_sector = first_sector;
-	if (dio->is_async)
-		bio->bi_end_io = dio_bio_end_aio;
-	else
-		bio->bi_end_io = dio_bio_end_io;
+	bio->bi_end_io = dio_bio_end_io;
 
 	dio->bio = bio;
 	return 0;
@@ -304,8 +279,6 @@ static void dio_bio_submit(struct dio *d
 	bio->bi_private = dio;
 	atomic_inc(&dio->bio_count);
 	atomic_inc(&dio->bios_in_flight);
-	if (dio->is_async && dio->rw == READ)
-		bio_set_pages_dirty(bio);
 	submit_bio(dio->rw, bio);
 
 	dio->bio = NULL;
@@ -324,23 +297,29 @@ static void dio_cleanup(struct dio *dio)
 /*
  * Wait for the next BIO to complete.  Remove it and return it.
  */
-static struct bio *dio_await_one(struct dio *dio)
+static struct bio *dio_await_one(struct dio *dio, wait_queue_t *wait)
 {
 	unsigned long flags;
 	struct bio *bio;
 
 	spin_lock_irqsave(&dio->bio_list_lock, flags);
-	while (dio->bio_list == NULL) {
-		set_current_state(TASK_UNINTERRUPTIBLE);
-		if (dio->bio_list == NULL) {
-			dio->waiter = current;
+	if (dio->bio_list == NULL) {
+		DEFINE_WAIT(local_wait);
+
+		if (!wait)
+			wait = &local_wait;
+		do {	
+			prepare_to_wait(&dio->wqh, wait, TASK_UNINTERRUPTIBLE);
+			if (dio->bio_list != NULL)
+				break;
 			spin_unlock_irqrestore(&dio->bio_list_lock, flags);
 			blk_run_queues();
+			if (!is_sync_wait(wait))
+				return ERR_PTR(-EIOCBRETRY);
 			io_schedule();
 			spin_lock_irqsave(&dio->bio_list_lock, flags);
-			dio->waiter = NULL;
-		}
-		set_current_state(TASK_RUNNING);
+		} while (dio->bio_list == NULL);
+		finish_wait(&dio->wqh, wait);
 	}
 	bio = dio->bio_list;
 	dio->bio_list = bio->bi_private;
@@ -360,26 +339,25 @@ static int dio_bio_complete(struct dio *
 	if (!uptodate)
 		dio->result = -EIO;
 
-	if (dio->is_async && dio->rw == READ) {
-		bio_check_pages_dirty(bio);	/* transfers ownership */
-	} else {
-		for (page_no = 0; page_no < bio->bi_vcnt; page_no++) {
-			struct page *page = bvec[page_no].bv_page;
+	for (page_no = 0; page_no < bio->bi_vcnt; page_no++) {
+		struct page *page = bvec[page_no].bv_page;
 
-			if (dio->rw == READ)
-				set_page_dirty_lock(page);
-			page_cache_release(page);
-		}
-		bio_put(bio);
+		if (dio->rw == READ)
+			set_page_dirty_lock(page);
+		page_cache_release(page);
 	}
+	bio_put(bio);
+
 	finished_one_bio(dio);
 	return uptodate ? 0 : -EIO;
 }
 
 /*
  * Wait on and process all in-flight BIOs.
+ * Complete and free the dio once all are done.
  */
-static int dio_await_completion(struct dio *dio)
+static int dio_await_completion(struct dio *dio, loff_t offset, 
+	wait_queue_t *wait)
 {
 	int ret = 0;
 
@@ -387,13 +365,25 @@ static int dio_await_completion(struct d
 		dio_bio_submit(dio);
 
 	while (atomic_read(&dio->bio_count)) {
-		struct bio *bio = dio_await_one(dio);
+		struct bio *bio = dio_await_one(dio, wait);
 		int ret2;
 
+		if (IS_ERR(bio)) {
+			ret = PTR_ERR(bio);
+			return ret;
+		}
 		ret2 = dio_bio_complete(dio, bio);
 		if (ret == 0)
 			ret = ret2;
 	}
+	if (ret == 0)
+		ret = dio->page_errors;
+	if ((ret == 0) && dio->result)
+		ret = dio->result;
+	dio_complete(dio, offset, ret);
+	dio->iocb->ki_private = NULL;
+	kfree(dio);
+
 	return ret;
 }
 
@@ -875,8 +865,12 @@ direct_io_worker(int rw, struct kiocb *i
 	int ret = 0;
 	int ret2;
 	size_t bytes;
+	loff_t end = offset;
+	struct dio *old_dio = iocb->ki_private;
+	wait_queue_t *io_wait = current->io_wait;
 
-	dio->is_async = !is_sync_kiocb(iocb);
+	if (dio == old_dio) /* i/o has already been issued for this dio */
+		goto do_wait;
 
 	dio->bio = NULL;
 	dio->inode = inode;
@@ -899,6 +893,7 @@ direct_io_worker(int rw, struct kiocb *i
 	dio->page_errors = 0;
 	dio->result = 0;
 	dio->iocb = iocb;
+	iocb->ki_private = dio;
 
 	/*
 	 * BIO completion state.
@@ -912,7 +907,7 @@ direct_io_worker(int rw, struct kiocb *i
 	atomic_set(&dio->bios_in_flight, 0);
 	spin_lock_init(&dio->bio_list_lock);
 	dio->bio_list = NULL;
-	dio->waiter = NULL;
+	init_waitqueue_head(&dio->wqh);
 
 	dio->pages_in_io = 0;
 	for (seg = 0; seg < nr_segs; seg++) 
@@ -920,7 +915,7 @@ direct_io_worker(int rw, struct kiocb *i
 
 	for (seg = 0; seg < nr_segs; seg++) {
 		user_addr = (unsigned long)iov[seg].iov_base;
-		bytes = iov[seg].iov_len;
+		end += bytes = iov[seg].iov_len;
 
 		/* Index into the first page of the first block */
 		dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
@@ -968,41 +963,50 @@ direct_io_worker(int rw, struct kiocb *i
 		dio_bio_submit(dio);
 
 	/*
-	 * All new block allocations have been performed.  We can let i_sem
-	 * go now.
+	 * If this is a file extend we have to hold i_sem throughout 
+	 * till the data hits the disk and i_size is updated, to avoid 
+	 * exposing uninitialized blocks. 
+	 *
+	 * For AIO file extends we just take the easy way out and wait 
+	 * synchronously in this situation; typically databases
+	 * would have pre-allocated the file before issuing AIO.
+	 * We may have to change this if there are real world 
+	 * users of file extending AIO who are affected.
+	 *
+	 * Secondly if we ran into a hole after submitting i/o for 
+	 * some of the blocks, we wait for issued i/o to complete 
+	 * before falling back to buffered i/o.
 	 */
-	if (dio->needs_locking)
+	if ((end > i_size_read(dio->inode)) || ret) {
+		/* switch to synchronous wait even if we are in AIO */
+		io_wait = NULL;
+	}
+	/*
+	 * All block lookups have been performed.  We can let i_sem
+	 * go now if this is a read request.
+	 */
+	if ((rw == READ) && dio->needs_locking)
 		up(&dio->inode->i_sem);
 
 	/*
 	 * OK, all BIOs are submitted, so we can decrement bio_count to truly
 	 * reflect the number of to-be-processed BIOs.
 	 */
-	if (dio->is_async) {
-		if (ret == 0)
-			ret = dio->result;	/* Bytes written */
-		finished_one_bio(dio);		/* This can free the dio */
-		blk_run_queues();
-	} else {
-		finished_one_bio(dio);
-		ret2 = dio_await_completion(dio);
-		if (ret == 0)
-			ret = ret2;
-		if (ret == 0)
-			ret = dio->page_errors;
-		if (ret == 0 && dio->result) {
-			loff_t i_size = i_size_read(inode);
+	finished_one_bio(dio);
 
-			ret = dio->result;
-			/*
-			 * Adjust the return value if the read crossed a
-			 * non-block-aligned EOF.
-			 */
-			if (rw == READ && (offset + ret > i_size))
-				ret = i_size - offset;
-		}
-		dio_complete(dio, offset, ret);
-		kfree(dio);
+do_wait:
+	ret2 = dio_await_completion(dio, offset, io_wait);
+	if (ret == 0)
+		ret = ret2;
+	if (ret == 0) {
+		loff_t i_size = i_size_read(inode);
+
+		/*
+		 * Adjust the return value if the read crossed a
+		 * non-block-aligned EOF.
+		 */
+		if (rw == READ && (offset + ret > i_size))
+			ret = i_size - offset;
 	}
 	return ret;
 }
@@ -1028,7 +1032,7 @@
 	unsigned bdev_blkbits = 0;
 	unsigned blocksize_mask = (1 << blkbits) - 1;
 	ssize_t retval = -EINVAL;
-	struct dio *dio;
+	struct dio *dio = iocb->ki_private;
 	int needs_locking;
 
 	if (bdev)
@@ -1055,6 +1061,8 @@ __blockdev_direct_IO(int rw, struct kioc
 		}
 	}
 
+	if (dio)
+		goto have_dio;	
 	dio = kmalloc(sizeof(*dio), GFP_KERNEL);
 	retval = -ENOMEM;
 	if (!dio)
@@ -1081,10 +1089,9 @@ __blockdev_direct_IO(int rw, struct kioc
 	}
 	dio->needs_locking = needs_locking;
 
+have_dio:
 	retval = direct_io_worker(rw, iocb, inode, iov, offset,
 				nr_segs, blkbits, get_blocks, end_io, dio);
-	if (needs_locking && rw == WRITE)
-		down(&inode->i_sem);
 out:
 	return retval;
 }
diff -urp linux-2.6.0-test6-mm4/include/linux/aio.h 260t6mm4/include/linux/aio.h
--- linux-2.6.0-test6-mm4/include/linux/aio.h	Tue Oct 21 18:58:55 2003
+++ 260t6mm4/include/linux/aio.h	Tue Oct 14 21:36:43 2003
@@ -74,6 +74,7 @@ struct kiocb {
 	char 			*ki_buf;	/* remaining iocb->aio_buf */
 	size_t			ki_left; 	/* remaining bytes */
 	wait_queue_t		ki_wait;
+	void			*ki_private;
 	long			ki_retried; 	/* just for testing */
 	long			ki_kicked; 	/* just for testing */
 	long			ki_queued; 	/* just for testing */
@@ -94,6 +95,7 @@ struct kiocb {
 		(x)->ki_user_obj = tsk;			\
 		(x)->ki_user_data = 0;			\
 		init_wait((&(x)->ki_wait));		\
+		(x)->ki_private = NULL;			\
 	} while (0)
 
 #define AIO_RING_MAGIC			0xa10a10a1
diff -urp linux-2.6.0-test6-mm4/mm/filemap.c 260t6mm4/mm/filemap.c
--- linux-2.6.0-test6-mm4/mm/filemap.c	Tue Oct 21 18:58:56 2003
+++ 260t6mm4/mm/filemap.c	Tue Oct 21 19:16:20 2003
@@ -926,8 +926,6 @@ __generic_file_aio_read(struct kiocb *io
 		if (pos < size) {
 			retval = generic_file_direct_IO(READ, iocb,
 						iov, pos, nr_segs);
-			if (retval >= 0 && !is_sync_kiocb(iocb))
-				retval = -EIOCBQUEUED;
 			if (retval > 0)
 				*ppos = pos + retval;
 		}
@@ -1892,8 +1890,6 @@ __generic_file_aio_write_nolock(struct k
 		 */
 		if (written >= 0 && file->f_flags & O_SYNC)
 			status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
-		if (written >= 0 && !is_sync_kiocb(iocb))
-			written = -EIOCBQUEUED;
 		if (written != -ENOTBLK)
 			goto out_status;
 		/*

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1)
  2003-10-21 12:11     ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1) Suparna Bhattacharya
@ 2003-10-23  0:40       ` Daniel McNeil
  2003-10-23 10:49         ` Suparna Bhattacharya
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel McNeil @ 2003-10-23  0:40 UTC (permalink / raw)
  To: Suparna Bhattacharya
  Cc: Andrew Morton, linux-aio, Linux Kernel Mailing List, Badari Pulavarty

Suparna and Andrew,

I've been doing more testing using the test programs I wrote to 
try and hit the AIO verses buffered read race conditions.

I tested 2.6.0-test8, 2.6.0-test8-mm1+(your first incomplete fix) and
2.6.0-test8-mm1+aio-dio-retry patch.  I used my test programs
(http://developer.osdl.org/daniel/AIO/TESTS/) by doing:

Run "dirty" program which allocates and writes 0xaa to a file and then
	frees the space.
Run "dio_sparse" or "aiodio-sparse - which creates "file", truncates it
	up to 64MB and then writes zeros into the holes (using DIO or
	AIO+DIO).  At same time, a forked child is reading the file
	looking for non-zero data.
rm "file"

On 2.6.0-test8
==============
I hit the race condition and see uninitialized data:
~/AIO/TESTS/dio_sparse
non zero buffer at buf[4] => 0xaaaaaaaa,aaaaaaaa,aaaaaaaa,aaaaaaaa
non-zero read at offset 24182785

 ~/AIO/TESTS/aiodio_sparse
non zero buffer at buf[4] -> 0xaaaaaaaa,aaaaaaaa,aaaaaaaa,aaaaaaaa
non-zero read at offset 8323062


On 2.6.0-test8-mm1+1st-direct-io-aio_complete patch and
   2.6.0-test8-mm1+aio-dio-retry patch

I never see uninitialized data.

Reading over your description and looking over the code, I'm pretty
sure I see the races you are talking about but had some questions:

1) DIO file extends.  So are you saying that in direct_io_worker()
   we were dropping i_sem and then waiting for the i/o to complete,
   but since i_sem was dropped there could be another write exposing
   the uninitialized data still in process of being written?
   
   If yes, then wouldn't the intermediate write be responsible for
   zeroing all data between the last isize and the new isize?

2) aio-dio extends

   Are you saying that in __generic_file_aio_write_nolock()
   that generic_file_direct_IO() returns the number of bytes
   that are being AIO written but not complete and we were updating
   i_size write after that?  (just trying to understand the code).


I'm still looking at the code changes and trying to make sure my
tests can hit race condititions.

Thanks,
Daniel

On Tue, 2003-10-21 at 05:11, Suparna Bhattacharya wrote:
> On Mon, Oct 20, 2003 at 04:47:53PM -0700, Daniel McNeil wrote:
> > On Mon, 2003-10-20 at 07:27, Suparna Bhattacharya wrote:
> > > On Fri, Oct 17, 2003 at 04:12:58PM -0700, Daniel McNeil wrote:
> > > > 
> > > > I still had apply suparna's direct-io.c patch to prevent oopses.
> > > > Suparna, you said the patch was not the complete patch, do you have
> > > > the complete patch?
> > > 
> > > Not yet.
> > > A complete patch would need to address one more case that's rather
> > > tricky to solve -- the case where a single dio request covers an 
> > > allocated region followed by a hole.
> > > 
> > > Besides that there is a pending bug to address -- i.e to avoid
> > > dropping i_sem during the actual i/o in the case where we are
> > > extending the file (an intervening buffered write could extend
> > > i_size, exposing uninitialized blocks). For AIO-DIO this means 
> > > forcing the i/o to be synchronous for this case (as also for 
> > > the case where we are overwriting an allocated region followed 
> > > by a hole). Until we can use i/o barriers.
> > > 
> > > I was playing with a retry-based implementation for AIO-DIO,
> > > where the first (tricky) case above becomes simple to handle ...
> > > But didn't get a chance to work much on this during the last 
> > > few days. I actually do have a patch, but there are occasional 
> > > hangs with a lot of AIO-DIO writes to an ext3 filesystem in 
> > > particular, that I can't explain as yet.
> > > 
> 
> OK, here is the patch I was talking about, in case you get a 
> chance to experiment with it. (you'll need to reverse the 
> earlier incomplete fix for overwriting holes if you apply
> this).
> 
> It converts AIO-DIO to be based on the retry model, and addresses
> the following known remaining loopholes in the current DIO
> code:
> 
> 1) For DIO file extends, we can't drop i_sem during the actual
>    i/o  -- intermediate writes could extend i_size exposing
>    unwritten blocks
> 2) For AIO-DIO extends, we are updating i_size before i/o 
>    completes --- could expose unwritten blocks to intermediate
>    reads
> 3) For AIO-DIO writes to holes, finished_one_bio causes 
>    aio_complete() (and dio end_io) to be called before 
>    falling back to buffered i/o. 
> 4) For AIO-DIO writes to allocated blocks followed by a hole,
>    we've submitted BIOs for some of the i/o before we come
>    across the hole and then we fallback to buffered i/o
>    without waiting for that i/o to complete.
> 
> This patch avoids dropping i_sem in between during synchronous 
> writes altogether. I had initially skipped dropping i_sem in the
> case of file extends only, but observed some hangs with ext3
> ... couldn't be sure if it was really OK to drop i_sem before
> completing the ext3_stop_journal. In the AIO case i_sem is
> released during the i/o when the request overwrites existing
> blocks only.
> 
> The second thing it does is to wait for i/o to complete even
> for AIO, if any part of the request involves overwriting a hole 
> or extending the file (takes care of 1, 2 and 4). Using the
> retry model makes it possible to switch from async to sync
> wait if appropriate _after_ the i/o is issued.
> 
> 
> Regards
> Suparna
> 
> diffstat aio-dio-retry.patch
> 
>  fs/aio.c            |    9 ++
>  fs/direct-io.c      |  177 ++++++++++++++++++++++++++--------------------------
>  include/linux/aio.h |    2 
>  mm/filemap.c        |    4 -
>  4 files changed, 100 insertions(+), 92 deletions(-)
> 
> 
> diff -urp linux-2.6.0-test6-mm4/fs/aio.c 260t6mm4/fs/aio.c
> --- linux-2.6.0-test6-mm4/fs/aio.c	Tue Oct 21 19:20:50 2003
> +++ 260t6mm4/fs/aio.c	Tue Oct 21 19:09:26 2003
> @@ -408,6 +408,7 @@ static struct kiocb *__aio_get_req(struc
>  	req->ki_cancel = NULL;
>  	req->ki_retry = NULL;
>  	req->ki_user_obj = NULL;
> +	req->ki_private = NULL;
>  	INIT_LIST_HEAD(&req->ki_run_list);
>  
>  	/* Check if the completion queue has enough free space to
> @@ -700,8 +701,12 @@ static ssize_t aio_run_iocb(struct kiocb
>  	iocb->ki_retry = NULL;
>  	spin_unlock_irq(&ctx->ctx_lock);
>  
> -	/* Quit retrying if the i/o has been cancelled */
> -	if (kiocbIsCancelled(iocb)) {
> +	/* 
> +	 * Quit retrying if the i/o has been cancelled unless
> +	 * there is some private iocb state that is still being 
> +	 * referenced
> +	 */
> +	if (kiocbIsCancelled(iocb) && !iocb->ki_private) {
>  		ret = -EINTR;
>  		aio_complete(iocb, ret, 0);
>  		/* must not access the iocb after this */
> diff -urp linux-2.6.0-test6-mm4/fs/direct-io.c 260t6mm4/fs/direct-io.c
> --- linux-2.6.0-test6-mm4/fs/direct-io.c	Tue Oct 21 18:58:52 2003
> +++ 260t6mm4/fs/direct-io.c	Tue Oct 21 19:14:55 2003
> @@ -118,11 +118,10 @@ struct dio {
>  	atomic_t bios_in_flight;	/* nr bios in flight */
>  	spinlock_t bio_list_lock;	/* protects bio_list */
>  	struct bio *bio_list;		/* singly linked via bi_private */
> -	struct task_struct *waiter;	/* waiting task (NULL if none) */
> +	wait_queue_head_t wqh;		/* to wait on in-flight bios */
>  
>  	/* AIO related stuff */
>  	struct kiocb *iocb;		/* kiocb */
> -	int is_async;			/* is IO async ? */
>  	int result;			/* IO result */
>  };
>  
> @@ -218,33 +217,12 @@ static void dio_complete(struct dio *dio
>   * Called when a BIO has been processed.  If the count goes to zero then IO is
>   * complete and we can signal this to the AIO layer.
>   */
> -static void finished_one_bio(struct dio *dio)
> +static inline void finished_one_bio(struct dio *dio)
>  {
> -	if (atomic_dec_and_test(&dio->bio_count)) {
> -		if (dio->is_async) {
> -			dio_complete(dio, dio->block_in_file << dio->blkbits,
> -					dio->result);
> -			aio_complete(dio->iocb, dio->result, 0);
> -			kfree(dio);
> -		}
> -	}
> +	atomic_dec(&dio->bio_count);
>  }
>  
>  static int dio_bio_complete(struct dio *dio, struct bio *bio);
> -/*
> - * Asynchronous IO callback. 
> - */
> -static int dio_bio_end_aio(struct bio *bio, unsigned int bytes_done, int error)
> -{
> -	struct dio *dio = bio->bi_private;
> -
> -	if (bio->bi_size)
> -		return 1;
> -
> -	/* cleanup the bio */
> -	dio_bio_complete(dio, bio);
> -	return 0;
> -}
>  
>  /*
>   * The BIO completion handler simply queues the BIO up for the process-context
> @@ -265,8 +243,8 @@ static int dio_bio_end_io(struct bio *bi
>  	bio->bi_private = dio->bio_list;
>  	dio->bio_list = bio;
>  	atomic_dec(&dio->bios_in_flight);
> -	if (dio->waiter && atomic_read(&dio->bios_in_flight) == 0)
> -		wake_up_process(dio->waiter);
> +	if (atomic_read(&dio->bios_in_flight) == 0)
> +		wake_up(&dio->wqh);
>  	spin_unlock_irqrestore(&dio->bio_list_lock, flags);
>  	return 0;
>  }
> @@ -283,10 +261,7 @@ dio_bio_alloc(struct dio *dio, struct bl
>  
>  	bio->bi_bdev = bdev;
>  	bio->bi_sector = first_sector;
> -	if (dio->is_async)
> -		bio->bi_end_io = dio_bio_end_aio;
> -	else
> -		bio->bi_end_io = dio_bio_end_io;
> +	bio->bi_end_io = dio_bio_end_io;
>  
>  	dio->bio = bio;
>  	return 0;
> @@ -304,8 +279,6 @@ static void dio_bio_submit(struct dio *d
>  	bio->bi_private = dio;
>  	atomic_inc(&dio->bio_count);
>  	atomic_inc(&dio->bios_in_flight);
> -	if (dio->is_async && dio->rw == READ)
> -		bio_set_pages_dirty(bio);
>  	submit_bio(dio->rw, bio);
>  
>  	dio->bio = NULL;
> @@ -324,23 +297,29 @@ static void dio_cleanup(struct dio *dio)
>  /*
>   * Wait for the next BIO to complete.  Remove it and return it.
>   */
> -static struct bio *dio_await_one(struct dio *dio)
> +static struct bio *dio_await_one(struct dio *dio, wait_queue_t *wait)
>  {
>  	unsigned long flags;
>  	struct bio *bio;
>  
>  	spin_lock_irqsave(&dio->bio_list_lock, flags);
> -	while (dio->bio_list == NULL) {
> -		set_current_state(TASK_UNINTERRUPTIBLE);
> -		if (dio->bio_list == NULL) {
> -			dio->waiter = current;
> +	if (dio->bio_list == NULL) {
> +		DEFINE_WAIT(local_wait);
> +
> +		if (!wait)
> +			wait = &local_wait;
> +		do {	
> +			prepare_to_wait(&dio->wqh, wait, TASK_UNINTERRUPTIBLE);
> +			if (dio->bio_list != NULL)
> +				break;
>  			spin_unlock_irqrestore(&dio->bio_list_lock, flags);
>  			blk_run_queues();
> +			if (!is_sync_wait(wait))
> +				return ERR_PTR(-EIOCBRETRY);
>  			io_schedule();
>  			spin_lock_irqsave(&dio->bio_list_lock, flags);
> -			dio->waiter = NULL;
> -		}
> -		set_current_state(TASK_RUNNING);
> +		} while (dio->bio_list == NULL);
> +		finish_wait(&dio->wqh, wait);
>  	}
>  	bio = dio->bio_list;
>  	dio->bio_list = bio->bi_private;
> @@ -360,26 +339,25 @@ static int dio_bio_complete(struct dio *
>  	if (!uptodate)
>  		dio->result = -EIO;
>  
> -	if (dio->is_async && dio->rw == READ) {
> -		bio_check_pages_dirty(bio);	/* transfers ownership */
> -	} else {
> -		for (page_no = 0; page_no < bio->bi_vcnt; page_no++) {
> -			struct page *page = bvec[page_no].bv_page;
> +	for (page_no = 0; page_no < bio->bi_vcnt; page_no++) {
> +		struct page *page = bvec[page_no].bv_page;
>  
> -			if (dio->rw == READ)
> -				set_page_dirty_lock(page);
> -			page_cache_release(page);
> -		}
> -		bio_put(bio);
> +		if (dio->rw == READ)
> +			set_page_dirty_lock(page);
> +		page_cache_release(page);
>  	}
> +	bio_put(bio);
> +
>  	finished_one_bio(dio);
>  	return uptodate ? 0 : -EIO;
>  }
>  
>  /*
>   * Wait on and process all in-flight BIOs.
> + * Complete and free the dio once all are done.
>   */
> -static int dio_await_completion(struct dio *dio)
> +static int dio_await_completion(struct dio *dio, loff_t offset, 
> +	wait_queue_t *wait)
>  {
>  	int ret = 0;
>  
> @@ -387,13 +365,25 @@ static int dio_await_completion(struct d
>  		dio_bio_submit(dio);
>  
>  	while (atomic_read(&dio->bio_count)) {
> -		struct bio *bio = dio_await_one(dio);
> +		struct bio *bio = dio_await_one(dio, wait);
>  		int ret2;
>  
> +		if (IS_ERR(bio)) {
> +			ret = PTR_ERR(bio);
> +			return ret;
> +		}
>  		ret2 = dio_bio_complete(dio, bio);
>  		if (ret == 0)
>  			ret = ret2;
>  	}
> +	if (ret == 0)
> +		ret = dio->page_errors;
> +	if ((ret == 0) && dio->result)
> +		ret = dio->result;
> +	dio_complete(dio, offset, ret);
> +	dio->iocb->ki_private = NULL;
> +	kfree(dio);
> +
>  	return ret;
>  }
>  
> @@ -875,8 +865,12 @@ direct_io_worker(int rw, struct kiocb *i
>  	int ret = 0;
>  	int ret2;
>  	size_t bytes;
> +	loff_t end = offset;
> +	struct dio *old_dio = iocb->ki_private;
> +	wait_queue_t *io_wait = current->io_wait;
>  
> -	dio->is_async = !is_sync_kiocb(iocb);
> +	if (dio == old_dio) /* i/o has already been issued for this dio */
> +		goto do_wait;
>  
>  	dio->bio = NULL;
>  	dio->inode = inode;
> @@ -899,6 +893,7 @@ direct_io_worker(int rw, struct kiocb *i
>  	dio->page_errors = 0;
>  	dio->result = 0;
>  	dio->iocb = iocb;
> +	iocb->ki_private = dio;
>  
>  	/*
>  	 * BIO completion state.
> @@ -912,7 +907,7 @@ direct_io_worker(int rw, struct kiocb *i
>  	atomic_set(&dio->bios_in_flight, 0);
>  	spin_lock_init(&dio->bio_list_lock);
>  	dio->bio_list = NULL;
> -	dio->waiter = NULL;
> +	init_waitqueue_head(&dio->wqh);
>  
>  	dio->pages_in_io = 0;
>  	for (seg = 0; seg < nr_segs; seg++) 
> @@ -920,7 +915,7 @@ direct_io_worker(int rw, struct kiocb *i
>  
>  	for (seg = 0; seg < nr_segs; seg++) {
>  		user_addr = (unsigned long)iov[seg].iov_base;
> -		bytes = iov[seg].iov_len;
> +		end += bytes = iov[seg].iov_len;
>  
>  		/* Index into the first page of the first block */
>  		dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> @@ -968,41 +963,50 @@ direct_io_worker(int rw, struct kiocb *i
>  		dio_bio_submit(dio);
>  
>  	/*
> -	 * All new block allocations have been performed.  We can let i_sem
> -	 * go now.
> +	 * If this is a file extend we have to hold i_sem throughout 
> +	 * till the data hits the disk and i_size is updated, to avoid 
> +	 * exposing uninitialized blocks. 
> +	 *
> +	 * For AIO file extends we just take the easy way out and wait 
> +	 * synchronously in this situation; typically databases
> +	 * would have pre-allocated the file before issuing AIO.
> +	 * We may have to change this if there are real world 
> +	 * users of file extending AIO who are affected.
> +	 *
> +	 * Secondly if we ran into a hole after submitting i/o for 
> +	 * some of the blocks, we wait for issued i/o to complete 
> +	 * before falling back to buffered i/o.
>  	 */
> -	if (dio->needs_locking)
> +	if ((end > i_size_read(dio->inode)) || ret) {
> +		/* switch to synchronous wait even if we are in AIO */
> +		io_wait = NULL;
> +	}
> +	/*
> +	 * All block lookups have been performed.  We can let i_sem
> +	 * go now if this is a read request.
> +	 */
> +	if ((rw == READ) && dio->needs_locking)
>  		up(&dio->inode->i_sem);
>  
>  	/*
>  	 * OK, all BIOs are submitted, so we can decrement bio_count to truly
>  	 * reflect the number of to-be-processed BIOs.
>  	 */
> -	if (dio->is_async) {
> -		if (ret == 0)
> -			ret = dio->result;	/* Bytes written */
> -		finished_one_bio(dio);		/* This can free the dio */
> -		blk_run_queues();
> -	} else {
> -		finished_one_bio(dio);
> -		ret2 = dio_await_completion(dio);
> -		if (ret == 0)
> -			ret = ret2;
> -		if (ret == 0)
> -			ret = dio->page_errors;
> -		if (ret == 0 && dio->result) {
> -			loff_t i_size = i_size_read(inode);
> +	finished_one_bio(dio);
>  
> -			ret = dio->result;
> -			/*
> -			 * Adjust the return value if the read crossed a
> -			 * non-block-aligned EOF.
> -			 */
> -			if (rw == READ && (offset + ret > i_size))
> -				ret = i_size - offset;
> -		}
> -		dio_complete(dio, offset, ret);
> -		kfree(dio);
> +do_wait:
> +	ret2 = dio_await_completion(dio, offset, io_wait);
> +	if (ret == 0)
> +		ret = ret2;
> +	if (ret == 0) {
> +		loff_t i_size = i_size_read(inode);
> +
> +		/*
> +		 * Adjust the return value if the read crossed a
> +		 * non-block-aligned EOF.
> +		 */
> +		if (rw == READ && (offset + ret > i_size))
> +			ret = i_size - offset;
>  	}
>  	return ret;
>  }
> @@ -1028,7 +1032,7 @@
>  	unsigned bdev_blkbits = 0;
>  	unsigned blocksize_mask = (1 << blkbits) - 1;
>  	ssize_t retval = -EINVAL;
> -	struct dio *dio;
> +	struct dio *dio = iocb->ki_private;
>  	int needs_locking;
>  
>  	if (bdev)
> @@ -1055,6 +1061,8 @@ __blockdev_direct_IO(int rw, struct kioc
>  		}
>  	}
>  
> +	if (dio)
> +		goto have_dio;	
>  	dio = kmalloc(sizeof(*dio), GFP_KERNEL);
>  	retval = -ENOMEM;
>  	if (!dio)
> @@ -1081,10 +1089,9 @@ __blockdev_direct_IO(int rw, struct kioc
>  	}
>  	dio->needs_locking = needs_locking;
>  
> +have_dio:
>  	retval = direct_io_worker(rw, iocb, inode, iov, offset,
>  				nr_segs, blkbits, get_blocks, end_io, dio);
> -	if (needs_locking && rw == WRITE)
> -		down(&inode->i_sem);
>  out:
>  	return retval;
>  }
> diff -urp linux-2.6.0-test6-mm4/include/linux/aio.h 260t6mm4/include/linux/aio.h
> --- linux-2.6.0-test6-mm4/include/linux/aio.h	Tue Oct 21 18:58:55 2003
> +++ 260t6mm4/include/linux/aio.h	Tue Oct 14 21:36:43 2003
> @@ -74,6 +74,7 @@ struct kiocb {
>  	char 			*ki_buf;	/* remaining iocb->aio_buf */
>  	size_t			ki_left; 	/* remaining bytes */
>  	wait_queue_t		ki_wait;
> +	void			*ki_private;
>  	long			ki_retried; 	/* just for testing */
>  	long			ki_kicked; 	/* just for testing */
>  	long			ki_queued; 	/* just for testing */
> @@ -94,6 +95,7 @@ struct kiocb {
>  		(x)->ki_user_obj = tsk;			\
>  		(x)->ki_user_data = 0;			\
>  		init_wait((&(x)->ki_wait));		\
> +		(x)->ki_private = NULL;			\
>  	} while (0)
>  
>  #define AIO_RING_MAGIC			0xa10a10a1
> diff -urp linux-2.6.0-test6-mm4/mm/filemap.c 260t6mm4/mm/filemap.c
> --- linux-2.6.0-test6-mm4/mm/filemap.c	Tue Oct 21 18:58:56 2003
> +++ 260t6mm4/mm/filemap.c	Tue Oct 21 19:16:20 2003
> @@ -926,8 +926,6 @@ __generic_file_aio_read(struct kiocb *io
>  		if (pos < size) {
>  			retval = generic_file_direct_IO(READ, iocb,
>  						iov, pos, nr_segs);
> -			if (retval >= 0 && !is_sync_kiocb(iocb))
> -				retval = -EIOCBQUEUED;
>  			if (retval > 0)
>  				*ppos = pos + retval;
>  		}
> @@ -1892,8 +1890,6 @@ __generic_file_aio_write_nolock(struct k
>  		 */
>  		if (written >= 0 && file->f_flags & O_SYNC)
>  			status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> -		if (written >= 0 && !is_sync_kiocb(iocb))
> -			written = -EIOCBQUEUED;
>  		if (written != -ENOTBLK)
>  			goto out_status;
>  		/*


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1)
  2003-10-23  0:40       ` Daniel McNeil
@ 2003-10-23 10:49         ` Suparna Bhattacharya
  2003-10-23 13:50           ` Suparna Bhattacharya
  0 siblings, 1 reply; 19+ messages in thread
From: Suparna Bhattacharya @ 2003-10-23 10:49 UTC (permalink / raw)
  To: Daniel McNeil
  Cc: Andrew Morton, linux-aio, Linux Kernel Mailing List, Badari Pulavarty

On Wed, Oct 22, 2003 at 05:40:32PM -0700, Daniel McNeil wrote:
> Suparna and Andrew,
> 
> I've been doing more testing using the test programs I wrote to 
> try and hit the AIO verses buffered read race conditions.
> 
> I tested 2.6.0-test8, 2.6.0-test8-mm1+(your first incomplete fix) and
> 2.6.0-test8-mm1+aio-dio-retry patch.  I used my test programs
> (http://developer.osdl.org/daniel/AIO/TESTS/) by doing:
> 
> Run "dirty" program which allocates and writes 0xaa to a file and then
> 	frees the space.
> Run "dio_sparse" or "aiodio-sparse - which creates "file", truncates it
> 	up to 64MB and then writes zeros into the holes (using DIO or
> 	AIO+DIO).  At same time, a forked child is reading the file
> 	looking for non-zero data.
> rm "file"
> 
> On 2.6.0-test8
> ==============
> I hit the race condition and see uninitialized data:
> ~/AIO/TESTS/dio_sparse
> non zero buffer at buf[4] => 0xaaaaaaaa,aaaaaaaa,aaaaaaaa,aaaaaaaa
> non-zero read at offset 24182785
> 
>  ~/AIO/TESTS/aiodio_sparse
> non zero buffer at buf[4] -> 0xaaaaaaaa,aaaaaaaa,aaaaaaaa,aaaaaaaa
> non-zero read at offset 8323062
> 
> 
> On 2.6.0-test8-mm1+1st-direct-io-aio_complete patch and
>    2.6.0-test8-mm1+aio-dio-retry patch
> 
> I never see uninitialized data.

That's good news.

You seem to be able to run test8-mm1 just fine; I have been
running into strange oops on syscall return for io_getevents :(
- haven't seen this before.
What library and header files are you using for libaio ? Do you have
4G-4G turned on in your build ?

> 
> Reading over your description and looking over the code, I'm pretty
> sure I see the races you are talking about but had some questions:
> 
> 1) DIO file extends.  So are you saying that in direct_io_worker()
>    we were dropping i_sem and then waiting for the i/o to complete,
>    but since i_sem was dropped there could be another write exposing
>    the uninitialized data still in process of being written?
>    
>    If yes, then wouldn't the intermediate write be responsible for
>    zeroing all data between the last isize and the new isize?

I think it would have to initialize all the data involved in the 
write but could just leave a hole from the last isize to the 
start of this write. It doesn't expect uninstantiated blocks to 
exist in that hole, so shouldn't have to worry about zeroing 
them.

> 
> 2) aio-dio extends
> 
>    Are you saying that in __generic_file_aio_write_nolock()
>    that generic_file_direct_IO() returns the number of bytes
>    that are being AIO written but not complete and we were updating
>    i_size write after that?  (just trying to understand the code).

Yes. And that leaving uninitialised data exposed.

Regards
Suparna

-
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Labs, India


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1)
  2003-10-23 10:49         ` Suparna Bhattacharya
@ 2003-10-23 13:50           ` Suparna Bhattacharya
  2003-10-23 22:59             ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Suparna Bhattacharya @ 2003-10-23 13:50 UTC (permalink / raw)
  To: Daniel McNeil
  Cc: Andrew Morton, linux-aio, Linux Kernel Mailing List, Badari Pulavarty

On Thu, Oct 23, 2003 at 04:19:23PM +0530, Suparna Bhattacharya wrote:
> On Wed, Oct 22, 2003 at 05:40:32PM -0700, Daniel McNeil wrote:
> > Suparna and Andrew,
> > 
> > I've been doing more testing using the test programs I wrote to 
> > try and hit the AIO verses buffered read race conditions.
> > 
> > I tested 2.6.0-test8, 2.6.0-test8-mm1+(your first incomplete fix) and
> > 2.6.0-test8-mm1+aio-dio-retry patch.  I used my test programs
> > (http://developer.osdl.org/daniel/AIO/TESTS/) by doing:
> > 
> > Run "dirty" program which allocates and writes 0xaa to a file and then
> > 	frees the space.
> > Run "dio_sparse" or "aiodio-sparse - which creates "file", truncates it
> > 	up to 64MB and then writes zeros into the holes (using DIO or
> > 	AIO+DIO).  At same time, a forked child is reading the file
> > 	looking for non-zero data.
> > rm "file"
> > 
> > On 2.6.0-test8
> > ==============
> > I hit the race condition and see uninitialized data:
> > ~/AIO/TESTS/dio_sparse
> > non zero buffer at buf[4] => 0xaaaaaaaa,aaaaaaaa,aaaaaaaa,aaaaaaaa
> > non-zero read at offset 24182785
> > 
> >  ~/AIO/TESTS/aiodio_sparse
> > non zero buffer at buf[4] -> 0xaaaaaaaa,aaaaaaaa,aaaaaaaa,aaaaaaaa
> > non-zero read at offset 8323062
> > 
> > 
> > On 2.6.0-test8-mm1+1st-direct-io-aio_complete patch and
> >    2.6.0-test8-mm1+aio-dio-retry patch
> > 
> > I never see uninitialized data.
> 
> That's good news.
> 
> You seem to be able to run test8-mm1 just fine; I have been
> running into strange oops on syscall return for io_getevents :(
> - haven't seen this before.
> What library and header files are you using for libaio ? Do you have
> 4G-4G turned on in your build ?

It turns out that backing out gcc-Os.patch (on RH 9) or switching 
to a system with an older compiler version made those errors go away.

Regards
Suparna

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Labs, India


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1)
  2003-10-23 13:50           ` Suparna Bhattacharya
@ 2003-10-23 22:59             ` Andrew Morton
  2003-10-23 23:20               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on2.6.0-test7-mm1) Adrian Bunk
  2003-10-23 23:22               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1) Dan Kegel
  0 siblings, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2003-10-23 22:59 UTC (permalink / raw)
  To: suparna; +Cc: daniel, linux-aio, linux-kernel, pbadari, Adrian Bunk

Suparna Bhattacharya <suparna@in.ibm.com> wrote:
>
> It turns out that backing out gcc-Os.patch (on RH 9) or switching 
> to a system with an older compiler version made those errors go away.

Ho hum, so we have our answer.

Adrian, how do you feel about slotting this under CONFIG_EMBEDDED?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on2.6.0-test7-mm1)
  2003-10-23 22:59             ` Andrew Morton
@ 2003-10-23 23:20               ` Adrian Bunk
  2003-10-23 23:25                 ` Andrew Morton
  2003-10-23 23:22               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1) Dan Kegel
  1 sibling, 1 reply; 19+ messages in thread
From: Adrian Bunk @ 2003-10-23 23:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: suparna, daniel, linux-aio, linux-kernel, pbadari

On Thu, Oct 23, 2003 at 03:59:37PM -0700, Andrew Morton wrote:
> Suparna Bhattacharya <suparna@in.ibm.com> wrote:
> >
> > It turns out that backing out gcc-Os.patch (on RH 9) or switching 
> > to a system with an older compiler version made those errors go away.
> 
> Ho hum, so we have our answer.
> 
> Adrian, how do you feel about slotting this under CONFIG_EMBEDDED?

That was in the first version of the patch, but Christoph Hellwig asked 
to drop the EMBEDDED.

The version I sent to you contained a dependency on EXPERIMENTAL, to 
indicate that -Os is less tested, and the help text says
  If unsure, say N.

IMHO this should be enough, so that only people who know what they are 
doing use this option.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1)
  2003-10-23 22:59             ` Andrew Morton
  2003-10-23 23:20               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on2.6.0-test7-mm1) Adrian Bunk
@ 2003-10-23 23:22               ` Dan Kegel
  1 sibling, 0 replies; 19+ messages in thread
From: Dan Kegel @ 2003-10-23 23:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: suparna, daniel, linux-aio, linux-kernel, pbadari, Adrian Bunk

Andrew Morton wrote:
> Suparna Bhattacharya <suparna@in.ibm.com> wrote:
> 
>>It turns out that backing out gcc-Os.patch (on RH 9) or switching 
>>to a system with an older compiler version made those errors go away.
> 
> 
> Ho hum, so we have our answer.
> 
> Adrian, how do you feel about slotting this under CONFIG_EMBEDDED?

I dunno about Adrian, but I'd rest easier either way
if we knew which optimization in -Os causes the problem,
and whether this is a compiler bug.

http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Optimize-Options.html#Optimize%20Options
has a nice list of the optimizations performed by -O2, but
the list for -Os is a bit fuzzy, darn it.
- Dan

p.s. For the curious, the url of that patch seems to be
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test8/2.6.0-test8-mm1/broken-out/gcc-Os.patch

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testing on2.6.0-test7-mm1)
  2003-10-23 23:20               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on2.6.0-test7-mm1) Adrian Bunk
@ 2003-10-23 23:25                 ` Andrew Morton
  2003-10-23 23:37                   ` Patch for Retry based AIO-DIO (Was AIO and DIO testingon2.6.0-test7-mm1) Adrian Bunk
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2003-10-23 23:25 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: suparna, daniel, linux-aio, linux-kernel, pbadari

Adrian Bunk <bunk@fs.tum.de> wrote:
>
> On Thu, Oct 23, 2003 at 03:59:37PM -0700, Andrew Morton wrote:
> > Suparna Bhattacharya <suparna@in.ibm.com> wrote:
> > >
> > > It turns out that backing out gcc-Os.patch (on RH 9) or switching 
> > > to a system with an older compiler version made those errors go away.
> > 
> > Ho hum, so we have our answer.
> > 
> > Adrian, how do you feel about slotting this under CONFIG_EMBEDDED?
> 
> That was in the first version of the patch, but Christoph Hellwig asked 
> to drop the EMBEDDED.

That was before we knew that it craps out when compiled on RH9.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testingon2.6.0-test7-mm1)
  2003-10-23 23:25                 ` Andrew Morton
@ 2003-10-23 23:37                   ` Adrian Bunk
  2003-10-23 23:46                     ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Bunk @ 2003-10-23 23:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: suparna, daniel, linux-aio, linux-kernel, pbadari

On Thu, Oct 23, 2003 at 04:25:39PM -0700, Andrew Morton wrote:
> Adrian Bunk <bunk@fs.tum.de> wrote:
> >
> > On Thu, Oct 23, 2003 at 03:59:37PM -0700, Andrew Morton wrote:
> > > Suparna Bhattacharya <suparna@in.ibm.com> wrote:
> > > >
> > > > It turns out that backing out gcc-Os.patch (on RH 9) or switching 
> > > > to a system with an older compiler version made those errors go away.
> > > 
> > > Ho hum, so we have our answer.
> > > 
> > > Adrian, how do you feel about slotting this under CONFIG_EMBEDDED?
> > 
> > That was in the first version of the patch, but Christoph Hellwig asked 
> > to drop the EMBEDDED.
> 
> That was before we knew that it craps out when compiled on RH9.

And one of the arguments I had for making it dependent on EXPERIMENTAL 
instead of enabling it unconditionally was:

- it's less tested (there might be miscompilations in some part of the
  kernel with some supported compilers)

-Os has it's benefits on some systems, so it shouldn't be totally hidden 
under EMBEDDED. OTOH, it's less tested, so only people who know what 
they are doing should use it (EXPERIMENTAL plus help text).

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIO testingon2.6.0-test7-mm1)
  2003-10-23 23:37                   ` Patch for Retry based AIO-DIO (Was AIO and DIO testingon2.6.0-test7-mm1) Adrian Bunk
@ 2003-10-23 23:46                     ` Andrew Morton
  2003-10-24  0:34                       ` Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1) Adrian Bunk
  2003-10-26 11:57                       ` Adrian Bunk
  0 siblings, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2003-10-23 23:46 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: suparna, daniel, linux-aio, linux-kernel, pbadari

Adrian Bunk <bunk@fs.tum.de> wrote:
>
> > That was before we knew that it craps out when compiled on RH9.
> 
> And one of the arguments I had for making it dependent on EXPERIMENTAL 
> instead of enabling it unconditionally was:
> 
> - it's less tested (there might be miscompilations in some part of the
>   kernel with some supported compilers)

That's why I enabled it unconditionally.

> -Os has it's benefits on some systems, so it shouldn't be totally hidden 
> under EMBEDDED. OTOH, it's less tested, so only people who know what 
> they are doing should use it (EXPERIMENTAL plus help text).

It causes kernels to crash on a major linux distribution.  We need to
either make it very hard to turn on, or just forget it altogether.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1)
  2003-10-23 23:46                     ` Andrew Morton
@ 2003-10-24  0:34                       ` Adrian Bunk
  2003-10-26 11:57                       ` Adrian Bunk
  1 sibling, 0 replies; 19+ messages in thread
From: Adrian Bunk @ 2003-10-24  0:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: suparna, daniel, linux-aio, linux-kernel, pbadari

On Thu, Oct 23, 2003 at 04:46:38PM -0700, Andrew Morton wrote:
>...
> > -Os has it's benefits on some systems, so it shouldn't be totally hidden 
> > under EMBEDDED. OTOH, it's less tested, so only people who know what 
> > they are doing should use it (EXPERIMENTAL plus help text).
> 
> It causes kernels to crash on a major linux distribution.  We need to
> either make it very hard to turn on, or just forget it altogether.

Looking at the source RPM, it seems e.g. gcc bug #8746 "gcc miscompiles
Linux kernel ppa driver on x86" (fixed in gcc 3.2.3) is also unfixed in
the gcc in RedHat 9.0 ...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1)
  2003-10-23 23:46                     ` Andrew Morton
  2003-10-24  0:34                       ` Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1) Adrian Bunk
@ 2003-10-26 11:57                       ` Adrian Bunk
  2003-10-26 12:08                         ` Russell King
  1 sibling, 1 reply; 19+ messages in thread
From: Adrian Bunk @ 2003-10-26 11:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: suparna, daniel, linux-aio, linux-kernel, pbadari

On Thu, Oct 23, 2003 at 04:46:38PM -0700, Andrew Morton wrote:
>...
> > -Os has it's benefits on some systems, so it shouldn't be totally hidden 
> > under EMBEDDED. OTOH, it's less tested, so only people who know what 
> > they are doing should use it (EXPERIMENTAL plus help text).
> 
> It causes kernels to crash on a major linux distribution.  We need to
> either make it very hard to turn on, or just forget it altogether.

The -Os patch with a dependency on EMBEDDED is below.

diffstat output:

 arch/arm/Makefile   |    2 --
 arch/h8300/Kconfig  |    4 ++++
 arch/h8300/Makefile |    2 +-
 Makefile            |    8 +++++++-
 init/Kconfig        |   10 ++++++++++
 5 files changed, 22 insertions(+), 4 deletions(-)


cu
Adrian


--- linux-2.6.0-test9/init/Kconfig.old	2003-10-26 12:48:21.000000000 +0100
+++ linux-2.6.0-test9/init/Kconfig	2003-10-26 12:49:56.000000000 +0100
@@ -196,6 +196,16 @@
 
 source "drivers/block/Kconfig.iosched"
 
+config CC_OPTIMIZE_FOR_SIZE
+	bool "Optimize for size" if EMBEDDED
+	default y if ARM || H8300
+	default n
+	help
+	  Enabling this option will pass "-Os" instead of "-O2" to gcc
+	  resulting in a smaller kernel.
+
+	  If unsure, say N.
+
 endmenu		# General setup
 
 
--- linux-2.6.0-test9/Makefile.old	2003-10-26 12:48:15.000000000 +0100
+++ linux-2.6.0-test9/Makefile	2003-10-26 12:48:32.000000000 +0100
@@ -275,7 +275,7 @@
 CPPFLAGS        := -D__KERNEL__ -Iinclude \
 		   $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include)
 
-CFLAGS 		:= -Wall -Wstrict-prototypes -Wno-trigraphs -O2 \
+CFLAGS 		:= -Wall -Wstrict-prototypes -Wno-trigraphs \
 	  	   -fno-strict-aliasing -fno-common
 AFLAGS		:= -D__ASSEMBLY__
 
@@ -431,6 +431,12 @@
 # ---------------------------------------------------------------------------
 
 
+ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
+CFLAGS		+= -Os
+else
+CFLAGS		+= -O2
+endif
+
 ifndef CONFIG_FRAME_POINTER
 CFLAGS		+= -fomit-frame-pointer
 endif

--- linux-2.6.0-test5-mm4/arch/arm/Makefile.old	2003-09-25 14:38:18.000000000 +0200
+++ linux-2.6.0-test5-mm4/arch/arm/Makefile	2003-09-25 14:40:47.000000000 +0200
@@ -14,8 +14,6 @@
 GZFLAGS		:=-9
 #CFLAGS		+=-pipe
 
-CFLAGS		:=$(CFLAGS:-O2=-Os)
-
 ifeq ($(CONFIG_FRAME_POINTER),y)
 CFLAGS		+=-fno-omit-frame-pointer -mapcs -mno-sched-prolog
 endif
--- linux-2.6.0-test5-mm4/arch/h8300/Kconfig.old	2003-09-25 14:43:27.000000000 +0200
+++ linux-2.6.0-test5-mm4/arch/h8300/Kconfig	2003-09-25 14:43:44.000000000 +0200
@@ -5,6 +5,10 @@
 
 mainmenu "uClinux/h8300 (w/o MMU) Kernel Configuration"
 
+config H8300
+	bool
+	default y
+
 config MMU
 	bool
 	default n
--- linux-2.6.0-test5-mm4/arch/h8300/Makefile.old	2003-09-25 14:38:18.000000000 +0200
+++ linux-2.6.0-test5-mm4/arch/h8300/Makefile	2003-09-25 14:38:24.000000000 +0200
@@ -34,7 +34,7 @@
 ldflags-$(CONFIG_CPU_H8S)	:= -mh8300self
 
 CFLAGS += $(cflags-y)
-CFLAGS += -mint32 -fno-builtin -Os
+CFLAGS += -mint32 -fno-builtin
 CFLAGS += -g
 CFLAGS += -D__linux__
 CFLAGS += -DUTS_SYSNAME=\"uClinux\"


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1)
  2003-10-26 11:57                       ` Adrian Bunk
@ 2003-10-26 12:08                         ` Russell King
  2003-10-26 12:17                           ` Adrian Bunk
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King @ 2003-10-26 12:08 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Andrew Morton, suparna, daniel, linux-aio, linux-kernel, pbadari

On Sun, Oct 26, 2003 at 12:57:19PM +0100, Adrian Bunk wrote:
> On Thu, Oct 23, 2003 at 04:46:38PM -0700, Andrew Morton wrote:
> >...
> > > -Os has it's benefits on some systems, so it shouldn't be totally hidden 
> > > under EMBEDDED. OTOH, it's less tested, so only people who know what 
> > > they are doing should use it (EXPERIMENTAL plus help text).
> > 
> > It causes kernels to crash on a major linux distribution.  We need to
> > either make it very hard to turn on, or just forget it altogether.
> 
> The -Os patch with a dependency on EMBEDDED is below.
> 
> diffstat output:
> 
>  arch/arm/Makefile   |    2 --
>  arch/h8300/Kconfig  |    4 ++++
>  arch/h8300/Makefile |    2 +-
>  Makefile            |    8 +++++++-
>  init/Kconfig        |   10 ++++++++++
>  5 files changed, 22 insertions(+), 4 deletions(-)

So now ARM goes back to not building with -Os.

Maybe the right answer is to completely forget about making this a
configuration-time option for 2.6?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1)
  2003-10-26 12:08                         ` Russell King
@ 2003-10-26 12:17                           ` Adrian Bunk
  2003-10-26 13:00                             ` Russell King
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Bunk @ 2003-10-26 12:17 UTC (permalink / raw)
  To: Andrew Morton, suparna, daniel, linux-aio, linux-kernel, pbadari

On Sun, Oct 26, 2003 at 12:08:08PM +0000, Russell King wrote:
> On Sun, Oct 26, 2003 at 12:57:19PM +0100, Adrian Bunk wrote:
> > On Thu, Oct 23, 2003 at 04:46:38PM -0700, Andrew Morton wrote:
> > >...
> > > > -Os has it's benefits on some systems, so it shouldn't be totally hidden 
> > > > under EMBEDDED. OTOH, it's less tested, so only people who know what 
> > > > they are doing should use it (EXPERIMENTAL plus help text).
> > > 
> > > It causes kernels to crash on a major linux distribution.  We need to
> > > either make it very hard to turn on, or just forget it altogether.
> > 
> > The -Os patch with a dependency on EMBEDDED is below.
> > 
> > diffstat output:
> > 
> >  arch/arm/Makefile   |    2 --
> >  arch/h8300/Kconfig  |    4 ++++
> >  arch/h8300/Makefile |    2 +-
> >  Makefile            |    8 +++++++-
> >  init/Kconfig        |   10 ++++++++++
> >  5 files changed, 22 insertions(+), 4 deletions(-)
> 
> So now ARM goes back to not building with -Os.
>...

With this patch ARM should default to -Os.

Could you recheck whether it really doesn't work?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1)
  2003-10-26 12:17                           ` Adrian Bunk
@ 2003-10-26 13:00                             ` Russell King
  0 siblings, 0 replies; 19+ messages in thread
From: Russell King @ 2003-10-26 13:00 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Andrew Morton, suparna, daniel, linux-aio, linux-kernel, pbadari

On Sun, Oct 26, 2003 at 01:17:42PM +0100, Adrian Bunk wrote:
> With this patch ARM should default to -Os.
> 
> Could you recheck whether it really doesn't work?

Ok, I agree.  Didn't read the patch correctly.

(Still building test9 here so actually checking is completely out of
the question.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2003-10-26 13:01 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-17 23:12 AIO and DIO testing on 2.6.0-test7-mm1 Daniel McNeil
2003-10-18  0:24 ` Andrew Morton
2003-10-20 14:27 ` Suparna Bhattacharya
2003-10-20 23:47   ` Daniel McNeil
2003-10-21 12:11     ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1) Suparna Bhattacharya
2003-10-23  0:40       ` Daniel McNeil
2003-10-23 10:49         ` Suparna Bhattacharya
2003-10-23 13:50           ` Suparna Bhattacharya
2003-10-23 22:59             ` Andrew Morton
2003-10-23 23:20               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on2.6.0-test7-mm1) Adrian Bunk
2003-10-23 23:25                 ` Andrew Morton
2003-10-23 23:37                   ` Patch for Retry based AIO-DIO (Was AIO and DIO testingon2.6.0-test7-mm1) Adrian Bunk
2003-10-23 23:46                     ` Andrew Morton
2003-10-24  0:34                       ` Patch for Retry based AIO-DIO (Was AIO and DIOtestingon2.6.0-test7-mm1) Adrian Bunk
2003-10-26 11:57                       ` Adrian Bunk
2003-10-26 12:08                         ` Russell King
2003-10-26 12:17                           ` Adrian Bunk
2003-10-26 13:00                             ` Russell King
2003-10-23 23:22               ` Patch for Retry based AIO-DIO (Was AIO and DIO testing on 2.6.0-test7-mm1) Dan Kegel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.