Suparna, Yes your patch did help. I originally had CONFIG_DEBUG_SLAB=y which was helping me see problems because the the freed dio was getting poisoned. I also tested with CONFIG_DEBUG_PAGEALLOC=y which is very good at catching these. I updated your AIO fallback patch plus your AIO race plus I fixed the bio_count decrement fix. This patch has all three fixes and it is working for me. I fixed the bio_count race, by changing bio_list_lock into bio_lock and using that for all the bio fields. I changed bio_count and bios_in_flight from atomics into int. They are now proctected by the bio_lock. I fixed the race, by in finished_one_bio() by leaving the bio_count at 1 until after the dio_complete() and then do the bio_count decrement and wakeup holding the bio_lock. Take a look, give it a try, and let me know what you think. I've tested this on my 2-way and so far all my tests have past. I have more testing to do, but this is working better. Thanks, Daniel On Mon, 2003-11-24 at 01:42, Suparna Bhattacharya wrote: > On Tue, Nov 18, 2003 at 03:47:53PM -0800, Daniel McNeil wrote: > > Suparna, > > > > I was unable to reproduce the hang in io_submit() without your patch. > > I ran aiocp with 1k i/o size constantly for 2 hours and it never hung. > > > > I re-ran with your patch with both as-iosched and deadline and both > > hung in io_submit(). aiocp would run a few times, but I put the > > aiocp in a while loop and it hung on the 1st or 2nd time. It > > did get most of the way through copying the file before hanging. > > This is on a 2-proc to ide disks running ext3. > > > > Found one race ... not sure if its the one causing the hangs > you see. The attached patch is not a complete fix (there is one > other race to close), but it would be interesting to see if > this makes any difference for you. > > Regards > Suparna