I have found (finally) the problem causing DIO reads racing with buffered writes to see uninitialized data on ext3 file systems (which is what I have been testing on). The problem is caused by the changes to __block_write_page_full() and a race with journaling: journal_commit_transaction() -> ll_rw_block() -> submit_bh() ll_rw_block() locks the buffer, clears buffer dirty and calls submit_bh() A racing __block_write_full_page() (from ext3_ordered_writepage()) would see that buffer_dirty() is not set because the i/o is still in flight, so it would not do a bh_submit() It would SetPageWriteback() and unlock_page() and then see that no i/o was submitted and call end_page_writeback() (with the i/o still in flight). This would allow the DIO code to issue the DIO read while buffer writes are still in flight. The i/o can be reordered by i/o scheduling and the DIO can complete BEFORE the writebacks complete. Thus the DIO sees the old uninitialized data. Here is a quick hack that fixes it, but I am not sure if this the proper long term fix. Thoughts? Daniel