From: Janet Morgan <janetmor@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Daniel McNeil <daniel@osdl.org>,
pbadari@us.ibm.com, linux-aio@kvack.org,
linux-kernel@vger.kernel.org, suparna@in.ibm.com
Subject: Re: [PATCH 2.6.2-rc3-mm1] DIO read race fix
Date: Wed, 04 Feb 2004 18:54:54 -0800 [thread overview]
Message-ID: <4021B07E.8030700@us.ibm.com> (raw)
In-Reply-To: <20040204180754.28801410.akpm@osdl.org>
Andrew Morton wrote:
>Daniel McNeil <daniel@osdl.org> wrote:
>
>
>>I have found (finally) the problem causing DIO reads racing with
>>buffered writes to see uninitialized data on ext3 file systems
>>(which is what I have been testing on).
>>
>>
>
>What kernel? If -mm, is this the only remaining buffered-vs-direct
>problem?
>
>
>
I think there was consensus on two other patches along the way:
http://marc.theaimsgroup.com/?l=linux-kernel&m=107286971311559&w=2
http://marc.theaimsgroup.com/?l=linux-aio&m=107291089712224&w=2
-Janet
>>The problem is caused by the changes to __block_write_page_full()
>>and a race with journaling:
>>
>>journal_commit_transaction() -> ll_rw_block() -> submit_bh()
>>
>>ll_rw_block() locks the buffer, clears buffer dirty and calls
>>submit_bh()
>>
>>A racing __block_write_full_page() (from ext3_ordered_writepage())
>>
>> would see that buffer_dirty() is not set because the i/o
>> is still in flight, so it would not do a bh_submit()
>>
>> It would SetPageWriteback() and unlock_page() and then
>> see that no i/o was submitted and call end_page_writeback()
>> (with the i/o still in flight).
>>
>>This would allow the DIO code to issue the DIO read while buffer writes
>>are still in flight. The i/o can be reordered by i/o scheduling and
>>the DIO can complete BEFORE the writebacks complete. Thus the DIO
>>sees the old uninitialized data.
>>
>>
>
>ow. How'd you work this out?
>
>
>
>>Here is a quick hack that fixes it, but I am not sure if this the
>>proper long term fix.
>>
>>
>
>The problem is that ext3 and the VFS are using different paradigms. VFS is
>all page-based, but ext3 is all block-based. One or the other needs to do
>something nasty.
>
>One approach would be to change the JBD write_out_data_locked loop to use
>block_write_full_page(bh->b_page) instead of buffer_head operations. But
>that could get hairy with blocksize < PAGE_SIZE.
>
>Thanks for working this out. Let me ponder it for a bit.
>
>
>
next prev parent reply other threads:[~2004-02-05 2:56 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <3FCD4B66.8090905@us.ibm.com>
2003-12-06 1:29 ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Daniel McNeil
2003-12-08 18:23 ` Daniel McNeil
2003-12-12 0:51 ` Daniel McNeil
2003-12-17 1:25 ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Daniel McNeil
2003-12-17 2:03 ` Andrew Morton
2003-12-17 19:25 ` Daniel McNeil
2003-12-17 20:17 ` Janet Morgan
2003-12-31 9:18 ` Suparna Bhattacharya
2003-12-31 9:35 ` Andrew Morton
2003-12-31 9:55 ` Suparna Bhattacharya
2003-12-31 9:59 ` Andrew Morton
2003-12-31 10:09 ` Suparna Bhattacharya
2003-12-31 10:10 ` Andrew Morton
2003-12-31 10:48 ` Suparna Bhattacharya
2003-12-31 10:53 ` Andrew Morton
2003-12-31 10:54 ` Andrew Morton
2003-12-31 11:17 ` Andrew Morton
2003-12-31 22:34 ` [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch Daniel McNeil
2003-12-31 22:41 ` [PATCH linux-2.6.1-rc1-mm1] aiodio_fallback_bio_count.patch Daniel McNeil
2003-12-31 23:46 ` Andrew Morton
2004-01-02 5:14 ` Suparna Bhattacharya
2004-01-02 7:46 ` Andrew Morton
2004-01-05 3:55 ` Suparna Bhattacharya
2004-01-05 5:06 ` Andrew Morton
2004-01-05 5:28 ` Suparna Bhattacharya
2004-01-05 5:28 ` Andrew Morton
2004-01-05 6:06 ` Suparna Bhattacharya
2004-01-05 6:14 ` Lincoln Dale
2003-12-31 22:47 ` [PATCH linux-2.6.1-rc1-mm1] dio_isize.patch Daniel McNeil
2003-12-31 23:42 ` [PATCH linux-2.6.1-rc1-mm1] filemap_fdatawait.patch Andrew Morton
2004-01-02 4:20 ` Suparna Bhattacharya
2004-01-02 4:36 ` Andrew Morton
2004-01-02 5:50 ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Suparna Bhattacharya
2004-01-02 7:31 ` Andrew Morton
2004-01-05 13:49 ` Marcelo Tosatti
2004-01-05 20:27 ` Andrew Morton
2004-03-29 15:44 ` Marcelo Tosatti
2004-01-11 23:14 ` Janet Morgan
2004-01-11 23:44 ` Andrew Morton
2004-01-12 18:00 ` filemap_fdatawait.patch Daniel McNeil
2004-01-12 19:39 ` [PATCH linux-2.6.0-test10-mm1] filemap_fdatawait.patch Janet Morgan
2004-01-12 19:46 ` Daniel McNeil
2004-01-13 4:12 ` Janet Morgan
2003-12-30 4:53 ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Suparna Bhattacharya
2003-12-31 0:29 ` Daniel McNeil
2003-12-31 6:09 ` Suparna Bhattacharya
2004-01-08 23:55 ` Daniel McNeil
2004-01-09 3:55 ` Suparna Bhattacharya
2004-02-05 1:39 ` [PATCH 2.6.2-rc3-mm1] DIO read race fix Daniel McNeil
2004-02-05 1:54 ` Badari Pulavarty
2004-02-05 2:07 ` Andrew Morton
2004-02-05 2:54 ` Janet Morgan [this message]
2004-02-05 3:19 ` Andrew Morton
2004-02-05 3:43 ` Suparna Bhattacharya
2004-02-05 5:33 ` Andrew Morton
2004-02-05 17:52 ` Daniel McNeil
2004-02-05 18:53 ` Badari Pulavarty
2004-03-29 15:41 ` [PATCH linux-2.6.0-test10-mm1] dio-read-race-fix Suparna Bhattacharya
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4021B07E.8030700@us.ibm.com \
--to=janetmor@us.ibm.com \
--cc=akpm@osdl.org \
--cc=daniel@osdl.org \
--cc=linux-aio@kvack.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbadari@us.ibm.com \
--cc=suparna@in.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).