* Device-backed loop broken in 2.6.0-test2? @ 2003-08-06 22:40 Thomas Themel 2003-08-07 0:40 ` Andrew Morton 0 siblings, 1 reply; 7+ messages in thread From: Thomas Themel @ 2003-08-06 22:40 UTC (permalink / raw) To: linux-kernel Hi, it seems that device backed loopback is broken in the 2.6.0-test2 series. I've noticed the error while testing cryptoloop, but it still appears reliably when using plain loop without encryption. I set up a loopback device on an IDE partition losetup /dev/loop0 /dev/hda6 and create an ext3 filesystem on it. Then, when trying to fill it with data, it works for a while until errors of the form Buffer I/O error on device loop0, logical block 377367 Buffer I/O error on device loop0, logical block 377380 Buffer I/O error on device loop0, logical block 377419 Buffer I/O error on device loop0, logical block 378937 Buffer I/O error on device loop0, logical block 378983 Buffer I/O error on device loop0, logical block 380008 Buffer I/O error on device loop0, logical block 380009 start to appear in the kernel log. This does not affect the writes, however, and only manifests later when the filesystem breaks or data in files is corrupted. ciao, -- [*Thomas Themel*] I read what some of you folks here write and all I can [extended contact] say is that I hope you are inside the fireballs when the [info provided in] freedom fighters take out the Great Satan. [*message header*] - Tim May on cypherpunks ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Device-backed loop broken in 2.6.0-test2? 2003-08-06 22:40 Device-backed loop broken in 2.6.0-test2? Thomas Themel @ 2003-08-07 0:40 ` Andrew Morton 2003-08-07 7:23 ` Thomas Themel ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Andrew Morton @ 2003-08-07 0:40 UTC (permalink / raw) To: Thomas Themel; +Cc: linux-kernel Thomas Themel <themel@iwoars.net> wrote: > > it seems that device backed loopback is broken in the 2.6.0-test2 series. doh. We're currently setting PF_READAHEAD across a call into the page allocator. We end up calling writepage() with PF_READAHEAD set and the block layer aborts the writes, resulting in corrupted data. It only seems to bite with loop-on-blockdev for some reason. And add a warning in ll_rw_block() to catch any more occurrences. drivers/block/ll_rw_blk.c | 8 +++++++- mm/readahead.c | 22 +++++++++++----------- 2 files changed, 18 insertions(+), 12 deletions(-) diff -puN mm/readahead.c~PF_READAHEAD-loop-fix mm/readahead.c --- 25/mm/readahead.c~PF_READAHEAD-loop-fix 2003-08-06 16:59:29.000000000 -0700 +++ 25-akpm/mm/readahead.c 2003-08-06 16:59:29.000000000 -0700 @@ -202,9 +202,9 @@ out: * * Returns the number of pages which actually had IO started against them. */ -static inline int +static int __do_page_cache_readahead(struct address_space *mapping, struct file *filp, - unsigned long offset, unsigned long nr_to_read) + unsigned long offset, unsigned long nr_to_read, int pf_readahead) { struct inode *inode = mapping->host; struct page *page; @@ -249,8 +249,11 @@ __do_page_cache_readahead(struct address * uptodate then the caller will launch readpage again, and * will then handle the error. */ - if (ret) + if (ret) { + current->flags |= pf_readahead; read_pages(mapping, filp, &page_pool, ret); + current->flags &= ~pf_readahead; + } BUG_ON(!list_empty(&page_pool)); out: return ret; @@ -275,8 +278,8 @@ int force_page_cache_readahead(struct ad if (this_chunk > nr_to_read) this_chunk = nr_to_read; - err = __do_page_cache_readahead(mapping, filp, - offset, this_chunk); + err = __do_page_cache_readahead(mapping, filp, offset, + this_chunk, 0); if (err < 0) { ret = err; break; @@ -300,12 +303,9 @@ int do_page_cache_readahead(struct addre { int ret = 0; - if (!bdi_read_congested(mapping->backing_dev_info)) { - current->flags |= PF_READAHEAD; - ret = __do_page_cache_readahead(mapping, filp, - offset, nr_to_read); - current->flags &= ~PF_READAHEAD; - } + if (!bdi_read_congested(mapping->backing_dev_info)) + ret = __do_page_cache_readahead(mapping, filp, offset, + nr_to_read, PF_READAHEAD); return ret; } diff -puN drivers/block/ll_rw_blk.c~PF_READAHEAD-loop-fix drivers/block/ll_rw_blk.c --- 25/drivers/block/ll_rw_blk.c~PF_READAHEAD-loop-fix 2003-08-06 16:59:29.000000000 -0700 +++ 25-akpm/drivers/block/ll_rw_blk.c 2003-08-06 17:40:27.000000000 -0700 @@ -1847,7 +1847,13 @@ static int __make_request(request_queue_ barrier = test_bit(BIO_RW_BARRIER, &bio->bi_rw); - ra = bio_flagged(bio, BIO_RW_AHEAD) || current->flags & PF_READAHEAD; + ra = bio_flagged(bio, BIO_RW_AHEAD); + if (current->flags & PF_READAHEAD) { + if (rw == WRITE) + WARN_ON(1); + else + ra = 1; + } again: insert_here = NULL; _ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Device-backed loop broken in 2.6.0-test2? 2003-08-07 0:40 ` Andrew Morton @ 2003-08-07 7:23 ` Thomas Themel 2003-08-07 16:07 ` Valdis.Kletnieks 2003-08-09 20:48 ` cryptoloop data corruption (was Re: Device-backed loop broken in 2.6.0-test2?) Thomas Themel 2 siblings, 0 replies; 7+ messages in thread From: Thomas Themel @ 2003-08-07 7:23 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Hi, Andrew Morton (akpm@osdl.org) wrote on 2003-08-07: > Thomas Themel <themel@iwoars.net> wrote: > > it seems that device backed loopback is broken in the 2.6.0-test2 series. > doh. Patch applied, and it at least withstood the initial restoration of the 8 GB of data onto it, which I never managed with the unpatched version. Thanks! ciao, -- [*Thomas Themel*] Great Goddess Discordia, Holy Mother Eris, [extended contact] Joy of the Universe, Laughter of Space, Grant [info provided in] us Life, Light, Love and Liberty and make the [*message header*] bloody magick work! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Device-backed loop broken in 2.6.0-test2? 2003-08-07 0:40 ` Andrew Morton 2003-08-07 7:23 ` Thomas Themel @ 2003-08-07 16:07 ` Valdis.Kletnieks 2003-08-07 16:24 ` Valdis.Kletnieks 2003-08-07 16:29 ` Andrew Morton 2003-08-09 20:48 ` cryptoloop data corruption (was Re: Device-backed loop broken in 2.6.0-test2?) Thomas Themel 2 siblings, 2 replies; 7+ messages in thread From: Valdis.Kletnieks @ 2003-08-07 16:07 UTC (permalink / raw) To: Andrew Morton; +Cc: Thomas Themel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 694 bytes --] On Wed, 06 Aug 2003 17:40:43 PDT, Andrew Morton said: > We're currently setting PF_READAHEAD across a call into the page allocator. > We end up calling writepage() with PF_READAHEAD set and the block layer > aborts the writes, resulting in corrupted data. > > It only seems to bite with loop-on-blockdev for some reason. For what it's worth, I've been seeing these same symptoms on ext3 on an LVM partition - so it's not *just* loop, it appears to be any filesystem that interposes a mapping layer. Hmm.. wonder if this explains the failures on RAID that somebody was reporting, too.... /Valdis (who is off to apply the patch that Andrew attached, which doesn't appear to be in -mm5)... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Device-backed loop broken in 2.6.0-test2? 2003-08-07 16:07 ` Valdis.Kletnieks @ 2003-08-07 16:24 ` Valdis.Kletnieks 2003-08-07 16:29 ` Andrew Morton 1 sibling, 0 replies; 7+ messages in thread From: Valdis.Kletnieks @ 2003-08-07 16:24 UTC (permalink / raw) To: Andrew Morton; +Cc: Thomas Themel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 404 bytes --] On Thu, 07 Aug 2003 12:07:32 EDT, Valdis.Kletnieks@vt.edu said: > /Valdis (who is off to apply the patch that Andrew attached, which doesn't appear to > be in -mm5)... Passing curious.. the first 3 hunks of the patch aren't in -mm5, the last 2 (or variants thereof) are.... of course I hit 'send' before checking past the first 3 hunks.. ;) Are the first 3 superfluous, or did -mm5 get half a patch? [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Device-backed loop broken in 2.6.0-test2? 2003-08-07 16:07 ` Valdis.Kletnieks 2003-08-07 16:24 ` Valdis.Kletnieks @ 2003-08-07 16:29 ` Andrew Morton 1 sibling, 0 replies; 7+ messages in thread From: Andrew Morton @ 2003-08-07 16:29 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: themel, linux-kernel Valdis.Kletnieks@vt.edu wrote: > > /Valdis (who is off to apply the patch that Andrew attached, which doesn't appear to > be in -mm5)... mm5 fixed it differently. ^ permalink raw reply [flat|nested] 7+ messages in thread
* cryptoloop data corruption (was Re: Device-backed loop broken in 2.6.0-test2?) 2003-08-07 0:40 ` Andrew Morton 2003-08-07 7:23 ` Thomas Themel 2003-08-07 16:07 ` Valdis.Kletnieks @ 2003-08-09 20:48 ` Thomas Themel 2 siblings, 0 replies; 7+ messages in thread From: Thomas Themel @ 2003-08-09 20:48 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Andrew Morton (akpm@osdl.org) wrote on 2003-08-07: > Thomas Themel <themel@iwoars.net> wrote: > > it seems that device backed loopback is broken in the 2.6.0-test2 series. > > doh. Hm, it seems that this patch doesn't apply to 2.6.0-test3, so I assume that the 'other fix' from -mm5 is included? I still get data corruption on cryptoloop, but now it is a bit more subtle... One bit of every byte at multiples of 0x200 is flipped, starting with the one at 0x1000. See this for a short example (xxd output of file before and after copy to cryptoloop): --- good.xxd 2003-08-09 22:33:21.000000000 +0200 +++ b0rk.xxd 2003-08-09 22:32:59.000000000 +0200 @@ -256,3 +256,3 @@ 0000ff0: ffff ffff ffff ffff ffff ffff ffff ffff ................ -0001000: ffff ffff ffff ffff ffff ffff ffff ffff ................ +0001000: f7ff ffff ffff ffff ffff ffff ffff ffff ................ 0001010: ffff ffff ffff ffff ffff ffff ffff ffff ................ @@ -288,3 +288,3 @@ 00011f0: 0ae0 004b 0000 0000 0000 0960 0000 0000 ...K.......`.... -0001200: 0001 2c00 0000 0000 0025 8000 0000 ffff ..,......%...... +0001200: 0801 2c00 0000 0000 0025 8000 0000 ffff ..,......%...... 0001210: ffff ffff ffff ffff ffff ffff ffff ffff ................ @@ -320,3 +320,3 @@ 00013f0: ffff ffff ffff ffff ffff ffff ffff ffff ................ -0001400: ffff ffff ffff ffff ffff ffff ffff ffff ................ +0001400: f7ff ffff ffff ffff ffff ffff ffff ffff ................ 0001410: ffff ffff ffff ffff ffff ffff ffff ffff ................ @@ -352,3 +352,3 @@ 00015f0: ffff ffff ffff ffff ffff ffff ffff ffff ................ -0001600: ffff ffff ffff ffff ffff ffff ffff ffff ................ Any ideas what's causing this? The files are ext3 on an AES cryptoloop backed by an IDE partition. ciao, -- [*Thomas Themel*] US law prohibits boycotting Israel [extended contact] [info provided in] <http://news.bbc.co.uk/2/hi/business/2403303.stm> [*message header*] <http://www.bxa.doc.gov/AntiboycottCompliance/Default.htm> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2003-08-09 20:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-08-06 22:40 Device-backed loop broken in 2.6.0-test2? Thomas Themel 2003-08-07 0:40 ` Andrew Morton 2003-08-07 7:23 ` Thomas Themel 2003-08-07 16:07 ` Valdis.Kletnieks 2003-08-07 16:24 ` Valdis.Kletnieks 2003-08-07 16:29 ` Andrew Morton 2003-08-09 20:48 ` cryptoloop data corruption (was Re: Device-backed loop broken in 2.6.0-test2?) Thomas Themel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).