All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>, Dmitry Monakhov <dmonakhov@openvz.org>,
	linux-ext4@vger.kernel.org
Subject: Re: REGRESSION: [PATCH 04/12] ext4: Disable merging of uninitialized extents
Date: Thu, 14 Feb 2013 17:11:16 +0100	[thread overview]
Message-ID: <20130214161116.GB31269@quack.suse.cz> (raw)
In-Reply-To: <20130209171015.GC8091@thunk.org>

On Sat 09-02-13 12:10:15, Ted Tso wrote:
> On Fri, Jan 18, 2013 at 01:00:38PM +0100, Jan Kara wrote:
> > Merging of uninitialized extents creates all sorts of interesting race
> > possibilities when writeback / DIO races with fallocate. Thus
> > ext4_convert_unwritten_extents_endio() has to deal with a case where
> > extent to be converted needs to be split out first. That isn't nice
> > for two reasons:
> > 
> > 1) It may need allocation of extent tree block so ENOSPC is possible.
> > 2) It complicates end_io handling code
> > 
> > So we disable merging of uninitialized extents which allows us to simplify
> > the code. Extents will get merged after they are converted to initialized
> > ones.
> > 
> > Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Sorry for not noticing this earlier, but this patch is causing a
> regression.  It is loading to test 113 failing when dioread_nolock is
> used:
> 
> 113	[   47.619363] EXT4-fs error (device vdb): ext4_convert_unwritten_extents_endio:3411: i
> node #10951: comm kworker/u:0: Written extent modified before IO finished: extent logical block
>  1024, len 1024; IO logical block 1024, len 127
> [   47.619363] 
> [   47.623239] EXT4-fs warning (device vdb): ext4_convert_unwritten_extents:4522: inode #10951:
>  block 1024: len 127: ext4_ext_map_blocks returned -5
> [   47.628975] EXT4-fs (vdb): failed to convert unwritten extents to written extents -- potenti
> al data loss!  (inode 10951, offset 4194304, size 520192, error -5)
> 
> 
> As a result, I am considering whether or not I should to drop the
> following patches from the ext4 tree:
> 
> e63dd9c ext4: disable merging of uninitialized extents
> de39534 ext4: remove unnecessary wait for extent conversion in ext4_fallocate()
> 37bf0a8 ext4: ext4_split_extent should take care of extent zeroout
> 
> I know that these patches fix other potential races which causes data
> loss, but they've been around for a while, and in practice seem to be
> relatively rarely hit.
  OK, so I've debugged this. It wasn't actually that hard. The problem is
that mpage_da_map_and_submit() prepares extent with e.g. 256 blocks but
later we submit a shorter bio e.g. because it cannot carry that many pages.
So ->end_io is called only for first 128 blocks or so. I spotted this
problem already before, just it didn't come up when Ted sent me this bug
report.

The fix isn't trivial. What we need to do is to be able to attach multiple
bios to one io_end structure and start the conversion only once they are
all finished. I actually have patches for this in the second part of my
patch set. So for this merge window I'd just do what Dmitry suggested (and
the warning can be triggered really trivially by a sequential write with
dioread_nolock so that definitely has to be hidden by default). And I'll go
off to finish that second part of my patch set so that it can get to Ted's
tree as soon as possible.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  parent reply	other threads:[~2013-02-14 16:11 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-18 12:00 [PATCH 0/12 v2] ext4: Several simplifications and fixes Jan Kara
2013-01-18 12:00 ` [PATCH 01/12] ext4: Always use ext4_bio_write_page() for writeout Jan Kara
2013-01-28 14:31   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 02/12] ext4: Use redirty_page_for_writepage() in ext4_bio_write_page() Jan Kara
2013-01-28 14:34   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 03/12] ext4: Remove bogus wait for unwritten extents in ext4_ind_direct_IO Jan Kara
2013-01-22 11:11   ` Dmitry Monakhov
2013-01-22 13:44     ` Jan Kara
2013-01-22 14:12       ` Dmitry Monakhov
2013-01-22 15:21         ` Jan Kara
2013-01-22 14:22       ` Zheng Liu
2013-01-22 15:22         ` Jan Kara
2013-01-22 16:00           ` Zheng Liu
2013-01-22 23:14             ` Jan Kara
2013-01-23  6:11               ` Zheng Liu
2013-01-23  9:42                 ` Jan Kara
2013-01-18 12:00 ` [PATCH 04/12] ext4: Disable merging of uninitialized extents Jan Kara
2013-01-24  9:49   ` Dmitry Monakhov
2013-01-24 15:12     ` Jan Kara
2013-01-24 15:32       ` Dmitry Monakhov
2013-01-28 14:36     ` Theodore Ts'o
2013-01-28 15:02       ` Dmitry Monakhov
2013-01-28 15:38         ` Theodore Ts'o
2013-01-29  7:41           ` Dmitry Monakhov
2013-01-29  8:37             ` Zheng Liu
2013-01-31  7:47     ` Dmitry Monakhov
2013-01-31 12:39       ` Jan Kara
2013-01-31 14:09         ` Dmitry Monakhov
2013-01-31 16:54       ` Theodore Ts'o
2013-02-09 17:10   ` REGRESSION: " Theodore Ts'o
2013-02-12 21:58     ` Jan Kara
2013-02-13  4:57       ` Theodore Ts'o
2013-02-13  7:26         ` Dmitry Monakhov
2013-02-13 15:08           ` Merge window planning for ext4 and Ted's vacation Theodore Ts'o
2013-02-14 10:47           ` REGRESSION: [PATCH 04/12] ext4: Disable merging of uninitialized extents Jan Kara
2013-02-14 16:11     ` Jan Kara [this message]
2013-02-14 19:05       ` Theodore Ts'o
2013-02-14 21:32         ` Jan Kara
2013-01-18 12:00 ` [PATCH 05/12] ext4: Remove unnecessary wait for extent conversion in ext4_fallocate() Jan Kara
2013-01-18 12:00 ` [PATCH 06/12] ext4: Move work from io_end to inode Jan Kara
2013-01-28 14:45   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 07/12] ext4: Simplify list handling in ext4_do_flush_completed_IO() Jan Kara
2013-01-28 14:51   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 08/12] ext4: Remove __ext4_journalled_writepage() from mpage_da_submit_io() Jan Kara
2013-01-28 14:40   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 09/12] ext4: Dirty page has always buffers attached Jan Kara
2013-01-28 17:55   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 10/12] ext4: Simplify mpage_add_bh_to_extent() Jan Kara
2013-01-28 18:06   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 11/12] ext4: Make ext4_bio_writepage() handle unprepared buffers Jan Kara
2013-01-29  1:59   ` Theodore Ts'o
2013-01-18 12:00 ` [PATCH 12/12] ext4: Fix ext4_writepage() to achieve data=ordered guarantees Jan Kara
2013-01-29  2:08   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130214161116.GB31269@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=dmonakhov@openvz.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.