All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 01/14] xfs: fix sub-page blocksize data integrity writes
Date: Mon, 20 May 2013 14:02:59 -0400	[thread overview]
Message-ID: <519A6553.4090801@redhat.com> (raw)
In-Reply-To: <1369007481-15185-2-git-send-email-david@fromorbit.com>

On 05/19/2013 07:51 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> FSX on 512 byte block size filesystems has been failing for some
> time with corrupted data. The fault dates back to the change in
> the writeback data integrity algorithm that uses a mark-and-sweep
> approach to avoid data writeback livelocks.
> 
> Unfortunately, a side effect of this mark-and-sweep approach is that
> each page will only be written once for a data integrity sync, and
> there is a condition in writeback in XFS where a page may require
> two writeback attempts to be fully written. As a result of the high
> level change, we now only get a partial page writeback during the
> integrity sync because the first pass through writeback clears the
> mark left on the page index to tell writeback that the page needs
> writeback....
> 
> The cause is writing a partial page in the clustering code. This can
> happen when a mapping boundary falls in the middle of a page - we
> end up writing back the first part of the page that the mapping
> covers, but then never revisit the page to have the remainder mapped
> and written.
> 
> The fix is simple - if the mapping boundary falls inside a page,
> then simple abort clustering without touching the page. This means
> that the next ->writepage entry that write_cache_pages() will make
> is the page we aborted on, and xfs_vm_writepage() will map all
> sections of the page correctly. This behaviour is also optimal for
> non-data integrity writes, as it results in contiguous sequential
> writeback of the file rather than missing small holes and having to
> write them a "random" writes in a future pass.
> 
> With this fix, all the fsx tests in xfstests now pass on a 512 byte
> block size filesystem on a 4k page machine.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

Looks good to me.

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_aops.c |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 2b2691b..f04eceb 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -725,6 +725,25 @@ xfs_convert_page(
>  			(xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT,
>  			i_size_read(inode));
>  
> +	/*
> +	 * If the current map does not span the entire page we are about to try
> +	 * to write, then give up. The only way we can write a page that spans
> +	 * multiple mappings in a single writeback iteration is via the
> +	 * xfs_vm_writepage() function. Data integrity writeback requires the
> +	 * entire page to be written in a single attempt, otherwise the part of
> +	 * the page we don't write here doesn't get written as part of the data
> +	 * integrity sync.
> +	 *
> +	 * For normal writeback, we also don't attempt to write partial pages
> +	 * here as it simply means that write_cache_pages() will see it under
> +	 * writeback and ignore the page until some pointin the future, at which
> +	 * time this will be the only page inteh file that needs writeback.
> +	 * Hence for more optimal IO patterns, we should always avoid partial
> +	 * page writeback due to multiple mappings on a page here.
> +	 */
> +	if (!xfs_imap_valid(inode, imap, end_offset))
> +		goto fail_unlock_page;
> +
>  	len = 1 << inode->i_blkbits;
>  	p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1),
>  					PAGE_CACHE_SIZE);
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-05-20 17:58 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-19 23:51 [PATCH 00/14] xfs: fixes for 3.10-rc2 (update) Dave Chinner
2013-05-19 23:51 ` [PATCH 01/14] xfs: fix sub-page blocksize data integrity writes Dave Chinner
2013-05-20 18:02   ` Brian Foster [this message]
2013-05-20 19:18     ` Ben Myers
2013-05-19 23:51 ` [PATCH 02/14] xfs: fix rounding in xfs_free_file_space Dave Chinner
2013-05-20 18:03   ` Brian Foster
2013-05-19 23:51 ` [PATCH 03/14] xfs: Don't reference the EFI after it is freed Dave Chinner
2013-05-20 18:03   ` Brian Foster
2013-05-19 23:51 ` [PATCH 04/14] xfs: avoid nesting transactions in xfs_qm_scall_setqlim() Dave Chinner
2013-05-20 18:03   ` Brian Foster
2013-05-21  0:06     ` Dave Chinner
2013-05-21  0:36   ` [PATCH 04/14 V2] " Dave Chinner
2013-05-21 10:51     ` Brian Foster
2013-05-19 23:51 ` [PATCH 05/14] xfs: fix missing KM_NOFS tags to keep lockdep happy Dave Chinner
2013-05-20 21:16   ` Ben Myers
2013-05-21  0:08     ` Dave Chinner
2013-05-19 23:51 ` [PATCH 06/14] xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC Dave Chinner
2013-05-20 21:32   ` Ben Myers
2013-05-19 23:51 ` [PATCH 07/14] xfs: xfs_attr_shortform_allfit() does not handle attr3 format Dave Chinner
2013-05-20 21:52   ` Ben Myers
2013-05-19 23:51 ` [PATCH 08/14] xfs: remote attribute allocation may be contiguous Dave Chinner
2013-05-20 19:03   ` Brian Foster
2013-05-20 22:04     ` Ben Myers
2013-05-21  0:25       ` Dave Chinner
2013-05-19 23:51 ` [PATCH 09/14] xfs: remote attribute lookups require the value length Dave Chinner
2013-05-20 22:15   ` Ben Myers
2013-05-19 23:51 ` [PATCH 10/14] xfs: remote attribute read too short Dave Chinner
2013-05-20 23:00   ` Ben Myers
2013-05-19 23:51 ` [PATCH 11/14] xfs: remote attribute tail zeroing does too much Dave Chinner
2013-05-20 23:01   ` Ben Myers
2013-05-19 23:51 ` [PATCH 12/14] xfs: correctly map remote attr buffers during removal Dave Chinner
2013-05-19 23:51 ` [PATCH 13/14] xfs: fully initialise temp leaf in xfs_attr3_leaf_unbalance Dave Chinner
2013-05-19 23:51 ` [PATCH 14/14] xfs: fully initialise temp leaf in xfs_attr3_leaf_compact Dave Chinner
2013-05-20 19:37 ` [PATCH 00/14] xfs: fixes for 3.10-rc2 (update) Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519A6553.4090801@redhat.com \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.