All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 03/27] xfs: use write_cache_pages for writeback clustering
Date: Fri, 1 Jul 2011 12:22:48 +1000	[thread overview]
Message-ID: <20110701022248.GM561@dastard> (raw)
In-Reply-To: <20110629140336.950805096@bombadil.infradead.org>

On Wed, Jun 29, 2011 at 10:01:12AM -0400, Christoph Hellwig wrote:
> Instead of implementing our own writeback clustering use write_cache_pages
> to do it for us.  This means the guts of the current writepage implementation
> become a new helper used both for implementing ->writepage and as a callback
> to write_cache_pages for ->writepages.  A new struct xfs_writeback_ctx
> is used to track block mapping state and the ioend chain over multiple
> invocation of it.
> 
> The advantage over the old code is that we avoid a double pagevec lookup,
> and a more efficient handling of extent boundaries inside a page for
> small blocksize filesystems, as well as having less XFS specific code.

It's not more efficient right now, due to a little bug:

> @@ -973,36 +821,38 @@ xfs_vm_writepage(
>  		 * buffers covering holes here.
>  		 */
>  		if (!buffer_mapped(bh) && buffer_uptodate(bh)) {
> -			imap_valid = 0;
> +			ctx->imap_valid = 0;
>  			continue;
>  		}
>  
>  		if (buffer_unwritten(bh)) {
>  			if (type != IO_UNWRITTEN) {
>  				type = IO_UNWRITTEN;
> -				imap_valid = 0;
> +				ctx->imap_valid = 0;
>  			}
>  		} else if (buffer_delay(bh)) {
>  			if (type != IO_DELALLOC) {
>  				type = IO_DELALLOC;
> -				imap_valid = 0;
> +				ctx->imap_valid = 0;
>  			}
>  		} else if (buffer_uptodate(bh)) {
>  			if (type != IO_OVERWRITE) {
>  				type = IO_OVERWRITE;
> -				imap_valid = 0;
> +				ctx->imap_valid = 0;
>  			}
>  		} else {
>  			if (PageUptodate(page)) {
>  				ASSERT(buffer_mapped(bh));
> -				imap_valid = 0;
> +				ctx->imap_valid = 0;
>  			}
>  			continue;
>  		}

This piece of logic checks is the type of buffer has changed from the
previous buffer. This used to work just fine, but now "type" is
local to the __xfs_vm_writepage() function, while the imap life
spanѕ multiple calls to the __xfs_vm_writepage() function. Hence
type is reinitialised to IO_OVERWRITE on every page that written,
and so for delalloc we are invalidating the imap and looking it up
again on every page. Traces show this sort of behaviour:

           <...>-514   [000] 689640.881953: xfs_writepage:        dev 253:16 ino 0x552248 pgoff 0xf7000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-514   [000] 689640.881954: xfs_ilock:            dev 253:16 ino 0x552248 flags ILOCK_SHARED caller xfs_map_blocks
           <...>-514   [000] 689640.881954: xfs_iunlock:          dev 253:16 ino 0x552248 flags ILOCK_SHARED caller xfs_map_blocks
           <...>-514   [000] 689640.881954: xfs_map_blocks_found: dev 253:16 ino 0x552248 size 0x0 new_size 0x0 offset 0xf7000 count 1024 type  startoff 0x0 startblock 6297609 blockcount 0x2800
           <...>-514   [000] 689640.881956: xfs_writepage:        dev 253:16 ino 0x552248 pgoff 0xf8000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-514   [000] 689640.881957: xfs_ilock:            dev 253:16 ino 0x552248 flags ILOCK_SHARED caller xfs_map_blocks
           <...>-514   [000] 689640.881957: xfs_iunlock:          dev 253:16 ino 0x552248 flags ILOCK_SHARED caller xfs_map_blocks
           <...>-514   [000] 689640.881957: xfs_map_blocks_found: dev 253:16 ino 0x552248 size 0x0 new_size 0x0 offset 0xf8000 count 1024 type  startoff 0x0 startblock 6297609 blockcount 0x2800
           <...>-514   [000] 689640.881960: xfs_writepage:        dev 253:16 ino 0x552248 pgoff 0xf9000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-514   [000] 689640.881960: xfs_ilock:            dev 253:16 ino 0x552248 flags ILOCK_SHARED caller xfs_map_blocks
           <...>-514   [000] 689640.881961: xfs_iunlock:          dev 253:16 ino 0x552248 flags ILOCK_SHARED caller xfs_map_blocks
           <...>-514   [000] 689640.881961: xfs_map_blocks_found: dev 253:16 ino 0x552248 size 0x0 new_size 0x0 offset 0xf9000 count 1024 type  startoff 0x0 startblock 6297609 blockcount 0x2800

IOWs, the type field also needs to be moved into the writepage
context structure so that we don't keep doing needless extent map
lookups.

With the following patch, the trace output now looks like this for
delalloc writeback:

           <...>-12623 [000] 694093.594883: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x505000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594884: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x506000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594884: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x507000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594885: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x508000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594885: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x509000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594886: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x50a000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594887: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x50b000 size 0xa00000 offset 0 delalloc 1 unwritten 0
           <...>-12623 [000] 694093.594888: xfs_writepage:        dev 253:16 ino 0x2300a5 pgoff 0x50c000 size 0xa00000 offset 0 delalloc 1 unwritten 0


i.e. there mapping lookup is no longer occurring for every page.

As a side effect, the failure case I'm seeing with test 180 has gone
from 5-10 files with the wrong size to >200 files with the wrong
size with this patch, so clearly there is something wrong with file
size updates getting to disk that this patch set makes worse.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: io type needs to be part of the writepage context

From: Dave Chinner <dchinner@redhat.com>

If we don't pass the IO type we are mapping with the writeage
context, then the imap is recalculated on every delalloc page that
is passed to _xfs_vm_writepage(). This defeats the purpose of having
a cached imap between calls and increases the overhead of delalloc
writeback significantly.

Fix this by moving the io type into the writepage context structure
so that it moves with the cached imap through the stack.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_aops.c |   30 ++++++++++++++++++------------
 1 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index 73dac4b..25b63cd 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -40,6 +40,7 @@
 
 struct xfs_writeback_ctx {
 	unsigned int		imap_valid;
+	unsigned int		io_type;
 	struct xfs_bmbt_irec	imap;
 	struct xfs_ioend	*iohead;
 	struct xfs_ioend	*ioend;
@@ -804,7 +805,6 @@ __xfs_vm_writepage(
 
 	bh = head = page_buffers(page);
 	offset = page_offset(page);
-	type = IO_OVERWRITE;
 
 	do {
 		int new_ioend = 0;
@@ -826,18 +826,18 @@ __xfs_vm_writepage(
 		}
 
 		if (buffer_unwritten(bh)) {
-			if (type != IO_UNWRITTEN) {
-				type = IO_UNWRITTEN;
+			if (ctx->io_type != IO_UNWRITTEN) {
+				ctx->io_type = IO_UNWRITTEN;
 				ctx->imap_valid = 0;
 			}
 		} else if (buffer_delay(bh)) {
-			if (type != IO_DELALLOC) {
-				type = IO_DELALLOC;
+			if (ctx->io_type != IO_DELALLOC) {
+				ctx->io_type = IO_DELALLOC;
 				ctx->imap_valid = 0;
 			}
 		} else if (buffer_uptodate(bh)) {
-			if (type != IO_OVERWRITE) {
-				type = IO_OVERWRITE;
+			if (ctx->io_type != IO_OVERWRITE) {
+				ctx->io_type = IO_OVERWRITE;
 				ctx->imap_valid = 0;
 			}
 		} else {
@@ -862,7 +862,8 @@ __xfs_vm_writepage(
 			 * time.
 			 */
 			new_ioend = 1;
-			err = xfs_map_blocks(inode, offset, &ctx->imap, type);
+			err = xfs_map_blocks(inode, offset, &ctx->imap,
+					     ctx->io_type);
 			if (err)
 				goto error;
 			ctx->imap_valid =
@@ -870,11 +871,12 @@ __xfs_vm_writepage(
 		}
 		if (ctx->imap_valid) {
 			lock_buffer(bh);
-			if (type != IO_OVERWRITE) {
+			if (ctx->io_type != IO_OVERWRITE) {
 				xfs_map_at_offset(inode, bh, &ctx->imap,
 						  offset);
 			}
-			xfs_add_to_ioend(ctx, inode, bh, offset, type, new_ioend);
+			xfs_add_to_ioend(ctx, inode, bh, offset, ctx->io_type,
+					 new_ioend);
 			count++;
 		}
 	} while (offset += len, ((bh = bh->b_this_page) != head));
@@ -902,7 +904,9 @@ xfs_vm_writepage(
 	struct page		*page,
 	struct writeback_control *wbc)
 {
-	struct xfs_writeback_ctx ctx = { };
+	struct xfs_writeback_ctx ctx = {
+		.io_type = IO_OVERWRITE,
+	};
 	int ret;
 
 	/*
@@ -939,7 +943,9 @@ xfs_vm_writepages(
 	struct address_space	*mapping,
 	struct writeback_control *wbc)
 {
-	struct xfs_writeback_ctx ctx = { };
+	struct xfs_writeback_ctx ctx = {
+		.io_type = IO_OVERWRITE,
+	};
 	int ret;
 
 	xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2011-07-01  2:22 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-29 14:01 [PATCH 00/27] patch queue for Linux 3.1 Christoph Hellwig
2011-06-29 14:01 ` [PATCH 01/27] xfs: PF_FSTRANS should never be set in ->writepage Christoph Hellwig
2011-06-30  1:34   ` Dave Chinner
2011-06-29 14:01 ` [PATCH 02/27] xfs: remove the unused ilock_nowait codepath in writepage Christoph Hellwig
2011-06-30  0:15   ` Dave Chinner
2011-06-30  1:26     ` Dave Chinner
2011-06-30  6:55     ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 03/27] xfs: use write_cache_pages for writeback clustering Christoph Hellwig
2011-06-30  2:00   ` Dave Chinner
2011-06-30  2:48     ` Dave Chinner
2011-06-30  6:57     ` Christoph Hellwig
2011-07-01  2:22   ` Dave Chinner [this message]
2011-07-01  4:18     ` Dave Chinner
2011-07-01  8:59       ` Christoph Hellwig
2011-07-01  9:20         ` Dave Chinner
2011-07-01  9:33       ` Christoph Hellwig
2011-07-01  9:33         ` Christoph Hellwig
2011-07-01 14:59         ` Mel Gorman
2011-07-01 14:59           ` Mel Gorman
2011-07-01 15:15           ` Christoph Hellwig
2011-07-01 15:15             ` Christoph Hellwig
2011-07-02  2:42           ` Dave Chinner
2011-07-02  2:42             ` Dave Chinner
2011-07-05 14:10             ` Mel Gorman
2011-07-05 14:10               ` Mel Gorman
2011-07-05 15:55               ` Dave Chinner
2011-07-05 15:55                 ` Dave Chinner
2011-07-11 10:26             ` Christoph Hellwig
2011-07-11 10:26               ` Christoph Hellwig
2011-07-01 15:41         ` Wu Fengguang
2011-07-01 15:41           ` Wu Fengguang
2011-07-04  3:25           ` Dave Chinner
2011-07-04  3:25             ` Dave Chinner
2011-07-05 14:34             ` Mel Gorman
2011-07-05 14:34               ` Mel Gorman
2011-07-06  1:23               ` Dave Chinner
2011-07-06  1:23                 ` Dave Chinner
2011-07-11 11:10               ` Christoph Hellwig
2011-07-11 11:10                 ` Christoph Hellwig
2011-07-06  4:53             ` Wu Fengguang
2011-07-06  4:53               ` Wu Fengguang
2011-07-06  6:47               ` Minchan Kim
2011-07-06  6:47                 ` Minchan Kim
2011-07-06  7:17               ` Dave Chinner
2011-07-06  7:17                 ` Dave Chinner
2011-07-06 15:12             ` Johannes Weiner
2011-07-06 15:12               ` Johannes Weiner
2011-07-08  9:54               ` Dave Chinner
2011-07-08  9:54                 ` Dave Chinner
2011-07-11 17:20                 ` Johannes Weiner
2011-07-11 17:20                   ` Johannes Weiner
2011-07-11 17:24                   ` Christoph Hellwig
2011-07-11 17:24                     ` Christoph Hellwig
2011-07-11 19:09                   ` Rik van Riel
2011-07-11 19:09                     ` Rik van Riel
2011-07-01  8:51     ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 04/27] xfs: cleanup xfs_add_to_ioend Christoph Hellwig
2011-06-29 22:13   ` Alex Elder
2011-06-30  2:00   ` Dave Chinner
2011-06-29 14:01 ` [PATCH 05/27] xfs: work around bogus gcc warning in xfs_allocbt_init_cursor Christoph Hellwig
2011-06-29 22:13   ` Alex Elder
2011-06-29 14:01 ` [PATCH 06/27] xfs: split xfs_setattr Christoph Hellwig
2011-06-29 22:13   ` Alex Elder
2011-06-30  7:03     ` Christoph Hellwig
2011-06-30 12:28       ` Alex Elder
2011-06-30  2:11   ` Dave Chinner
2011-06-29 14:01 ` [PATCH 08/27] xfs: kill xfs_itruncate_start Christoph Hellwig
2011-06-29 22:13   ` Alex Elder
2011-06-29 14:01 ` [PATCH 09/27] xfs: split xfs_itruncate_finish Christoph Hellwig
2011-06-30  2:44   ` Dave Chinner
2011-06-30  7:18     ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 10/27] xfs: improve sync behaviour in the fact of aggressive dirtying Christoph Hellwig
2011-06-30  2:52   ` Dave Chinner
2011-06-29 14:01 ` [PATCH 11/27] xfs: fix filesystsem freeze race in xfs_trans_alloc Christoph Hellwig
2011-06-30  2:59   ` Dave Chinner
2011-06-29 14:01 ` [PATCH 12/27] xfs: remove i_transp Christoph Hellwig
2011-06-30  3:00   ` Dave Chinner
2011-06-29 14:01 ` [PATCH 13/27] xfs: factor out xfs_dir2_leaf_find_entry Christoph Hellwig
2011-06-30  6:11   ` Dave Chinner
2011-06-30  7:34     ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 14/27] xfs: cleanup shortform directory inode number handling Christoph Hellwig
2011-06-30  6:35   ` Dave Chinner
2011-06-30  7:39     ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 15/27] xfs: kill struct xfs_dir2_sf Christoph Hellwig
2011-06-30  7:04   ` Dave Chinner
2011-06-30  7:09     ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 16/27] xfs: cleanup the defintion of struct xfs_dir2_sf_entry Christoph Hellwig
2011-06-29 14:01 ` [PATCH 17/27] xfs: avoid usage of struct xfs_dir2_block Christoph Hellwig
2011-06-29 14:01 ` [PATCH 18/27] xfs: kill " Christoph Hellwig
2011-06-29 14:01 ` [PATCH 19/27] xfs: avoid usage of struct xfs_dir2_data Christoph Hellwig
2011-06-29 14:01 ` [PATCH 20/27] xfs: kill " Christoph Hellwig
2011-06-29 14:01 ` [PATCH 21/27] xfs: cleanup the defintion of struct xfs_dir2_data_entry Christoph Hellwig
2011-06-29 14:01 ` [PATCH 22/27] xfs: cleanup struct xfs_dir2_leaf Christoph Hellwig
2011-06-29 14:01 ` [PATCH 23/27] xfs: remove the unused xfs_bufhash structure Christoph Hellwig
2011-06-29 14:01 ` [PATCH 24/27] xfs: clean up buffer locking helpers Christoph Hellwig
2011-06-29 14:01 ` [PATCH 25/27] xfs: return the buffer locked from xfs_buf_get_uncached Christoph Hellwig
2011-06-29 14:01 ` [PATCH 26/27] xfs: cleanup I/O-related buffer flags Christoph Hellwig
2011-06-29 14:01 ` [PATCH 27/27] xfs: avoid a few disk cache flushes Christoph Hellwig
2011-06-30  6:36 ` [PATCH 00/27] patch queue for Linux 3.1 Dave Chinner
2011-06-30  6:50   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110701022248.GM561@dastard \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.