From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id p5U20Jjr164889 for ; Wed, 29 Jun 2011 21:00:19 -0500 Received: from ipmail06.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A74854C25CF for ; Wed, 29 Jun 2011 19:00:17 -0700 (PDT) Received: from ipmail06.adl2.internode.on.net (ipmail06.adl2.internode.on.net [150.101.137.129]) by cuda.sgi.com with ESMTP id cUyqOikSE5xpjGEz for ; Wed, 29 Jun 2011 19:00:17 -0700 (PDT) Date: Thu, 30 Jun 2011 12:00:13 +1000 From: Dave Chinner Subject: Re: [PATCH 03/27] xfs: use write_cache_pages for writeback clustering Message-ID: <20110630020013.GX561@dastard> References: <20110629140109.003209430@bombadil.infradead.org> <20110629140336.950805096@bombadil.infradead.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20110629140336.950805096@bombadil.infradead.org> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Christoph Hellwig Cc: xfs@oss.sgi.com On Wed, Jun 29, 2011 at 10:01:12AM -0400, Christoph Hellwig wrote: > Instead of implementing our own writeback clustering use write_cache_pages > to do it for us. This means the guts of the current writepage implementation > become a new helper used both for implementing ->writepage and as a callback > to write_cache_pages for ->writepages. A new struct xfs_writeback_ctx > is used to track block mapping state and the ioend chain over multiple > invocation of it. > > The advantage over the old code is that we avoid a double pagevec lookup, > and a more efficient handling of extent boundaries inside a page for > small blocksize filesystems, as well as having less XFS specific code. Yes, it should be, but I can't actually measure any noticable CPU usage difference @800MB/s writeback. The profiles change shape around the changed code, but overall cpu usage does not change. I think this is because the second pagevec lookup is pretty much free because the radix tree is already hot in cache when we do the second lookup... > The downside is that we don't do writeback clustering when called from > kswapd anyore, but that is a case that should be avoided anyway. Note > that we still convert the whole delalloc range from ->writepage, so > the on-disk allocation pattern is not affected. All the more reason to ensure the mm subsystem doesn't do this.... ..... > error: > - if (iohead) > - xfs_cancel_ioend(iohead); > - > - if (err == -EAGAIN) > - goto redirty; > - Should this EAGAIN handling be dealt with in the removing-the-non- blocking-mode patch? > +STATIC int > xfs_vm_writepages( > struct address_space *mapping, > struct writeback_control *wbc) > { > + struct xfs_writeback_ctx ctx = { }; > + int ret; > + > xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED); > - return generic_writepages(mapping, wbc); > + > + ret = write_cache_pages(mapping, wbc, __xfs_vm_writepage, &ctx); > + > + if (ctx.iohead) { > + if (ret) > + xfs_cancel_ioend(ctx.iohead); > + else > + xfs_submit_ioend(wbc, ctx.iohead); > + } I think this error handling does not work. If we have put pages into the ioend (i.e. successful ->writepage calls) and then have a ->writepage call fail, we'll get all the pages under writeback (i.e. those on the ioend) remain in that state, and not ever get written back (so move into the clean state) or redirtied (so written again later) xfs_cancel_ioend() was only ever called for the first page sent down to ->writepage, and on error that page was redirtied separately. Hence it doesn't handle this case at all as it never occurs in the existing code. I'd suggest that regardless of whether an error is returned here, the existence of ctx.iohead indicates a valid ioend that needs to be submitted.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs