From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ipmail01.adl2.internode.on.net ([150.101.137.133]:42344 "EHLO ipmail01.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752024AbeEOBLu (ORCPT ); Mon, 14 May 2018 21:11:50 -0400 Date: Tue, 15 May 2018 11:11:47 +1000 From: Dave Chinner To: Andreas Gruenbacher Cc: cluster-devel@redhat.com, Christoph Hellwig , linux-fsdevel@vger.kernel.org, Dave Chinner Subject: Re: [PATCH v4 06/11] iomap: Add write_{begin,end} iomap operations Message-ID: <20180515011147.GF23861@dastard> References: <20180514153624.29598-1-agruenba@redhat.com> <20180514153624.29598-7-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180514153624.29598-7-agruenba@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, May 14, 2018 at 05:36:19PM +0200, Andreas Gruenbacher wrote: > Add write_begin and write_end operations to struct iomap_ops to provide > a way of overriding the default behavior of iomap_write_begin and > iomap_write_end. This is needed for implementing data journaling: in > the data journaling case, pages are written into the journal before > being written back to their proper on-disk locations. I'm not sure adding a per-page filesystem callout abstraction is what we want in the iomap code. The commit message does not explain why gfs2 needs per-page callouts, nor why the per-page work cannot be done/avoided by running code in the ->iomap_begin callout (e.g. by unstuffing the inode so nothing needs to be done per-page). My main concern is that the callouts to the filesystem in iomap are - by design - per IO, not per page. The whole point of using iomap was to get away from doing per-page filesystem operations on multi-page IOs - it's highly inefficient when we only need to call into the filesystem on a per-extent basis, and we simplified the code a *lot* by avoiding such behaviours. And to that point: this change has serious conflicts with the next step for the iomap infrastructure: Christoph's recent "remove bufferheads from iomap" series. Apart from the obvious code conflicts, there's a fairly major architectural/functional conflict. https://marc.info/?l=linux-fsdevel&m=152585212000411&w=2 That is, Christoph's patchset pushes us further away from filesystems doing their own per-page processing of page state. It converts the iomap IO path to store it's own per-page state tracking information in page->private, greatly reducing memory overhead (16 bytes on 4k page instead of 104 bytes per bufferhead) and streamlining the code. This, however, essentially makes the buffered IO path through the iomap infrastructure incompatible with existing bufferhead based filesystems. Depsite this, we can still provide limited support for buffered writes on bufferhead based filesystems through the iomap infrastructure, but I only see that as a mechanism for filesystems moving away from dependencies on bufferheads. It would require a different refactoring of the iomap code - I'd much prefer to keep bufferhead dependent IO paths completely separate (e.g. via a new actor function) to minimise impact and dependencies on the internal bufferhead free iomap IO path.... Let's see what Christoph thinks first, though.... Cheers, Dave. -- Dave Chinner david@fromorbit.com