All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Altaparmakov <aia21@cam.ac.uk>
To: Nick Piggin <npiggin@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux Filesystems <linux-fsdevel@vger.kernel.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [patch 2/3] fs: introduce perform_write aop
Date: Fri, 9 Mar 2007 22:01:11 +0000	[thread overview]
Message-ID: <4079137B-C8FE-402B-9B75-478EC6A3A5A4@cam.ac.uk> (raw)
In-Reply-To: <20070309125242.GB15325@wotan.suse.de>


On 9 Mar 2007, at 12:52, Nick Piggin wrote:

> Hi Christoph,
>
> On Fri, Mar 09, 2007 at 10:39:13AM +0000, Christoph Hellwig wrote:
>> Hi Nick,
>>
>> sorry for my later reply, this has been on my to answer list for  
>> the last
>> month and I only managed to get back to it now.
>
> No worries, I haven't had much time to work on it since then anyway.
> Thanks for taking a look.
>
>> On Thu, Feb 08, 2007 at 02:07:36PM +0100, Nick Piggin wrote:
>>> as a single call to copy a given amount of userdata at the given  
>>> offset. This
>>> is more flexible, because the implementation can determine how to  
>>> best handle
>>> errors, or multi-page ranges (eg. it may use a gang lookup), and  
>>> only requires
>>> one call into the fs.
>>
>> I really like this idea, especially for avoiding to call into the  
>> allocator
>> for every block.

Indeed.  FWIW my NTFS driver does not use the generic file write  
helper function and instead has its own code that does the allocation  
first and then does the writing a page at a time much the same as  
generic file write does so depending on what this new interface ends  
up looking like exactly I may well be able to switch the NTFS driver  
to use it instead of doing everything by myself and duplicating a ton  
of code from the VFS...

Best regards,

	Anton

>> Have you contacted the reiser4 folks whether this would
>> superceed their batch_write op completely?
>
> I haven't yet, although that's been on my todo list when I get the API
> into a more final state.
>
> batch_write seems quite similar, however theirs is still page  
> based, and
> a bit crufty, IMO. I found it to be really clean to just pass down  
> offsets,
> but that may be a matter for debate.
>
> What they _do_ have is a write actor function that will do the data  
> copy.
> This could be one possible way to get rid of ->prepare_write and
> ->commit_write, but I haven't tried that yet, because I don't like  
> adding
> more redirection and complexity if possible...
>
>>> One problem with this interface is that it cannot be used to  
>>> write into the
>>> filesystem by any means other than already-initialised buffers  
>>> via iovecs. So
>>> prepare/commit have to stay around for non-user data...
>>
>> Actually I think that's a a good thing to a certain extent.  It  
>> reminds
>> us that all other users are horrible abuse of the interface.  I'd  
>> even
>> go so far as to make batch_write a callback that the filesystem  
>> passes
>> to generic_file_aio_write to make clear it's not a generic thing but
>> a helper.  (It's not a generic thing because it's the upper layer  
>> writing
>> into the pagecache, not a pagecache to fs below operation).
>
> OK, if you think that's reasonable, then that is one hurdle out of  
> the way ;)
>
>> The still leaves open on how to get rid of ->prepare_write and - 
>> >commit_write
>> compltely, and for that we'll probably need ->kernel_read and - 
>> >kernel_write
>> file operations.  But that's a step you shouldn't consider yet  
>> when doing
>> this work.
>
> I had a couple of possibilities for that. First is passing in a  
> write actor
> (eg. defaulting to the normal iovec usercopy), but as I said I  
> consider this
> more like fixing the problem with brute force (ie. just making the  
> interface
> more complex). Maybe as a last resort, though.
>
> Another thing that would be much nicer from _my_ point of view  
> would be to
> just make all kernel users set up their data in an iovec, and use  
> the normal
> call with KERNEL_DS. Unfortunately, this is not the expected way  
> for a lot
> of code to work, and it might require extra copying of the data.
>
>
>>> Another thing is that it seems to be less able to be implemented  
>>> in generic,
>>> reusable code. It should be possible to introduce a new 2-op  
>>> interface (or
>>> maybe just a new error handler op) which can be used correctly in  
>>> generic code.
>>
>> We should be able to find a nice abstraction for this, see my next  
>> mails.
>>
>>> +	/*
>>> +	 * perform_write replaces prepare and commit_write callbacks.
>>> +	 */
>>
>> This is a rather useless comment :)  Better remove it and add a  
>> proper
>> descriptions to Documentation/filesystems/vfs.txt and
>> Documentation/filesystems/Locking
>
> Will do. Thanks!

-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/



  reply	other threads:[~2007-03-09 22:01 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-08 13:07 [rfc][patch 0/3] a faster buffered write deadlock fix? Nick Piggin
2007-02-08 13:07 ` [patch 1/3] fs: add an iovec iterator Nick Piggin
2007-02-08 19:49   ` Christoph Hellwig
2007-02-09  1:46     ` Nick Piggin
2007-02-09  2:03       ` Nate Diller
2007-02-09  3:31         ` Nick Piggin
2007-02-09 17:28           ` Zach Brown
2007-03-09 10:40         ` Christoph Hellwig
2007-02-08 23:04   ` Mark Fasheh
2007-02-08 13:07 ` [patch 2/3] fs: introduce perform_write aop Nick Piggin
2007-03-09 10:39   ` Christoph Hellwig
2007-03-09 12:52     ` Nick Piggin
2007-03-09 22:01       ` Anton Altaparmakov [this message]
2007-03-09 23:33     ` Mark Fasheh
2007-03-10  9:25       ` Christoph Hellwig
2007-03-12  2:13         ` Mark Fasheh
2007-03-14 13:30         ` Nick Piggin
2007-03-14 15:17           ` Christoph Hellwig
2007-02-08 13:07 ` [patch 3/3] ext2: use " Nick Piggin
2007-02-08 14:47   ` Dmitriy Monakhov
2007-02-09 19:14   ` Andrew Morton
2007-02-09 19:45     ` Andrew Morton
2007-02-10  1:34       ` Nick Piggin
2007-02-10  1:50         ` Andrew Morton
2007-02-09  0:38 ` [rfc][patch 0/3] a faster buffered write deadlock fix? Mark Fasheh
2007-02-09  2:04   ` Nick Piggin
2007-02-09  8:41 ` Andrew Morton
2007-02-09  9:54   ` Nick Piggin
2007-02-09 10:09     ` Andrew Morton
2007-02-09 10:32       ` Nick Piggin
2007-02-09 10:52         ` Andrew Morton
2007-02-09 11:31           ` Nick Piggin
2007-02-09 11:46             ` Andrew Morton
2007-02-09 12:11               ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4079137B-C8FE-402B-9B75-478EC6A3A5A4@cam.ac.uk \
    --to=aia21@cam.ac.uk \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.