linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Josef Bacik <josef@redhat.com>,
	linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	xfs@oss.sgi.com, cmm@us.ibm.com, cluster-devel@redhat.com,
	ocfs2-devel@oss.oracle.com
Subject: Re: [PATCH 1/6] fs: add hole punching to fallocate
Date: Wed, 17 Nov 2010 03:19:49 -0600	[thread overview]
Message-ID: <7CF30C2C-44CE-4DFC-BB8B-92A207E4052A@dilger.ca> (raw)
In-Reply-To: <20101117021150.GL22876@dastard>

On 2010-11-16, at 20:11, Dave Chinner wrote:
> On Tue, Nov 16, 2010 at 06:22:47PM -0600, Andreas Dilger wrote:
>> IMHO, it makes more sense for consistency and "get what users
>> expect" that these be treated as flags.  Some users will want
>> KEEP_SIZE, but in other cases it may make sense that a hole punch
>> at the end of a file should shrink the file (i.e. the opposite of
>> an append).
> 
> What's wrong with ftruncate() for this?

It makes the API usage from applications more consistent.  It would be inconvenient, for example, if applications had to use a different system call if they were writing in the middle of the file vs. at the end, wouldn't it?

Similarly, if multiple threads are appending vs. punching (let's assume non-overlapping regions, for sanity, like a producer/consumer model punching out completed records) then using ftruncate() to remove the last record and shrink the file would require locking the whole file from userspace (unlike the append, which does this in the kernel), or risk discarding unprocessed data beyond the record that was punched out.

> There's plenty of open questions about the interface if we allow
> hole punching to change the file size. e.g. where do we set the EOF
> (offset or offset+len)?

I would think it natural that the new size is the start of the region, like an "anti-write" (where write sets the size at the end of the added bytes).

>  What do we do with the rest of the blocks that are now beyond EOF?
> We weren't asked to punch them out, so do we leave them behind?

I definitely think they should be left as is.  If they were in the punched-out range, they would be deallocated, and if they are beyond EOF they will remain as they are - we didn't ask to remove them unless the punched-out range went to ~0ULL (which would make it equivalent to an ftruncate()).

> What if we are leaving written blocks beyond EOF - does any filesystem other than XFS support that (i.e. are we introducing different behaviour on different filesystems)?

I'm not sure I understand what a "written block beyond EOF" means.  How can there be data beyond EOF?  I think the KEEP_SIZE flag is only relevant if the punch is spanning EOF, like the opposite of a write that is spanning EOF.  If KEEP_SIZE is set, then it leaves the size unchanged, and if unset and punch spans EOF it reduces the file size.  If the punch is not at EOF it doesn't change the file size, just like a write that is not at EOF.

> And what happens if the offset is beyond EOF? Do we extend the file, and if so why wouldn't you just use ftruncate() instead?

Even if the effects were the same, it makes sense because applications may be using fallocate(PUNCH_HOLE) to punch out records, and having them special case the use of ftruncate() to get certain semantics at the end of the file adds needless complexity.

Cheers, Andreas






  parent reply	other threads:[~2010-11-17  9:19 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-15 17:05 Hole Punching V2 Josef Bacik
2010-11-15 17:05 ` [PATCH 1/6] fs: add hole punching to fallocate Josef Bacik
2010-11-16 11:16   ` Jan Kara
2010-11-16 11:43     ` Jan Kara
2010-11-16 12:52       ` Josef Bacik
2010-11-16 13:14         ` Jan Kara
2010-11-17  0:22           ` Andreas Dilger
2010-11-17  2:11             ` Dave Chinner
2010-11-17  2:28               ` Josef Bacik
2010-11-17  2:34                 ` Josef Bacik
2010-11-17  9:30                   ` Andreas Dilger
2010-11-17  9:19               ` Andreas Dilger [this message]
2010-11-16 12:53     ` Josef Bacik
2010-11-15 17:05 ` [PATCH 2/6] XFS: handle hole punching via fallocate properly Josef Bacik
2010-11-15 17:05 ` [PATCH 3/6] Ocfs2: " Josef Bacik
2010-11-16 11:50   ` Jan Kara
2010-11-17 23:27   ` Joel Becker
2010-11-15 17:05 ` [PATCH 4/6] Ext4: fail if we try to use hole punch Josef Bacik
2010-11-16 11:52   ` Jan Kara
2010-11-16 12:25   ` Avi Kivity
2010-11-16 12:50     ` Josef Bacik
2010-11-16 13:07       ` Avi Kivity
2010-11-16 16:05         ` Josef Bacik
2010-11-16 20:47           ` Greg Freemyer
2010-11-17  3:06         ` Ted Ts'o
2010-11-17  6:31           ` Josef Bacik
2010-11-16 16:20   ` Pádraig Brady
2010-11-16 16:33     ` Josef Bacik
2010-11-16 16:56       ` Pádraig Brady
2010-11-15 17:05 ` [PATCH 5/6] Btrfs: " Josef Bacik
2010-11-15 17:05 ` [PATCH 6/6] Gfs2: " Josef Bacik
  -- strict thread matches above, loose matches on Subject: below --
2010-11-18  1:46 Hole Punching V3 Josef Bacik
2010-11-18  1:46 ` [PATCH 1/6] fs: add hole punching to fallocate Josef Bacik
2010-11-18 23:43   ` Jan Kara
2010-11-08 20:32 Josef Bacik
2010-11-09  1:12 ` Dave Chinner
2010-11-09  2:10   ` Josef Bacik
2010-11-09  3:30   ` Ted Ts'o
2010-11-09  4:42     ` Dave Chinner
2010-11-09 21:41       ` Ted Ts'o
2010-11-09 21:53         ` Jan Kara
2010-11-09 23:40         ` Dave Chinner
2011-01-11 21:13           ` Lawrence Greenfield
2011-01-11 21:30             ` Ted Ts'o
2011-01-12 11:48               ` Dave Chinner
2011-01-12 12:44             ` Dave Chinner
2011-01-28 18:13               ` Ric Wheeler
2010-11-09 20:51   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7CF30C2C-44CE-4DFC-BB8B-92A207E4052A@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=cluster-devel@redhat.com \
    --cc=cmm@us.ibm.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=josef@redhat.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).