All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Brian Foster <bfoster@redhat.com>,
	xfs@oss.sgi.com, linux-block@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, dm-devel@redhat.com
Subject: Re: [RFC PATCH] block: wire blkdev_fallocate() to block_device_operations' reserve_space
Date: Tue, 12 Apr 2016 17:12:43 -0700	[thread overview]
Message-ID: <20160413001243.GA25280@birch.djwong.org> (raw)
In-Reply-To: <20160412210426.GA1845@redhat.com>

On Tue, Apr 12, 2016 at 05:04:27PM -0400, Mike Snitzer wrote:
> On Tue, Apr 12 2016 at  4:39pm -0400,
> Darrick J. Wong <darrick.wong@oracle.com> wrote:
> 
> > On Tue, Apr 12, 2016 at 04:04:59PM -0400, Mike Snitzer wrote:
> > > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > > index 5a2c3ab..b34c07b 100644
> > > --- a/fs/block_dev.c
> > > +++ b/fs/block_dev.c
> > > @@ -1801,17 +1801,13 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > >  	struct request_queue *q = bdev_get_queue(bdev);
> > >  	struct address_space *mapping;
> > >  	loff_t end = start + len - 1;
> > > -	loff_t bs_mask, isize;
> > > +	loff_t isize;
> > >  	int error;
> > >  
> > >  	/* We only support zero range and punch hole. */
> > >  	if (mode & ~BLKDEV_FALLOC_FL_SUPPORTED)
> > >  		return -EOPNOTSUPP;
> > >  
> > > -	/* We haven't a primitive for "ensure space exists" right now. */
> > > -	if (!(mode & ~FALLOC_FL_KEEP_SIZE))
> > > -		return -EOPNOTSUPP;
> > > -
> > >  	/* Only punch if the device can do zeroing discard. */
> > >  	if ((mode & FALLOC_FL_PUNCH_HOLE) &&
> > >  	    (!blk_queue_discard(q) || !q->limits.discard_zeroes_data))
> > > @@ -1829,9 +1825,12 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > >  			return -EINVAL;
> > >  	}
> > >  
> > > -	/* Don't allow IO that isn't aligned to logical block size */
> > > -	bs_mask = bdev_logical_block_size(bdev) - 1;
> > > -	if ((start | len) & bs_mask)
> > > +	/*
> > > +	 * Don't allow IO that isn't aligned to minimum IO size (io_min)
> > > +	 * - for normal device's io_min is usually logical block size
> > > +	 * - but for more exotic devices (e.g. DM thinp) it may be larger
> > > +	 */
> > > +	if ((start | len) % bdev_io_min(bdev))

I started by noticing the 64-bit division.  However, in researching alignment
requirements for fallocate, I noticed that nothing says that we can return
-EINVAL for unaligned offset/len for allocate or punch.  For file allocations
ext4 and xfs simply enlarge the range so that the ends are aligned to the
logical block size; for punch they both shrink the range to deallocate until
the ends are aligned, and write zeroes to the partial blocks.

At least for user-visible fallocate we should do likewise, but for the internal
blkdev_ helpers I think it makes more sense to check lbs alignment and let the
lower level driver reject the IO if min_io alignment is a hard requirement.
Documentation/block/queue-sysfs.txt says that the min_io is the smallest
/preferred/ size.

But, before that, I'll push out some new fallocate patches for -rc3.

> > >  		return -EINVAL;
> > 
> > Noted.  Will update the original patch.
> 
> BTW, I just noticed your "block: require write_same and discard requests
> align to logical block size" -- doesn't look right.

What happens if we pass a request to thinp that isn't aligned to
minimum_io_size?  Does it reject the command?

> But maybe I'm just too hyper-focused on DM thinp's needs (which would
> much prefer these checks be done in terms of minimum_io_size, rather
> than logical_block_size, and _not_ assuming power-of-2 math will work).
> 
> But at least for discard: your lbs-based check is fine; since we have
> discard_granularity to cover thinp's more specific requirements.

--D

WARNING: multiple messages have this Message-ID (diff)
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Brian Foster <bfoster@redhat.com>,
	dm-devel@redhat.com, xfs@oss.sgi.com
Subject: Re: [RFC PATCH] block: wire blkdev_fallocate() to block_device_operations' reserve_space
Date: Tue, 12 Apr 2016 17:12:43 -0700	[thread overview]
Message-ID: <20160413001243.GA25280@birch.djwong.org> (raw)
In-Reply-To: <20160412210426.GA1845@redhat.com>

On Tue, Apr 12, 2016 at 05:04:27PM -0400, Mike Snitzer wrote:
> On Tue, Apr 12 2016 at  4:39pm -0400,
> Darrick J. Wong <darrick.wong@oracle.com> wrote:
> 
> > On Tue, Apr 12, 2016 at 04:04:59PM -0400, Mike Snitzer wrote:
> > > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > > index 5a2c3ab..b34c07b 100644
> > > --- a/fs/block_dev.c
> > > +++ b/fs/block_dev.c
> > > @@ -1801,17 +1801,13 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > >  	struct request_queue *q = bdev_get_queue(bdev);
> > >  	struct address_space *mapping;
> > >  	loff_t end = start + len - 1;
> > > -	loff_t bs_mask, isize;
> > > +	loff_t isize;
> > >  	int error;
> > >  
> > >  	/* We only support zero range and punch hole. */
> > >  	if (mode & ~BLKDEV_FALLOC_FL_SUPPORTED)
> > >  		return -EOPNOTSUPP;
> > >  
> > > -	/* We haven't a primitive for "ensure space exists" right now. */
> > > -	if (!(mode & ~FALLOC_FL_KEEP_SIZE))
> > > -		return -EOPNOTSUPP;
> > > -
> > >  	/* Only punch if the device can do zeroing discard. */
> > >  	if ((mode & FALLOC_FL_PUNCH_HOLE) &&
> > >  	    (!blk_queue_discard(q) || !q->limits.discard_zeroes_data))
> > > @@ -1829,9 +1825,12 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > >  			return -EINVAL;
> > >  	}
> > >  
> > > -	/* Don't allow IO that isn't aligned to logical block size */
> > > -	bs_mask = bdev_logical_block_size(bdev) - 1;
> > > -	if ((start | len) & bs_mask)
> > > +	/*
> > > +	 * Don't allow IO that isn't aligned to minimum IO size (io_min)
> > > +	 * - for normal device's io_min is usually logical block size
> > > +	 * - but for more exotic devices (e.g. DM thinp) it may be larger
> > > +	 */
> > > +	if ((start | len) % bdev_io_min(bdev))

I started by noticing the 64-bit division.  However, in researching alignment
requirements for fallocate, I noticed that nothing says that we can return
-EINVAL for unaligned offset/len for allocate or punch.  For file allocations
ext4 and xfs simply enlarge the range so that the ends are aligned to the
logical block size; for punch they both shrink the range to deallocate until
the ends are aligned, and write zeroes to the partial blocks.

At least for user-visible fallocate we should do likewise, but for the internal
blkdev_ helpers I think it makes more sense to check lbs alignment and let the
lower level driver reject the IO if min_io alignment is a hard requirement.
Documentation/block/queue-sysfs.txt says that the min_io is the smallest
/preferred/ size.

But, before that, I'll push out some new fallocate patches for -rc3.

> > >  		return -EINVAL;
> > 
> > Noted.  Will update the original patch.
> 
> BTW, I just noticed your "block: require write_same and discard requests
> align to logical block size" -- doesn't look right.

What happens if we pass a request to thinp that isn't aligned to
minimum_io_size?  Does it reject the command?

> But maybe I'm just too hyper-focused on DM thinp's needs (which would
> much prefer these checks be done in terms of minimum_io_size, rather
> than logical_block_size, and _not_ assuming power-of-2 math will work).
> 
> But at least for discard: your lbs-based check is fine; since we have
> discard_granularity to cover thinp's more specific requirements.

--D

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2016-04-13  0:12 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 16:42 [RFC v2 PATCH 00/10] dm-thin/xfs: prototype a block reservation allocation model Brian Foster
2016-04-12 16:42 ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 01/10] xfs: refactor xfs_reserve_blocks() to handle ENOSPC correctly Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 02/10] xfs: replace xfs_mod_fdblocks() bool param with flags Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 03/10] block: add block_device_operations methods to set and get reserved space Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-14  0:32   ` Dave Chinner
2016-04-14  0:32     ` Dave Chinner
2016-04-12 16:42 ` [RFC v2 PATCH 04/10] dm: add " Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 05/10] dm thin: " Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-13 17:44   ` Darrick J. Wong
2016-04-13 17:44     ` Darrick J. Wong
2016-04-13 18:33     ` Brian Foster
2016-04-13 18:33       ` Brian Foster
2016-04-13 20:41       ` Brian Foster
2016-04-13 20:41         ` Brian Foster
2016-04-13 21:01         ` Darrick J. Wong
2016-04-13 21:01           ` Darrick J. Wong
2016-04-14 15:10         ` Mike Snitzer
2016-04-14 15:10           ` Mike Snitzer
2016-04-14 16:23           ` Brian Foster
2016-04-14 16:23             ` Brian Foster
2016-04-14 20:18             ` Mike Snitzer
2016-04-14 20:18               ` Mike Snitzer
2016-04-15 11:48               ` Brian Foster
2016-04-15 11:48                 ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 06/10] xfs: thin block device reservation mechanism Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 07/10] xfs: adopt a reserved allocation model on dm-thin devices Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 08/10] xfs: handle bdev reservation ENOSPC correctly from XFS reserved pool Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 09/10] xfs: support no block reservation transaction mode Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 10/10] xfs: use contiguous bdev reservation for file preallocation Brian Foster
2016-04-12 16:42   ` Brian Foster
2016-04-12 20:04 ` [RFC PATCH] block: wire blkdev_fallocate() to block_device_operations' reserve_space Mike Snitzer
2016-04-12 20:04   ` Mike Snitzer
2016-04-12 20:39   ` Darrick J. Wong
2016-04-12 20:39     ` Darrick J. Wong
2016-04-12 20:46     ` Mike Snitzer
2016-04-12 20:46       ` Mike Snitzer
2016-04-12 22:25       ` Darrick J. Wong
2016-04-12 22:25         ` Darrick J. Wong
2016-04-12 21:04     ` Mike Snitzer
2016-04-12 21:04       ` Mike Snitzer
2016-04-13  0:12       ` Darrick J. Wong [this message]
2016-04-13  0:12         ` Darrick J. Wong
2016-04-14 15:18         ` Mike Snitzer
2016-04-14 15:18           ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160413001243.GA25280@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.