All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] nilfs2: add support for FITRIM ioctl
@ 2014-02-17 22:39 Andreas Rohner
       [not found] ` <cover.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Rohner @ 2014-02-17 22:39 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner

Hi,

This patch adds support for the FITRIM ioctl, which allows user space 
tools like fstrim to issue TRIM/DISCARD requests to the underlying 
device. It takes a fstrim_range structure as a parameter and for every 
clean segment in the specified range the function blkdev_issue_discard 
is called. The range is truncated to sector boundaries.

Best regards,
Andreas Rohner

---
v1->v2 (based on review by Ryusuke Konishi)
 * Remove upper limit of minlen
 * Add check for minlen
 * Round range to sector boundary instead of segment boundary
 * Fix minor bug
 * Use kmap_atomic instead of kmap
 * Move input checks to ioctl.c
 * Use nilfs_sufile_segment_usages_in_block()
--
Andreas Rohner (2):
  nilfs2: add nilfs_sufile_trim_fs to trim clean segs
  nilfs2: add FITRIM ioctl support for nilfs2

 fs/nilfs2/ioctl.c  |  52 +++++++++++++++++++
 fs/nilfs2/sufile.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nilfs2/sufile.h |   1 +
 3 files changed, 197 insertions(+)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs
       [not found] ` <cover.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
@ 2014-02-17 22:39   ` Andreas Rohner
       [not found]     ` <0ba40f9a0504b2a228e0714032f82556e03b927e.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
  2014-02-17 22:39   ` [PATCH 2/2] nilfs2: add FITRIM ioctl support for nilfs2 Andreas Rohner
  1 sibling, 1 reply; 7+ messages in thread
From: Andreas Rohner @ 2014-02-17 22:39 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner

This patch adds the nilfs_sufile_trim_fs function, which takes a
fstrim_range structure and calls blkdev_issue_discard for every
clean segment in the specified range. The range is truncated to sector
boundaries.

Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
---
 fs/nilfs2/sufile.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nilfs2/sufile.h |   1 +
 2 files changed, 145 insertions(+)

diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 3127e9f..3605cc9 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -870,6 +870,150 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf,
 }
 
 /**
+ * nilfs_sufile_trim_fs() - trim ioctl handle function
+ * @sufile: inode of segment usage file
+ * @range: fstrim_range structure
+ *
+ * start:	First Byte to trim
+ * len:		number of Bytes to trim from start
+ * minlen:	minimum extent length in Bytes
+ *
+ * Decription: nilfs_sufile_trim_fs goes through all segments containing bytes
+ * from start to start+len. start is rounded up to the next sector boundary
+ * and start+len is rounded down. For each clean segment blkdev_issue_discard
+ * function is invoked to trim it.
+ *
+ * Return Value: On success, 0 is returned or negative error code, otherwise.
+ */
+int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
+{
+	struct the_nilfs *nilfs = sufile->i_sb->s_fs_info;
+	struct buffer_head *su_bh;
+	struct nilfs_segment_usage *su;
+	void *kaddr;
+	size_t n, i, susz = NILFS_MDT(sufile)->mi_entry_size;
+	sector_t seg_start, seg_end, real_start, real_end,
+				start = 0, nblocks = 0;
+	u64 segnum, end, minlen, trimmed = 0;
+	int ret = 0;
+	unsigned int sect_size, sects_per_block;
+
+	sect_size = bdev_logical_block_size(nilfs->ns_bdev);
+	sects_per_block = (1 << nilfs->ns_blocksize_bits) / sect_size;
+	real_start = (range->start + sect_size - 1) / sect_size;
+	real_end = (range->start + range->len) / sect_size;
+	segnum = nilfs_get_segnum_of_block(nilfs, real_start / sects_per_block);
+	end = nilfs_get_segnum_of_block(nilfs, ((real_end + sects_per_block - 1)
+			/ sects_per_block) + nilfs->ns_blocks_per_segment - 1);
+	minlen = range->minlen / sect_size;
+
+
+	if (end > nilfs->ns_nsegments)
+		end = nilfs->ns_nsegments;
+	if (segnum >= nilfs->ns_nsegments || end <= segnum)
+		goto out;
+
+	down_read(&NILFS_MDT(sufile)->mi_sem);
+
+	while (segnum < end) {
+		n = nilfs_sufile_segment_usages_in_block(sufile, segnum,
+				end - 1);
+
+		ret = nilfs_sufile_get_segment_usage_block(sufile, segnum, 0,
+							   &su_bh);
+		if (ret < 0) {
+			if (ret != -ENOENT)
+				goto out_sem;
+			/* hole */
+			segnum += n;
+			continue;
+		}
+
+		kaddr = kmap_atomic(su_bh->b_page);
+		su = nilfs_sufile_block_get_segment_usage(sufile, segnum,
+				su_bh, kaddr);
+		for (i = 0; i < n; ++i, ++segnum, su = (void *)su + susz) {
+			if (!nilfs_segment_usage_clean(su))
+				continue;
+
+			nilfs_get_segment_range(nilfs, segnum, &seg_start,
+						&seg_end);
+
+			if (!nblocks) {
+				/* start new extent */
+				start = seg_start;
+				nblocks = seg_end - seg_start + 1;
+				continue;
+			}
+
+			if (start + nblocks == seg_start) {
+				/* add to previous extent */
+				nblocks += seg_end - seg_start + 1;
+				continue;
+			}
+
+			/* discard previous extent */
+			start *= sects_per_block;
+			nblocks *= sects_per_block;
+			if (start < real_start) {
+				nblocks -= real_start - start;
+				start = real_start;
+			}
+			if (start + nblocks > real_end)
+				nblocks = real_end - start;
+
+			if (nblocks >= minlen) {
+				kunmap_atomic(kaddr);
+
+				ret = blkdev_issue_discard(nilfs->ns_bdev,
+						start, nblocks, GFP_NOFS, 0);
+				if (ret < 0) {
+					put_bh(su_bh);
+					goto out_sem;
+				}
+
+				trimmed += nblocks;
+				kaddr = kmap_atomic(su_bh->b_page);
+				su = nilfs_sufile_block_get_segment_usage(
+					sufile, segnum, su_bh, kaddr);
+			}
+
+			/* start new extent */
+			start = seg_start;
+			nblocks = seg_end - seg_start + 1;
+		}
+		kunmap_atomic(kaddr);
+		put_bh(su_bh);
+	}
+
+
+	if (nblocks) {
+		/* discard last extent */
+		start *= sects_per_block;
+		nblocks *= sects_per_block;
+		if (start < real_start) {
+			nblocks -= real_start - start;
+			start = real_start;
+		}
+		if (start + nblocks > real_end)
+			nblocks = real_end - start;
+
+		if (nblocks >= minlen) {
+			ret = blkdev_issue_discard(nilfs->ns_bdev, start,
+					nblocks, GFP_NOFS, 0);
+			if (!ret)
+				trimmed += nblocks;
+		}
+	}
+
+out_sem:
+	up_read(&NILFS_MDT(sufile)->mi_sem);
+out:
+	range->len = trimmed * sect_size;
+	return ret;
+}
+
+/**
  * nilfs_sufile_read - read or get sufile inode
  * @sb: super block instance
  * @susize: size of a segment usage entry
diff --git a/fs/nilfs2/sufile.h b/fs/nilfs2/sufile.h
index e84bc5b..2434abd 100644
--- a/fs/nilfs2/sufile.h
+++ b/fs/nilfs2/sufile.h
@@ -65,6 +65,7 @@ void nilfs_sufile_do_set_error(struct inode *, __u64, struct buffer_head *,
 int nilfs_sufile_resize(struct inode *sufile, __u64 newnsegs);
 int nilfs_sufile_read(struct super_block *sb, size_t susize,
 		      struct nilfs_inode *raw_inode, struct inode **inodep);
+int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range);
 
 /**
  * nilfs_sufile_scrap - make a segment garbage
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] nilfs2: add FITRIM ioctl support for nilfs2
       [not found] ` <cover.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
  2014-02-17 22:39   ` [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs Andreas Rohner
@ 2014-02-17 22:39   ` Andreas Rohner
  1 sibling, 0 replies; 7+ messages in thread
From: Andreas Rohner @ 2014-02-17 22:39 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner

This patch adds support for the FITRIM ioctl, which enables user space
tools to issue TRIM/DISCARD requests to the underlying device. Every
clean segment within the specified range will be discarded.

Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
---
 fs/nilfs2/ioctl.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
index 2b34021..cf21dbd 100644
--- a/fs/nilfs2/ioctl.c
+++ b/fs/nilfs2/ioctl.c
@@ -1072,6 +1072,55 @@ out:
 }
 
 /**
+ * nilfs_ioctl_trim_fs() - trim ioctl handle function
+ * @inode: inode object
+ * @argp: pointer on argument from userspace
+ *
+ * Decription: nilfs_ioctl_trim_fs is the FITRIM ioctl handle function. It
+ * checks the arguments from userspace and calls nilfs_sufile_trim_fs, which
+ * performs the actual trim operation.
+ *
+ * Return Value: On success, 0 is returned or negative error code, otherwise.
+ */
+static int nilfs_ioctl_trim_fs(struct inode *inode, void __user *argp)
+{
+	struct the_nilfs *nilfs = inode->i_sb->s_fs_info;
+	struct request_queue *q = bdev_get_queue(nilfs->ns_bdev);
+	struct fstrim_range range;
+	int ret;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!blk_queue_discard(q))
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&range, argp, sizeof(range)))
+		return -EFAULT;
+
+	range.minlen = max_t(unsigned int, (unsigned int)range.minlen,
+			q->limits.discard_granularity);
+
+	if (range.len < nilfs->ns_blocksize ||
+		range.start >= (nilfs->ns_nsegments *
+			nilfs->ns_blocks_per_segment) <<
+			nilfs->ns_blocksize_bits)
+		return -EINVAL;
+
+	down_read(&nilfs->ns_segctor_sem);
+	ret = nilfs_sufile_trim_fs(nilfs->ns_sufile, &range);
+	up_read(&nilfs->ns_segctor_sem);
+
+	if (ret < 0)
+		return ret;
+
+	if (copy_to_user(argp, &range, sizeof(range)))
+		return -EFAULT;
+
+	return 0;
+}
+
+/**
  * nilfs_ioctl_set_alloc_range - limit range of segments to be allocated
  * @inode: inode object
  * @argp: pointer on argument from userspace
@@ -1205,6 +1254,8 @@ long nilfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		return nilfs_ioctl_resize(inode, filp, argp);
 	case NILFS_IOCTL_SET_ALLOC_RANGE:
 		return nilfs_ioctl_set_alloc_range(inode, argp);
+	case FITRIM:
+		return nilfs_ioctl_trim_fs(inode, argp);
 	default:
 		return -ENOTTY;
 	}
@@ -1235,6 +1286,7 @@ long nilfs_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 	case NILFS_IOCTL_SYNC:
 	case NILFS_IOCTL_RESIZE:
 	case NILFS_IOCTL_SET_ALLOC_RANGE:
+	case FITRIM:
 		break;
 	default:
 		return -ENOIOCTLCMD;
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs
       [not found]     ` <0ba40f9a0504b2a228e0714032f82556e03b927e.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
@ 2014-02-18 18:37       ` Ryusuke Konishi
       [not found]         ` <20140219.033733.321498615.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Ryusuke Konishi @ 2014-02-18 18:37 UTC (permalink / raw)
  To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

On Mon, 17 Feb 2014 23:39:51 +0100, Andreas Rohner wrote:
> This patch adds the nilfs_sufile_trim_fs function, which takes a
> fstrim_range structure and calls blkdev_issue_discard for every
> clean segment in the specified range. The range is truncated to sector
> boundaries.
> 
> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
> ---
>  fs/nilfs2/sufile.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/nilfs2/sufile.h |   1 +
>  2 files changed, 145 insertions(+)
> 
> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
> index 3127e9f..3605cc9 100644
> --- a/fs/nilfs2/sufile.c
> +++ b/fs/nilfs2/sufile.c
> @@ -870,6 +870,150 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf,
>  }
>  
>  /**
> + * nilfs_sufile_trim_fs() - trim ioctl handle function
> + * @sufile: inode of segment usage file
> + * @range: fstrim_range structure
> + *
> + * start:	First Byte to trim
> + * len:		number of Bytes to trim from start
> + * minlen:	minimum extent length in Bytes
> + *
> + * Decription: nilfs_sufile_trim_fs goes through all segments containing bytes
> + * from start to start+len. start is rounded up to the next sector boundary
> + * and start+len is rounded down. For each clean segment blkdev_issue_discard
> + * function is invoked to trim it.
> + *
> + * Return Value: On success, 0 is returned or negative error code, otherwise.
> + */
> +int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
> +{
> +	struct the_nilfs *nilfs = sufile->i_sb->s_fs_info;
> +	struct buffer_head *su_bh;
> +	struct nilfs_segment_usage *su;
> +	void *kaddr;
> +	size_t n, i, susz = NILFS_MDT(sufile)->mi_entry_size;
> +	sector_t seg_start, seg_end, real_start, real_end,
> +				start = 0, nblocks = 0;
> +	u64 segnum, end, minlen, trimmed = 0;
> +	int ret = 0;
> +	unsigned int sect_size, sects_per_block;
> +
> +	sect_size = bdev_logical_block_size(nilfs->ns_bdev);
> +	sects_per_block = (1 << nilfs->ns_blocksize_bits) / sect_size;

> +	real_start = (range->start + sect_size - 1) / sect_size;
> +	real_end = (range->start + range->len) / sect_size;

Why not use start_sect, end_sect instead of real_start, real_end?
real_{start,end} are not intuitive to me.

We need to use do_div() for these divisions, and DIV_ROUND_UP_ULL()
macro should be applied to round up the start sector.

Moreover, we have to avoid overflow in "range->start + range->len".
Actually, range->len is usually set to UULONG_MAX.

So, these will be as follows:

	u64 len = range->len;

	...

	do_div(len, sect_size);
	if (!len)
		goto out;

	...
	start_sect = DIV_ROUND_UP_ULL(range->start, sect_size);
	end_sect = start_sect + len - 1;  /* this end_sect is inclusive */

Note that, we also should care about large range->start to avoid
overflow in substitution to start_sect (sector_t) since sector_t may
be 32-bit.  We should check it before the division.

Here, I recant my earlier comment.  We should do the following check
in this function to clarify that the overflow issue is handled
properly.

	u64 max_byte =
		((u64)nilfs->ns_nsegments * nilfs->ns_blocks_per_segments)
			<< nilfs->ns_blocksize_bits;

	...
	if (range.len < nilfs->ns_blocksize || range.start >= max_byte)
		return -EINVAL;
	...
	(divisions)

> +	segnum = nilfs_get_segnum_of_block(nilfs, real_start / sects_per_block);
> +	end = nilfs_get_segnum_of_block(nilfs, ((real_end + sects_per_block - 1)
> +			/ sects_per_block) + nilfs->ns_blocks_per_segment - 1);

It would be better to use the following intermediate variables to
improve readability of these calculations.

And, these calculations need sector_div() and DIV_ROUND_UP_SECTOR_T()
macro:

	start_block = start_sect;
	sector_div(start_block, sects_per_block);

	end_block = DIV_ROUND_UP_SECTOR_T(end_sect, sects_per_block);

	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
	end = nilfs_get_segnum_of_block(
			nilfs, end_block + nilfs->ns_blocks_per_segment - 1);

> +	minlen = range->minlen / sect_size;

And, this one needs do_div():

	minlen = range->minlen;
	do_div(minlen, sect_size);

> +
> +
> +	if (end > nilfs->ns_nsegments)
> +		end = nilfs->ns_nsegments;
> +	if (segnum >= nilfs->ns_nsegments || end <= segnum)
> +		goto out;
> +
> +	down_read(&NILFS_MDT(sufile)->mi_sem);
> +
> +	while (segnum < end) {
> +		n = nilfs_sufile_segment_usages_in_block(sufile, segnum,
> +				end - 1);
> +
> +		ret = nilfs_sufile_get_segment_usage_block(sufile, segnum, 0,
> +							   &su_bh);
> +		if (ret < 0) {
> +			if (ret != -ENOENT)
> +				goto out_sem;
> +			/* hole */
> +			segnum += n;
> +			continue;
> +		}
> +
> +		kaddr = kmap_atomic(su_bh->b_page);
> +		su = nilfs_sufile_block_get_segment_usage(sufile, segnum,
> +				su_bh, kaddr);
> +		for (i = 0; i < n; ++i, ++segnum, su = (void *)su + susz) {
> +			if (!nilfs_segment_usage_clean(su))
> +				continue;
> +
> +			nilfs_get_segment_range(nilfs, segnum, &seg_start,
> +						&seg_end);
> +
> +			if (!nblocks) {
> +				/* start new extent */
> +				start = seg_start;
> +				nblocks = seg_end - seg_start + 1;
> +				continue;
> +			}
> +
> +			if (start + nblocks == seg_start) {
> +				/* add to previous extent */
> +				nblocks += seg_end - seg_start + 1;
> +				continue;
> +			}
> +
> +			/* discard previous extent */
> +			start *= sects_per_block;
> +			nblocks *= sects_per_block;
> +			if (start < real_start) {
> +				nblocks -= real_start - start;
> +				start = real_start;
> +			}

> +			if (start + nblocks > real_end)
> +				nblocks = real_end - start;

Why do you need this adjustment during discarding "previous" extent ?

> +			if (nblocks >= minlen) {
> +				kunmap_atomic(kaddr);
> +
> +				ret = blkdev_issue_discard(nilfs->ns_bdev,
> +						start, nblocks, GFP_NOFS, 0);
> +				if (ret < 0) {
> +					put_bh(su_bh);
> +					goto out_sem;
> +				}
> +
> +				trimmed += nblocks;
> +				kaddr = kmap_atomic(su_bh->b_page);
> +				su = nilfs_sufile_block_get_segment_usage(
> +					sufile, segnum, su_bh, kaddr);
> +			}
> +
> +			/* start new extent */
> +			start = seg_start;
> +			nblocks = seg_end - seg_start + 1;
> +		}
> +		kunmap_atomic(kaddr);
> +		put_bh(su_bh);
> +	}
> +
> +
> +	if (nblocks) {
> +		/* discard last extent */
> +		start *= sects_per_block;
> +		nblocks *= sects_per_block;
> +		if (start < real_start) {
> +			nblocks -= real_start - start;
> +			start = real_start;
> +		}
> +		if (start + nblocks > real_end)
> +			nblocks = real_end - start;
> +
> +		if (nblocks >= minlen) {
> +			ret = blkdev_issue_discard(nilfs->ns_bdev, start,
> +					nblocks, GFP_NOFS, 0);
> +			if (!ret)
> +				trimmed += nblocks;
> +		}
> +	}
> +
> +out_sem:
> +	up_read(&NILFS_MDT(sufile)->mi_sem);
> +out:
> +	range->len = trimmed * sect_size;
> +	return ret;
> +}
> +
> +/**
>   * nilfs_sufile_read - read or get sufile inode
>   * @sb: super block instance
>   * @susize: size of a segment usage entry
> diff --git a/fs/nilfs2/sufile.h b/fs/nilfs2/sufile.h
> index e84bc5b..2434abd 100644
> --- a/fs/nilfs2/sufile.h
> +++ b/fs/nilfs2/sufile.h
> @@ -65,6 +65,7 @@ void nilfs_sufile_do_set_error(struct inode *, __u64, struct buffer_head *,
>  int nilfs_sufile_resize(struct inode *sufile, __u64 newnsegs);
>  int nilfs_sufile_read(struct super_block *sb, size_t susize,
>  		      struct nilfs_inode *raw_inode, struct inode **inodep);
> +int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range);
>  
>  /**
>   * nilfs_sufile_scrap - make a segment garbage
> -- 
> 1.9.0

Please try to compile this patch both for 32-bit kernel and 64-bit
kernel to test if the patch is architecture independent.


Regards,
Ryusuke Konishi
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs
       [not found]         ` <20140219.033733.321498615.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
@ 2014-02-19  6:36           ` Andreas Rohner
       [not found]             ` <5304510A.8080006-hi6Y0CQ0nG0@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Rohner @ 2014-02-19  6:36 UTC (permalink / raw)
  To: Ryusuke Konishi; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

On 2014-02-18 19:37, Ryusuke Konishi wrote:
> On Mon, 17 Feb 2014 23:39:51 +0100, Andreas Rohner wrote:
>> This patch adds the nilfs_sufile_trim_fs function, which takes a
>> fstrim_range structure and calls blkdev_issue_discard for every
>> clean segment in the specified range. The range is truncated to sector
>> boundaries.
>>
>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
>> ---
>>  fs/nilfs2/sufile.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  fs/nilfs2/sufile.h |   1 +
>>  2 files changed, 145 insertions(+)
>>
>> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
>> index 3127e9f..3605cc9 100644
>> --- a/fs/nilfs2/sufile.c
>> +++ b/fs/nilfs2/sufile.c
>> @@ -870,6 +870,150 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf,
>>  }
>>  
>>  /**
>> + * nilfs_sufile_trim_fs() - trim ioctl handle function
>> + * @sufile: inode of segment usage file
>> + * @range: fstrim_range structure
>> + *
>> + * start:	First Byte to trim
>> + * len:		number of Bytes to trim from start
>> + * minlen:	minimum extent length in Bytes
>> + *
>> + * Decription: nilfs_sufile_trim_fs goes through all segments containing bytes
>> + * from start to start+len. start is rounded up to the next sector boundary
>> + * and start+len is rounded down. For each clean segment blkdev_issue_discard
>> + * function is invoked to trim it.
>> + *
>> + * Return Value: On success, 0 is returned or negative error code, otherwise.
>> + */
>> +int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>> +{
>> +	struct the_nilfs *nilfs = sufile->i_sb->s_fs_info;
>> +	struct buffer_head *su_bh;
>> +	struct nilfs_segment_usage *su;
>> +	void *kaddr;
>> +	size_t n, i, susz = NILFS_MDT(sufile)->mi_entry_size;
>> +	sector_t seg_start, seg_end, real_start, real_end,
>> +				start = 0, nblocks = 0;
>> +	u64 segnum, end, minlen, trimmed = 0;
>> +	int ret = 0;
>> +	unsigned int sect_size, sects_per_block;
>> +
>> +	sect_size = bdev_logical_block_size(nilfs->ns_bdev);
>> +	sects_per_block = (1 << nilfs->ns_blocksize_bits) / sect_size;
> 
>> +	real_start = (range->start + sect_size - 1) / sect_size;
>> +	real_end = (range->start + range->len) / sect_size;
> 
> Why not use start_sect, end_sect instead of real_start, real_end?
> real_{start,end} are not intuitive to me.

Yes that looks better.

> We need to use do_div() for these divisions, and DIV_ROUND_UP_ULL()
> macro should be applied to round up the start sector.
> 
> Moreover, we have to avoid overflow in "range->start + range->len".
> Actually, range->len is usually set to UULONG_MAX.

Ah yes I forgot to test that case.

> So, these will be as follows:
> 
> 	u64 len = range->len;
> 
> 	...
> 
> 	do_div(len, sect_size);
> 	if (!len)
> 		goto out;
> 
> 	...
> 	start_sect = DIV_ROUND_UP_ULL(range->start, sect_size);
> 	end_sect = start_sect + len - 1;  /* this end_sect is inclusive */

I don't get why this has to be inclusive. To me this seems to be a
matter of taste. I think it is much easier to reason about this stuff
and more "natural", if start_sect is inclusive and end_sect is
exclusive. Then segnum is inclusive and end is exclusive. It is just
like with pointer arithmetic.

> Note that, we also should care about large range->start to avoid
> overflow in substitution to start_sect (sector_t) since sector_t may
> be 32-bit.  We should check it before the division.
> 
> Here, I recant my earlier comment.  We should do the following check
> in this function to clarify that the overflow issue is handled
> properly.

Ok.

> 	u64 max_byte =
> 		((u64)nilfs->ns_nsegments * nilfs->ns_blocks_per_segments)
> 			<< nilfs->ns_blocksize_bits;
> 
> 	...
> 	if (range.len < nilfs->ns_blocksize || range.start >= max_byte)
> 		return -EINVAL;
> 	...
> 	(divisions)
> 
>> +	segnum = nilfs_get_segnum_of_block(nilfs, real_start / sects_per_block);
>> +	end = nilfs_get_segnum_of_block(nilfs, ((real_end + sects_per_block - 1)
>> +			/ sects_per_block) + nilfs->ns_blocks_per_segment - 1);
> 
> It would be better to use the following intermediate variables to
> improve readability of these calculations.

Ok.

> And, these calculations need sector_div() and DIV_ROUND_UP_SECTOR_T()
> macro:
> 
> 	start_block = start_sect;
> 	sector_div(start_block, sects_per_block);
> 
> 	end_block = DIV_ROUND_UP_SECTOR_T(end_sect, sects_per_block);
> 
> 	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
> 	end = nilfs_get_segnum_of_block(
> 			nilfs, end_block + nilfs->ns_blocks_per_segment - 1);
> 
>> +	minlen = range->minlen / sect_size;
> 
> And, this one needs do_div():
> 
> 	minlen = range->minlen;
> 	do_div(minlen, sect_size);
> 
>> +
>> +
>> +	if (end > nilfs->ns_nsegments)
>> +		end = nilfs->ns_nsegments;
>> +	if (segnum >= nilfs->ns_nsegments || end <= segnum)
>> +		goto out;
>> +
>> +	down_read(&NILFS_MDT(sufile)->mi_sem);
>> +
>> +	while (segnum < end) {
>> +		n = nilfs_sufile_segment_usages_in_block(sufile, segnum,
>> +				end - 1);
>> +
>> +		ret = nilfs_sufile_get_segment_usage_block(sufile, segnum, 0,
>> +							   &su_bh);
>> +		if (ret < 0) {
>> +			if (ret != -ENOENT)
>> +				goto out_sem;
>> +			/* hole */
>> +			segnum += n;
>> +			continue;
>> +		}
>> +
>> +		kaddr = kmap_atomic(su_bh->b_page);
>> +		su = nilfs_sufile_block_get_segment_usage(sufile, segnum,
>> +				su_bh, kaddr);
>> +		for (i = 0; i < n; ++i, ++segnum, su = (void *)su + susz) {
>> +			if (!nilfs_segment_usage_clean(su))
>> +				continue;
>> +
>> +			nilfs_get_segment_range(nilfs, segnum, &seg_start,
>> +						&seg_end);
>> +
>> +			if (!nblocks) {
>> +				/* start new extent */
>> +				start = seg_start;
>> +				nblocks = seg_end - seg_start + 1;
>> +				continue;
>> +			}
>> +
>> +			if (start + nblocks == seg_start) {
>> +				/* add to previous extent */
>> +				nblocks += seg_end - seg_start + 1;
>> +				continue;
>> +			}
>> +
>> +			/* discard previous extent */
>> +			start *= sects_per_block;
>> +			nblocks *= sects_per_block;
>> +			if (start < real_start) {
>> +				nblocks -= real_start - start;
>> +				start = real_start;
>> +			}
> 
>> +			if (start + nblocks > real_end)
>> +				nblocks = real_end - start;
> 
> Why do you need this adjustment during discarding "previous" extent ?

You are right I don't need it.

>> +			if (nblocks >= minlen) {
>> +				kunmap_atomic(kaddr);
>> +
>> +				ret = blkdev_issue_discard(nilfs->ns_bdev,
>> +						start, nblocks, GFP_NOFS, 0);
>> +				if (ret < 0) {
>> +					put_bh(su_bh);
>> +					goto out_sem;
>> +				}
>> +
>> +				trimmed += nblocks;
>> +				kaddr = kmap_atomic(su_bh->b_page);
>> +				su = nilfs_sufile_block_get_segment_usage(
>> +					sufile, segnum, su_bh, kaddr);
>> +			}
>> +
>> +			/* start new extent */
>> +			start = seg_start;
>> +			nblocks = seg_end - seg_start + 1;
>> +		}
>> +		kunmap_atomic(kaddr);
>> +		put_bh(su_bh);
>> +	}
>> +
>> +
>> +	if (nblocks) {
>> +		/* discard last extent */
>> +		start *= sects_per_block;
>> +		nblocks *= sects_per_block;
>> +		if (start < real_start) {
>> +			nblocks -= real_start - start;
>> +			start = real_start;
>> +		}
>> +		if (start + nblocks > real_end)
>> +			nblocks = real_end - start;
>> +
>> +		if (nblocks >= minlen) {
>> +			ret = blkdev_issue_discard(nilfs->ns_bdev, start,
>> +					nblocks, GFP_NOFS, 0);
>> +			if (!ret)
>> +				trimmed += nblocks;
>> +		}
>> +	}
>> +
>> +out_sem:
>> +	up_read(&NILFS_MDT(sufile)->mi_sem);
>> +out:
>> +	range->len = trimmed * sect_size;
>> +	return ret;
>> +}
>> +
>> +/**
>>   * nilfs_sufile_read - read or get sufile inode
>>   * @sb: super block instance
>>   * @susize: size of a segment usage entry
>> diff --git a/fs/nilfs2/sufile.h b/fs/nilfs2/sufile.h
>> index e84bc5b..2434abd 100644
>> --- a/fs/nilfs2/sufile.h
>> +++ b/fs/nilfs2/sufile.h
>> @@ -65,6 +65,7 @@ void nilfs_sufile_do_set_error(struct inode *, __u64, struct buffer_head *,
>>  int nilfs_sufile_resize(struct inode *sufile, __u64 newnsegs);
>>  int nilfs_sufile_read(struct super_block *sb, size_t susize,
>>  		      struct nilfs_inode *raw_inode, struct inode **inodep);
>> +int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range);
>>  
>>  /**
>>   * nilfs_sufile_scrap - make a segment garbage
>> -- 
>> 1.9.0
> 
> Please try to compile this patch both for 32-bit kernel and 64-bit
> kernel to test if the patch is architecture independent.

Ok.

With all the proper division macros it gets very complicated. I think it
would simplify things if we just truncate to block size instead of
sector size. Then we could use simple bit shifts. It would look
something like this:

	if (range->len < nilfs->ns_blocksize ||
			range->start >= max_byte)
		return -EINVAL;
	/* sector_t could be 32 bit */
	if (range->len > max_byte)
		range->len = max_byte;

	sects_per_block = (1 << nilfs->ns_blocksize_bits) /
			bdev_logical_block_size(nilfs->ns_bdev);

	start_block = (range->start + nilfs->ns_blocksize - 1) >>
			nilfs->ns_blocksize_bits;
	end_block = start_block + (range->len >>
				nilfs->ns_blocksize_bits);

	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
	end = nilfs_get_segnum_of_block(nilfs, end_block +
			nilfs->ns_blocks_per_segment - 1);

	minlen = range->minlen >> nilfs->ns_blocksize_bits;


What do you think?

Regards,
Andreas Rohner
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs
       [not found]             ` <5304510A.8080006-hi6Y0CQ0nG0@public.gmane.org>
@ 2014-02-20  8:12               ` Ryusuke Konishi
  0 siblings, 0 replies; 7+ messages in thread
From: Ryusuke Konishi @ 2014-02-20  8:12 UTC (permalink / raw)
  To: Andreas Rohner; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

On Wed, 19 Feb 2014 07:36:58 +0100, Andreas Rohner wrote:
> On 2014-02-18 19:37, Ryusuke Konishi wrote:
>> On Mon, 17 Feb 2014 23:39:51 +0100, Andreas Rohner wrote:
>> 
>> Please try to compile this patch both for 32-bit kernel and 64-bit
>> kernel to test if the patch is architecture independent.
> 
> Ok.
> 
> With all the proper division macros it gets very complicated. I think it
> would simplify things if we just truncate to block size instead of
> sector size. Then we could use simple bit shifts. It would look
> something like this:
> 
> 	if (range->len < nilfs->ns_blocksize ||
> 			range->start >= max_byte)
> 		return -EINVAL;
> 	/* sector_t could be 32 bit */
> 	if (range->len > max_byte)
> 		range->len = max_byte;
> 
> 	sects_per_block = (1 << nilfs->ns_blocksize_bits) /
> 			bdev_logical_block_size(nilfs->ns_bdev);
> 
> 	start_block = (range->start + nilfs->ns_blocksize - 1) >>
> 			nilfs->ns_blocksize_bits;
> 	end_block = start_block + (range->len >>
> 				nilfs->ns_blocksize_bits);
> 
> 	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
> 	end = nilfs_get_segnum_of_block(nilfs, end_block +
> 			nilfs->ns_blocks_per_segment - 1);
> 
> 	minlen = range->minlen >> nilfs->ns_blocksize_bits;
> 
> 
> What do you think?

Yes, sector-based calculations looks a bit complicated.  Your idea,
adjusting positions to fs-block alignment, is reasonable.  I agree
with the approach.

>> So, these will be as follows:
>> 
>> 	u64 len = range->len;
>> 
>> 	...
>> 
>> 	do_div(len, sect_size);
>> 	if (!len)
>> 		goto out;
>> 
>> 	...
>> 	start_sect = DIV_ROUND_UP_ULL(range->start, sect_size);
>> 	end_sect = start_sect + len - 1;  /* this end_sect is inclusive */
> 
> I don't get why this has to be inclusive. To me this seems to be a
> matter of taste. I think it is much easier to reason about this stuff
> and more "natural", if start_sect is inclusive and end_sect is
> exclusive. Then segnum is inclusive and end is exclusive. It is just
> like with pointer arithmetic.

Yes, it is implementation matter (taste).  The above comment just
notice that the result of the calculation is inclusive which differs
from the original meaning.  What I hope there, is that usage of
variables (e.g. the edge is inclusive or exclusive) is taken care to
avoid confusion.

Regards,
Ryusuke Konishi
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 0/2] nilfs2: add support for FITRIM ioctl
@ 2014-02-15 13:34 Andreas Rohner
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Rohner @ 2014-02-15 13:34 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Andreas Rohner

Hi,

This patch adds support for the FITRIM ioctl, which allows user space 
tools like fstrim to issue TRIM/DISCARD requests to the underlying 
device. It takes a fstrim_range structure as a parameter and for every 
clean segment in the specified range the function blkdev_issue_discard 
is called.

Best regards,
Andreas Rohner

---
Andreas Rohner (2):
  nilfs2: add nilfs_sufile_trim_fs to trim clean segs
  nilfs2: add FITRIM ioctl support for nilfs2

 fs/nilfs2/ioctl.c  |  46 +++++++++++++++++++++
 fs/nilfs2/sufile.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/nilfs2/sufile.h |   1 +
 3 files changed, 161 insertions(+)

-- 
1.8.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-02-20  8:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-17 22:39 [PATCH 0/2] nilfs2: add support for FITRIM ioctl Andreas Rohner
     [not found] ` <cover.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 22:39   ` [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs Andreas Rohner
     [not found]     ` <0ba40f9a0504b2a228e0714032f82556e03b927e.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-02-18 18:37       ` Ryusuke Konishi
     [not found]         ` <20140219.033733.321498615.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-19  6:36           ` Andreas Rohner
     [not found]             ` <5304510A.8080006-hi6Y0CQ0nG0@public.gmane.org>
2014-02-20  8:12               ` Ryusuke Konishi
2014-02-17 22:39   ` [PATCH 2/2] nilfs2: add FITRIM ioctl support for nilfs2 Andreas Rohner
  -- strict thread matches above, loose matches on Subject: below --
2014-02-15 13:34 [PATCH 0/2] nilfs2: add support for FITRIM ioctl Andreas Rohner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.