linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
@ 2015-03-22  8:14 Ming Lei
  2015-03-22  8:14 ` [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page) Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Ming Lei @ 2015-03-22  8:14 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel; +Cc: Christoph Hellwig, Al Viro, Maxim Patlasov

This patchset uses VFS ITER_BVEC for reading/writing
loop backing file, and basically it is a rewriting of
Christoph Hellwig's post in [1].

With this change, loop code becomes much cleaner than before,
and it is also a simplication for reading/writing backing file
inside kernel.

Another benifit is that one extra page copy is avoided for
READ in case of none_transfer. 

This patchset passes xfstest test(ext4, -g auto) over loop block
devices. 

The patch 3/3 depends a bit on Christoph's kiocb split patches
which have been in -next for a while, so tree dependency(not build
depend, just behaviour depend) should be considered if it is to
be merged to 4.0. 

The next step is to add AIO/O_DIRECT support for reading/writing
backing file against this patchset if no one objects, so that double
cache can be avoided.

[1], loop: convert to vfs_bvec_write
http://marc.info/?l=linux-fsdevel&m=142159370107007&w=2

[2], kiocb split patchset
http://marc.info/?t=142238145800004&r=1&w=2

Thanks,
Ming Lei


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page)
  2015-03-22  8:14 [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Ming Lei
@ 2015-03-22  8:14 ` Ming Lei
  2015-03-24 10:29   ` Christoph Hellwig
  2015-03-22  8:14 ` [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2015-03-22  8:14 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: Christoph Hellwig, Al Viro, Maxim Patlasov, Ming Lei

The raw page allocated in lo_send() can be a highmem page,
so kmap(page) should be used for read/write on the page.

Also the patch removes kmap()/kunmap() in lo_send() because
it isn't needed at all.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/block/loop.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d1f168b..965f117 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -275,10 +275,13 @@ static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
 {
 	int ret = lo_do_transfer(lo, WRITE, page, 0, bvec->bv_page,
 			bvec->bv_offset, bvec->bv_len, pos >> 9);
-	if (likely(!ret))
-		return __do_lo_send_write(lo->lo_backing_file,
-				page_address(page), bvec->bv_len,
+	if (likely(!ret)) {
+		ret =  __do_lo_send_write(lo->lo_backing_file,
+				kmap(page), bvec->bv_len,
 				pos);
+		kunmap(page);
+		return ret;
+	}
 	printk_ratelimited(KERN_ERR "loop: Transfer error at byte offset %llu, "
 			"length %i.\n", (unsigned long long)pos, bvec->bv_len);
 	if (ret > 0)
@@ -299,7 +302,6 @@ static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
 		page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
 		if (unlikely(!page))
 			goto fail;
-		kmap(page);
 		do_lo_send = do_lo_send_write;
 	} else {
 		do_lo_send = do_lo_send_direct_write;
@@ -312,7 +314,6 @@ static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
 		pos += bvec.bv_len;
 	}
 	if (page) {
-		kunmap(page);
 		__free_page(page);
 	}
 out:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive
  2015-03-22  8:14 [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Ming Lei
  2015-03-22  8:14 ` [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page) Ming Lei
@ 2015-03-22  8:14 ` Ming Lei
  2015-03-24 10:30   ` Christoph Hellwig
  2015-03-22  8:14 ` [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file Ming Lei
  2015-03-24 10:32 ` [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Christoph Hellwig
  3 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2015-03-22  8:14 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: Christoph Hellwig, Al Viro, Maxim Patlasov, Ming Lei

Preparing for conversion to vfs iterator based read/write, also
pass 'loop_cmd' instead of 'request' so that rw common stuff
can be put into loop_cmd easily.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/block/loop.c |   29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 965f117..c082cf7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -289,7 +289,7 @@ static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
 	return ret;
 }
 
-static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
+static int lo_send(struct loop_device *lo, struct loop_cmd *cmd, loff_t pos)
 {
 	int (*do_lo_send)(struct loop_device *, struct bio_vec *, loff_t,
 			struct page *page);
@@ -297,6 +297,7 @@ static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
 	struct req_iterator iter;
 	struct page *page = NULL;
 	int ret = 0;
+	struct request *rq = cmd->rq;
 
 	if (lo->transfer != transfer_none) {
 		page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
@@ -394,11 +395,13 @@ do_lo_receive(struct loop_device *lo,
 }
 
 static int
-lo_receive(struct loop_device *lo, struct request *rq, int bsize, loff_t pos)
+lo_receive(struct loop_device *lo, struct loop_cmd *cmd, loff_t pos)
 {
 	struct bio_vec bvec;
 	struct req_iterator iter;
 	ssize_t s;
+	struct request *rq = cmd->rq;
+	int	bsize = lo->lo_blocksize;
 
 	rq_for_each_segment(bvec, rq, iter) {
 		s = do_lo_receive(lo, &bvec, bsize, pos);
@@ -451,10 +454,24 @@ static int lo_req_flush(struct loop_device *lo, struct request *rq)
 	return ret;
 }
 
-static int do_req_filebacked(struct loop_device *lo, struct request *rq)
+static inline int lo_rw(struct loop_device *lo, struct loop_cmd *cmd,
+		loff_t pos, int rw)
+{
+	int ret;
+
+	if (rw == READ)
+		ret = lo_receive(lo, cmd, pos);
+	else
+		ret = lo_send(lo, cmd, pos);
+
+	return ret;
+}
+
+static int do_req_filebacked(struct loop_device *lo, struct loop_cmd *cmd)
 {
 	loff_t pos;
 	int ret;
+	struct request *rq = cmd->rq;
 
 	pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
 
@@ -464,9 +481,9 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
 		else if (rq->cmd_flags & REQ_DISCARD)
 			ret = lo_discard(lo, rq, pos);
 		else
-			ret = lo_send(lo, rq, pos);
+			ret = lo_rw(lo, cmd, pos, WRITE);
 	} else
-		ret = lo_receive(lo, rq, lo->lo_blocksize, pos);
+		ret = lo_rw(lo, cmd, pos, READ);
 
 	return ret;
 }
@@ -1514,7 +1531,7 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
 	if (write && (lo->lo_flags & LO_FLAGS_READ_ONLY))
 		goto failed;
 
-	ret = do_req_filebacked(lo, cmd->rq);
+	ret = do_req_filebacked(lo, cmd);
 
  failed:
 	if (ret)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file
  2015-03-22  8:14 [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Ming Lei
  2015-03-22  8:14 ` [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page) Ming Lei
  2015-03-22  8:14 ` [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive Ming Lei
@ 2015-03-22  8:14 ` Ming Lei
  2015-03-24 10:31   ` Christoph Hellwig
  2015-03-24 10:32 ` [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Christoph Hellwig
  3 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2015-03-22  8:14 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: Christoph Hellwig, Al Viro, Maxim Patlasov, Ming Lei

Now loop code gets simplified a lot, and becomes more clean.

Also one extra page copy is avoided for READ in case of none
transfer.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/block/loop.c |  285 ++++++++++++++++++++++----------------------------
 drivers/block/loop.h |    3 +
 2 files changed, 127 insertions(+), 161 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index c082cf7..f3c470a 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -75,6 +75,7 @@
 #include <linux/sysfs.h>
 #include <linux/miscdevice.h>
 #include <linux/falloc.h>
+#include <linux/uio.h>
 #include "loop.h"
 
 #include <asm/uaccess.h>
@@ -87,26 +88,51 @@ static int part_shift;
 
 static struct workqueue_struct *loop_wq;
 
-/*
- * Transfer functions
- */
-static int transfer_none(struct loop_device *lo, int cmd,
-			 struct page *raw_page, unsigned raw_off,
-			 struct page *loop_page, unsigned loop_off,
-			 int size, sector_t real_block)
+struct ibvec_rw_data {
+	struct bio_vec *bvec;
+	unsigned long nr_segs;
+	size_t count;
+	int rw;
+	loff_t pos;
+};
+
+static ssize_t vfs_ibvec_rw(struct loop_device *lo, struct ibvec_rw_data *data)
 {
-	char *raw_buf = kmap_atomic(raw_page) + raw_off;
-	char *loop_buf = kmap_atomic(loop_page) + loop_off;
+	struct iov_iter iter;
+	struct file *file = lo->lo_backing_file;
+
+	iov_iter_bvec(&iter, ITER_BVEC | data->rw, data->bvec,
+		      data->nr_segs, data->count);
 
-	if (cmd == READ)
-		memcpy(loop_buf, raw_buf, size);
+	if (data->rw == READ)
+		return vfs_iter_read(file, &iter, &data->pos);
 	else
-		memcpy(raw_buf, loop_buf, size);
+		return vfs_iter_write(file, &iter, &data->pos);
+}
 
-	kunmap_atomic(loop_buf);
-	kunmap_atomic(raw_buf);
-	cond_resched();
-	return 0;
+static ssize_t vfs_rw(struct loop_device *lo, struct ibvec_rw_data *data)
+{
+	char *buf;
+	struct bio_vec *bvec;
+	struct file *file;
+	int ret;
+
+	if (lo->vfs_rw_iter)
+		return vfs_ibvec_rw(lo, data);
+
+	/* fallback to vfs_read and vfs_write */
+	BUG_ON(data->nr_segs != 1);
+
+	file = lo->lo_backing_file;
+	bvec = data->bvec;
+	buf = kmap(bvec->bv_page) + bvec->bv_offset;
+
+	if (data->rw == READ)
+		ret = vfs_read(file, buf, bvec->bv_len, &data->pos);
+	else
+		ret = vfs_write(file, buf, bvec->bv_len, &data->pos);
+	kunmap(bvec->bv_page);
+	return ret;
 }
 
 static int transfer_xor(struct loop_device *lo, int cmd,
@@ -147,7 +173,6 @@ static int xor_init(struct loop_device *lo, const struct loop_info64 *info)
 
 static struct loop_func_table none_funcs = {
 	.number = LO_CRYPT_NONE,
-	.transfer = transfer_none,
 }; 	
 
 static struct loop_func_table xor_funcs = {
@@ -214,74 +239,45 @@ lo_do_transfer(struct loop_device *lo, int cmd,
 	       struct page *lpage, unsigned loffs,
 	       int size, sector_t rblock)
 {
-	if (unlikely(!lo->transfer))
-		return 0;
-
 	return lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
 }
 
 /**
- * __do_lo_send_write - helper for writing data to a loop device
- *
- * This helper just factors out common code between do_lo_send_direct_write()
- * and do_lo_send_write().
- */
-static int __do_lo_send_write(struct file *file,
-		u8 *buf, const int len, loff_t pos)
-{
-	ssize_t bw;
-	mm_segment_t old_fs = get_fs();
-
-	file_start_write(file);
-	set_fs(get_ds());
-	bw = file->f_op->write(file, buf, len, &pos);
-	set_fs(old_fs);
-	file_end_write(file);
-	if (likely(bw == len))
-		return 0;
-	printk_ratelimited(KERN_ERR "loop: Write error at byte offset %llu, length %i.\n",
-			(unsigned long long)pos, len);
-	if (bw >= 0)
-		bw = -EIO;
-	return bw;
-}
-
-/**
- * do_lo_send_direct_write - helper for writing data to a loop device
+ * do_lo_send_write - helper for writing data to a loop device
  *
- * This is the fast, non-transforming version that does not need double
- * buffering.
  */
-static int do_lo_send_direct_write(struct loop_device *lo,
-		struct bio_vec *bvec, loff_t pos, struct page *page)
+static ssize_t do_lo_send_write(struct loop_device *lo,
+				struct loop_cmd *cmd,
+				struct bio_vec *bvec, loff_t pos)
 {
-	ssize_t bw = __do_lo_send_write(lo->lo_backing_file,
-			kmap(bvec->bv_page) + bvec->bv_offset,
-			bvec->bv_len, pos);
-	kunmap(bvec->bv_page);
-	cond_resched();
-	return bw;
-}
+	ssize_t ret;
+	struct ibvec_rw_data data;
+	struct bio_vec  r_bvec = *bvec;
+	struct page *r_page = cmd->trans_page;
+
+	if (r_page != NULL) {
+		ret = lo_do_transfer(lo, WRITE, r_page, 0,
+				     bvec->bv_page, bvec->bv_offset,
+				     bvec->bv_len, pos >> 9);
+		if (unlikely(ret))
+			goto fail;
 
-/**
- * do_lo_send_write - helper for writing data to a loop device
- *
- * This is the slow, transforming version that needs to double buffer the
- * data as it cannot do the transformations in place without having direct
- * access to the destination pages of the backing file.
- */
-static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
-		loff_t pos, struct page *page)
-{
-	int ret = lo_do_transfer(lo, WRITE, page, 0, bvec->bv_page,
-			bvec->bv_offset, bvec->bv_len, pos >> 9);
-	if (likely(!ret)) {
-		ret =  __do_lo_send_write(lo->lo_backing_file,
-				kmap(page), bvec->bv_len,
-				pos);
-		kunmap(page);
-		return ret;
+		r_bvec.bv_page = r_page;
+		r_bvec.bv_offset = 0;
 	}
+
+	data.bvec = &r_bvec;
+	data.count = bvec->bv_len;
+	data.pos = pos;
+	data.nr_segs = 1;
+	data.rw = WRITE;
+
+	ret = vfs_rw(lo, &data);
+	if (ret < 0)
+		goto fail;
+	return ret;
+
+ fail:
 	printk_ratelimited(KERN_ERR "loop: Transfer error at byte offset %llu, "
 			"length %i.\n", (unsigned long long)pos, bvec->bv_len);
 	if (ret > 0)
@@ -289,108 +285,64 @@ static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
 	return ret;
 }
 
-static int lo_send(struct loop_device *lo, struct loop_cmd *cmd, loff_t pos)
+static ssize_t lo_send(struct loop_device *lo, struct loop_cmd *cmd,
+		       loff_t pos)
 {
-	int (*do_lo_send)(struct loop_device *, struct bio_vec *, loff_t,
-			struct page *page);
 	struct bio_vec bvec;
 	struct req_iterator iter;
-	struct page *page = NULL;
 	int ret = 0;
 	struct request *rq = cmd->rq;
 
-	if (lo->transfer != transfer_none) {
-		page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
-		if (unlikely(!page))
-			goto fail;
-		do_lo_send = do_lo_send_write;
-	} else {
-		do_lo_send = do_lo_send_direct_write;
-	}
-
 	rq_for_each_segment(bvec, rq, iter) {
-		ret = do_lo_send(lo, &bvec, pos, page);
+		ret = do_lo_send_write(lo, cmd, &bvec, pos);
 		if (ret < 0)
 			break;
 		pos += bvec.bv_len;
 	}
-	if (page) {
-		__free_page(page);
-	}
-out:
-	return ret;
-fail:
-	printk_ratelimited(KERN_ERR "loop: Failed to allocate temporary page for write.\n");
-	ret = -ENOMEM;
-	goto out;
-}
-
-struct lo_read_data {
-	struct loop_device *lo;
-	struct page *page;
-	unsigned offset;
-	int bsize;
-};
-
-static int
-lo_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
-		struct splice_desc *sd)
-{
-	struct lo_read_data *p = sd->u.data;
-	struct loop_device *lo = p->lo;
-	struct page *page = buf->page;
-	sector_t IV;
-	int size;
-
-	IV = ((sector_t) page->index << (PAGE_CACHE_SHIFT - 9)) +
-							(buf->offset >> 9);
-	size = sd->len;
-	if (size > p->bsize)
-		size = p->bsize;
-
-	if (lo_do_transfer(lo, READ, page, buf->offset, p->page, p->offset, size, IV)) {
-		printk_ratelimited(KERN_ERR "loop: transfer error block %ld\n",
-		       page->index);
-		size = -EINVAL;
-	}
 
-	flush_dcache_page(p->page);
-
-	if (size > 0)
-		p->offset += size;
-
-	return size;
-}
-
-static int
-lo_direct_splice_actor(struct pipe_inode_info *pipe, struct splice_desc *sd)
-{
-	return __splice_from_pipe(pipe, sd, lo_splice_actor);
+	if (ret > 0)
+		ret = 0;
+	return ret;
 }
 
 static ssize_t
-do_lo_receive(struct loop_device *lo,
-	      struct bio_vec *bvec, int bsize, loff_t pos)
+do_lo_receive(struct loop_device *lo, struct loop_cmd *cmd,
+	      struct bio_vec *bvec, loff_t pos)
 {
-	struct lo_read_data cookie;
-	struct splice_desc sd;
-	struct file *file;
 	ssize_t retval;
+	struct ibvec_rw_data data;
+	struct bio_vec  r_bvec = *bvec;
+	bool trans = false;
+
+	if (cmd->trans_page != NULL) {
+		r_bvec.bv_page = cmd->trans_page;
+		r_bvec.bv_offset = 0;
+		trans = true;
+	}
 
-	cookie.lo = lo;
-	cookie.page = bvec->bv_page;
-	cookie.offset = bvec->bv_offset;
-	cookie.bsize = bsize;
-
-	sd.len = 0;
-	sd.total_len = bvec->bv_len;
-	sd.flags = 0;
-	sd.pos = pos;
-	sd.u.data = &cookie;
+	data.bvec = &r_bvec;
+	data.count = r_bvec.bv_len;
+	data.pos = pos;
+	data.nr_segs = 1;
+	data.rw = READ;
 
-	file = lo->lo_backing_file;
-	retval = splice_direct_to_actor(file, &sd, lo_direct_splice_actor);
+	retval = vfs_rw(lo, &data);
+	if (retval < 0)
+		goto out;
 
+	if (trans) {
+		retval = lo_do_transfer(lo, READ, r_bvec.bv_page, 0,
+				bvec->bv_page, bvec->bv_offset, retval,
+				pos >> 9);
+		if (retval < 0)
+			goto out;
+		flush_dcache_page(bvec->bv_page);
+	}
+out:
+	if (retval < 0)
+		printk_ratelimited(KERN_ERR "loop: transfer error block "
+				   "%lld ret=%ld\n", pos >> 9,
+				   (long)retval);
 	return retval;
 }
 
@@ -401,10 +353,9 @@ lo_receive(struct loop_device *lo, struct loop_cmd *cmd, loff_t pos)
 	struct req_iterator iter;
 	ssize_t s;
 	struct request *rq = cmd->rq;
-	int	bsize = lo->lo_blocksize;
 
 	rq_for_each_segment(bvec, rq, iter) {
-		s = do_lo_receive(lo, &bvec, bsize, pos);
+		s = do_lo_receive(lo, cmd, &bvec, pos);
 		if (s < 0)
 			return s;
 
@@ -458,12 +409,22 @@ static inline int lo_rw(struct loop_device *lo, struct loop_cmd *cmd,
 		loff_t pos, int rw)
 {
 	int ret;
+	struct page *page = NULL;
+
+	if (lo->transfer != NULL) {
+		page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
+		if (unlikely(!page))
+			return -ENOMEM;
+	}
+	cmd->trans_page = page;
 
 	if (rw == READ)
 		ret = lo_receive(lo, cmd, pos);
 	else
 		ret = lo_send(lo, cmd, pos);
 
+	if (cmd->trans_page)
+		__free_page(cmd->trans_page);
 	return ret;
 }
 
@@ -804,7 +765,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	lo->lo_device = bdev;
 	lo->lo_flags = lo_flags;
 	lo->lo_backing_file = file;
-	lo->transfer = transfer_none;
+	lo->transfer = NULL;
 	lo->ioctl = NULL;
 	lo->lo_sizelimit = 0;
 	lo->old_gfp_mask = mapping_gfp_mask(mapping);
@@ -813,6 +774,8 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	if (!(lo_flags & LO_FLAGS_READ_ONLY) && file->f_op->fsync)
 		blk_queue_flush(lo->lo_queue, REQ_FLUSH);
 
+	lo->vfs_rw_iter = file->f_op->read_iter && file->f_op->write_iter;
+
 	set_capacity(lo->lo_disk, size);
 	bd_set_size(bdev, size << 9);
 	loop_sysfs_init(lo);
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index 301c27f..981c8b9 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -63,12 +63,15 @@ struct loop_device {
 	struct request_queue	*lo_queue;
 	struct blk_mq_tag_set	tag_set;
 	struct gendisk		*lo_disk;
+
+	bool			vfs_rw_iter;
 };
 
 struct loop_cmd {
 	struct work_struct read_work;
 	struct request *rq;
 	struct list_head list;
+	struct page *trans_page;	/* only for encrypted transfer */
 };
 
 /* Support for loadable transfer modules */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page)
  2015-03-22  8:14 ` [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page) Ming Lei
@ 2015-03-24 10:29   ` Christoph Hellwig
  2015-03-24 10:49     ` Ming Lei
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-24 10:29 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, Christoph Hellwig, Al Viro, Maxim Patlasov

On Sun, Mar 22, 2015 at 04:14:52PM +0800, Ming Lei wrote:
> The raw page allocated in lo_send() can be a highmem page,
> so kmap(page) should be used for read/write on the page.
> 
> Also the patch removes kmap()/kunmap() in lo_send() because
> it isn't needed at all.

The description seems odd, what it seems to do is to shift the
kmap from lo_send into do_lo_send_write.  This does not seem
very useful as it now means a kmap per segment instead of just
a single one per I/O.  What would make more sense is to just
not allocate a highmem page.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive
  2015-03-22  8:14 ` [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive Ming Lei
@ 2015-03-24 10:30   ` Christoph Hellwig
  2015-03-24 10:55     ` Ming Lei
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-24 10:30 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, Christoph Hellwig, Al Viro, Maxim Patlasov

This just adds code for no obvious benefit..

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file
  2015-03-22  8:14 ` [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file Ming Lei
@ 2015-03-24 10:31   ` Christoph Hellwig
  2015-03-24 11:01     ` Ming Lei
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-24 10:31 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, Christoph Hellwig, Al Viro, Maxim Patlasov

On Sun, Mar 22, 2015 at 04:14:54PM +0800, Ming Lei wrote:
> Now loop code gets simplified a lot, and becomes more clean.
> 
> Also one extra page copy is avoided for READ in case of none
> transfer.

I really don't like the mess with the wrappers to pretent that read/write
work the same, and the ibvec_rw_data structure doesn't exactly help
readability either.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-22  8:14 [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Ming Lei
                   ` (2 preceding siblings ...)
  2015-03-22  8:14 ` [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file Ming Lei
@ 2015-03-24 10:32 ` Christoph Hellwig
  2015-03-24 10:53   ` Ming Lei
  3 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-24 10:32 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, Christoph Hellwig, Al Viro, Maxim Patlasov

On Sun, Mar 22, 2015 at 04:14:51PM +0800, Ming Lei wrote:
> This patchset uses VFS ITER_BVEC for reading/writing
> loop backing file, and basically it is a rewriting of
> Christoph Hellwig's post in [1].

What is the benefit over my older version?  From all I can see the
code is a lot more complicated without any obvious benefit.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page)
  2015-03-24 10:29   ` Christoph Hellwig
@ 2015-03-24 10:49     ` Ming Lei
  2015-03-24 11:09       ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2015-03-24 10:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 6:29 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Sun, Mar 22, 2015 at 04:14:52PM +0800, Ming Lei wrote:
>> The raw page allocated in lo_send() can be a highmem page,
>> so kmap(page) should be used for read/write on the page.
>>
>> Also the patch removes kmap()/kunmap() in lo_send() because
>> it isn't needed at all.
>
> The description seems odd, what it seems to do is to shift the
> kmap from lo_send into do_lo_send_write.  This does not seem

Looks I ignore the fact which kmap() can setup page_address(),
so this change isn't needed, please ignore the patch.

> very useful as it now means a kmap per segment instead of just
> a single one per I/O.  What would make more sense is to just
> not allocate a highmem page.

On 32bit system, this still may cause pressure on page allocation, since
requests are handled concurrently now after using workqueue.

But it is a good idea to keep kmap() per segment, and will do that in v2.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-24 10:32 ` [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Christoph Hellwig
@ 2015-03-24 10:53   ` Ming Lei
  2015-03-24 18:01     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2015-03-24 10:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 6:32 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Sun, Mar 22, 2015 at 04:14:51PM +0800, Ming Lei wrote:
>> This patchset uses VFS ITER_BVEC for reading/writing
>> loop backing file, and basically it is a rewriting of
>> Christoph Hellwig's post in [1].
>
> What is the benefit over my older version?  From all I can see the
> code is a lot more complicated without any obvious benefit.

You old code can't be applied at all because it isn't against
latest loop change.

If you have better one, please just ignore this patchset.

Thanks,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive
  2015-03-24 10:30   ` Christoph Hellwig
@ 2015-03-24 10:55     ` Ming Lei
  0 siblings, 0 replies; 20+ messages in thread
From: Ming Lei @ 2015-03-24 10:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 6:30 PM, Christoph Hellwig <hch@lst.de> wrote:
> This just adds code for no obvious benefit..

If you see the patch 3, you will find lo_rw() can handle transfer page
for both read and write, so there is benefit.

Thanks,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file
  2015-03-24 10:31   ` Christoph Hellwig
@ 2015-03-24 11:01     ` Ming Lei
  0 siblings, 0 replies; 20+ messages in thread
From: Ming Lei @ 2015-03-24 11:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 6:31 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Sun, Mar 22, 2015 at 04:14:54PM +0800, Ming Lei wrote:
>> Now loop code gets simplified a lot, and becomes more clean.
>>
>> Also one extra page copy is avoided for READ in case of none
>> transfer.
>
> I really don't like the mess with the wrappers to pretent that read/write
> work the same,

Could you comment on code about the mess?

Actually this patch tried to move common things between read/write
out of read() and write() compared before.

> and the ibvec_rw_data structure doesn't exactly help
> readability either.

This can be removed if you think lots of parameters are better.

Thanks,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page)
  2015-03-24 10:49     ` Ming Lei
@ 2015-03-24 11:09       ` Christoph Hellwig
  2015-03-24 11:24         ` Ming Lei
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-24 11:09 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 06:49:00PM +0800, Ming Lei wrote:
> On 32bit system, this still may cause pressure on page allocation, since
> requests are handled concurrently now after using workqueue.

kmap slots are from a much smaller resource pool than non-highmem
allocations, so if a page needs to stay kmaped from allocation time until
freeing it it should just be allocated as a kernel mapped page.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page)
  2015-03-24 11:09       ` Christoph Hellwig
@ 2015-03-24 11:24         ` Ming Lei
  0 siblings, 0 replies; 20+ messages in thread
From: Ming Lei @ 2015-03-24 11:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 7:09 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Tue, Mar 24, 2015 at 06:49:00PM +0800, Ming Lei wrote:
>> On 32bit system, this still may cause pressure on page allocation, since
>> requests are handled concurrently now after using workqueue.
>
> kmap slots are from a much smaller resource pool than non-highmem
> allocations, so if a page needs to stay kmaped from allocation time until
> freeing it it should just be allocated as a kernel mapped page.

Yeah, that makes sense to use non-highmem.

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-24 10:53   ` Ming Lei
@ 2015-03-24 18:01     ` Christoph Hellwig
  2015-03-25  7:23       ` Ming Lei
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-24 18:01 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Tue, Mar 24, 2015 at 06:53:07PM +0800, Ming Lei wrote:
> If you have better one, please just ignore this patchset.

Here is a straight forward port of my old one.  It's got light testing but
still could use a better description and a split up:

---
>From a45b5053db5908954d340b05ef8a23e1ca43010a Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 16 Jan 2015 15:06:03 +0100
Subject: loop: convert to vfs_iter_read/write

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c | 291 +++++++++++++++++++++------------------------------
 1 file changed, 119 insertions(+), 172 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d1f168b..68974b2 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -87,28 +87,6 @@ static int part_shift;
 
 static struct workqueue_struct *loop_wq;
 
-/*
- * Transfer functions
- */
-static int transfer_none(struct loop_device *lo, int cmd,
-			 struct page *raw_page, unsigned raw_off,
-			 struct page *loop_page, unsigned loop_off,
-			 int size, sector_t real_block)
-{
-	char *raw_buf = kmap_atomic(raw_page) + raw_off;
-	char *loop_buf = kmap_atomic(loop_page) + loop_off;
-
-	if (cmd == READ)
-		memcpy(loop_buf, raw_buf, size);
-	else
-		memcpy(raw_buf, loop_buf, size);
-
-	kunmap_atomic(loop_buf);
-	kunmap_atomic(raw_buf);
-	cond_resched();
-	return 0;
-}
-
 static int transfer_xor(struct loop_device *lo, int cmd,
 			struct page *raw_page, unsigned raw_off,
 			struct page *loop_page, unsigned loop_off,
@@ -147,7 +125,6 @@ static int xor_init(struct loop_device *lo, const struct loop_info64 *info)
 
 static struct loop_func_table none_funcs = {
 	.number = LO_CRYPT_NONE,
-	.transfer = transfer_none,
 }; 	
 
 static struct loop_func_table xor_funcs = {
@@ -214,206 +191,169 @@ lo_do_transfer(struct loop_device *lo, int cmd,
 	       struct page *lpage, unsigned loffs,
 	       int size, sector_t rblock)
 {
-	if (unlikely(!lo->transfer))
+	int ret;
+
+	ret = lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
+	if (likely(!ret))
 		return 0;
 
-	return lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
+	printk_ratelimited(KERN_ERR
+		"loop: Transfer error at byte offset %llu, length %i.\n",
+		(unsigned long long)rblock << 9, size);
+	return ret;
 }
 
-/**
- * __do_lo_send_write - helper for writing data to a loop device
- *
- * This helper just factors out common code between do_lo_send_direct_write()
- * and do_lo_send_write().
- */
-static int __do_lo_send_write(struct file *file,
-		u8 *buf, const int len, loff_t pos)
+static int lo_write_bvec(struct file *file, struct bio_vec *bvec, loff_t *ppos)
 {
+	struct iov_iter i;
 	ssize_t bw;
-	mm_segment_t old_fs = get_fs();
+
+	iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
 
 	file_start_write(file);
-	set_fs(get_ds());
-	bw = file->f_op->write(file, buf, len, &pos);
-	set_fs(old_fs);
+	bw = vfs_iter_write(file, &i, ppos);
 	file_end_write(file);
-	if (likely(bw == len))
+
+	if (likely(bw ==  bvec->bv_len))
 		return 0;
-	printk_ratelimited(KERN_ERR "loop: Write error at byte offset %llu, length %i.\n",
-			(unsigned long long)pos, len);
+
+	printk_ratelimited(KERN_ERR
+		"loop: Write error at byte offset %llu, length %i.\n",
+		(unsigned long long)*ppos, bvec->bv_len);
 	if (bw >= 0)
 		bw = -EIO;
 	return bw;
 }
 
-/**
- * do_lo_send_direct_write - helper for writing data to a loop device
- *
- * This is the fast, non-transforming version that does not need double
- * buffering.
- */
-static int do_lo_send_direct_write(struct loop_device *lo,
-		struct bio_vec *bvec, loff_t pos, struct page *page)
+static int lo_write_simple(struct loop_device *lo, struct request *rq,
+		loff_t pos)
 {
-	ssize_t bw = __do_lo_send_write(lo->lo_backing_file,
-			kmap(bvec->bv_page) + bvec->bv_offset,
-			bvec->bv_len, pos);
-	kunmap(bvec->bv_page);
-	cond_resched();
-	return bw;
+	struct bio_vec bvec;
+	struct req_iterator iter;
+	int ret = 0;
+
+	rq_for_each_segment(bvec, rq, iter) {
+		ret = lo_write_bvec(lo->lo_backing_file, &bvec, &pos);
+		if (ret < 0)
+			break;
+		cond_resched();
+	}
+
+	return ret;
 }
 
-/**
- * do_lo_send_write - helper for writing data to a loop device
- *
+/*
  * This is the slow, transforming version that needs to double buffer the
  * data as it cannot do the transformations in place without having direct
  * access to the destination pages of the backing file.
  */
-static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
-		loff_t pos, struct page *page)
-{
-	int ret = lo_do_transfer(lo, WRITE, page, 0, bvec->bv_page,
-			bvec->bv_offset, bvec->bv_len, pos >> 9);
-	if (likely(!ret))
-		return __do_lo_send_write(lo->lo_backing_file,
-				page_address(page), bvec->bv_len,
-				pos);
-	printk_ratelimited(KERN_ERR "loop: Transfer error at byte offset %llu, "
-			"length %i.\n", (unsigned long long)pos, bvec->bv_len);
-	if (ret > 0)
-		ret = -EIO;
-	return ret;
-}
-
-static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
+static int lo_write_transfer(struct loop_device *lo, struct request *rq,
+		loff_t pos)
 {
-	int (*do_lo_send)(struct loop_device *, struct bio_vec *, loff_t,
-			struct page *page);
-	struct bio_vec bvec;
+	struct bio_vec bvec, b;
 	struct req_iterator iter;
-	struct page *page = NULL;
+	struct page *page;
 	int ret = 0;
 
-	if (lo->transfer != transfer_none) {
-		page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
-		if (unlikely(!page))
-			goto fail;
-		kmap(page);
-		do_lo_send = do_lo_send_write;
-	} else {
-		do_lo_send = do_lo_send_direct_write;
-	}
+	page = alloc_page(GFP_NOIO);
+	if (unlikely(!page))
+		return -ENOMEM;
 
 	rq_for_each_segment(bvec, rq, iter) {
-		ret = do_lo_send(lo, &bvec, pos, page);
+		ret = lo_do_transfer(lo, WRITE, page, 0, bvec.bv_page,
+			bvec.bv_offset, bvec.bv_len, pos >> 9);
+		if (unlikely(ret))
+			break;
+
+		b.bv_page = page;
+		b.bv_offset = 0;
+		b.bv_len = bvec.bv_len;
+		ret = lo_write_bvec(lo->lo_backing_file, &b, &pos);
 		if (ret < 0)
 			break;
-		pos += bvec.bv_len;
 	}
-	if (page) {
-		kunmap(page);
-		__free_page(page);
-	}
-out:
+
+	__free_page(page);
 	return ret;
-fail:
-	printk_ratelimited(KERN_ERR "loop: Failed to allocate temporary page for write.\n");
-	ret = -ENOMEM;
-	goto out;
 }
 
-struct lo_read_data {
-	struct loop_device *lo;
-	struct page *page;
-	unsigned offset;
-	int bsize;
-};
+static int lo_read_simple(struct loop_device *lo, struct request *rq,
+		loff_t pos)
+{
+	struct bio_vec bvec;
+	struct req_iterator iter;
+	struct iov_iter i;
+	ssize_t len;
 
-static int
-lo_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
-		struct splice_desc *sd)
-{
-	struct lo_read_data *p = sd->u.data;
-	struct loop_device *lo = p->lo;
-	struct page *page = buf->page;
-	sector_t IV;
-	int size;
-
-	IV = ((sector_t) page->index << (PAGE_CACHE_SHIFT - 9)) +
-							(buf->offset >> 9);
-	size = sd->len;
-	if (size > p->bsize)
-		size = p->bsize;
-
-	if (lo_do_transfer(lo, READ, page, buf->offset, p->page, p->offset, size, IV)) {
-		printk_ratelimited(KERN_ERR "loop: transfer error block %ld\n",
-		       page->index);
-		size = -EINVAL;
-	}
+	rq_for_each_segment(bvec, rq, iter) {
+		iov_iter_bvec(&i, ITER_BVEC, &bvec, 1, bvec.bv_len);
+		len = vfs_iter_read(lo->lo_backing_file, &i, &pos);
+		if (len < 0)
+			return len;
 
-	flush_dcache_page(p->page);
+		flush_dcache_page(bvec.bv_page);
 
-	if (size > 0)
-		p->offset += size;
+		if (len != bvec.bv_len) {
+			struct bio *bio;
 
-	return size;
-}
+			__rq_for_each_bio(bio, rq)
+				zero_fill_bio(bio);
+			break;
+		}
+		cond_resched();
+	}
 
-static int
-lo_direct_splice_actor(struct pipe_inode_info *pipe, struct splice_desc *sd)
-{
-	return __splice_from_pipe(pipe, sd, lo_splice_actor);
+	return 0;
 }
 
-static ssize_t
-do_lo_receive(struct loop_device *lo,
-	      struct bio_vec *bvec, int bsize, loff_t pos)
+static int lo_read_transfer(struct loop_device *lo, struct request *rq,
+		loff_t pos)
 {
-	struct lo_read_data cookie;
-	struct splice_desc sd;
-	struct file *file;
-	ssize_t retval;
-
-	cookie.lo = lo;
-	cookie.page = bvec->bv_page;
-	cookie.offset = bvec->bv_offset;
-	cookie.bsize = bsize;
+	struct bio_vec bvec, b;
+	struct req_iterator iter;
+	struct iov_iter i;
+	struct page *page;
+	ssize_t len;
+	int ret = 0;
 
-	sd.len = 0;
-	sd.total_len = bvec->bv_len;
-	sd.flags = 0;
-	sd.pos = pos;
-	sd.u.data = &cookie;
+	page = alloc_page(GFP_NOIO);
+	if (unlikely(!page))
+		return -ENOMEM;
 
-	file = lo->lo_backing_file;
-	retval = splice_direct_to_actor(file, &sd, lo_direct_splice_actor);
+	rq_for_each_segment(bvec, rq, iter) {
+		loff_t offset = pos;
 
-	return retval;
-}
+		b.bv_page = page;
+		b.bv_offset = 0;
+		b.bv_len = bvec.bv_len;
 
-static int
-lo_receive(struct loop_device *lo, struct request *rq, int bsize, loff_t pos)
-{
-	struct bio_vec bvec;
-	struct req_iterator iter;
-	ssize_t s;
+		iov_iter_bvec(&i, ITER_BVEC, &b, 1, b.bv_len);
+		len = vfs_iter_read(lo->lo_backing_file, &i, &pos);
+		if (len < 0) {
+			ret = len;
+			goto out_free_page;
+		}
 
-	rq_for_each_segment(bvec, rq, iter) {
-		s = do_lo_receive(lo, &bvec, bsize, pos);
-		if (s < 0)
-			return s;
+		ret = lo_do_transfer(lo, READ, page, 0, bvec.bv_page,
+			bvec.bv_offset, len, offset >> 9);
+		if (ret)
+			goto out_free_page;
+	
+		flush_dcache_page(bvec.bv_page);
 
-		if (s != bvec.bv_len) {
+		if (len != bvec.bv_len) {
 			struct bio *bio;
 
 			__rq_for_each_bio(bio, rq)
 				zero_fill_bio(bio);
 			break;
 		}
-		pos += bvec.bv_len;
 	}
-	return 0;
+
+	ret = 0;
+out_free_page:
+	__free_page(page);
+	return ret;
 }
 
 static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
@@ -462,10 +402,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
 			ret = lo_req_flush(lo, rq);
 		else if (rq->cmd_flags & REQ_DISCARD)
 			ret = lo_discard(lo, rq, pos);
+		else if (lo->transfer)
+			ret = lo_write_transfer(lo, rq, pos);
 		else
-			ret = lo_send(lo, rq, pos);
-	} else
-		ret = lo_receive(lo, rq, lo->lo_blocksize, pos);
+			ret = lo_write_simple(lo, rq, pos);
+
+	} else {
+		if (lo->transfer)
+			ret = lo_read_transfer(lo, rq, pos);
+		else
+			ret = lo_read_simple(lo, rq, pos);
+	}
 
 	return ret;
 }
@@ -786,7 +733,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	lo->lo_device = bdev;
 	lo->lo_flags = lo_flags;
 	lo->lo_backing_file = file;
-	lo->transfer = transfer_none;
+	lo->transfer = NULL;
 	lo->ioctl = NULL;
 	lo->lo_sizelimit = 0;
 	lo->old_gfp_mask = mapping_gfp_mask(mapping);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-24 18:01     ` Christoph Hellwig
@ 2015-03-25  7:23       ` Ming Lei
  2015-03-25 15:27         ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Ming Lei @ 2015-03-25  7:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Linux Kernel Mailing List, Al Viro, Maxim Patlasov

On Wed, Mar 25, 2015 at 2:01 AM, Christoph Hellwig <hch@lst.de> wrote:
> On Tue, Mar 24, 2015 at 06:53:07PM +0800, Ming Lei wrote:
>> If you have better one, please just ignore this patchset.
>
> Here is a straight forward port of my old one.  It's got light testing but
> still could use a better description and a split up:
>
> ---
> From a45b5053db5908954d340b05ef8a23e1ca43010a Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Fri, 16 Jan 2015 15:06:03 +0100
> Subject: loop: convert to vfs_iter_read/write
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/block/loop.c | 291 +++++++++++++++++++++------------------------------
>  1 file changed, 119 insertions(+), 172 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index d1f168b..68974b2 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -87,28 +87,6 @@ static int part_shift;
>
>  static struct workqueue_struct *loop_wq;
>
> -/*
> - * Transfer functions
> - */
> -static int transfer_none(struct loop_device *lo, int cmd,
> -                        struct page *raw_page, unsigned raw_off,
> -                        struct page *loop_page, unsigned loop_off,
> -                        int size, sector_t real_block)
> -{
> -       char *raw_buf = kmap_atomic(raw_page) + raw_off;
> -       char *loop_buf = kmap_atomic(loop_page) + loop_off;
> -
> -       if (cmd == READ)
> -               memcpy(loop_buf, raw_buf, size);
> -       else
> -               memcpy(raw_buf, loop_buf, size);
> -
> -       kunmap_atomic(loop_buf);
> -       kunmap_atomic(raw_buf);
> -       cond_resched();
> -       return 0;
> -}
> -
>  static int transfer_xor(struct loop_device *lo, int cmd,
>                         struct page *raw_page, unsigned raw_off,
>                         struct page *loop_page, unsigned loop_off,
> @@ -147,7 +125,6 @@ static int xor_init(struct loop_device *lo, const struct loop_info64 *info)
>
>  static struct loop_func_table none_funcs = {
>         .number = LO_CRYPT_NONE,
> -       .transfer = transfer_none,
>  };
>
>  static struct loop_func_table xor_funcs = {
> @@ -214,206 +191,169 @@ lo_do_transfer(struct loop_device *lo, int cmd,
>                struct page *lpage, unsigned loffs,
>                int size, sector_t rblock)
>  {
> -       if (unlikely(!lo->transfer))
> +       int ret;
> +
> +       ret = lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
> +       if (likely(!ret))
>                 return 0;
>
> -       return lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
> +       printk_ratelimited(KERN_ERR
> +               "loop: Transfer error at byte offset %llu, length %i.\n",
> +               (unsigned long long)rblock << 9, size);
> +       return ret;
>  }
>
> -/**
> - * __do_lo_send_write - helper for writing data to a loop device
> - *
> - * This helper just factors out common code between do_lo_send_direct_write()
> - * and do_lo_send_write().
> - */
> -static int __do_lo_send_write(struct file *file,
> -               u8 *buf, const int len, loff_t pos)
> +static int lo_write_bvec(struct file *file, struct bio_vec *bvec, loff_t *ppos)
>  {
> +       struct iov_iter i;
>         ssize_t bw;
> -       mm_segment_t old_fs = get_fs();
> +
> +       iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
>
>         file_start_write(file);
> -       set_fs(get_ds());
> -       bw = file->f_op->write(file, buf, len, &pos);
> -       set_fs(old_fs);
> +       bw = vfs_iter_write(file, &i, ppos);

This patch moves to support ->read_iter/->write_iter only, which
might cause regression for backing file without defining read/write
iter callback.

>         file_end_write(file);
> -       if (likely(bw == len))
> +
> +       if (likely(bw ==  bvec->bv_len))
>                 return 0;
> -       printk_ratelimited(KERN_ERR "loop: Write error at byte offset %llu, length %i.\n",
> -                       (unsigned long long)pos, len);
> +
> +       printk_ratelimited(KERN_ERR
> +               "loop: Write error at byte offset %llu, length %i.\n",
> +               (unsigned long long)*ppos, bvec->bv_len);
>         if (bw >= 0)
>                 bw = -EIO;
>         return bw;
>  }
>
> -/**
> - * do_lo_send_direct_write - helper for writing data to a loop device
> - *
> - * This is the fast, non-transforming version that does not need double
> - * buffering.
> - */
> -static int do_lo_send_direct_write(struct loop_device *lo,
> -               struct bio_vec *bvec, loff_t pos, struct page *page)
> +static int lo_write_simple(struct loop_device *lo, struct request *rq,
> +               loff_t pos)
>  {
> -       ssize_t bw = __do_lo_send_write(lo->lo_backing_file,
> -                       kmap(bvec->bv_page) + bvec->bv_offset,
> -                       bvec->bv_len, pos);
> -       kunmap(bvec->bv_page);
> -       cond_resched();
> -       return bw;
> +       struct bio_vec bvec;
> +       struct req_iterator iter;
> +       int ret = 0;
> +
> +       rq_for_each_segment(bvec, rq, iter) {
> +               ret = lo_write_bvec(lo->lo_backing_file, &bvec, &pos);
> +               if (ret < 0)
> +                       break;
> +               cond_resched();
> +       }
> +
> +       return ret;
>  }
>
> -/**
> - * do_lo_send_write - helper for writing data to a loop device
> - *
> +/*
>   * This is the slow, transforming version that needs to double buffer the
>   * data as it cannot do the transformations in place without having direct
>   * access to the destination pages of the backing file.
>   */
> -static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
> -               loff_t pos, struct page *page)
> -{
> -       int ret = lo_do_transfer(lo, WRITE, page, 0, bvec->bv_page,
> -                       bvec->bv_offset, bvec->bv_len, pos >> 9);
> -       if (likely(!ret))
> -               return __do_lo_send_write(lo->lo_backing_file,
> -                               page_address(page), bvec->bv_len,
> -                               pos);
> -       printk_ratelimited(KERN_ERR "loop: Transfer error at byte offset %llu, "
> -                       "length %i.\n", (unsigned long long)pos, bvec->bv_len);
> -       if (ret > 0)
> -               ret = -EIO;
> -       return ret;
> -}
> -
> -static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
> +static int lo_write_transfer(struct loop_device *lo, struct request *rq,
> +               loff_t pos)
>  {
> -       int (*do_lo_send)(struct loop_device *, struct bio_vec *, loff_t,
> -                       struct page *page);
> -       struct bio_vec bvec;
> +       struct bio_vec bvec, b;
>         struct req_iterator iter;
> -       struct page *page = NULL;
> +       struct page *page;
>         int ret = 0;
>
> -       if (lo->transfer != transfer_none) {
> -               page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
> -               if (unlikely(!page))
> -                       goto fail;
> -               kmap(page);
> -               do_lo_send = do_lo_send_write;
> -       } else {
> -               do_lo_send = do_lo_send_direct_write;
> -       }
> +       page = alloc_page(GFP_NOIO);
> +       if (unlikely(!page))
> +               return -ENOMEM;
>
>         rq_for_each_segment(bvec, rq, iter) {
> -               ret = do_lo_send(lo, &bvec, pos, page);
> +               ret = lo_do_transfer(lo, WRITE, page, 0, bvec.bv_page,
> +                       bvec.bv_offset, bvec.bv_len, pos >> 9);
> +               if (unlikely(ret))
> +                       break;
> +
> +               b.bv_page = page;
> +               b.bv_offset = 0;
> +               b.bv_len = bvec.bv_len;
> +               ret = lo_write_bvec(lo->lo_backing_file, &b, &pos);
>                 if (ret < 0)
>                         break;
> -               pos += bvec.bv_len;
>         }
> -       if (page) {
> -               kunmap(page);
> -               __free_page(page);
> -       }
> -out:
> +
> +       __free_page(page);
>         return ret;
> -fail:
> -       printk_ratelimited(KERN_ERR "loop: Failed to allocate temporary page for write.\n");
> -       ret = -ENOMEM;
> -       goto out;
>  }
>
> -struct lo_read_data {
> -       struct loop_device *lo;
> -       struct page *page;
> -       unsigned offset;
> -       int bsize;
> -};
> +static int lo_read_simple(struct loop_device *lo, struct request *rq,
> +               loff_t pos)
> +{
> +       struct bio_vec bvec;
> +       struct req_iterator iter;
> +       struct iov_iter i;
> +       ssize_t len;
>
> -static int
> -lo_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
> -               struct splice_desc *sd)
> -{
> -       struct lo_read_data *p = sd->u.data;
> -       struct loop_device *lo = p->lo;
> -       struct page *page = buf->page;
> -       sector_t IV;
> -       int size;
> -
> -       IV = ((sector_t) page->index << (PAGE_CACHE_SHIFT - 9)) +
> -                                                       (buf->offset >> 9);
> -       size = sd->len;
> -       if (size > p->bsize)
> -               size = p->bsize;
> -
> -       if (lo_do_transfer(lo, READ, page, buf->offset, p->page, p->offset, size, IV)) {
> -               printk_ratelimited(KERN_ERR "loop: transfer error block %ld\n",
> -                      page->index);
> -               size = -EINVAL;
> -       }
> +       rq_for_each_segment(bvec, rq, iter) {
> +               iov_iter_bvec(&i, ITER_BVEC, &bvec, 1, bvec.bv_len);
> +               len = vfs_iter_read(lo->lo_backing_file, &i, &pos);
> +               if (len < 0)
> +                       return len;
>
> -       flush_dcache_page(p->page);
> +               flush_dcache_page(bvec.bv_page);
>
> -       if (size > 0)
> -               p->offset += size;
> +               if (len != bvec.bv_len) {
> +                       struct bio *bio;
>
> -       return size;
> -}
> +                       __rq_for_each_bio(bio, rq)
> +                               zero_fill_bio(bio);
> +                       break;
> +               }
> +               cond_resched();
> +       }
>
> -static int
> -lo_direct_splice_actor(struct pipe_inode_info *pipe, struct splice_desc *sd)
> -{
> -       return __splice_from_pipe(pipe, sd, lo_splice_actor);
> +       return 0;
>  }
>
> -static ssize_t
> -do_lo_receive(struct loop_device *lo,
> -             struct bio_vec *bvec, int bsize, loff_t pos)
> +static int lo_read_transfer(struct loop_device *lo, struct request *rq,
> +               loff_t pos)
>  {
> -       struct lo_read_data cookie;
> -       struct splice_desc sd;
> -       struct file *file;
> -       ssize_t retval;
> -
> -       cookie.lo = lo;
> -       cookie.page = bvec->bv_page;
> -       cookie.offset = bvec->bv_offset;
> -       cookie.bsize = bsize;
> +       struct bio_vec bvec, b;
> +       struct req_iterator iter;
> +       struct iov_iter i;
> +       struct page *page;
> +       ssize_t len;
> +       int ret = 0;
>
> -       sd.len = 0;
> -       sd.total_len = bvec->bv_len;
> -       sd.flags = 0;
> -       sd.pos = pos;
> -       sd.u.data = &cookie;
> +       page = alloc_page(GFP_NOIO);
> +       if (unlikely(!page))
> +               return -ENOMEM;

The above page allocation is one code duplication.

>
> -       file = lo->lo_backing_file;
> -       retval = splice_direct_to_actor(file, &sd, lo_direct_splice_actor);
> +       rq_for_each_segment(bvec, rq, iter) {
> +               loff_t offset = pos;
>
> -       return retval;
> -}
> +               b.bv_page = page;
> +               b.bv_offset = 0;
> +               b.bv_len = bvec.bv_len;
>
> -static int
> -lo_receive(struct loop_device *lo, struct request *rq, int bsize, loff_t pos)
> -{
> -       struct bio_vec bvec;
> -       struct req_iterator iter;
> -       ssize_t s;
> +               iov_iter_bvec(&i, ITER_BVEC, &b, 1, b.bv_len);
> +               len = vfs_iter_read(lo->lo_backing_file, &i, &pos);
> +               if (len < 0) {
> +                       ret = len;
> +                       goto out_free_page;
> +               }
>
> -       rq_for_each_segment(bvec, rq, iter) {
> -               s = do_lo_receive(lo, &bvec, bsize, pos);
> -               if (s < 0)
> -                       return s;
> +               ret = lo_do_transfer(lo, READ, page, 0, bvec.bv_page,
> +                       bvec.bv_offset, len, offset >> 9);
> +               if (ret)
> +                       goto out_free_page;
> +
> +               flush_dcache_page(bvec.bv_page);
>
> -               if (s != bvec.bv_len) {
> +               if (len != bvec.bv_len) {
>                         struct bio *bio;
>
>                         __rq_for_each_bio(bio, rq)
>                                 zero_fill_bio(bio);
>                         break;
>                 }

both flush_fdcache_page and zero_fill_bio are code duplication too.

> -               pos += bvec.bv_len;
>         }
> -       return 0;
> +
> +       ret = 0;
> +out_free_page:
> +       __free_page(page);
> +       return ret;
>  }

Free page is code duplication too.

>
>  static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
> @@ -462,10 +402,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
>                         ret = lo_req_flush(lo, rq);
>                 else if (rq->cmd_flags & REQ_DISCARD)
>                         ret = lo_discard(lo, rq, pos);
> +               else if (lo->transfer)
> +                       ret = lo_write_transfer(lo, rq, pos);
>                 else
> -                       ret = lo_send(lo, rq, pos);
> -       } else
> -               ret = lo_receive(lo, rq, lo->lo_blocksize, pos);
> +                       ret = lo_write_simple(lo, rq, pos);
> +
> +       } else {
> +               if (lo->transfer)
> +                       ret = lo_read_transfer(lo, rq, pos);
> +               else
> +                       ret = lo_read_simple(lo, rq, pos);
> +       }

IMO, lo_read/write_transfer and lo_read/write_simple can be
merged to avoid code duplication, since the logic for handling
lo->transfer in read/write is quite simple.

>
>         return ret;
>  }
> @@ -786,7 +733,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
>         lo->lo_device = bdev;
>         lo->lo_flags = lo_flags;
>         lo->lo_backing_file = file;
> -       lo->transfer = transfer_none;
> +       lo->transfer = NULL;
>         lo->ioctl = NULL;
>         lo->lo_sizelimit = 0;
>         lo->old_gfp_mask = mapping_gfp_mask(mapping);
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-25  7:23       ` Ming Lei
@ 2015-03-25 15:27         ` Christoph Hellwig
  2015-03-31 22:22           ` Al Viro
  2015-04-04  5:20           ` Al Viro
  0 siblings, 2 replies; 20+ messages in thread
From: Christoph Hellwig @ 2015-03-25 15:27 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Linux Kernel Mailing List,
	Al Viro, Maxim Patlasov

On Wed, Mar 25, 2015 at 03:23:48PM +0800, Ming Lei wrote:
> > -       mm_segment_t old_fs = get_fs();
> > +
> > +       iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
> >
> >         file_start_write(file);
> > -       set_fs(get_ds());
> > -       bw = file->f_op->write(file, buf, len, &pos);
> > -       set_fs(old_fs);
> > +       bw = vfs_iter_write(file, &i, ppos);
> 
> This patch moves to support ->read_iter/->write_iter only, which
> might cause regression for backing file without defining read/write
> iter callback.

->read_iter/->write_iter is the main fs I/O path - by the time this is
ready ->aio_read/->aio_write should be gone.

> > +       page = alloc_page(GFP_NOIO);
> > +       if (unlikely(!page))
> > +               return -ENOMEM;
> 
> The above page allocation is one code duplication.

A very trivial one, isn't it? :)

> IMO, lo_read/write_transfer and lo_read/write_simple can be
> merged to avoid code duplication, since the logic for handling
> lo->transfer in read/write is quite simple.

If you have a patch that merges them while making the code smaller and
simpler you're welcome.  I wasn't really good enough to come up with
a way that would be an improvement.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-25 15:27         ` Christoph Hellwig
@ 2015-03-31 22:22           ` Al Viro
  2015-04-04  5:20           ` Al Viro
  1 sibling, 0 replies; 20+ messages in thread
From: Al Viro @ 2015-03-31 22:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Jens Axboe, Linux Kernel Mailing List, Maxim Patlasov

On Wed, Mar 25, 2015 at 04:27:48PM +0100, Christoph Hellwig wrote:
> On Wed, Mar 25, 2015 at 03:23:48PM +0800, Ming Lei wrote:
> > > -       mm_segment_t old_fs = get_fs();
> > > +
> > > +       iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
> > >
> > >         file_start_write(file);
> > > -       set_fs(get_ds());
> > > -       bw = file->f_op->write(file, buf, len, &pos);
> > > -       set_fs(old_fs);
> > > +       bw = vfs_iter_write(file, &i, ppos);
> > 
> > This patch moves to support ->read_iter/->write_iter only, which
> > might cause regression for backing file without defining read/write
> > iter callback.
> 
> ->read_iter/->write_iter is the main fs I/O path - by the time this is
> ready ->aio_read/->aio_write should be gone.

Umm...  FWIW, it's not about ->aio_write - that conversion is practically
complete; nothing with ->aio_write != NULL is plausible for backing store
of /dev/loop anyway.  Regression in question is about the files that
have ->write(), but no ->write_iter().  Most of that won't be usable as
backing store, but there are some exceptions:
	* ncpfs regular files.  Ought to be switched to {read,write}_iter -
it's a pretty mechanical work, I'll do that.  Only marginally useful,
but what the hell - why not do it right?
	* coda.  Ought to switch to ->..._iter() and demand _its_ backing
store to have those.  Trivial, will do.
	* 9p.  Cthulhu-awful mess.  The trouble is, we'd need to support
ITER_BVEC ones down in p9_client_write().  I'm actually trying to do that
right now; if it works, this regression will become a non-issue.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-03-25 15:27         ` Christoph Hellwig
  2015-03-31 22:22           ` Al Viro
@ 2015-04-04  5:20           ` Al Viro
  2015-04-07 16:23             ` Christoph Hellwig
  1 sibling, 1 reply; 20+ messages in thread
From: Al Viro @ 2015-04-04  5:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Jens Axboe, Linux Kernel Mailing List, Maxim Patlasov

On Wed, Mar 25, 2015 at 04:27:48PM +0100, Christoph Hellwig wrote:
> On Wed, Mar 25, 2015 at 03:23:48PM +0800, Ming Lei wrote:
> > > -       mm_segment_t old_fs = get_fs();
> > > +
> > > +       iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
> > >
> > >         file_start_write(file);
> > > -       set_fs(get_ds());
> > > -       bw = file->f_op->write(file, buf, len, &pos);
> > > -       set_fs(old_fs);
> > > +       bw = vfs_iter_write(file, &i, ppos);
> > 
> > This patch moves to support ->read_iter/->write_iter only, which
> > might cause regression for backing file without defining read/write
> > iter callback.
> 
> ->read_iter/->write_iter is the main fs I/O path - by the time this is
> ready ->aio_read/->aio_write should be gone.

	So they are.  Folks, could you base /dev/loop stuff on current
vfs.git#for-next?  And yes, everything that might serve as backing file
for /dev/loop has ->write_iter() now.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] block: loop: switch to VFS ITER_BVEC
  2015-04-04  5:20           ` Al Viro
@ 2015-04-07 16:23             ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2015-04-07 16:23 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Ming Lei, Jens Axboe,
	Linux Kernel Mailing List, Maxim Patlasov

On Sat, Apr 04, 2015 at 06:20:05AM +0100, Al Viro wrote:
> 	So they are.  Folks, could you base /dev/loop stuff on current
> vfs.git#for-next?  And yes, everything that might serve as backing file
> for /dev/loop has ->write_iter() now.

Rebased patch below:

---
>From d977f676d9e82b69555d8cd0c76dadb2b2bdc0b7 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 16 Jan 2015 15:06:03 +0100
Subject: loop: convert to use ITER_BVEC reads/writes

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c | 290 +++++++++++++++++++++------------------------------
 1 file changed, 118 insertions(+), 172 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index c4fd1e4..61d8009 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -88,28 +88,6 @@ static int part_shift;
 
 static struct workqueue_struct *loop_wq;
 
-/*
- * Transfer functions
- */
-static int transfer_none(struct loop_device *lo, int cmd,
-			 struct page *raw_page, unsigned raw_off,
-			 struct page *loop_page, unsigned loop_off,
-			 int size, sector_t real_block)
-{
-	char *raw_buf = kmap_atomic(raw_page) + raw_off;
-	char *loop_buf = kmap_atomic(loop_page) + loop_off;
-
-	if (cmd == READ)
-		memcpy(loop_buf, raw_buf, size);
-	else
-		memcpy(raw_buf, loop_buf, size);
-
-	kunmap_atomic(loop_buf);
-	kunmap_atomic(raw_buf);
-	cond_resched();
-	return 0;
-}
-
 static int transfer_xor(struct loop_device *lo, int cmd,
 			struct page *raw_page, unsigned raw_off,
 			struct page *loop_page, unsigned loop_off,
@@ -148,7 +126,6 @@ static int xor_init(struct loop_device *lo, const struct loop_info64 *info)
 
 static struct loop_func_table none_funcs = {
 	.number = LO_CRYPT_NONE,
-	.transfer = transfer_none,
 }; 	
 
 static struct loop_func_table xor_funcs = {
@@ -215,207 +192,169 @@ lo_do_transfer(struct loop_device *lo, int cmd,
 	       struct page *lpage, unsigned loffs,
 	       int size, sector_t rblock)
 {
-	if (unlikely(!lo->transfer))
+	int ret;
+
+	ret = lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
+	if (likely(!ret))
 		return 0;
 
-	return lo->transfer(lo, cmd, rpage, roffs, lpage, loffs, size, rblock);
+	printk_ratelimited(KERN_ERR
+		"loop: Transfer error at byte offset %llu, length %i.\n",
+		(unsigned long long)rblock << 9, size);
+	return ret;
 }
 
-/**
- * __do_lo_send_write - helper for writing data to a loop device
- *
- * This helper just factors out common code between do_lo_send_direct_write()
- * and do_lo_send_write().
- */
-static int __do_lo_send_write(struct file *file,
-		u8 *buf, const int len, loff_t pos)
+static int lo_write_bvec(struct file *file, struct bio_vec *bvec, loff_t *ppos)
 {
-	struct kvec kvec = {.iov_base = buf, .iov_len = len};
-	struct iov_iter from;
+	struct iov_iter i;
 	ssize_t bw;
 
-	iov_iter_kvec(&from, ITER_KVEC | WRITE, &kvec, 1, len);
+	iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
 
 	file_start_write(file);
-	bw = vfs_iter_write(file, &from, &pos);
+	bw = vfs_iter_write(file, &i, ppos);
 	file_end_write(file);
-	if (likely(bw == len))
+
+	if (likely(bw ==  bvec->bv_len))
 		return 0;
-	printk_ratelimited(KERN_ERR "loop: Write error at byte offset %llu, length %i.\n",
-			(unsigned long long)pos, len);
+
+	printk_ratelimited(KERN_ERR
+		"loop: Write error at byte offset %llu, length %i.\n",
+		(unsigned long long)*ppos, bvec->bv_len);
 	if (bw >= 0)
 		bw = -EIO;
 	return bw;
 }
 
-/**
- * do_lo_send_direct_write - helper for writing data to a loop device
- *
- * This is the fast, non-transforming version that does not need double
- * buffering.
- */
-static int do_lo_send_direct_write(struct loop_device *lo,
-		struct bio_vec *bvec, loff_t pos, struct page *page)
+static int lo_write_simple(struct loop_device *lo, struct request *rq,
+		loff_t pos)
 {
-	ssize_t bw = __do_lo_send_write(lo->lo_backing_file,
-			kmap(bvec->bv_page) + bvec->bv_offset,
-			bvec->bv_len, pos);
-	kunmap(bvec->bv_page);
-	cond_resched();
-	return bw;
+	struct bio_vec bvec;
+	struct req_iterator iter;
+	int ret = 0;
+
+	rq_for_each_segment(bvec, rq, iter) {
+		ret = lo_write_bvec(lo->lo_backing_file, &bvec, &pos);
+		if (ret < 0)
+			break;
+		cond_resched();
+	}
+
+	return ret;
 }
 
-/**
- * do_lo_send_write - helper for writing data to a loop device
- *
+/*
  * This is the slow, transforming version that needs to double buffer the
  * data as it cannot do the transformations in place without having direct
  * access to the destination pages of the backing file.
  */
-static int do_lo_send_write(struct loop_device *lo, struct bio_vec *bvec,
-		loff_t pos, struct page *page)
+static int lo_write_transfer(struct loop_device *lo, struct request *rq,
+		loff_t pos)
 {
-	int ret = lo_do_transfer(lo, WRITE, page, 0, bvec->bv_page,
-			bvec->bv_offset, bvec->bv_len, pos >> 9);
-	if (likely(!ret))
-		return __do_lo_send_write(lo->lo_backing_file,
-				page_address(page), bvec->bv_len,
-				pos);
-	printk_ratelimited(KERN_ERR "loop: Transfer error at byte offset %llu, "
-			"length %i.\n", (unsigned long long)pos, bvec->bv_len);
-	if (ret > 0)
-		ret = -EIO;
-	return ret;
-}
-
-static int lo_send(struct loop_device *lo, struct request *rq, loff_t pos)
-{
-	int (*do_lo_send)(struct loop_device *, struct bio_vec *, loff_t,
-			struct page *page);
-	struct bio_vec bvec;
+	struct bio_vec bvec, b;
 	struct req_iterator iter;
-	struct page *page = NULL;
+	struct page *page;
 	int ret = 0;
 
-	if (lo->transfer != transfer_none) {
-		page = alloc_page(GFP_NOIO | __GFP_HIGHMEM);
-		if (unlikely(!page))
-			goto fail;
-		kmap(page);
-		do_lo_send = do_lo_send_write;
-	} else {
-		do_lo_send = do_lo_send_direct_write;
-	}
+	page = alloc_page(GFP_NOIO);
+	if (unlikely(!page))
+		return -ENOMEM;
 
 	rq_for_each_segment(bvec, rq, iter) {
-		ret = do_lo_send(lo, &bvec, pos, page);
+		ret = lo_do_transfer(lo, WRITE, page, 0, bvec.bv_page,
+			bvec.bv_offset, bvec.bv_len, pos >> 9);
+		if (unlikely(ret))
+			break;
+
+		b.bv_page = page;
+		b.bv_offset = 0;
+		b.bv_len = bvec.bv_len;
+		ret = lo_write_bvec(lo->lo_backing_file, &b, &pos);
 		if (ret < 0)
 			break;
-		pos += bvec.bv_len;
 	}
-	if (page) {
-		kunmap(page);
-		__free_page(page);
-	}
-out:
+
+	__free_page(page);
 	return ret;
-fail:
-	printk_ratelimited(KERN_ERR "loop: Failed to allocate temporary page for write.\n");
-	ret = -ENOMEM;
-	goto out;
 }
 
-struct lo_read_data {
-	struct loop_device *lo;
-	struct page *page;
-	unsigned offset;
-	int bsize;
-};
+static int lo_read_simple(struct loop_device *lo, struct request *rq,
+		loff_t pos)
+{
+	struct bio_vec bvec;
+	struct req_iterator iter;
+	struct iov_iter i;
+	ssize_t len;
 
-static int
-lo_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
-		struct splice_desc *sd)
-{
-	struct lo_read_data *p = sd->u.data;
-	struct loop_device *lo = p->lo;
-	struct page *page = buf->page;
-	sector_t IV;
-	int size;
-
-	IV = ((sector_t) page->index << (PAGE_CACHE_SHIFT - 9)) +
-							(buf->offset >> 9);
-	size = sd->len;
-	if (size > p->bsize)
-		size = p->bsize;
-
-	if (lo_do_transfer(lo, READ, page, buf->offset, p->page, p->offset, size, IV)) {
-		printk_ratelimited(KERN_ERR "loop: transfer error block %ld\n",
-		       page->index);
-		size = -EINVAL;
-	}
+	rq_for_each_segment(bvec, rq, iter) {
+		iov_iter_bvec(&i, ITER_BVEC, &bvec, 1, bvec.bv_len);
+		len = vfs_iter_read(lo->lo_backing_file, &i, &pos);
+		if (len < 0)
+			return len;
 
-	flush_dcache_page(p->page);
+		flush_dcache_page(bvec.bv_page);
 
-	if (size > 0)
-		p->offset += size;
+		if (len != bvec.bv_len) {
+			struct bio *bio;
 
-	return size;
-}
+			__rq_for_each_bio(bio, rq)
+				zero_fill_bio(bio);
+			break;
+		}
+		cond_resched();
+	}
 
-static int
-lo_direct_splice_actor(struct pipe_inode_info *pipe, struct splice_desc *sd)
-{
-	return __splice_from_pipe(pipe, sd, lo_splice_actor);
+	return 0;
 }
 
-static ssize_t
-do_lo_receive(struct loop_device *lo,
-	      struct bio_vec *bvec, int bsize, loff_t pos)
+static int lo_read_transfer(struct loop_device *lo, struct request *rq,
+		loff_t pos)
 {
-	struct lo_read_data cookie;
-	struct splice_desc sd;
-	struct file *file;
-	ssize_t retval;
-
-	cookie.lo = lo;
-	cookie.page = bvec->bv_page;
-	cookie.offset = bvec->bv_offset;
-	cookie.bsize = bsize;
+	struct bio_vec bvec, b;
+	struct req_iterator iter;
+	struct iov_iter i;
+	struct page *page;
+	ssize_t len;
+	int ret = 0;
 
-	sd.len = 0;
-	sd.total_len = bvec->bv_len;
-	sd.flags = 0;
-	sd.pos = pos;
-	sd.u.data = &cookie;
+	page = alloc_page(GFP_NOIO);
+	if (unlikely(!page))
+		return -ENOMEM;
 
-	file = lo->lo_backing_file;
-	retval = splice_direct_to_actor(file, &sd, lo_direct_splice_actor);
+	rq_for_each_segment(bvec, rq, iter) {
+		loff_t offset = pos;
 
-	return retval;
-}
+		b.bv_page = page;
+		b.bv_offset = 0;
+		b.bv_len = bvec.bv_len;
 
-static int
-lo_receive(struct loop_device *lo, struct request *rq, int bsize, loff_t pos)
-{
-	struct bio_vec bvec;
-	struct req_iterator iter;
-	ssize_t s;
+		iov_iter_bvec(&i, ITER_BVEC, &b, 1, b.bv_len);
+		len = vfs_iter_read(lo->lo_backing_file, &i, &pos);
+		if (len < 0) {
+			ret = len;
+			goto out_free_page;
+		}
 
-	rq_for_each_segment(bvec, rq, iter) {
-		s = do_lo_receive(lo, &bvec, bsize, pos);
-		if (s < 0)
-			return s;
+		ret = lo_do_transfer(lo, READ, page, 0, bvec.bv_page,
+			bvec.bv_offset, len, offset >> 9);
+		if (ret)
+			goto out_free_page;
+	
+		flush_dcache_page(bvec.bv_page);
 
-		if (s != bvec.bv_len) {
+		if (len != bvec.bv_len) {
 			struct bio *bio;
 
 			__rq_for_each_bio(bio, rq)
 				zero_fill_bio(bio);
 			break;
 		}
-		pos += bvec.bv_len;
 	}
-	return 0;
+
+	ret = 0;
+out_free_page:
+	__free_page(page);
+	return ret;
 }
 
 static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
@@ -464,10 +403,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
 			ret = lo_req_flush(lo, rq);
 		else if (rq->cmd_flags & REQ_DISCARD)
 			ret = lo_discard(lo, rq, pos);
+		else if (lo->transfer)
+			ret = lo_write_transfer(lo, rq, pos);
 		else
-			ret = lo_send(lo, rq, pos);
-	} else
-		ret = lo_receive(lo, rq, lo->lo_blocksize, pos);
+			ret = lo_write_simple(lo, rq, pos);
+
+	} else {
+		if (lo->transfer)
+			ret = lo_read_transfer(lo, rq, pos);
+		else
+			ret = lo_read_simple(lo, rq, pos);
+	}
 
 	return ret;
 }
@@ -788,7 +734,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	lo->lo_device = bdev;
 	lo->lo_flags = lo_flags;
 	lo->lo_backing_file = file;
-	lo->transfer = transfer_none;
+	lo->transfer = NULL;
 	lo->ioctl = NULL;
 	lo->lo_sizelimit = 0;
 	lo->old_gfp_mask = mapping_gfp_mask(mapping);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-04-07 16:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-22  8:14 [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Ming Lei
2015-03-22  8:14 ` [PATCH 1/3] block: loop: use kmap(page) instead of page_address(page) Ming Lei
2015-03-24 10:29   ` Christoph Hellwig
2015-03-24 10:49     ` Ming Lei
2015-03-24 11:09       ` Christoph Hellwig
2015-03-24 11:24         ` Ming Lei
2015-03-22  8:14 ` [PATCH 2/3] block: loop: unify interface type of lo_send and lo_receive Ming Lei
2015-03-24 10:30   ` Christoph Hellwig
2015-03-24 10:55     ` Ming Lei
2015-03-22  8:14 ` [PATCH 3/3] block: loop: use vfs ITER_BVEC to read/write backing file Ming Lei
2015-03-24 10:31   ` Christoph Hellwig
2015-03-24 11:01     ` Ming Lei
2015-03-24 10:32 ` [PATCH 0/3] block: loop: switch to VFS ITER_BVEC Christoph Hellwig
2015-03-24 10:53   ` Ming Lei
2015-03-24 18:01     ` Christoph Hellwig
2015-03-25  7:23       ` Ming Lei
2015-03-25 15:27         ` Christoph Hellwig
2015-03-31 22:22           ` Al Viro
2015-04-04  5:20           ` Al Viro
2015-04-07 16:23             ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).