All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] block_dev: implement readpages() to optimize sequential read
@ 2014-07-17 14:17 Akinobu Mita
  2014-07-17 18:31 ` Jeff Moyer
  0 siblings, 1 reply; 4+ messages in thread
From: Akinobu Mita @ 2014-07-17 14:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: Akinobu Mita, Andrew Morton, Jens Axboe, Alexander Viro, linux-fsdevel

Sequential read from a block device is expected to be equal or faster
than from the file on a filesystem.  But it is not correct due to the
lack of effective readpages() in the address space operations for
block device.

This implements readpages() operation for block device by using
mpage_readpages() which can create multipage BIOs instead of BIOs for
each page.  It enables to reduce system CPU time consumption.

Install 1GB of RAM disk storage:

	# modprobe scsi_debug dev_size_mb=1024 delay=0

Sequential read from file on a filesystem:

	# mkfs.ext4 /dev/$DEV
	# mount /dev/$DEV /mnt
	# fio --name=t --size=512m --rw=read --filename=/mnt/file
	...
	  read : io=524288KB, bw=2133.4MB/s, iops=546133, runt=   240msec

Sequential read from a block device:
	# fio --name=t --size=512m --rw=read --filename=/dev/sdb
	...
(Without this commit)
	  read : io=524288KB, bw=1700.2MB/s, iops=435455, runt=   301msec

(With this commit)
	  read : io=524288KB, bw=2160.4MB/s, iops=553046, runt=   237msec

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/block_dev.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 6d72746..e2f3ad08 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -304,6 +304,12 @@ static int blkdev_readpage(struct file * file, struct page * page)
 	return block_read_full_page(page, blkdev_get_block);
 }
 
+static int blkdev_readpages(struct file *file, struct address_space *mapping,
+			struct list_head *pages, unsigned nr_pages)
+{
+	return mpage_readpages(mapping, pages, nr_pages, blkdev_get_block);
+}
+
 static int blkdev_write_begin(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned flags,
 			struct page **pagep, void **fsdata)
@@ -1622,6 +1628,7 @@ static int blkdev_releasepage(struct page *page, gfp_t wait)
 
 static const struct address_space_operations def_blk_aops = {
 	.readpage	= blkdev_readpage,
+	.readpages	= blkdev_readpages,
 	.writepage	= blkdev_writepage,
 	.write_begin	= blkdev_write_begin,
 	.write_end	= blkdev_write_end,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] block_dev: implement readpages() to optimize sequential read
  2014-07-17 14:17 [PATCH] block_dev: implement readpages() to optimize sequential read Akinobu Mita
@ 2014-07-17 18:31 ` Jeff Moyer
  2014-07-17 23:14   ` Akinobu Mita
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Moyer @ 2014-07-17 18:31 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-kernel, Andrew Morton, Jens Axboe, Alexander Viro, linux-fsdevel

Akinobu Mita <akinobu.mita@gmail.com> writes:

> Sequential read from a block device is expected to be equal or faster
> than from the file on a filesystem.  But it is not correct due to the
> lack of effective readpages() in the address space operations for
> block device.

Ah, a trip down memory lane.  ;-)  Here's a thread showing issues with
the last time this was proposed (by me, incidentally):

https://lkml.org/lkml/2009/6/2/480

At the very least, we need to see numbers on a real device, and see it
booted on something with >4k page size before taking this back in.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] block_dev: implement readpages() to optimize sequential read
  2014-07-17 18:31 ` Jeff Moyer
@ 2014-07-17 23:14   ` Akinobu Mita
  2014-07-23 16:18     ` Akinobu Mita
  0 siblings, 1 reply; 4+ messages in thread
From: Akinobu Mita @ 2014-07-17 23:14 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: LKML, Andrew Morton, Jens Axboe, Alexander Viro, linux-fsdevel

2014-07-18 3:31 GMT+09:00 Jeff Moyer <jmoyer@redhat.com>:
> Akinobu Mita <akinobu.mita@gmail.com> writes:
>
>> Sequential read from a block device is expected to be equal or faster
>> than from the file on a filesystem.  But it is not correct due to the
>> lack of effective readpages() in the address space operations for
>> block device.
>
> Ah, a trip down memory lane.  ;-)  Here's a thread showing issues with
> the last time this was proposed (by me, incidentally):
>
> https://lkml.org/lkml/2009/6/2/480
>
> At the very least, we need to see numbers on a real device, and see it
> booted on something with >4k page size before taking this back in.

Thanks for the information.  First, I'll try to reproduce the >4k page
size issue with qemu with the architecture which can support 64k page.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] block_dev: implement readpages() to optimize sequential read
  2014-07-17 23:14   ` Akinobu Mita
@ 2014-07-23 16:18     ` Akinobu Mita
  0 siblings, 0 replies; 4+ messages in thread
From: Akinobu Mita @ 2014-07-23 16:18 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: LKML, Andrew Morton, Jens Axboe, Alexander Viro, linux-fsdevel

2014-07-18 8:14 GMT+09:00 Akinobu Mita <akinobu.mita@gmail.com>:
> 2014-07-18 3:31 GMT+09:00 Jeff Moyer <jmoyer@redhat.com>:
>> Akinobu Mita <akinobu.mita@gmail.com> writes:
>>
>>> Sequential read from a block device is expected to be equal or faster
>>> than from the file on a filesystem.  But it is not correct due to the
>>> lack of effective readpages() in the address space operations for
>>> block device.
>>
>> Ah, a trip down memory lane.  ;-)  Here's a thread showing issues with
>> the last time this was proposed (by me, incidentally):
>>
>> https://lkml.org/lkml/2009/6/2/480
>>
>> At the very least, we need to see numbers on a real device, and see it
>> booted on something with >4k page size before taking this back in.
>
> Thanks for the information.  First, I'll try to reproduce the >4k page
> size issue with qemu with the architecture which can support 64k page.

I can reproduce the issue which attempts to access beyond end of
device, and it turns out that this can be reproducible on 4K page
system, too.

This problem only happens when reading a partition which is being
mounted by the filesystem at the same time.  The filesystem sets
the blocksize to a multiple of the logical block size by set_blocksize()
while it is mounted.  If the end of the partition is not aligned on
the new blocksize, reading the end of the partition causes access
beyond end of device.

So this patch is incomplete, and some guard check code is required in
mpage_readpages() or altenative solution is needed.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-23 16:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-17 14:17 [PATCH] block_dev: implement readpages() to optimize sequential read Akinobu Mita
2014-07-17 18:31 ` Jeff Moyer
2014-07-17 23:14   ` Akinobu Mita
2014-07-23 16:18     ` Akinobu Mita

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.