All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/60] block: support multipage bvec
@ 2016-10-29  8:07 ` Ming Lei
  0 siblings, 0 replies; 148+ messages in thread
From: Ming Lei @ 2016-10-29  8:07 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: linux-block, linux-fsdevel, Christoph Hellwig,
	Kirill A . Shutemov, Ming Lei, Al Viro, Andrew Morton,
	Bart Van Assche, open list:GFS2 FILE SYSTEM, Coly Li,
	Dan Williams, open list:DEVICE-MAPPER  LVM,
	open list:DRBD DRIVER, Eric Wheeler, Guoqing Jiang,
	Hannes Reinecke, Hannes Reinecke, Jiri Kosina, Joe Perches,
	Johannes Berg, Johannes Thumshirn, Keith Busch, Kent

Hi,

This patchset brings multipage bvec into block layer. Basic
xfstests(-a auto) over virtio-blk/virtio-scsi have been run
and no regression is found, so it should be good enough
to show the approach now, and any comments are welcome!

1) what is multipage bvec?

Multipage bvecs means that one 'struct bio_bvec' can hold
multiple pages which are physically contiguous instead
of one single page used in linux kernel for long time.

2) why is multipage bvec introduced?

Kent proposed the idea[1] first. 

As system's RAM becomes much bigger than before, and 
at the same time huge page, transparent huge page and
memory compaction are widely used, it is a bit easy now
to see physically contiguous pages inside fs/block stack.
On the other hand, from block layer's view, it isn't
necessary to store intermediate pages into bvec, and
it is enough to just store the physicallly contiguous
'segment'.

Also huge pages are being brought to filesystem[2], we
can do IO a hugepage a time[3], requires that one bio can
transfer at least one huge page one time. Turns out it isn't
flexiable to change BIO_MAX_PAGES simply[3]. Multipage bvec
can fit in this case very well.

With multipage bvec:

- bio size can be increased and it should improve some
high-bandwidth IO case in theory[4].

- Inside block layer, both bio splitting and sg map can
become more efficient than before by just traversing the
physically contiguous 'segment' instead of each page.

- there is possibility in future to improve memory footprint
of bvecs usage. 

3) how is multipage bvec implemented in this patchset?

The 1st 22 patches cleanup on direct access to bvec table,
and comments on some special cases. With this approach,
most of cases are found as safe for multipage bvec,
only fs/buffer, pktcdvd, dm-io, MD and btrfs need to deal
with.

Given a little more work is involved to cleanup pktcdvd,
MD and btrfs, this patchset introduces QUEUE_FLAG_NO_MP for
them, and these components can still see/use singlepage bvec.
In the future, once the cleanup is done, the flag can be killed.

The 2nd part(23 ~ 60) implements multipage bvec in block:

- put all tricks into bvec/bio/rq iterators, and as far as
drivers and fs use these standard iterators, they are happy
with multipage bvec

- bio_for_each_segment_all() changes
this helper pass pointer of each bvec directly to user, and
it has to be changed. Two new helpers(bio_for_each_segment_all_rd()
and bio_for_each_segment_all_wt()) are introduced. 

- bio_clone() changes
At default bio_clone still clones one new bio in multipage bvec
way. Also single page version of bio_clone() is introduced
for some special cases, such as only single page bvec is used
for the new cloned bio(bio bounce, ...)

These patches can be found in the following git tree:

	https://github.com/ming1/linux/tree/mp-bvec-0.3-v4.9

Thanks Christoph for looking at the early version and providing
very good suggestions, such as: introduce bio_init_with_vec_table(),
remove another unnecessary helpers for cleanup and so on.

TODO:
	- cleanup direct access to bvec table for MD & btrfs


[1], http://marc.info/?l=linux-kernel&m=141680246629547&w=2
[2], http://lwn.net/Articles/700781/
[3], http://marc.info/?t=147735447100001&r=1&w=2
[4], http://marc.info/?l=linux-mm&m=147745525801433&w=2


Ming Lei (60):
  block: bio: introduce bio_init_with_vec_table()
  block drivers: convert to bio_init_with_vec_table()
  block: drbd: remove impossible failure handling
  block: floppy: use bio_add_page()
  target: avoid to access .bi_vcnt directly
  bcache: debug: avoid to access .bi_io_vec directly
  dm: crypt: use bio_add_page()
  dm: use bvec iterator helpers to implement .get_page and .next_page
  dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments
  fs: logfs: convert to bio_add_page() in sync_request()
  fs: logfs: use bio_add_page() in __bdev_writeseg()
  fs: logfs: use bio_add_page() in do_erase()
  fs: logfs: remove unnecesary check
  block: drbd: comment on direct access bvec table
  block: loop: comment on direct access to bvec table
  block: pktcdvd: comment on direct access to bvec table
  kernel/power/swap.c: comment on direct access to bvec table
  mm: page_io.c: comment on direct access to bvec table
  fs/buffer: comment on direct access to bvec table
  f2fs: f2fs_read_end_io: comment on direct access to bvec table
  bcache: comment on direct access to bvec table
  block: comment on bio_alloc_pages()
  block: introduce flag QUEUE_FLAG_NO_MP
  md: set NO_MP for request queue of md
  block: pktcdvd: set NO_MP for pktcdvd request queue
  btrfs: set NO_MP for request queues behind BTRFS
  block: introduce BIO_SP_MAX_SECTORS
  block: introduce QUEUE_FLAG_SPLIT_MP
  dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT
  bcache: set flag of QUEUE_FLAG_SPLIT_MP
  block: introduce multipage/single page bvec helpers
  block: implement sp version of bvec iterator helpers
  block: introduce bio_for_each_segment_mp()
  block: introduce bio_clone_sp()
  bvec_iter: introduce BVEC_ITER_ALL_INIT
  block: bounce: avoid direct access to bvec from bio->bi_io_vec
  block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq
  block: bounce: convert multipage bvecs into singlepage
  bcache: debug: switch to bio_clone_sp()
  blk-merge: compute bio->bi_seg_front_size efficiently
  block: blk-merge: try to make front segments in full size
  block: use bio_for_each_segment_mp() to compute segments count
  block: use bio_for_each_segment_mp() to map sg
  block: introduce bvec_for_each_sp_bvec()
  block: bio: introduce bio_for_each_segment_all_rd() and its write pair
  block: deal with dirtying pages for multipage bvec
  block: convert to bio_for_each_segment_all_rd()
  fs/mpage: convert to bio_for_each_segment_all_rd()
  fs/direct-io: convert to bio_for_each_segment_all_rd()
  ext4: convert to bio_for_each_segment_all_rd()
  xfs: convert to bio_for_each_segment_all_rd()
  logfs: convert to bio_for_each_segment_all_rd()
  gfs2: convert to bio_for_each_segment_all_rd()
  f2fs: convert to bio_for_each_segment_all_rd()
  exofs: convert to bio_for_each_segment_all_rd()
  fs: crypto: convert to bio_for_each_segment_all_rd()
  bcache: convert to bio_for_each_segment_all_rd()
  dm-crypt: convert to bio_for_each_segment_all_rd()
  fs/buffer.c: use bvec iterator to truncate the bio
  block: enable multipage bvecs

 block/bio.c                        | 104 ++++++++++++++----
 block/blk-merge.c                  | 216 +++++++++++++++++++++++++++++--------
 block/bounce.c                     |  80 ++++++++++----
 drivers/block/drbd/drbd_bitmap.c   |   1 +
 drivers/block/drbd/drbd_receiver.c |  14 +--
 drivers/block/floppy.c             |  10 +-
 drivers/block/loop.c               |   5 +
 drivers/block/pktcdvd.c            |   8 ++
 drivers/md/bcache/btree.c          |   4 +-
 drivers/md/bcache/debug.c          |  19 +++-
 drivers/md/bcache/io.c             |   4 +-
 drivers/md/bcache/journal.c        |   4 +-
 drivers/md/bcache/movinggc.c       |   7 +-
 drivers/md/bcache/super.c          |  25 +++--
 drivers/md/bcache/util.c           |   7 ++
 drivers/md/bcache/writeback.c      |   6 +-
 drivers/md/dm-bufio.c              |   4 +-
 drivers/md/dm-crypt.c              |  11 +-
 drivers/md/dm-io.c                 |  34 ++++--
 drivers/md/dm-rq.c                 |   3 +-
 drivers/md/dm.c                    |  11 +-
 drivers/md/md.c                    |  12 +++
 drivers/md/raid5.c                 |   9 +-
 drivers/nvme/target/io-cmd.c       |   4 +-
 drivers/target/target_core_pscsi.c |   8 +-
 fs/btrfs/volumes.c                 |   3 +
 fs/buffer.c                        |  24 +++--
 fs/crypto/crypto.c                 |   3 +-
 fs/direct-io.c                     |   4 +-
 fs/exofs/ore.c                     |   3 +-
 fs/exofs/ore_raid.c                |   3 +-
 fs/ext4/page-io.c                  |   3 +-
 fs/ext4/readpage.c                 |   3 +-
 fs/f2fs/data.c                     |  13 ++-
 fs/gfs2/lops.c                     |   3 +-
 fs/gfs2/meta_io.c                  |   3 +-
 fs/logfs/dev_bdev.c                | 110 +++++++------------
 fs/mpage.c                         |   3 +-
 fs/xfs/xfs_aops.c                  |   3 +-
 include/linux/bio.h                | 108 +++++++++++++++++--
 include/linux/blk_types.h          |   6 ++
 include/linux/blkdev.h             |   4 +
 include/linux/bvec.h               | 123 +++++++++++++++++++--
 kernel/power/swap.c                |   2 +
 mm/page_io.c                       |   1 +
 45 files changed, 759 insertions(+), 276 deletions(-)

-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 148+ messages in thread

end of thread, other threads:[~2016-12-17 11:38 UTC | newest]

Thread overview: 148+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-29  8:07 [PATCH 00/60] block: support multipage bvec Ming Lei
2016-10-29  8:07 ` [Cluster-devel] " Ming Lei
2016-10-29  8:07 ` Ming Lei
2016-10-29  8:07 ` Ming Lei
2016-10-29  8:07 ` Ming Lei
2016-10-29  8:08 ` [PATCH 01/60] block: bio: introduce bio_init_with_vec_table() Ming Lei
2016-10-29 15:21   ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 02/60] block drivers: convert to bio_init_with_vec_table() Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 03/60] block: drbd: remove impossible failure handling Ming Lei
2016-10-31 15:25   ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 04/60] block: floppy: use bio_add_page() Ming Lei
2016-10-31 15:26   ` Christoph Hellwig
2016-10-31 22:54     ` Ming Lei
2016-11-10 19:35   ` Christoph Hellwig
2016-11-11  8:39     ` Ming Lei
2016-10-29  8:08 ` [PATCH 05/60] target: avoid to access .bi_vcnt directly Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-31 15:26   ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 06/60] bcache: debug: avoid to access .bi_io_vec directly Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 07/60] dm: crypt: use bio_add_page() Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 08/60] dm: use bvec iterator helpers to implement .get_page and .next_page Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 09/60] dm: dm.c: replace 'bio->bi_vcnt == 1' with !bio_multiple_segments Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-31 15:29   ` Christoph Hellwig
2016-10-31 15:29     ` Christoph Hellwig
2016-10-31 22:59     ` Ming Lei
2016-10-31 22:59       ` Ming Lei
2016-11-02  3:09     ` Kent Overstreet
2016-11-02  3:09       ` Kent Overstreet
2016-11-02  7:56       ` Ming Lei
2016-11-02  7:56         ` Ming Lei
2016-11-02 14:24         ` Mike Snitzer
2016-11-02 14:24           ` Mike Snitzer
2016-11-02 14:24           ` Mike Snitzer
2016-11-02 23:47           ` Ming Lei
2016-11-02 23:47             ` Ming Lei
2016-10-29  8:08 ` [PATCH 10/60] fs: logfs: convert to bio_add_page() in sync_request() Ming Lei
2016-10-29  8:08 ` [PATCH 11/60] fs: logfs: use bio_add_page() in __bdev_writeseg() Ming Lei
2016-10-31 15:29   ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 12/60] fs: logfs: use bio_add_page() in do_erase() Ming Lei
2016-10-31 15:29   ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 13/60] fs: logfs: remove unnecesary check Ming Lei
2016-10-31 15:29   ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 14/60] block: drbd: comment on direct access bvec table Ming Lei
2016-10-29  8:08 ` [PATCH 15/60] block: loop: comment on direct access to " Ming Lei
2016-10-31 15:31   ` Christoph Hellwig
2016-10-31 23:08     ` Ming Lei
2016-10-29  8:08 ` [PATCH 16/60] block: pktcdvd: " Ming Lei
2016-10-31 15:33   ` Christoph Hellwig
2016-10-31 23:08     ` Ming Lei
2016-10-29  8:08 ` [PATCH 17/60] kernel/power/swap.c: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 18/60] mm: page_io.c: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 19/60] fs/buffer: " Ming Lei
2016-10-31 15:35   ` Christoph Hellwig
2016-10-31 23:12     ` Ming Lei
2016-10-29  8:08 ` [PATCH 20/60] f2fs: f2fs_read_end_io: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 21/60] bcache: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 22/60] block: comment on bio_alloc_pages() Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 23/60] block: introduce flag QUEUE_FLAG_NO_MP Ming Lei
2016-10-29 15:29   ` Christoph Hellwig
2016-10-29 22:20     ` Ming Lei
2016-10-29 22:20       ` Ming Lei
2016-10-29  8:08 ` [PATCH 24/60] md: set NO_MP for request queue of md Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 25/60] block: pktcdvd: set NO_MP for pktcdvd request queue Ming Lei
2016-10-29  8:08 ` [PATCH 26/60] btrfs: set NO_MP for request queues behind BTRFS Ming Lei
2016-10-31 15:36   ` Christoph Hellwig
2016-10-31 17:58     ` Chris Mason
2016-10-31 18:00       ` Christoph Hellwig
2016-10-29  8:08 ` [PATCH 27/60] block: introduce BIO_SP_MAX_SECTORS Ming Lei
2016-10-29  8:08 ` [PATCH 28/60] block: introduce QUEUE_FLAG_SPLIT_MP Ming Lei
2016-10-31 15:39   ` Christoph Hellwig
2016-10-31 23:56     ` Ming Lei
2016-11-02  3:08     ` Kent Overstreet
2016-11-03 10:38       ` Ming Lei
2016-11-03 11:20         ` Kent Overstreet
2016-11-03 11:26           ` Ming Lei
2016-11-03 11:30             ` Kent Overstreet
2016-10-29  8:08 ` [PATCH 29/60] dm: limit the max bio size as BIO_SP_MAX_SECTORS << SECTOR_SHIFT Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 30/60] bcache: set flag of QUEUE_FLAG_SPLIT_MP Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 31/60] block: introduce multipage/single page bvec helpers Ming Lei
2016-10-29  8:08 ` [PATCH 32/60] block: implement sp version of bvec iterator helpers Ming Lei
2016-10-29 11:06   ` kbuild test robot
2016-12-17 11:38     ` Ming Lei
2016-12-17 11:38       ` Ming Lei
2016-10-29  8:08 ` [PATCH 33/60] block: introduce bio_for_each_segment_mp() Ming Lei
2016-10-29  8:08 ` [PATCH 34/60] block: introduce bio_clone_sp() Ming Lei
2016-10-29  8:08 ` [PATCH 35/60] bvec_iter: introduce BVEC_ITER_ALL_INIT Ming Lei
2016-10-29  8:08 ` [PATCH 36/60] block: bounce: avoid direct access to bvec from bio->bi_io_vec Ming Lei
2016-10-29  8:08 ` [PATCH 37/60] block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq Ming Lei
2016-10-29  8:08 ` [PATCH 38/60] block: bounce: convert multipage bvecs into singlepage Ming Lei
2016-10-29  8:08 ` [PATCH 39/60] bcache: debug: switch to bio_clone_sp() Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 40/60] blk-merge: compute bio->bi_seg_front_size efficiently Ming Lei
2016-10-29  8:08 ` [PATCH 41/60] block: blk-merge: try to make front segments in full size Ming Lei
2016-10-29  8:08 ` [PATCH 42/60] block: use bio_for_each_segment_mp() to compute segments count Ming Lei
2016-10-29  8:08 ` [PATCH 43/60] block: use bio_for_each_segment_mp() to map sg Ming Lei
2016-10-29  8:08 ` [PATCH 44/60] block: introduce bvec_for_each_sp_bvec() Ming Lei
2016-10-29  8:08 ` [PATCH 45/60] block: bio: introduce bio_for_each_segment_all_rd() and its write pair Ming Lei
2016-10-31 13:59   ` Theodore Ts'o
2016-10-31 15:11     ` Christoph Hellwig
2016-10-31 22:50       ` Ming Lei
2016-11-02  3:01       ` Kent Overstreet
2016-10-31 22:46     ` Ming Lei
2016-10-31 23:51       ` Ming Lei
2016-11-01 14:17         ` Theodore Ts'o
2016-11-02  1:58           ` Ming Lei
2016-10-29  8:08 ` [PATCH 46/60] block: deal with dirtying pages for multipage bvec Ming Lei
2016-10-31 15:40   ` Christoph Hellwig
2016-11-01  0:19     ` Ming Lei
2016-10-29  8:08 ` [PATCH 47/60] block: convert to bio_for_each_segment_all_rd() Ming Lei
2016-10-29  8:08 ` [PATCH 48/60] fs/mpage: " Ming Lei
2016-10-29  8:08 ` [PATCH 49/60] fs/direct-io: " Ming Lei
2016-10-29  8:08 ` [PATCH 50/60] ext4: " Ming Lei
2016-10-29  8:08 ` [PATCH 51/60] xfs: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 52/60] logfs: " Ming Lei
2016-10-29  8:08 ` [PATCH 53/60] gfs2: " Ming Lei
2016-10-29  8:08   ` [Cluster-devel] " Ming Lei
2016-10-29  8:08 ` [PATCH 54/60] f2fs: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 55/60] exofs: " Ming Lei
2016-10-29  8:08 ` [PATCH 56/60] fs: crypto: " Ming Lei
2016-10-29  8:08 ` [PATCH 57/60] bcache: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 58/60] dm-crypt: " Ming Lei
2016-10-29  8:08   ` Ming Lei
2016-10-29  8:08 ` [PATCH 59/60] fs/buffer.c: use bvec iterator to truncate the bio Ming Lei
2016-10-29  8:08 ` [PATCH 60/60] block: enable multipage bvecs Ming Lei
2016-10-31 15:25 ` [PATCH 00/60] block: support multipage bvec Christoph Hellwig
2016-10-31 15:25   ` [Cluster-devel] " Christoph Hellwig
2016-10-31 15:25   ` Christoph Hellwig
2016-10-31 15:25   ` Christoph Hellwig
2016-10-31 22:52   ` Ming Lei
2016-10-31 22:52     ` [Cluster-devel] " Ming Lei
2016-10-31 22:52     ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.