linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/17] md: cleanup on direct access to bvec table
@ 2017-02-16 11:45 Ming Lei
  2017-02-16 11:45 ` [PATCH 01/17] block: introduce bio_segments_all() Ming Lei
                   ` (17 more replies)
  0 siblings, 18 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

In MD's resync I/O path, there are lots of direct access to bio's
bvec table. This patchset kills most of them, and the conversion
is quite straightforward.

Once direct access to bvec table in MD is cleaned up, we may make
multipage bvec moving on.

Thanks,
Ming

Ming Lei (17):
  block: introduce bio_segments_all()
  block: introduce bio_remove_last_page()
  md: raid1/raid10: use bio_remove_last_page()
  md: introduce helpers for dealing with fetch/store preallocated pages
    in bio
  md: raid1/raid10: use the introduced helpers
  md: raid1/raid10: borrow .bi_error as pre-allocated page index
  md: raid1/raid10: don't use .bi_vcnt to check if all pages are added
  md: raid1: simplify r1buf_pool_free()
  md: raid1/raid10: use bio helper in *_pool_free
  md: raid1: remove direct access to bvec table in fix_sync_read_error
  md: raid1: use bio helper in process_checks()
  md: raid1: avoid direct access to bvec table in process_checks()
  md: raid1: use bio_segments_all()
  md: raid10: avoid direct access to bvec table in sync_request_write()
  md: raid10: avoid direct access to bvec table in
    fix_recovery_read_error
  md: raid10: avoid direct access to bvec table in reshape_request
  md: raid10: avoid direct access to bvec table in
    handle_reshape_read_error

 block/bio.c         | 23 ++++++++++++++
 drivers/md/md.h     | 21 +++++++++++++
 drivers/md/raid1.c  | 87 +++++++++++++++++++++++++++++++++++++----------------
 drivers/md/raid10.c | 76 ++++++++++++++++++++++++++++++++++------------
 include/linux/bio.h |  8 +++++
 5 files changed, 169 insertions(+), 46 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/17] block: introduce bio_segments_all()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 02/17] block: introduce bio_remove_last_page() Ming Lei
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

So that we can replace the direct access to .bi_vcnt.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 include/linux/bio.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 8e521194f6fc..3364b3ed90e7 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -293,6 +293,13 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
 		bv->bv_len = iter.bi_bvec_done;
 }
 
+static inline unsigned bio_segments_all(struct bio *bio)
+{
+	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
+
+	return bio->bi_vcnt;
+}
+
 enum bip_flags {
 	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
 	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/17] block: introduce bio_remove_last_page()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
  2017-02-16 11:45 ` [PATCH 01/17] block: introduce bio_segments_all() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 12:08   ` Johannes Thumshirn
  2017-02-16 11:45 ` [PATCH 03/17] md: raid1/raid10: use bio_remove_last_page() Ming Lei
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

MD need this helper to remove the last added page, so introduce
it.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/bio.c         | 23 +++++++++++++++++++++++
 include/linux/bio.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 5eec5e08417f..0ce7ffcd7939 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -837,6 +837,29 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
 EXPORT_SYMBOL(bio_add_pc_page);
 
 /**
+ *	bio_remove_last_page	-	remove the last added page
+ *	@bio: destination bio
+ *
+ *	Attempt to remove the last added page from the bio_vec maplist.
+ */
+void bio_remove_last_page(struct bio *bio)
+{
+	/*
+	 * cloned bio must not modify vec list
+	 */
+	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
+		return;
+
+	if (bio->bi_vcnt > 0) {
+		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
+
+		bio->bi_iter.bi_size -= bv->bv_len;
+		bio->bi_vcnt--;
+	}
+}
+EXPORT_SYMBOL(bio_remove_last_page);
+
+/**
  *	bio_add_page	-	attempt to add page to bio
  *	@bio: destination bio
  *	@page: page to add
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 3364b3ed90e7..32aeb493d1fe 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -443,6 +443,7 @@ extern void bio_init(struct bio *bio, struct bio_vec *table,
 extern void bio_reset(struct bio *);
 void bio_chain(struct bio *, struct bio *);
 
+extern void bio_remove_last_page(struct bio *bio);
 extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/17] md: raid1/raid10: use bio_remove_last_page()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
  2017-02-16 11:45 ` [PATCH 01/17] block: introduce bio_segments_all() Ming Lei
  2017-02-16 11:45 ` [PATCH 02/17] block: introduce bio_remove_last_page() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 04/17] md: introduce helpers for dealing with fetch/store preallocated pages in bio Ming Lei
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c  | 3 +--
 drivers/md/raid10.c | 6 ++----
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 85f309836fd7..6e4e0b868ff2 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2824,8 +2824,7 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
 						if (bio->bi_end_io==NULL)
 							continue;
 						/* remove last page from this bio */
-						bio->bi_vcnt--;
-						bio->bi_iter.bi_size -= len;
+						bio_remove_last_page(bio);
 						bio_clear_flag(bio, BIO_SEG_VALID);
 					}
 					goto bio_full;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 063c43d83b72..aa37d4c7900a 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3447,8 +3447,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 			     bio2 && bio2 != bio;
 			     bio2 = bio2->bi_next) {
 				/* remove last page from this bio */
-				bio2->bi_vcnt--;
-				bio2->bi_iter.bi_size -= len;
+				bio_remove_last_page(bio2);
 				bio_clear_flag(bio2, BIO_SEG_VALID);
 			}
 			goto bio_full;
@@ -4538,8 +4537,7 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
 			     bio2 && bio2 != bio;
 			     bio2 = bio2->bi_next) {
 				/* Remove last page from this bio */
-				bio2->bi_vcnt--;
-				bio2->bi_iter.bi_size -= len;
+				bio_remove_last_page(bio2);
 				bio_clear_flag(bio2, BIO_SEG_VALID);
 			}
 			goto bio_full;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/17] md: introduce helpers for dealing with fetch/store preallocated pages in bio
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (2 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 03/17] md: raid1/raid10: use bio_remove_last_page() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 05/17] md: raid1/raid10: use the introduced helpers Ming Lei
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Both raid1 and raid10 uses bio's bvec table to store pre-allocated
pages, then fetch and add it to bio.

This patch introduces two helpers for dealing with the special case,
like what bio_iov_iter_get_pages() does.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/md.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index a86ad62079de..21897cb514af 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -708,4 +708,25 @@ static inline void mddev_clear_unsupported_flags(struct mddev *mddev,
 {
 	mddev->flags &= ~unsupported_flags;
 }
+
+/*
+ * Both raid1 and raid10 use bio's bvec table to store the preallocated
+ * pages, so we introduces the following helpers for this kind of usage.
+ *
+ * Actually the usage is a bit similar with bio_iov_iter_get_pages().
+ *
+ * Please make sure .bi_io_vec[idx] points to one unused vector by the
+ * bio.
+ */
+static inline struct page *mdev_get_page_from_bio(struct bio *bio, unsigned idx)
+{
+	return bio->bi_io_vec[idx].bv_page;
+}
+
+static inline void mdev_put_page_to_bio(struct bio *bio, unsigned idx,
+					struct page *page)
+{
+	bio->bi_io_vec[idx].bv_page = page;
+}
+
 #endif /* _MD_MD_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/17] md: raid1/raid10: use the introduced helpers
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (3 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 04/17] md: introduce helpers for dealing with fetch/store preallocated pages in bio Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 06/17] md: raid1/raid10: borrow .bi_error as pre-allocated page index Ming Lei
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

This patch uses the introduced helpers to fetch pre-allocated
page from bio bvec table, and store it back.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c  | 4 ++--
 drivers/md/raid10.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6e4e0b868ff2..c4791fbd69ac 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2814,10 +2814,10 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
 		for (i = 0 ; i < conf->raid_disks * 2; i++) {
 			bio = r1_bio->bios[i];
 			if (bio->bi_end_io) {
-				page = bio->bi_io_vec[bio->bi_vcnt].bv_page;
+				page = mdev_get_page_from_bio(bio, bio->bi_vcnt);
 				if (bio_add_page(bio, page, len, 0) == 0) {
 					/* stop here */
-					bio->bi_io_vec[bio->bi_vcnt].bv_page = page;
+					mdev_put_page_to_bio(bio, bio->bi_vcnt, page);
 					while (i > 0) {
 						i--;
 						bio = r1_bio->bios[i];
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index aa37d4c7900a..b7dfbca869a3 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3437,12 +3437,12 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 			break;
 		for (bio= biolist ; bio ; bio=bio->bi_next) {
 			struct bio *bio2;
-			page = bio->bi_io_vec[bio->bi_vcnt].bv_page;
+			page = mdev_get_page_from_bio(bio, bio->bi_vcnt);
 			if (bio_add_page(bio, page, len, 0))
 				continue;
 
 			/* stop here */
-			bio->bi_io_vec[bio->bi_vcnt].bv_page = page;
+			mdev_put_page_to_bio(bio, bio->bi_vcnt, page);
 			for (bio2 = biolist;
 			     bio2 && bio2 != bio;
 			     bio2 = bio2->bi_next) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/17] md: raid1/raid10: borrow .bi_error as pre-allocated page index
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (4 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 05/17] md: raid1/raid10: use the introduced helpers Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 07/17] md: raid1/raid10: don't use .bi_vcnt to check if all pages are added Ming Lei
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Before bio is submitted, it is safe to borrow .bi_error. This
patch uses .bi_error as index of pre-allocated page in bio, so
that we can avoid to mess .bi_vcnt. Especially the old way
will not work any more when multipage bvec is introduced.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c  | 12 ++++++++++--
 drivers/md/raid10.c | 14 ++++++++++----
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index c4791fbd69ac..8904a9149671 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2811,13 +2811,14 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
 				len = sync_blocks<<9;
 		}
 
+		/* borrow .bi_error as pre-allocated page index */
 		for (i = 0 ; i < conf->raid_disks * 2; i++) {
 			bio = r1_bio->bios[i];
 			if (bio->bi_end_io) {
-				page = mdev_get_page_from_bio(bio, bio->bi_vcnt);
+				page = mdev_get_page_from_bio(bio, bio->bi_error++);
 				if (bio_add_page(bio, page, len, 0) == 0) {
 					/* stop here */
-					mdev_put_page_to_bio(bio, bio->bi_vcnt, page);
+					mdev_put_page_to_bio(bio, --bio->bi_error, page);
 					while (i > 0) {
 						i--;
 						bio = r1_bio->bios[i];
@@ -2836,6 +2837,13 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
 		sync_blocks -= (len>>9);
 	} while (r1_bio->bios[disk]->bi_vcnt < RESYNC_PAGES);
  bio_full:
+	/* return .bi_error back to bio */
+	for (i = 0 ; i < conf->raid_disks * 2; i++) {
+		bio = r1_bio->bios[i];
+		if (bio->bi_end_io)
+			bio->bi_error = 0;
+	}
+
 	r1_bio->sectors = nr_sectors;
 
 	if (mddev_is_clustered(mddev) &&
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index b7dfbca869a3..9cfc22cd1330 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3348,7 +3348,6 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 
 			bio = r10_bio->devs[i].bio;
 			bio_reset(bio);
-			bio->bi_error = -EIO;
 			rcu_read_lock();
 			rdev = rcu_dereference(conf->mirrors[d].rdev);
 			if (rdev == NULL || test_bit(Faulty, &rdev->flags)) {
@@ -3392,7 +3391,6 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 			/* Need to set up for writing to the replacement */
 			bio = r10_bio->devs[i].repl_bio;
 			bio_reset(bio);
-			bio->bi_error = -EIO;
 
 			sector = r10_bio->devs[i].addr;
 			bio->bi_next = biolist;
@@ -3435,14 +3433,15 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 			len = (max_sector - sector_nr) << 9;
 		if (len == 0)
 			break;
+		/* borrow .bi_error as pre-allocated page index */
 		for (bio= biolist ; bio ; bio=bio->bi_next) {
 			struct bio *bio2;
-			page = mdev_get_page_from_bio(bio, bio->bi_vcnt);
+			page = mdev_get_page_from_bio(bio, bio->bi_error++);
 			if (bio_add_page(bio, page, len, 0))
 				continue;
 
 			/* stop here */
-			mdev_put_page_to_bio(bio, bio->bi_vcnt, page);
+			mdev_put_page_to_bio(bio, --bio->bi_error, page);
 			for (bio2 = biolist;
 			     bio2 && bio2 != bio;
 			     bio2 = bio2->bi_next) {
@@ -3456,6 +3455,13 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 		sector_nr += len>>9;
 	} while (biolist->bi_vcnt < RESYNC_PAGES);
  bio_full:
+	/* return .bi_error back to bio, and set resync's as -EIO */
+	for (bio= biolist ; bio ; bio=bio->bi_next)
+		if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery))
+			bio->bi_error = -EIO;
+		else
+			bio->bi_error = 0;
+
 	r10_bio->sectors = nr_sectors;
 
 	while (biolist) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/17] md: raid1/raid10: don't use .bi_vcnt to check if all pages are added
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (5 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 06/17] md: raid1/raid10: borrow .bi_error as pre-allocated page index Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 08/17] md: raid1: simplify r1buf_pool_free() Ming Lei
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Instead we use the index of the pre-allocated pages buffer, it should
be more explicit because the index(.bi_error) just means how many pages
are added to the bio.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c  | 2 +-
 drivers/md/raid10.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 8904a9149671..23a3a678a9ed 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2835,7 +2835,7 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
 		nr_sectors += len>>9;
 		sector_nr += len>>9;
 		sync_blocks -= (len>>9);
-	} while (r1_bio->bios[disk]->bi_vcnt < RESYNC_PAGES);
+	} while (r1_bio->bios[disk]->bi_error < RESYNC_PAGES);
  bio_full:
 	/* return .bi_error back to bio */
 	for (i = 0 ; i < conf->raid_disks * 2; i++) {
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 9cfc22cd1330..8262f3e1dd93 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3453,7 +3453,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 		}
 		nr_sectors += len>>9;
 		sector_nr += len>>9;
-	} while (biolist->bi_vcnt < RESYNC_PAGES);
+	} while (biolist->bi_error < RESYNC_PAGES);
  bio_full:
 	/* return .bi_error back to bio, and set resync's as -EIO */
 	for (bio= biolist ; bio ; bio=bio->bi_next)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/17] md: raid1: simplify r1buf_pool_free()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (6 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 07/17] md: raid1/raid10: don't use .bi_vcnt to check if all pages are added Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 09/17] md: raid1/raid10: use bio helper in *_pool_free Ming Lei
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

This patch gets each page's reference of each bio for resync,
then r1buf_pool_free() gets simplified.

The same policy has been taken in raid10's buf pool allocation/free
too.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 23a3a678a9ed..ebea8615344a 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -143,9 +143,12 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
 	/* If not user-requests, copy the page pointers to all bios */
 	if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) {
 		for (i=0; i<RESYNC_PAGES ; i++)
-			for (j=1; j<pi->raid_disks; j++)
-				r1_bio->bios[j]->bi_io_vec[i].bv_page =
+			for (j=1; j<pi->raid_disks; j++) {
+				struct page *page =
 					r1_bio->bios[0]->bi_io_vec[i].bv_page;
+				get_page(page);
+				r1_bio->bios[j]->bi_io_vec[i].bv_page = page;
+			}
 	}
 
 	r1_bio->master_bio = NULL;
@@ -170,12 +173,8 @@ static void r1buf_pool_free(void *__r1_bio, void *data)
 	struct r1bio *r1bio = __r1_bio;
 
 	for (i = 0; i < RESYNC_PAGES; i++)
-		for (j = pi->raid_disks; j-- ;) {
-			if (j == 0 ||
-			    r1bio->bios[j]->bi_io_vec[i].bv_page !=
-			    r1bio->bios[0]->bi_io_vec[i].bv_page)
-				safe_put_page(r1bio->bios[j]->bi_io_vec[i].bv_page);
-		}
+		for (j = pi->raid_disks; j-- ;)
+			safe_put_page(r1bio->bios[j]->bi_io_vec[i].bv_page);
 	for (i=0 ; i < pi->raid_disks; i++)
 		bio_put(r1bio->bios[i]);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/17] md: raid1/raid10: use bio helper in *_pool_free
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (7 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 08/17] md: raid1: simplify r1buf_pool_free() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 10/17] md: raid1: remove direct access to bvec table in fix_sync_read_error Ming Lei
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c  | 13 ++++++++++---
 drivers/md/raid10.c | 11 +++++++----
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index ebea8615344a..1dd6b2760fba 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -172,9 +172,16 @@ static void r1buf_pool_free(void *__r1_bio, void *data)
 	int i,j;
 	struct r1bio *r1bio = __r1_bio;
 
-	for (i = 0; i < RESYNC_PAGES; i++)
-		for (j = pi->raid_disks; j-- ;)
-			safe_put_page(r1bio->bios[j]->bi_io_vec[i].bv_page);
+	for (i = 0; i < pi->raid_disks; i++) {
+		struct bio_vec *bvl;
+		struct bio *bio = r1bio->bios[i];
+
+		/* make sure all pages can be freed */
+		bio->bi_vcnt = RESYNC_PAGES;
+
+		bio_for_each_segment_all(bvl, bio, j)
+			safe_put_page(bvl->bv_page);
+	}
 	for (i=0 ; i < pi->raid_disks; i++)
 		bio_put(r1bio->bios[i]);
 
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 8262f3e1dd93..5c698f3d3083 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -230,10 +230,13 @@ static void r10buf_pool_free(void *__r10_bio, void *data)
 	for (j=0; j < conf->copies; j++) {
 		struct bio *bio = r10bio->devs[j].bio;
 		if (bio) {
-			for (i = 0; i < RESYNC_PAGES; i++) {
-				safe_put_page(bio->bi_io_vec[i].bv_page);
-				bio->bi_io_vec[i].bv_page = NULL;
-			}
+			struct bio_vec *bvl;
+
+			/* make sure all pages can be freed */
+			bio->bi_vcnt = RESYNC_PAGES;
+
+			bio_for_each_segment_all(bvl, bio, i)
+				safe_put_page(bvl->bv_page);
 			bio_put(bio);
 		}
 		bio = r10bio->devs[j].repl_bio;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/17] md: raid1: remove direct access to bvec table in fix_sync_read_error
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (8 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 09/17] md: raid1/raid10: use bio helper in *_pool_free Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 11/17] md: raid1: use bio helper in process_checks() Ming Lei
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

This patch uses a stack variable to hold the pages in bio, so
that we can remove direct access to bvec table in fix_sync_read_error().

This 16*8 stack variable is just fine for kernel thread context.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 1dd6b2760fba..02ee8542295d 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1874,6 +1874,16 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
 	int sectors = r1_bio->sectors;
 	int idx = 0;
 	struct md_rdev *rdev;
+	struct bio_vec *bvl;
+	int i;
+	struct page *pages[RESYNC_PAGES];
+
+	/*
+	 * bio for read_disk is filled up, so we can use
+	 * bio_for_each_segment_all() to retrieve all pages.
+	 */
+	bio_for_each_segment_all(bvl, bio, i)
+		pages[i] = bvl->bv_page;
 
 	rdev = conf->mirrors[r1_bio->read_disk].rdev;
 	if (test_bit(FailFast, &rdev->flags)) {
@@ -1903,7 +1913,7 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
 				 */
 				rdev = conf->mirrors[d].rdev;
 				if (sync_page_io(rdev, sect, s<<9,
-						 bio->bi_io_vec[idx].bv_page,
+						 pages[idx],
 						 REQ_OP_READ, 0, false)) {
 					success = 1;
 					break;
@@ -1958,7 +1968,7 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
 				continue;
 			rdev = conf->mirrors[d].rdev;
 			if (r1_sync_page_io(rdev, sect, s,
-					    bio->bi_io_vec[idx].bv_page,
+					    pages[idx],
 					    WRITE) == 0) {
 				r1_bio->bios[d]->bi_end_io = NULL;
 				rdev_dec_pending(rdev, mddev);
@@ -1973,7 +1983,7 @@ static int fix_sync_read_error(struct r1bio *r1_bio)
 				continue;
 			rdev = conf->mirrors[d].rdev;
 			if (r1_sync_page_io(rdev, sect, s,
-					    bio->bi_io_vec[idx].bv_page,
+					    pages[idx],
 					    READ) != 0)
 				atomic_add(s, &rdev->corrected_errors);
 		}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 11/17] md: raid1: use bio helper in process_checks()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (9 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 10/17] md: raid1: remove direct access to bvec table in fix_sync_read_error Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 12/17] md: raid1: avoid direct access to bvec table " Ming Lei
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Avoid to direct access to bvec table.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 02ee8542295d..4400fbe7ce8c 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2017,6 +2017,7 @@ static void process_checks(struct r1bio *r1_bio)
 		int j;
 		int size;
 		int error;
+		struct bio_vec *bi;
 		struct bio *b = r1_bio->bios[i];
 		if (b->bi_end_io != end_sync_read)
 			continue;
@@ -2033,9 +2034,7 @@ static void process_checks(struct r1bio *r1_bio)
 		b->bi_private = r1_bio;
 
 		size = b->bi_iter.bi_size;
-		for (j = 0; j < vcnt ; j++) {
-			struct bio_vec *bi;
-			bi = &b->bi_io_vec[j];
+		bio_for_each_segment_all(bi, b, j) {
 			bi->bv_offset = 0;
 			if (size > PAGE_SIZE)
 				bi->bv_len = PAGE_SIZE;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 12/17] md: raid1: avoid direct access to bvec table in process_checks()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (10 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 11/17] md: raid1: use bio helper in process_checks() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-17  8:33   ` kbuild test robot
  2017-02-16 11:45 ` [PATCH 13/17] md: raid1: use bio_segments_all() Ming Lei
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Given process_checks is only called in resync path, it should be
ok to allocate three stack variable(total 320byteds) to store
pages from bios.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 4400fbe7ce8c..54ec32be3277 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2010,6 +2010,9 @@ static void process_checks(struct r1bio *r1_bio)
 	int primary;
 	int i;
 	int vcnt;
+	struct bio_vec *bi;
+	int page_len[RESYNC_PAGES];
+	struct page *pbio_pages[RESYNC_PAGES], *sbio_pages[RESYNC_PAGES];
 
 	/* Fix variable parts of all bios */
 	vcnt = (r1_bio->sectors + PAGE_SIZE / 512 - 1) >> (PAGE_SHIFT - 9);
@@ -2017,7 +2020,6 @@ static void process_checks(struct r1bio *r1_bio)
 		int j;
 		int size;
 		int error;
-		struct bio_vec *bi;
 		struct bio *b = r1_bio->bios[i];
 		if (b->bi_end_io != end_sync_read)
 			continue;
@@ -2051,6 +2053,11 @@ static void process_checks(struct r1bio *r1_bio)
 			break;
 		}
 	r1_bio->read_disk = primary;
+
+	/* .bi_vcnt has been set for all read bios */
+	bio_for_each_segment_all(bi, r1_bio->bios[primary], i)
+		pbio_pages[i] = bi->bv_page;
+
 	for (i = 0; i < conf->raid_disks * 2; i++) {
 		int j;
 		struct bio *pbio = r1_bio->bios[primary];
@@ -2062,14 +2069,19 @@ static void process_checks(struct r1bio *r1_bio)
 		/* Now we can 'fixup' the error value */
 		sbio->bi_error = 0;
 
+		bio_for_each_segment_all(bi, sbio, j) {
+			sbio_pages[j] = bi->bv_page;
+			page_len[j] = bi->bv_len;
+		}
+
 		if (!error) {
 			for (j = vcnt; j-- ; ) {
 				struct page *p, *s;
-				p = pbio->bi_io_vec[j].bv_page;
-				s = sbio->bi_io_vec[j].bv_page;
+				p = pbio_pages[j];
+				s = sbio_pages[j];
 				if (memcmp(page_address(p),
 					   page_address(s),
-					   sbio->bi_io_vec[j].bv_len))
+					   page_len[j]))
 					break;
 			}
 		} else
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 13/17] md: raid1: use bio_segments_all()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (11 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 12/17] md: raid1: avoid direct access to bvec table " Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 12:35   ` Johannes Thumshirn
  2017-02-16 11:45 ` [PATCH 14/17] md: raid10: avoid direct access to bvec table in sync_request_write() Ming Lei
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

Use this helper, instead of direct access to .bi_vcnt.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid1.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 54ec32be3277..02cfece74981 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -998,7 +998,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
 {
 	int i;
 	struct bio_vec *bvec;
-	struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec),
+	unsigned vcnt = bio_segments_all(bio);
+	struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec),
 					GFP_NOIO);
 	if (unlikely(!bvecs))
 		return;
@@ -1014,12 +1015,12 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
 		kunmap(bvec->bv_page);
 	}
 	r1_bio->behind_bvecs = bvecs;
-	r1_bio->behind_page_count = bio->bi_vcnt;
+	r1_bio->behind_page_count = vcnt;
 	set_bit(R1BIO_BehindIO, &r1_bio->state);
 	return;
 
 do_sync_io:
-	for (i = 0; i < bio->bi_vcnt; i++)
+	for (i = 0; i < vcnt; i++)
 		if (bvecs[i].bv_page)
 			put_page(bvecs[i].bv_page);
 	kfree(bvecs);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 14/17] md: raid10: avoid direct access to bvec table in sync_request_write()
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (12 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 13/17] md: raid1: use bio_segments_all() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 15/17] md: raid10: avoid direct access to bvec table in fix_recovery_read_error Ming Lei
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

The cost is 256bytes(8*16*2) stack space, and just use the bio
helper to retrieve pages from bio.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid10.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 5c698f3d3083..69fe2a3cef89 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2036,6 +2036,8 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 	int i, first;
 	struct bio *tbio, *fbio;
 	int vcnt;
+	struct bio_vec *bvl;
+	struct page *fbio_pages[RESYNC_PAGES], *tbio_pages[RESYNC_PAGES];
 
 	atomic_set(&r10_bio->remaining, 1);
 
@@ -2052,6 +2054,10 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 	fbio->bi_iter.bi_size = r10_bio->sectors << 9;
 	fbio->bi_iter.bi_idx = 0;
 
+	/* the bio has been filled up in raid10_sync_request */
+	bio_for_each_segment_all(bvl, fbio, i)
+		fbio_pages[i] = bvl->bv_page;
+
 	vcnt = (r10_bio->sectors + (PAGE_SIZE >> 9) - 1) >> (PAGE_SHIFT - 9);
 	/* now find blocks with errors */
 	for (i=0 ; i < conf->copies ; i++) {
@@ -2072,12 +2078,17 @@ static void sync_request_write(struct mddev *mddev, struct r10bio *r10_bio)
 			 * All vec entries are PAGE_SIZE;
 			 */
 			int sectors = r10_bio->sectors;
+
+			/* the bio has been filled up in raid10_sync_request */
+			bio_for_each_segment_all(bvl, tbio, j)
+				tbio_pages[j] = bvl->bv_page;
+
 			for (j = 0; j < vcnt; j++) {
 				int len = PAGE_SIZE;
 				if (sectors < (len / 512))
 					len = sectors * 512;
-				if (memcmp(page_address(fbio->bi_io_vec[j].bv_page),
-					   page_address(tbio->bi_io_vec[j].bv_page),
+				if (memcmp(page_address(fbio_pages[j]),
+					   page_address(tbio_pages[j]),
 					   len))
 					break;
 				sectors -= len/512;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 15/17] md: raid10: avoid direct access to bvec table in fix_recovery_read_error
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (13 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 14/17] md: raid10: avoid direct access to bvec table in sync_request_write() Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 16/17] md: raid10: avoid direct access to bvec table in reshape_request Ming Lei
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

The cost is 128bytes(8*16) stack space in kernel thread context, and just
use the bio helper to retrieve pages from bio.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid10.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 69fe2a3cef89..c7d2f73565d9 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2184,6 +2184,11 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)
 	int idx = 0;
 	int dr = r10_bio->devs[0].devnum;
 	int dw = r10_bio->devs[1].devnum;
+	struct bio_vec *bvl;
+	struct page *pages[RESYNC_PAGES];
+
+	bio_for_each_segment_all(bvl, bio, idx)
+		pages[idx] = bvl->bv_page;
 
 	while (sectors) {
 		int s = sectors;
@@ -2199,7 +2204,7 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)
 		ok = sync_page_io(rdev,
 				  addr,
 				  s << 9,
-				  bio->bi_io_vec[idx].bv_page,
+				  pages[idx],
 				  REQ_OP_READ, 0, false);
 		if (ok) {
 			rdev = conf->mirrors[dw].rdev;
@@ -2207,7 +2212,7 @@ static void fix_recovery_read_error(struct r10bio *r10_bio)
 			ok = sync_page_io(rdev,
 					  addr,
 					  s << 9,
-					  bio->bi_io_vec[idx].bv_page,
+					  pages[idx],
 					  REQ_OP_WRITE, 0, false);
 			if (!ok) {
 				set_bit(WriteErrorSeen, &rdev->flags);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 16/17] md: raid10: avoid direct access to bvec table in reshape_request
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (14 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 15/17] md: raid10: avoid direct access to bvec table in fix_recovery_read_error Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 11:45 ` [PATCH 17/17] md: raid10: avoid direct access to bvec table in handle_reshape_read_error Ming Lei
  2017-02-16 22:16 ` [PATCH 00/17] md: cleanup on direct access to bvec table Shaohua Li
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

The cost is 128bytes(8*16) stack space in kernel thread context, and
just use the bio helper to retrieve pages from bio.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid10.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index c7d2f73565d9..5016ce8163c3 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4383,6 +4383,8 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
 	struct bio *blist;
 	struct bio *bio, *read_bio;
 	int sectors_done = 0;
+	struct bio_vec *bvl;
+	struct page *pages[RESYNC_PAGES];
 
 	if (sector_nr == 0) {
 		/* If restarting in the middle, skip the initial sectors */
@@ -4546,9 +4548,12 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
 
 	/* Now add as many pages as possible to all of these bios. */
 
+	bio_for_each_segment_all(bvl, r10_bio->devs[0].bio, s)
+		pages[s] = bvl->bv_page;
+
 	nr_sectors = 0;
 	for (s = 0 ; s < max_sectors; s += PAGE_SIZE >> 9) {
-		struct page *page = r10_bio->devs[0].bio->bi_io_vec[s/(PAGE_SIZE>>9)].bv_page;
+		struct page *page = pages[s / (PAGE_SIZE >> 9)];
 		int len = (max_sectors - s) << 9;
 		if (len > PAGE_SIZE)
 			len = PAGE_SIZE;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 17/17] md: raid10: avoid direct access to bvec table in handle_reshape_read_error
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (15 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 16/17] md: raid10: avoid direct access to bvec table in reshape_request Ming Lei
@ 2017-02-16 11:45 ` Ming Lei
  2017-02-16 22:16 ` [PATCH 00/17] md: cleanup on direct access to bvec table Shaohua Li
  17 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 11:45 UTC (permalink / raw)
  To: Shaohua Li, Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown
  Cc: Ming Lei

The cost is 128bytes(8*16) stack space in kernel thread context, and
just use the bio helper to retrieve pages from bio.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/raid10.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 5016ce8163c3..d9d7c79a3bd4 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4689,7 +4689,15 @@ static int handle_reshape_read_error(struct mddev *mddev,
 	struct r10bio *r10b = &on_stack.r10_bio;
 	int slot = 0;
 	int idx = 0;
-	struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec;
+	struct bio_vec *bvl;
+	struct page *pages[RESYNC_PAGES];
+
+	/*
+	 * This bio is allocated in reshape_request(), and size
+	 * is still RESYNC_PAGES
+	 */
+	bio_for_each_segment_all(bvl, r10_bio->master_bio, idx)
+		pages[idx] = bvl->bv_page;
 
 	r10b->sector = r10_bio->sector;
 	__raid10_find_phys(&conf->prev, r10b);
@@ -4718,7 +4726,7 @@ static int handle_reshape_read_error(struct mddev *mddev,
 			success = sync_page_io(rdev,
 					       addr,
 					       s << 9,
-					       bvec[idx].bv_page,
+					       pages[idx],
 					       REQ_OP_READ, 0, false);
 			rdev_dec_pending(rdev, mddev);
 			rcu_read_lock();
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/17] block: introduce bio_remove_last_page()
  2017-02-16 11:45 ` [PATCH 02/17] block: introduce bio_remove_last_page() Ming Lei
@ 2017-02-16 12:08   ` Johannes Thumshirn
  2017-02-16 13:30     ` Ming Lei
  0 siblings, 1 reply; 31+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 12:08 UTC (permalink / raw)
  To: Ming Lei, Shaohua Li, Jens Axboe, linux-kernel, linux-raid,
	linux-block, Christoph Hellwig, NeilBrown

On 02/16/2017 12:45 PM, Ming Lei wrote:
> MD need this helper to remove the last added page, so introduce
> it.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  block/bio.c         | 23 +++++++++++++++++++++++
>  include/linux/bio.h |  1 +
>  2 files changed, 24 insertions(+)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 5eec5e08417f..0ce7ffcd7939 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -837,6 +837,29 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>  EXPORT_SYMBOL(bio_add_pc_page);
>  
>  /**
> + *	bio_remove_last_page	-	remove the last added page
> + *	@bio: destination bio
> + *
> + *	Attempt to remove the last added page from the bio_vec maplist.
> + */
> +void bio_remove_last_page(struct bio *bio)
> +{
> +	/*
> +	 * cloned bio must not modify vec list
> +	 */
> +	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
> +		return;
> +
> +	if (bio->bi_vcnt > 0) {

In patch 1 you introduce bio_segments_all() with the log message 'So
that we can replace the direct access to .bi_vcnt.' Here you introduce a
new direct access to it (plus the duplicated WARN_ON_ONCE()).

Maybe use the helper directly here (I admit I haven't gone through the
whole series yet, so I can't see if the change is made later).

Byte,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/17] md: raid1: use bio_segments_all()
  2017-02-16 11:45 ` [PATCH 13/17] md: raid1: use bio_segments_all() Ming Lei
@ 2017-02-16 12:35   ` Johannes Thumshirn
  2017-02-16 13:32     ` Ming Lei
  0 siblings, 1 reply; 31+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 12:35 UTC (permalink / raw)
  To: Ming Lei, Shaohua Li, Jens Axboe, linux-kernel, linux-raid,
	linux-block, Christoph Hellwig, NeilBrown

On 02/16/2017 12:45 PM, Ming Lei wrote:
> @@ -998,7 +998,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
>  {
>  	int i;
>  	struct bio_vec *bvec;
> -	struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec),
> +	unsigned vcnt = bio_segments_all(bio);
> +	struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec),
>  					GFP_NOIO);

Maybe use kcalloc() instead of kzalloc() with a multiplication.

Byte,
	Johannes
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/17] block: introduce bio_remove_last_page()
  2017-02-16 12:08   ` Johannes Thumshirn
@ 2017-02-16 13:30     ` Ming Lei
  2017-02-16 13:40       ` Johannes Thumshirn
  0 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-16 13:30 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On Thu, Feb 16, 2017 at 8:08 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> On 02/16/2017 12:45 PM, Ming Lei wrote:
>> MD need this helper to remove the last added page, so introduce
>> it.
>>
>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>> ---
>>  block/bio.c         | 23 +++++++++++++++++++++++
>>  include/linux/bio.h |  1 +
>>  2 files changed, 24 insertions(+)
>>
>> diff --git a/block/bio.c b/block/bio.c
>> index 5eec5e08417f..0ce7ffcd7939 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -837,6 +837,29 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>>  EXPORT_SYMBOL(bio_add_pc_page);
>>
>>  /**
>> + *   bio_remove_last_page    -       remove the last added page
>> + *   @bio: destination bio
>> + *
>> + *   Attempt to remove the last added page from the bio_vec maplist.
>> + */
>> +void bio_remove_last_page(struct bio *bio)
>> +{
>> +     /*
>> +      * cloned bio must not modify vec list
>> +      */
>> +     if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
>> +             return;
>> +
>> +     if (bio->bi_vcnt > 0) {
>
> In patch 1 you introduce bio_segments_all() with the log message 'So
> that we can replace the direct access to .bi_vcnt.' Here you introduce a
> new direct access to it (plus the duplicated WARN_ON_ONCE()).
>
> Maybe use the helper directly here (I admit I haven't gone through the
> whole series yet, so I can't see if the change is made later).

Firstly MD does need one helper to remove the last added page, as you
can see there are three such uses in patch3.

Secondly both the two helpers will be changed once multipage bvec
is supported, that means we have to change MD too after multipage bvec
if just using bio_segments_all() to replace .bi_vcnt for removing
the last added page.



Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/17] md: raid1: use bio_segments_all()
  2017-02-16 12:35   ` Johannes Thumshirn
@ 2017-02-16 13:32     ` Ming Lei
  2017-02-16 13:34       ` Johannes Thumshirn
  0 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-16 13:32 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On Thu, Feb 16, 2017 at 8:35 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> On 02/16/2017 12:45 PM, Ming Lei wrote:
>> @@ -998,7 +998,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
>>  {
>>       int i;
>>       struct bio_vec *bvec;
>> -     struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec),
>> +     unsigned vcnt = bio_segments_all(bio);
>> +     struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec),
>>                                       GFP_NOIO);
>
> Maybe use kcalloc() instead of kzalloc() with a multiplication.

That doesn't belong to this patch, which just wants to remove direct
access to .bi_vcnt.


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/17] md: raid1: use bio_segments_all()
  2017-02-16 13:32     ` Ming Lei
@ 2017-02-16 13:34       ` Johannes Thumshirn
  2017-02-16 13:38         ` Ming Lei
  0 siblings, 1 reply; 31+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 13:34 UTC (permalink / raw)
  To: Ming Lei
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On 02/16/2017 02:32 PM, Ming Lei wrote:
> On Thu, Feb 16, 2017 at 8:35 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>> On 02/16/2017 12:45 PM, Ming Lei wrote:
>>> @@ -998,7 +998,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
>>>  {
>>>       int i;
>>>       struct bio_vec *bvec;
>>> -     struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec),
>>> +     unsigned vcnt = bio_segments_all(bio);
>>> +     struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec),
>>>                                       GFP_NOIO);
>>
>> Maybe use kcalloc() instead of kzalloc() with a multiplication.
> 
> That doesn't belong to this patch, which just wants to remove direct
> access to .bi_vcnt.

But you're touching it anyways, aren't you?


-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/17] md: raid1: use bio_segments_all()
  2017-02-16 13:34       ` Johannes Thumshirn
@ 2017-02-16 13:38         ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2017-02-16 13:38 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On Thu, Feb 16, 2017 at 9:34 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> On 02/16/2017 02:32 PM, Ming Lei wrote:
>> On Thu, Feb 16, 2017 at 8:35 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>>> On 02/16/2017 12:45 PM, Ming Lei wrote:
>>>> @@ -998,7 +998,8 @@ static void alloc_behind_pages(struct bio *bio, struct r1bio *r1_bio)
>>>>  {
>>>>       int i;
>>>>       struct bio_vec *bvec;
>>>> -     struct bio_vec *bvecs = kzalloc(bio->bi_vcnt * sizeof(struct bio_vec),
>>>> +     unsigned vcnt = bio_segments_all(bio);
>>>> +     struct bio_vec *bvecs = kzalloc(vcnt * sizeof(struct bio_vec),
>>>>                                       GFP_NOIO);
>>>
>>> Maybe use kcalloc() instead of kzalloc() with a multiplication.
>>
>> That doesn't belong to this patch, which just wants to remove direct
>> access to .bi_vcnt.
>
> But you're touching it anyways, aren't you?

Don't you know the policy of 'do one thing, do it better' in one patch?

If you want to switch to kcalloc(), just post a patch, that is fine, but
please don't push me to do that in this patch.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/17] block: introduce bio_remove_last_page()
  2017-02-16 13:30     ` Ming Lei
@ 2017-02-16 13:40       ` Johannes Thumshirn
  2017-02-16 13:59         ` Ming Lei
  0 siblings, 1 reply; 31+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 13:40 UTC (permalink / raw)
  To: Ming Lei
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On 02/16/2017 02:30 PM, Ming Lei wrote:
> On Thu, Feb 16, 2017 at 8:08 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>> On 02/16/2017 12:45 PM, Ming Lei wrote:
>>> MD need this helper to remove the last added page, so introduce
>>> it.
>>>
>>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>>> ---
>>>  block/bio.c         | 23 +++++++++++++++++++++++
>>>  include/linux/bio.h |  1 +
>>>  2 files changed, 24 insertions(+)
>>>
>>> diff --git a/block/bio.c b/block/bio.c
>>> index 5eec5e08417f..0ce7ffcd7939 100644
>>> --- a/block/bio.c
>>> +++ b/block/bio.c
>>> @@ -837,6 +837,29 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>>>  EXPORT_SYMBOL(bio_add_pc_page);
>>>
>>>  /**
>>> + *   bio_remove_last_page    -       remove the last added page
>>> + *   @bio: destination bio
>>> + *
>>> + *   Attempt to remove the last added page from the bio_vec maplist.
>>> + */
>>> +void bio_remove_last_page(struct bio *bio)
>>> +{
>>> +     /*
>>> +      * cloned bio must not modify vec list
>>> +      */
>>> +     if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
>>> +             return;
>>> +
>>> +     if (bio->bi_vcnt > 0) {
>>
>> In patch 1 you introduce bio_segments_all() with the log message 'So
>> that we can replace the direct access to .bi_vcnt.' Here you introduce a
>> new direct access to it (plus the duplicated WARN_ON_ONCE()).
>>
>> Maybe use the helper directly here (I admit I haven't gone through the
>> whole series yet, so I can't see if the change is made later).
> 
> Firstly MD does need one helper to remove the last added page, as you
> can see there are three such uses in patch3.
> 
> Secondly both the two helpers will be changed once multipage bvec
> is supported, that means we have to change MD too after multipage bvec
> if just using bio_segments_all() to replace .bi_vcnt for removing
> the last added page.

I'm not sure if we're talking past each other here, I assumed you'd do
something like:

void bio_remove_last_page(struct bio *bio)
{
	int vcnt = bio_segments_all(bio);

	if (bio_flagged(bio, BIO_CLONED))
		return;

	if (vcnt > 0) {
		struct bio_vec *bv = &bio->bi_io_vec[vcnt - 1];

		bio->bi_iter.bi_size -= bv->bv_len;
		bio->bi_vcnt--;
	}
}

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/17] block: introduce bio_remove_last_page()
  2017-02-16 13:40       ` Johannes Thumshirn
@ 2017-02-16 13:59         ` Ming Lei
  2017-02-16 14:10           ` Johannes Thumshirn
  0 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-16 13:59 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On Thu, Feb 16, 2017 at 9:40 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> On 02/16/2017 02:30 PM, Ming Lei wrote:
>> On Thu, Feb 16, 2017 at 8:08 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>>> On 02/16/2017 12:45 PM, Ming Lei wrote:
>>>> MD need this helper to remove the last added page, so introduce
>>>> it.
>>>>
>>>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>>>> ---
>>>>  block/bio.c         | 23 +++++++++++++++++++++++
>>>>  include/linux/bio.h |  1 +
>>>>  2 files changed, 24 insertions(+)
>>>>
>>>> diff --git a/block/bio.c b/block/bio.c
>>>> index 5eec5e08417f..0ce7ffcd7939 100644
>>>> --- a/block/bio.c
>>>> +++ b/block/bio.c
>>>> @@ -837,6 +837,29 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>>>>  EXPORT_SYMBOL(bio_add_pc_page);
>>>>
>>>>  /**
>>>> + *   bio_remove_last_page    -       remove the last added page
>>>> + *   @bio: destination bio
>>>> + *
>>>> + *   Attempt to remove the last added page from the bio_vec maplist.
>>>> + */
>>>> +void bio_remove_last_page(struct bio *bio)
>>>> +{
>>>> +     /*
>>>> +      * cloned bio must not modify vec list
>>>> +      */
>>>> +     if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
>>>> +             return;
>>>> +
>>>> +     if (bio->bi_vcnt > 0) {
>>>
>>> In patch 1 you introduce bio_segments_all() with the log message 'So
>>> that we can replace the direct access to .bi_vcnt.' Here you introduce a
>>> new direct access to it (plus the duplicated WARN_ON_ONCE()).
>>>
>>> Maybe use the helper directly here (I admit I haven't gone through the
>>> whole series yet, so I can't see if the change is made later).
>>
>> Firstly MD does need one helper to remove the last added page, as you
>> can see there are three such uses in patch3.
>>
>> Secondly both the two helpers will be changed once multipage bvec
>> is supported, that means we have to change MD too after multipage bvec
>> if just using bio_segments_all() to replace .bi_vcnt for removing
>> the last added page.
>
> I'm not sure if we're talking past each other here, I assumed you'd do
> something like:
>
> void bio_remove_last_page(struct bio *bio)
> {
>         int vcnt = bio_segments_all(bio);
>
>         if (bio_flagged(bio, BIO_CLONED))
>                 return;
>
>         if (vcnt > 0) {
>                 struct bio_vec *bv = &bio->bi_io_vec[vcnt - 1];
>
>                 bio->bi_iter.bi_size -= bv->bv_len;
>                 bio->bi_vcnt--;
>         }
> }

What we are doing is to remove the external direct access to bvec
table in drivers or filesystems because they may misuse that, for
example, drivers often use .bi_vcnt to get the page count in the bio,
but the actual meaning is just bvec's count.

And the implementation in block layer still need to play the table directly,
especially for sake of efficiency, cause we understand the details and won't
misuse that.


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/17] block: introduce bio_remove_last_page()
  2017-02-16 13:59         ` Ming Lei
@ 2017-02-16 14:10           ` Johannes Thumshirn
  0 siblings, 0 replies; 31+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 14:10 UTC (permalink / raw)
  To: Ming Lei
  Cc: Shaohua Li, Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On 02/16/2017 02:59 PM, Ming Lei wrote:
> On Thu, Feb 16, 2017 at 9:40 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>> On 02/16/2017 02:30 PM, Ming Lei wrote:
>>> On Thu, Feb 16, 2017 at 8:08 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>>>> On 02/16/2017 12:45 PM, Ming Lei wrote:
>>>>> MD need this helper to remove the last added page, so introduce
>>>>> it.
>>>>>
>>>>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>>>>> ---
>>>>>  block/bio.c         | 23 +++++++++++++++++++++++
>>>>>  include/linux/bio.h |  1 +
>>>>>  2 files changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/block/bio.c b/block/bio.c
>>>>> index 5eec5e08417f..0ce7ffcd7939 100644
>>>>> --- a/block/bio.c
>>>>> +++ b/block/bio.c
>>>>> @@ -837,6 +837,29 @@ int bio_add_pc_page(struct request_queue *q, struct bio *bio, struct page
>>>>>  EXPORT_SYMBOL(bio_add_pc_page);
>>>>>
>>>>>  /**
>>>>> + *   bio_remove_last_page    -       remove the last added page
>>>>> + *   @bio: destination bio
>>>>> + *
>>>>> + *   Attempt to remove the last added page from the bio_vec maplist.
>>>>> + */
>>>>> +void bio_remove_last_page(struct bio *bio)
>>>>> +{
>>>>> +     /*
>>>>> +      * cloned bio must not modify vec list
>>>>> +      */
>>>>> +     if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
>>>>> +             return;
>>>>> +
>>>>> +     if (bio->bi_vcnt > 0) {
>>>>
>>>> In patch 1 you introduce bio_segments_all() with the log message 'So
>>>> that we can replace the direct access to .bi_vcnt.' Here you introduce a
>>>> new direct access to it (plus the duplicated WARN_ON_ONCE()).
>>>>
>>>> Maybe use the helper directly here (I admit I haven't gone through the
>>>> whole series yet, so I can't see if the change is made later).
>>>
>>> Firstly MD does need one helper to remove the last added page, as you
>>> can see there are three such uses in patch3.
>>>
>>> Secondly both the two helpers will be changed once multipage bvec
>>> is supported, that means we have to change MD too after multipage bvec
>>> if just using bio_segments_all() to replace .bi_vcnt for removing
>>> the last added page.
>>
>> I'm not sure if we're talking past each other here, I assumed you'd do
>> something like:
>>
>> void bio_remove_last_page(struct bio *bio)
>> {
>>         int vcnt = bio_segments_all(bio);
>>
>>         if (bio_flagged(bio, BIO_CLONED))
>>                 return;
>>
>>         if (vcnt > 0) {
>>                 struct bio_vec *bv = &bio->bi_io_vec[vcnt - 1];
>>
>>                 bio->bi_iter.bi_size -= bv->bv_len;
>>                 bio->bi_vcnt--;
>>         }
>> }
> 
> What we are doing is to remove the external direct access to bvec
> table in drivers or filesystems because they may misuse that, for
> example, drivers often use .bi_vcnt to get the page count in the bio,
> but the actual meaning is just bvec's count.
> 
> And the implementation in block layer still need to play the table directly,
> especially for sake of efficiency, cause we understand the details and won't
> misuse that.

Ah OK, so bio_segments_all() is intended for drivers.

Thanks for the clarification,
	Johannes


-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 00/17] md: cleanup on direct access to bvec table
  2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
                   ` (16 preceding siblings ...)
  2017-02-16 11:45 ` [PATCH 17/17] md: raid10: avoid direct access to bvec table in handle_reshape_read_error Ming Lei
@ 2017-02-16 22:16 ` Shaohua Li
  2017-02-17  1:25   ` Ming Lei
  17 siblings, 1 reply; 31+ messages in thread
From: Shaohua Li @ 2017-02-16 22:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-kernel, linux-raid, linux-block,
	Christoph Hellwig, NeilBrown

On Thu, Feb 16, 2017 at 07:45:30PM +0800, Ming Lei wrote:
> In MD's resync I/O path, there are lots of direct access to bio's
> bvec table. This patchset kills most of them, and the conversion
> is quite straightforward.

I don't like this approach. The MD uses a hacky way to manage pages allocated,
this is the root of the problem. The patches add another hack way to do the
management. I'd like to see explict management of the pages, for example, add
data structure in r1bio to manage the pages, then we can use existing API for
all the stuffes we need.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 00/17] md: cleanup on direct access to bvec table
  2017-02-16 22:16 ` [PATCH 00/17] md: cleanup on direct access to bvec table Shaohua Li
@ 2017-02-17  1:25   ` Ming Lei
  2017-02-17  4:16     ` Shaohua Li
  0 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2017-02-17  1:25 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

Hi Shaohua,

On Fri, Feb 17, 2017 at 6:16 AM, Shaohua Li <shli@kernel.org> wrote:
> On Thu, Feb 16, 2017 at 07:45:30PM +0800, Ming Lei wrote:
>> In MD's resync I/O path, there are lots of direct access to bio's
>> bvec table. This patchset kills most of them, and the conversion
>> is quite straightforward.
>
> I don't like this approach. The MD uses a hacky way to manage pages allocated,
> this is the root of the problem. The patches add another hack way to do the

Yes, I agree, and bio_iov_iter_get_pages() uses this kind of hacky way too
actually.

> management. I'd like to see explict management of the pages, for example, add
> data structure in r1bio to manage the pages, then we can use existing API for
> all the stuffes we need.

Yeah, that is definitely clean, but we have to pay the following cost:

- allocate at least N * (128 + 4) bytes per each r1_bio/r10_bio
- N is pool_info.raid_disks for raid1, and conf->copies for raid10

If we are happy to introduce the cost, I can take this way in V1.


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 00/17] md: cleanup on direct access to bvec table
  2017-02-17  1:25   ` Ming Lei
@ 2017-02-17  4:16     ` Shaohua Li
  0 siblings, 0 replies; 31+ messages in thread
From: Shaohua Li @ 2017-02-17  4:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Linux Kernel Mailing List,
	open list:SOFTWARE RAID (Multiple Disks) SUPPORT, linux-block,
	Christoph Hellwig, NeilBrown

On Fri, Feb 17, 2017 at 09:25:27AM +0800, Ming Lei wrote:
> Hi Shaohua,
> 
> On Fri, Feb 17, 2017 at 6:16 AM, Shaohua Li <shli@kernel.org> wrote:
> > On Thu, Feb 16, 2017 at 07:45:30PM +0800, Ming Lei wrote:
> >> In MD's resync I/O path, there are lots of direct access to bio's
> >> bvec table. This patchset kills most of them, and the conversion
> >> is quite straightforward.
> >
> > I don't like this approach. The MD uses a hacky way to manage pages allocated,
> > this is the root of the problem. The patches add another hack way to do the
> 
> Yes, I agree, and bio_iov_iter_get_pages() uses this kind of hacky way too
> actually.
> 
> > management. I'd like to see explict management of the pages, for example, add
> > data structure in r1bio to manage the pages, then we can use existing API for
> > all the stuffes we need.
> 
> Yeah, that is definitely clean, but we have to pay the following cost:
> 
> - allocate at least N * (128 + 4) bytes per each r1_bio/r10_bio
> - N is pool_info.raid_disks for raid1, and conf->copies for raid10
> 
> If we are happy to introduce the cost, I can take this way in V1.

It's not a big deal. The inflight bio shouldn't be big, so the r1_bio count
isn't big. We don't waste much.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 12/17] md: raid1: avoid direct access to bvec table in process_checks()
  2017-02-16 11:45 ` [PATCH 12/17] md: raid1: avoid direct access to bvec table " Ming Lei
@ 2017-02-17  8:33   ` kbuild test robot
  0 siblings, 0 replies; 31+ messages in thread
From: kbuild test robot @ 2017-02-17  8:33 UTC (permalink / raw)
  To: Ming Lei
  Cc: kbuild-all, Shaohua Li, Jens Axboe, linux-kernel, linux-raid,
	linux-block, Christoph Hellwig, NeilBrown, Ming Lei

[-- Attachment #1: Type: text/plain, Size: 6268 bytes --]

Hi Ming,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.10-rc8]
[cannot apply to next-20170216]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ming-Lei/md-cleanup-on-direct-access-to-bvec-table/20170216-210357
config: powerpc-cell_defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   In file included from arch/powerpc/include/asm/page.h:331:0,
                    from arch/powerpc/include/asm/thread_info.h:34,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/powerpc/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:59,
                    from include/linux/spinlock.h:50,
                    from include/linux/mmzone.h:7,
                    from include/linux/gfp.h:5,
                    from include/linux/slab.h:14,
                    from drivers/md/raid1.c:34:
   drivers/md/raid1.c: In function 'raid1d':
>> include/asm-generic/memory_model.h:54:52: warning: 'sbio_pages$' may be used uninitialized in this function [-Wmaybe-uninitialized]
    #define __page_to_pfn(page) (unsigned long)((page) - vmemmap)
                                                       ^
   drivers/md/raid1.c:2008:42: note: 'sbio_pages$' was declared here
     struct page *pbio_pages[RESYNC_PAGES], *sbio_pages[RESYNC_PAGES];
                                             ^~~~~~~~~~
   drivers/md/raid1.c:2075:9: warning: 'page_len$' may be used uninitialized in this function [-Wmaybe-uninitialized]
        if (memcmp(page_address(p),
            ^~~~~~~~~~~~~~~~~~~~~~~
            page_address(s),
            ~~~~~~~~~~~~~~~~
            page_len[j]))
            ~~~~~~~~~~~~
   drivers/md/raid1.c:2007:6: note: 'page_len$' was declared here
     int page_len[RESYNC_PAGES];
         ^~~~~~~~
   In file included from arch/powerpc/include/asm/page.h:331:0,
                    from arch/powerpc/include/asm/thread_info.h:34,
                    from include/linux/thread_info.h:25,
                    from include/asm-generic/preempt.h:4,
                    from ./arch/powerpc/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:59,
                    from include/linux/spinlock.h:50,
                    from include/linux/mmzone.h:7,
                    from include/linux/gfp.h:5,
                    from include/linux/slab.h:14,
                    from drivers/md/raid1.c:34:
>> include/asm-generic/memory_model.h:54:52: warning: 'pbio_pages$' may be used uninitialized in this function [-Wmaybe-uninitialized]
    #define __page_to_pfn(page) (unsigned long)((page) - vmemmap)
                                                       ^
   drivers/md/raid1.c:2008:15: note: 'pbio_pages$' was declared here
     struct page *pbio_pages[RESYNC_PAGES], *sbio_pages[RESYNC_PAGES];
                  ^~~~~~~~~~
   drivers/md/raid1.c:1978:8: warning: 'pages$' may be used uninitialized in this function [-Wmaybe-uninitialized]
       if (r1_sync_page_io(rdev, sect, s,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             pages[idx],
             ~~~~~~~~~~~
             READ) != 0)
             ~~~~~
   drivers/md/raid1.c:1872:15: note: 'pages$' was declared here
     struct page *pages[RESYNC_PAGES];
                  ^~~~~

vim +54 include/asm-generic/memory_model.h

a117e66e KAMEZAWA Hiroyuki  2006-03-27  38  ({	unsigned long __pfn = (pfn);		\
c5d71243 Rafael J. Wysocki  2008-11-08  39  	unsigned long __nid = arch_pfn_to_nid(__pfn);  \
a117e66e KAMEZAWA Hiroyuki  2006-03-27  40  	NODE_DATA(__nid)->node_mem_map + arch_local_page_offset(__pfn, __nid);\
a117e66e KAMEZAWA Hiroyuki  2006-03-27  41  })
a117e66e KAMEZAWA Hiroyuki  2006-03-27  42  
67de6482 Andy Whitcroft     2006-06-23  43  #define __page_to_pfn(pg)						\
aa462abe Ian Campbell       2011-08-17  44  ({	const struct page *__pg = (pg);					\
a0140c1d KAMEZAWA Hiroyuki  2006-03-27  45  	struct pglist_data *__pgdat = NODE_DATA(page_to_nid(__pg));	\
a0140c1d KAMEZAWA Hiroyuki  2006-03-27  46  	(unsigned long)(__pg - __pgdat->node_mem_map) +			\
a0140c1d KAMEZAWA Hiroyuki  2006-03-27  47  	 __pgdat->node_start_pfn;					\
a117e66e KAMEZAWA Hiroyuki  2006-03-27  48  })
a117e66e KAMEZAWA Hiroyuki  2006-03-27  49  
8f6aac41 Christoph Lameter  2007-10-16  50  #elif defined(CONFIG_SPARSEMEM_VMEMMAP)
8f6aac41 Christoph Lameter  2007-10-16  51  
af901ca1 André Goddard Rosa 2009-11-14  52  /* memmap is virtually contiguous.  */
8f6aac41 Christoph Lameter  2007-10-16  53  #define __pfn_to_page(pfn)	(vmemmap + (pfn))
32272a26 Martin Schwidefsky 2008-12-25 @54  #define __page_to_pfn(page)	(unsigned long)((page) - vmemmap)
8f6aac41 Christoph Lameter  2007-10-16  55  
a117e66e KAMEZAWA Hiroyuki  2006-03-27  56  #elif defined(CONFIG_SPARSEMEM)
a117e66e KAMEZAWA Hiroyuki  2006-03-27  57  /*
1a49123b Zhang Yanfei       2013-10-03  58   * Note: section's mem_map is encoded to reflect its start_pfn.
a117e66e KAMEZAWA Hiroyuki  2006-03-27  59   * section[i].section_mem_map == mem_map's address - start_pfn;
a117e66e KAMEZAWA Hiroyuki  2006-03-27  60   */
67de6482 Andy Whitcroft     2006-06-23  61  #define __page_to_pfn(pg)					\
aa462abe Ian Campbell       2011-08-17  62  ({	const struct page *__pg = (pg);				\

:::::: The code at line 54 was first introduced by commit
:::::: 32272a26974d2027384fd4010cd1780fca425d94 [S390] __page_to_pfn warnings

:::::: TO: Martin Schwidefsky <schwidefsky@de.ibm.com>
:::::: CC: Martin Schwidefsky <schwidefsky@de.ibm.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 18602 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-02-17  8:34 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-16 11:45 [PATCH 00/17] md: cleanup on direct access to bvec table Ming Lei
2017-02-16 11:45 ` [PATCH 01/17] block: introduce bio_segments_all() Ming Lei
2017-02-16 11:45 ` [PATCH 02/17] block: introduce bio_remove_last_page() Ming Lei
2017-02-16 12:08   ` Johannes Thumshirn
2017-02-16 13:30     ` Ming Lei
2017-02-16 13:40       ` Johannes Thumshirn
2017-02-16 13:59         ` Ming Lei
2017-02-16 14:10           ` Johannes Thumshirn
2017-02-16 11:45 ` [PATCH 03/17] md: raid1/raid10: use bio_remove_last_page() Ming Lei
2017-02-16 11:45 ` [PATCH 04/17] md: introduce helpers for dealing with fetch/store preallocated pages in bio Ming Lei
2017-02-16 11:45 ` [PATCH 05/17] md: raid1/raid10: use the introduced helpers Ming Lei
2017-02-16 11:45 ` [PATCH 06/17] md: raid1/raid10: borrow .bi_error as pre-allocated page index Ming Lei
2017-02-16 11:45 ` [PATCH 07/17] md: raid1/raid10: don't use .bi_vcnt to check if all pages are added Ming Lei
2017-02-16 11:45 ` [PATCH 08/17] md: raid1: simplify r1buf_pool_free() Ming Lei
2017-02-16 11:45 ` [PATCH 09/17] md: raid1/raid10: use bio helper in *_pool_free Ming Lei
2017-02-16 11:45 ` [PATCH 10/17] md: raid1: remove direct access to bvec table in fix_sync_read_error Ming Lei
2017-02-16 11:45 ` [PATCH 11/17] md: raid1: use bio helper in process_checks() Ming Lei
2017-02-16 11:45 ` [PATCH 12/17] md: raid1: avoid direct access to bvec table " Ming Lei
2017-02-17  8:33   ` kbuild test robot
2017-02-16 11:45 ` [PATCH 13/17] md: raid1: use bio_segments_all() Ming Lei
2017-02-16 12:35   ` Johannes Thumshirn
2017-02-16 13:32     ` Ming Lei
2017-02-16 13:34       ` Johannes Thumshirn
2017-02-16 13:38         ` Ming Lei
2017-02-16 11:45 ` [PATCH 14/17] md: raid10: avoid direct access to bvec table in sync_request_write() Ming Lei
2017-02-16 11:45 ` [PATCH 15/17] md: raid10: avoid direct access to bvec table in fix_recovery_read_error Ming Lei
2017-02-16 11:45 ` [PATCH 16/17] md: raid10: avoid direct access to bvec table in reshape_request Ming Lei
2017-02-16 11:45 ` [PATCH 17/17] md: raid10: avoid direct access to bvec table in handle_reshape_read_error Ming Lei
2017-02-16 22:16 ` [PATCH 00/17] md: cleanup on direct access to bvec table Shaohua Li
2017-02-17  1:25   ` Ming Lei
2017-02-17  4:16     ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).