All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Shaohua Li <shli@kernel.org>, Jens Axboe <axboe@fb.com>,
	linux-raid@vger.kernel.org, linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <tom.leiming@gmail.com>
Subject: Re: [PATCH v3 05/14] md: raid1: don't use bio's vec table to manage resync pages
Date: Mon, 10 Jul 2017 09:09:08 +1000	[thread overview]
Message-ID: <87mv8d5ht7.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20170316161235.27110-6-tom.leiming@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3988 bytes --]

On Fri, Mar 17 2017, Ming Lei wrote:

> Now we allocate one page array for managing resync pages, instead
> of using bio's vec table to do that, and the old way is very hacky
> and won't work any more if multipage bvec is enabled.
>
> The introduced cost is that we need to allocate (128 + 16) * raid_disks
> bytes per r1_bio, and it is fine because the inflight r1_bio for
> resync shouldn't be much, as pointed by Shaohua.
>
> Also the bio_reset() in raid1_sync_request() is removed because
> all bios are freshly new now and not necessary to reset any more.
>
> This patch can be thought as a cleanup too
>
> Suggested-by: Shaohua Li <shli@kernel.org>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/md/raid1.c | 94 +++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 64 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index e30d89690109..0e64beb60e4d 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -80,6 +80,24 @@ static void lower_barrier(struct r1conf *conf, sector_t sector_nr);
>  #define raid1_log(md, fmt, args...)				\
>  	do { if ((md)->queue) blk_add_trace_msg((md)->queue, "raid1 " fmt, ##args); } while (0)
>  
> +/*
> + * 'strct resync_pages' stores actual pages used for doing the resync
> + *  IO, and it is per-bio, so make .bi_private points to it.
> + */
> +static inline struct resync_pages *get_resync_pages(struct bio *bio)
> +{
> +	return bio->bi_private;
> +}
> +
> +/*
> + * for resync bio, r1bio pointer can be retrieved from the per-bio
> + * 'struct resync_pages'.
> + */
> +static inline struct r1bio *get_resync_r1bio(struct bio *bio)
> +{
> +	return get_resync_pages(bio)->raid_bio;
> +}
> +
>  static void * r1bio_pool_alloc(gfp_t gfp_flags, void *data)
>  {
>  	struct pool_info *pi = data;
> @@ -107,12 +125,18 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
>  	struct r1bio *r1_bio;
>  	struct bio *bio;
>  	int need_pages;
> -	int i, j;
> +	int j;
> +	struct resync_pages *rps;
>  
>  	r1_bio = r1bio_pool_alloc(gfp_flags, pi);
>  	if (!r1_bio)
>  		return NULL;
>  
> +	rps = kmalloc(sizeof(struct resync_pages) * pi->raid_disks,
> +		      gfp_flags);
> +	if (!rps)
> +		goto out_free_r1bio;
> +
>  	/*
>  	 * Allocate bios : 1 for reading, n-1 for writing
>  	 */
> @@ -132,22 +156,22 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
>  		need_pages = pi->raid_disks;
>  	else
>  		need_pages = 1;
> -	for (j = 0; j < need_pages; j++) {
> +	for (j = 0; j < pi->raid_disks; j++) {
> +		struct resync_pages *rp = &rps[j];
> +
>  		bio = r1_bio->bios[j];
> -		bio->bi_vcnt = RESYNC_PAGES;
> -
> -		if (bio_alloc_pages(bio, gfp_flags))
> -			goto out_free_pages;
> -	}
> -	/* If not user-requests, copy the page pointers to all bios */
> -	if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) {
> -		for (i=0; i<RESYNC_PAGES ; i++)
> -			for (j=1; j<pi->raid_disks; j++) {
> -				struct page *page =
> -					r1_bio->bios[0]->bi_io_vec[i].bv_page;
> -				get_page(page);
> -				r1_bio->bios[j]->bi_io_vec[i].bv_page = page;
> -			}
> +
> +		if (j < need_pages) {
> +			if (resync_alloc_pages(rp, gfp_flags))
> +				goto out_free_pages;
> +		} else {
> +			memcpy(rp, &rps[0], sizeof(*rp));
> +			resync_get_all_pages(rp);
> +		}
> +
> +		rp->idx = 0;

This is the only place the ->idx is initialized, in r1buf_pool_alloc().
The mempool alloc function is suppose to allocate memory, not initialize
it.

If the mempool_alloc() call cannot allocate memory it will use memory
from the pool.  If this memory has already been used, then it will no
longer have the initialized value.

In short: you need to initialise memory *after* calling
mempool_alloc(), unless you ensure it is reset to the init values before
calling mempool_free().

https://bugzilla.kernel.org/show_bug.cgi?id=196307

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: NeilBrown <neilb@suse.com>
To: Ming Lei <tom.leiming@gmail.com>, Shaohua Li <shli@kernel.org>,
	Jens Axboe <axboe@fb.com>,
	linux-raid@vger.kernel.org, linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <tom.leiming@gmail.com>
Subject: Re: [PATCH v3 05/14] md: raid1: don't use bio's vec table to manage resync pages
Date: Mon, 10 Jul 2017 09:09:08 +1000	[thread overview]
Message-ID: <87mv8d5ht7.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20170316161235.27110-6-tom.leiming@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3988 bytes --]

On Fri, Mar 17 2017, Ming Lei wrote:

> Now we allocate one page array for managing resync pages, instead
> of using bio's vec table to do that, and the old way is very hacky
> and won't work any more if multipage bvec is enabled.
>
> The introduced cost is that we need to allocate (128 + 16) * raid_disks
> bytes per r1_bio, and it is fine because the inflight r1_bio for
> resync shouldn't be much, as pointed by Shaohua.
>
> Also the bio_reset() in raid1_sync_request() is removed because
> all bios are freshly new now and not necessary to reset any more.
>
> This patch can be thought as a cleanup too
>
> Suggested-by: Shaohua Li <shli@kernel.org>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  drivers/md/raid1.c | 94 +++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 64 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index e30d89690109..0e64beb60e4d 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -80,6 +80,24 @@ static void lower_barrier(struct r1conf *conf, sector_t sector_nr);
>  #define raid1_log(md, fmt, args...)				\
>  	do { if ((md)->queue) blk_add_trace_msg((md)->queue, "raid1 " fmt, ##args); } while (0)
>  
> +/*
> + * 'strct resync_pages' stores actual pages used for doing the resync
> + *  IO, and it is per-bio, so make .bi_private points to it.
> + */
> +static inline struct resync_pages *get_resync_pages(struct bio *bio)
> +{
> +	return bio->bi_private;
> +}
> +
> +/*
> + * for resync bio, r1bio pointer can be retrieved from the per-bio
> + * 'struct resync_pages'.
> + */
> +static inline struct r1bio *get_resync_r1bio(struct bio *bio)
> +{
> +	return get_resync_pages(bio)->raid_bio;
> +}
> +
>  static void * r1bio_pool_alloc(gfp_t gfp_flags, void *data)
>  {
>  	struct pool_info *pi = data;
> @@ -107,12 +125,18 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
>  	struct r1bio *r1_bio;
>  	struct bio *bio;
>  	int need_pages;
> -	int i, j;
> +	int j;
> +	struct resync_pages *rps;
>  
>  	r1_bio = r1bio_pool_alloc(gfp_flags, pi);
>  	if (!r1_bio)
>  		return NULL;
>  
> +	rps = kmalloc(sizeof(struct resync_pages) * pi->raid_disks,
> +		      gfp_flags);
> +	if (!rps)
> +		goto out_free_r1bio;
> +
>  	/*
>  	 * Allocate bios : 1 for reading, n-1 for writing
>  	 */
> @@ -132,22 +156,22 @@ static void * r1buf_pool_alloc(gfp_t gfp_flags, void *data)
>  		need_pages = pi->raid_disks;
>  	else
>  		need_pages = 1;
> -	for (j = 0; j < need_pages; j++) {
> +	for (j = 0; j < pi->raid_disks; j++) {
> +		struct resync_pages *rp = &rps[j];
> +
>  		bio = r1_bio->bios[j];
> -		bio->bi_vcnt = RESYNC_PAGES;
> -
> -		if (bio_alloc_pages(bio, gfp_flags))
> -			goto out_free_pages;
> -	}
> -	/* If not user-requests, copy the page pointers to all bios */
> -	if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) {
> -		for (i=0; i<RESYNC_PAGES ; i++)
> -			for (j=1; j<pi->raid_disks; j++) {
> -				struct page *page =
> -					r1_bio->bios[0]->bi_io_vec[i].bv_page;
> -				get_page(page);
> -				r1_bio->bios[j]->bi_io_vec[i].bv_page = page;
> -			}
> +
> +		if (j < need_pages) {
> +			if (resync_alloc_pages(rp, gfp_flags))
> +				goto out_free_pages;
> +		} else {
> +			memcpy(rp, &rps[0], sizeof(*rp));
> +			resync_get_all_pages(rp);
> +		}
> +
> +		rp->idx = 0;

This is the only place the ->idx is initialized, in r1buf_pool_alloc().
The mempool alloc function is suppose to allocate memory, not initialize
it.

If the mempool_alloc() call cannot allocate memory it will use memory
from the pool.  If this memory has already been used, then it will no
longer have the initialized value.

In short: you need to initialise memory *after* calling
mempool_alloc(), unless you ensure it is reset to the init values before
calling mempool_free().

https://bugzilla.kernel.org/show_bug.cgi?id=196307

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2017-07-09 23:09 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-16 16:12 [PATCH v3 00/14] md: cleanup on direct access to bvec table Ming Lei
2017-03-16 16:12 ` [PATCH v3 01/14] md: raid1/raid10: don't handle failure of bio_add_page() Ming Lei
2017-03-27  9:14   ` Christoph Hellwig
2017-03-16 16:12 ` [PATCH v3 02/14] md: move two macros into md.h Ming Lei
2017-03-24  5:57   ` NeilBrown
2017-03-24  5:57     ` NeilBrown
2017-03-24  6:30     ` Ming Lei
2017-03-24 16:53     ` Shaohua Li
2017-03-27  9:15       ` Christoph Hellwig
2017-03-27  9:52         ` NeilBrown
2017-03-27  9:52           ` NeilBrown
2017-03-16 16:12 ` [PATCH v3 03/14] md: prepare for managing resync I/O pages in clean way Ming Lei
2017-03-24  6:00   ` NeilBrown
2017-03-24  6:00     ` NeilBrown
2017-03-16 16:12 ` [PATCH v3 04/14] md: raid1: simplify r1buf_pool_free() Ming Lei
2017-03-16 16:12 ` [PATCH v3 05/14] md: raid1: don't use bio's vec table to manage resync pages Ming Lei
2017-07-09 23:09   ` NeilBrown [this message]
2017-07-09 23:09     ` NeilBrown
2017-07-10  3:35     ` Ming Lei
2017-07-10  4:13       ` Ming Lei
2017-07-10  4:38         ` NeilBrown
2017-07-10  4:38           ` NeilBrown
2017-07-10  7:25           ` Ming Lei
2017-07-10  7:25             ` Ming Lei
2017-07-10 19:05             ` Shaohua Li
2017-07-10 22:54               ` Ming Lei
2017-07-10 23:14               ` NeilBrown
2017-07-10 23:14                 ` NeilBrown
2017-07-12  1:40                 ` Ming Lei
2017-07-12 16:30                   ` Shaohua Li
2017-07-13  1:22                     ` Ming Lei
2017-03-16 16:12 ` [PATCH v3 06/14] md: raid1: retrieve page from pre-allocated resync page array Ming Lei
2017-03-16 16:12 ` [PATCH v3 07/14] md: raid1: use bio helper in process_checks() Ming Lei
2017-03-16 16:12 ` [PATCH v3 08/14] block: introduce bio_copy_data_partial Ming Lei
2017-03-24  5:34   ` Shaohua Li
2017-03-24  5:34     ` Shaohua Li
2017-03-24 16:41   ` Jens Axboe
2017-03-24 16:41     ` Jens Axboe
2017-03-16 16:12 ` [PATCH v3 09/14] md: raid1: move 'offset' out of loop Ming Lei
2017-03-16 16:12 ` [PATCH v3 10/14] md: raid1: improve write behind Ming Lei
2017-03-16 16:12 ` [PATCH v3 11/14] md: raid10: refactor code of read reshape's .bi_end_io Ming Lei
2017-03-16 16:12 ` [PATCH v3 12/14] md: raid10: don't use bio's vec table to manage resync pages Ming Lei
2017-03-16 16:12 ` [PATCH v3 13/14] md: raid10: retrieve page from preallocated resync page array Ming Lei
2017-03-16 16:12 ` [PATCH v3 14/14] md: raid10: avoid direct access to bvec table in handle_reshape_read_error Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mv8d5ht7.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=axboe@fb.com \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=shli@kernel.org \
    --cc=tom.leiming@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.