All of lore.kernel.org
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: axboe@kernel.dk
Cc: linux-block@vger.kernel.org, linux-bcache@vger.kernel.org,
	Coly Li <colyli@suse.de>, Hannes Reinecke <hare@suse.de>
Subject: [PATCH 18/25] bcache: handle cache prio_buckets and disk_buckets properly for bucket size > 8MB
Date: Sat, 25 Jul 2020 20:00:32 +0800	[thread overview]
Message-ID: <20200725120039.91071-19-colyli@suse.de> (raw)
In-Reply-To: <20200725120039.91071-1-colyli@suse.de>

Similar to c->uuids, struct cache's prio_buckets and disk_buckets also
have the potential memory allocation failure during cache registration
if the bucket size > 8MB.

ca->prio_buckets can be stored on cache device in multiple buckets, its
in-memory space is allocated by kzalloc() interface but normally
allocated by alloc_pages() because the size > KMALLOC_MAX_CACHE_SIZE.

So allocation of ca->prio_buckets has the MAX_ORDER restriction too. If
the bucket size > 8MB, by default the page allocator will fail because
the page order > 11 (default MAX_ORDER value). ca->prio_buckets should
also use meta_bucket_bytes(), meta_bucket_pages() to decide its memory
size and use alloc_meta_bucket_pages() to allocate pages, to avoid the
allocation failure during cache set registration when bucket size > 8MB.

ca->disk_buckets is a single bucket size memory buffer, it is used to
iterate each bucket of ca->prio_buckets, and compose the bio based on
memory of ca->disk_buckets, then write ca->disk_buckets memory to cache
disk one-by-one for each bucket of ca->prio_buckets. ca->disk_buckets
should have in-memory size exact to the meta_bucket_pages(), this is the
size that ca->prio_buckets will be stored into each on-disk bucket.

This patch fixes the above issues and handle cache's prio_buckets and
disk_buckets properly for bucket size larger than 8MB.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/bcache/bcache.h |  9 +++++----
 drivers/md/bcache/super.c  | 10 +++++-----
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 972f1aff0f70..0ebfda284866 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -782,11 +782,12 @@ static inline unsigned int meta_bucket_bytes(struct cache_sb *sb)
 	return meta_bucket_pages(sb) << PAGE_SHIFT;
 }
 
-#define prios_per_bucket(c)				\
-	((bucket_bytes(c) - sizeof(struct prio_set)) /	\
+#define prios_per_bucket(ca)						\
+	((meta_bucket_bytes(&(ca)->sb) - sizeof(struct prio_set)) /	\
 	 sizeof(struct bucket_disk))
-#define prio_buckets(c)					\
-	DIV_ROUND_UP((size_t) (c)->sb.nbuckets, prios_per_bucket(c))
+
+#define prio_buckets(ca)						\
+	DIV_ROUND_UP((size_t) (ca)->sb.nbuckets, prios_per_bucket(ca))
 
 static inline size_t sector_to_bucket(struct cache_set *c, sector_t s)
 {
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index a19f1baa8664..1c4a4c3557f7 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -563,7 +563,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
 
 	bio->bi_iter.bi_sector	= bucket * ca->sb.bucket_size;
 	bio_set_dev(bio, ca->bdev);
-	bio->bi_iter.bi_size	= bucket_bytes(ca);
+	bio->bi_iter.bi_size	= meta_bucket_bytes(&ca->sb);
 
 	bio->bi_end_io	= prio_endio;
 	bio->bi_private = ca;
@@ -621,7 +621,7 @@ int bch_prio_write(struct cache *ca, bool wait)
 
 		p->next_bucket	= ca->prio_buckets[i + 1];
 		p->magic	= pset_magic(&ca->sb);
-		p->csum		= bch_crc64(&p->magic, bucket_bytes(ca) - 8);
+		p->csum		= bch_crc64(&p->magic, meta_bucket_bytes(&ca->sb) - 8);
 
 		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
 		BUG_ON(bucket == -1);
@@ -674,7 +674,7 @@ static int prio_read(struct cache *ca, uint64_t bucket)
 			prio_io(ca, bucket, REQ_OP_READ, 0);
 
 			if (p->csum !=
-			    bch_crc64(&p->magic, bucket_bytes(ca) - 8)) {
+			    bch_crc64(&p->magic, meta_bucket_bytes(&ca->sb) - 8)) {
 				pr_warn("bad csum reading priorities\n");
 				goto out;
 			}
@@ -2222,7 +2222,7 @@ void bch_cache_release(struct kobject *kobj)
 		ca->set->cache[ca->sb.nr_this_dev] = NULL;
 	}
 
-	free_pages((unsigned long) ca->disk_buckets, ilog2(bucket_pages(ca)));
+	free_pages((unsigned long) ca->disk_buckets, ilog2(meta_bucket_pages(&ca->sb)));
 	kfree(ca->prio_buckets);
 	vfree(ca->buckets);
 
@@ -2319,7 +2319,7 @@ static int cache_alloc(struct cache *ca)
 		goto err_prio_buckets_alloc;
 	}
 
-	ca->disk_buckets = alloc_bucket_pages(GFP_KERNEL, ca);
+	ca->disk_buckets = alloc_meta_bucket_pages(GFP_KERNEL, &ca->sb);
 	if (!ca->disk_buckets) {
 		err = "ca->disk_buckets alloc failed";
 		goto err_disk_buckets_alloc;
-- 
2.26.2


  parent reply	other threads:[~2020-07-25 12:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-25 12:00 [PATCH 00/25] bcache patches for Linux v5.9 Coly Li
2020-07-25 12:00 ` [PATCH 01/25] bcache: Fix typo in Kconfig name Coly Li
2020-07-25 12:00 ` [PATCH 02/25] bcache: allocate meta data pages as compound pages Coly Li
2020-07-25 12:00 ` [PATCH 03/25] bcache: journel: use for_each_clear_bit() to simplify the code Coly Li
2020-07-25 12:00 ` [PATCH 04/25] bcache: writeback: Remove unneeded variable i Coly Li
2020-07-25 12:00 ` [PATCH 05/25] bcache: movinggc: Use struct_size() helper in kzalloc() Coly Li
2020-07-25 12:00 ` [PATCH 06/25] bcache: Use struct_size() " Coly Li
2020-07-25 12:00 ` [PATCH 07/25] bcache: avoid nr_stripes overflow in bcache_device_init() Coly Li
2020-07-27 21:24   ` Sasha Levin
2020-07-25 12:00 ` [PATCH 08/25] bcache: fix overflow in offset_to_stripe() Coly Li
2020-07-27 21:24   ` Sasha Levin
2020-07-25 12:00 ` [PATCH 09/25] bcache: add read_super_common() to read major part of super block Coly Li
2020-07-25 12:00 ` [PATCH 10/25] bcache: add more accurate error information in read_super_common() Coly Li
2020-07-25 12:00 ` [PATCH 11/25] bcache: disassemble the big if() checks in bch_cache_set_alloc() Coly Li
2020-07-25 12:00 ` [PATCH 12/25] bcache: fix super block seq numbers comparision in register_cache_set() Coly Li
2020-07-25 12:00 ` [PATCH 13/25] bcache: increase super block version for cache device and backing device Coly Li
2020-07-25 12:00 ` [PATCH 14/25] bcache: move bucket related code into read_super_common() Coly Li
2020-07-25 12:00 ` [PATCH 15/25] bcache: struct cache_sb is only for in-memory super block now Coly Li
2020-07-25 12:00 ` [PATCH 16/25] bcache: introduce meta_bucket_pages() related helper routines Coly Li
2020-07-25 12:00 ` [PATCH 17/25] bcache: handle c->uuids properly for bucket size > 8MB Coly Li
2020-07-25 12:00 ` Coly Li [this message]
2020-07-25 12:00 ` [PATCH 19/25] bcache: handle cache set verify_ondisk " Coly Li
2020-07-25 12:00 ` [PATCH 20/25] bcache: handle btree node memory allocation " Coly Li
2020-07-25 12:00 ` [PATCH 21/25] bcache: add bucket_size_hi into struct cache_sb_disk for large bucket Coly Li
2020-07-25 12:00 ` [PATCH 22/25] bcache: add sysfs file to display feature sets information of cache set Coly Li
2020-07-25 12:00 ` [PATCH 23/25] bcache: avoid extra memory allocation from mempool c->fill_iter Coly Li
2020-07-25 12:00 ` [PATCH 24/25] bcache: avoid extra memory consumption in struct bbio for large bucket size Coly Li
2020-07-25 12:00 ` [PATCH 25/25] bcache: fix bio_{start,end}_io_acct with proper device Coly Li
2020-07-26 15:07   ` Christoph Hellwig
2020-07-25 13:39 ` [PATCH 00/25] bcache patches for Linux v5.9 Jens Axboe
2020-07-28 12:14   ` Christoph Hellwig
2020-07-28 12:40     ` Coly Li
2020-07-28 12:41       ` Christoph Hellwig
2020-07-28 15:13       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200725120039.91071-19-colyli@suse.de \
    --to=colyli@suse.de \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.