All of lore.kernel.org
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: axboe@kernel.dk
Cc: linux-block@vger.kernel.org, linux-bcache@vger.kernel.org,
	Coly Li <colyli@suse.de>, Hannes Reinecke <hare@suse.de>
Subject: [PATCH 16/25] bcache: introduce meta_bucket_pages() related helper routines
Date: Sat, 25 Jul 2020 20:00:30 +0800	[thread overview]
Message-ID: <20200725120039.91071-17-colyli@suse.de> (raw)
In-Reply-To: <20200725120039.91071-1-colyli@suse.de>

Currently the in-memory meta data like c->uuids or c->disk_buckets
are allocated by alloc_bucket_pages(). The macro alloc_bucket_pages()
calls __get_free_pages() to allocated continuous pages with order
indicated by ilog2(bucket_pages(c)),
 #define alloc_bucket_pages(gfp, c)                      \
     ((void *) __get_free_pages(__GFP_ZERO|gfp, ilog2(bucket_pages(c))))

The maximum order is defined as MAX_ORDER, the default value is 11 (and
can be overwritten by CONFIG_FORCE_MAX_ZONEORDER). In bcache code the
maximum bucket size width is 16bits, this is restricted both by KEY_SIZE
size and bucket_size size from struct cache_sb_disk. The maximum 16bits
width and power-of-2 value is (1<<15) in unit of sector (512byte). It
means the maximum value of bucket size in bytes is (1<<24) bytes a.k.a
4096 pages.

When the bucket size is set to maximum permitted value, ilog2(4096) is
12, which exceeds the default maximum order __get_free_pages() can
accepted, the failed pages allocation will fail cache set registration
procedure and print a kernel oops message for the exceeded pages order.

This patch introduces meta_bucket_pages(), meta_bucket_bytes(), and
alloc_bucket_pages() helper routines. meta_bucket_pages() indicates the
maximum pages can be allocated to meta data bucket, meta_bucket_bytes()
indicates the according maximum bytes, and alloc_bucket_pages() does
the pages allocation for meta bucket. Because meta_bucket_pages()
chooses the smaller value among the bucket size and MAX_ORDER_NR_PAGES,
it still works when MAX_ORDER overwritten by CONFIG_FORCE_MAX_ZONEORDER.

Following patches will use these helper routines to decide maximum pages
can be allocated for different meta data buckets. If the bucket size is
larger than meta_bucket_bytes(), the bcache registration can continue to
success, just the space more than meta_bucket_bytes() inside the bucket
is wasted. Comparing bcache failed for large bucket size, wasting some
space for meta data buckets is acceptable at this moment.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/bcache/bcache.h | 20 ++++++++++++++++++++
 drivers/md/bcache/super.c  |  3 +++
 2 files changed, 23 insertions(+)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 80e3c4813fb0..972f1aff0f70 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -762,6 +762,26 @@ struct bbio {
 #define bucket_bytes(c)		((c)->sb.bucket_size << 9)
 #define block_bytes(c)		((c)->sb.block_size << 9)
 
+static inline unsigned int meta_bucket_pages(struct cache_sb *sb)
+{
+	unsigned int n, max_pages;
+
+	max_pages = min_t(unsigned int,
+			  __rounddown_pow_of_two(USHRT_MAX) / PAGE_SECTORS,
+			  MAX_ORDER_NR_PAGES);
+
+	n = sb->bucket_size / PAGE_SECTORS;
+	if (n > max_pages)
+		n = max_pages;
+
+	return n;
+}
+
+static inline unsigned int meta_bucket_bytes(struct cache_sb *sb)
+{
+	return meta_bucket_pages(sb) << PAGE_SHIFT;
+}
+
 #define prios_per_bucket(c)				\
 	((bucket_bytes(c) - sizeof(struct prio_set)) /	\
 	 sizeof(struct bucket_disk))
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 214d50903375..d0bdbbc3ff5c 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1821,6 +1821,9 @@ void bch_cache_set_unregister(struct cache_set *c)
 #define alloc_bucket_pages(gfp, c)			\
 	((void *) __get_free_pages(__GFP_ZERO|__GFP_COMP|gfp, ilog2(bucket_pages(c))))
 
+#define alloc_meta_bucket_pages(gfp, sb)		\
+	((void *) __get_free_pages(__GFP_ZERO|__GFP_COMP|gfp, ilog2(meta_bucket_pages(sb))))
+
 struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
 {
 	int iter_size;
-- 
2.26.2


  parent reply	other threads:[~2020-07-25 12:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-25 12:00 [PATCH 00/25] bcache patches for Linux v5.9 Coly Li
2020-07-25 12:00 ` [PATCH 01/25] bcache: Fix typo in Kconfig name Coly Li
2020-07-25 12:00 ` [PATCH 02/25] bcache: allocate meta data pages as compound pages Coly Li
2020-07-25 12:00 ` [PATCH 03/25] bcache: journel: use for_each_clear_bit() to simplify the code Coly Li
2020-07-25 12:00 ` [PATCH 04/25] bcache: writeback: Remove unneeded variable i Coly Li
2020-07-25 12:00 ` [PATCH 05/25] bcache: movinggc: Use struct_size() helper in kzalloc() Coly Li
2020-07-25 12:00 ` [PATCH 06/25] bcache: Use struct_size() " Coly Li
2020-07-25 12:00 ` [PATCH 07/25] bcache: avoid nr_stripes overflow in bcache_device_init() Coly Li
2020-07-27 21:24   ` Sasha Levin
2020-07-25 12:00 ` [PATCH 08/25] bcache: fix overflow in offset_to_stripe() Coly Li
2020-07-27 21:24   ` Sasha Levin
2020-07-25 12:00 ` [PATCH 09/25] bcache: add read_super_common() to read major part of super block Coly Li
2020-07-25 12:00 ` [PATCH 10/25] bcache: add more accurate error information in read_super_common() Coly Li
2020-07-25 12:00 ` [PATCH 11/25] bcache: disassemble the big if() checks in bch_cache_set_alloc() Coly Li
2020-07-25 12:00 ` [PATCH 12/25] bcache: fix super block seq numbers comparision in register_cache_set() Coly Li
2020-07-25 12:00 ` [PATCH 13/25] bcache: increase super block version for cache device and backing device Coly Li
2020-07-25 12:00 ` [PATCH 14/25] bcache: move bucket related code into read_super_common() Coly Li
2020-07-25 12:00 ` [PATCH 15/25] bcache: struct cache_sb is only for in-memory super block now Coly Li
2020-07-25 12:00 ` Coly Li [this message]
2020-07-25 12:00 ` [PATCH 17/25] bcache: handle c->uuids properly for bucket size > 8MB Coly Li
2020-07-25 12:00 ` [PATCH 18/25] bcache: handle cache prio_buckets and disk_buckets " Coly Li
2020-07-25 12:00 ` [PATCH 19/25] bcache: handle cache set verify_ondisk " Coly Li
2020-07-25 12:00 ` [PATCH 20/25] bcache: handle btree node memory allocation " Coly Li
2020-07-25 12:00 ` [PATCH 21/25] bcache: add bucket_size_hi into struct cache_sb_disk for large bucket Coly Li
2020-07-25 12:00 ` [PATCH 22/25] bcache: add sysfs file to display feature sets information of cache set Coly Li
2020-07-25 12:00 ` [PATCH 23/25] bcache: avoid extra memory allocation from mempool c->fill_iter Coly Li
2020-07-25 12:00 ` [PATCH 24/25] bcache: avoid extra memory consumption in struct bbio for large bucket size Coly Li
2020-07-25 12:00 ` [PATCH 25/25] bcache: fix bio_{start,end}_io_acct with proper device Coly Li
2020-07-26 15:07   ` Christoph Hellwig
2020-07-25 13:39 ` [PATCH 00/25] bcache patches for Linux v5.9 Jens Axboe
2020-07-28 12:14   ` Christoph Hellwig
2020-07-28 12:40     ` Coly Li
2020-07-28 12:41       ` Christoph Hellwig
2020-07-28 15:13       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200725120039.91071-17-colyli@suse.de \
    --to=colyli@suse.de \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.