All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naohiro Aota <naohiro.aota@wdc.com>
To: linux-btrfs@vger.kernel.org, dsterba@suse.com
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Subject: [PATCH v2] btrfs: zoned: move superblock logging zone location
Date: Thu,  8 Apr 2021 17:25:28 +0900	[thread overview]
Message-ID: <2f58edb74695825632c77349b000d31f16cb3226.1617870145.git.naohiro.aota@wdc.com> (raw)

This commit moves the location of the superblock logging zones. The new
locations of the logging zones are now determined based on fixed block
addresses instead of on fixed zone numbers.

The old placement method based on fixed zone numbers causes problems when
one needs to inspect a file system image without access to the drive zone
information. In such case, the super block locations cannot be reliably
determined as the zone size is unknown. By locating the superblock logging
zones using fixed addresses, we can scan a dumped file system image without
the zone information since a super block copy will always be present at or
after the fixed location.

This commit introduces the following three pairs of zones containing fixed
offset locations, regardless of the device zone size.

  - Primary superblock: zone starting at offset 0 and the following zone
  - First copy: zone containing offset 64GB and the following zone
  - Second copy: zone containing offset 256GB and the following zone

If a logging zone is outside of the disk capacity, we do not record the
superblock copy.

The first copy position is much larger than for a regular btrfs volume
(64M).  This increase is to avoid overlapping with the log zones for the
primary superblock. This higher location is arbitrary but allows supporting
devices with very large zone sizes, up to 32GB. Such large zone size is
unrealistic and very unlikely to ever be seen in real devices. Currently,
SMR disks have a zone size of 256MB, and we are expecting ZNS drives to be
in the 1-4GB range, so this 32GB limit gives us room to breathe. For now,
we only allow zone sizes up to 8GB, below this hard limit of 32GB.

The fixed location addresses are somewhat arbitrary, but with the intent of
maintaining superblock reliability even for smaller devices. For this
reason, the superblock fixed locations do not exceed 1TB.

The superblock logging zones are reserved for superblock logging and never
used for data or metadata blocks. Note that we only reserve the two zones
per primary/copy actually used for superblock logging. We do not reserve
the ranges of zones possibly containing superblocks with the largest
supported zone size (0-16GB, 64G-80GB, 256G-272GB).

The zones containing the fixed location offsets used to store superblocks
in a regular btrfs volume (no zoned case) are also reserved to avoid
confusion.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/zoned.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 1f972b75a9ab..a4b195fe08a0 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -21,9 +21,28 @@
 /* Pseudo write pointer value for conventional zone */
 #define WP_CONVENTIONAL ((u64)-2)
 
+/*
+ * Location of the first zone of superblock logging zone pairs.
+ * - Primary superblock: the zone containing offset 0 (zone 0)
+ * - First superblock copy: the zone containing offset 64G
+ * - Second superblock copy: the zone containing offset 256G
+ */
+#define BTRFS_PRIMARY_SB_LOG_ZONE 0ULL
+#define BTRFS_FIRST_SB_LOG_ZONE (64ULL * SZ_1G)
+#define BTRFS_SECOND_SB_LOG_ZONE (256ULL * SZ_1G)
+#define BTRFS_FIRST_SB_LOG_ZONE_SHIFT const_ilog2(BTRFS_FIRST_SB_LOG_ZONE)
+#define BTRFS_SECOND_SB_LOG_ZONE_SHIFT const_ilog2(BTRFS_SECOND_SB_LOG_ZONE)
+
 /* Number of superblock log zones */
 #define BTRFS_NR_SB_LOG_ZONES 2
 
+/*
+ * Maximum size of zones. Currently, SMR disks have a zone size of 256MB,
+ * and we are expecting ZNS drives to be in the 1-4GB range. We do not
+ * expect the zone size to become larger than 8GB in the near future.
+ */
+#define BTRFS_MAX_ZONE_SIZE SZ_8G
+
 static int copy_zone_info_cb(struct blk_zone *zone, unsigned int idx, void *data)
 {
 	struct blk_zone *zones = data;
@@ -111,11 +130,8 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
 }
 
 /*
- * The following zones are reserved as the circular buffer on ZONED btrfs.
- *  - The primary superblock: zones 0 and 1
- *  - The first copy: zones 16 and 17
- *  - The second copy: zones 1024 or zone at 256GB which is minimum, and
- *                     the following one
+ * Get the zone number of the first zone of a pair of contiguous zones used
+ * for superblock logging.
  */
 static inline u32 sb_zone_number(int shift, int mirror)
 {
@@ -123,8 +139,8 @@ static inline u32 sb_zone_number(int shift, int mirror)
 
 	switch (mirror) {
 	case 0: return 0;
-	case 1: return 16;
-	case 2: return min_t(u64, btrfs_sb_offset(mirror) >> shift, 1024);
+	case 1: return 1 << (BTRFS_FIRST_SB_LOG_ZONE_SHIFT - shift);
+	case 2: return 1 << (BTRFS_SECOND_SB_LOG_ZONE_SHIFT - shift);
 	}
 
 	return 0;
@@ -300,10 +316,21 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device)
 		zone_sectors = bdev_zone_sectors(bdev);
 	}
 
-	nr_sectors = bdev_nr_sectors(bdev);
 	/* Check if it's power of 2 (see is_power_of_2) */
 	ASSERT(zone_sectors != 0 && (zone_sectors & (zone_sectors - 1)) == 0);
 	zone_info->zone_size = zone_sectors << SECTOR_SHIFT;
+
+	/* We reject devices with a zone size larger than 8GB. */
+	if (zone_info->zone_size > BTRFS_MAX_ZONE_SIZE) {
+		btrfs_err_in_rcu(fs_info,
+				 "zoned: %s: zone size %llu is too large",
+				 rcu_str_deref(device->name),
+				 zone_info->zone_size);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	nr_sectors = bdev_nr_sectors(bdev);
 	zone_info->zone_size_shift = ilog2(zone_info->zone_size);
 	zone_info->max_zone_append_size =
 		(u64)queue_max_zone_append_sectors(queue) << SECTOR_SHIFT;
-- 
2.31.1


             reply	other threads:[~2021-04-08  8:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08  8:25 Naohiro Aota [this message]
2021-04-08 14:57 ` [PATCH v2] btrfs: zoned: move superblock logging zone location Josef Bacik
2021-04-09 10:48   ` David Sterba
2021-04-09 11:07 ` David Sterba
2021-04-10  9:29   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f58edb74695825632c77349b000d31f16cb3226.1617870145.git.naohiro.aota@wdc.com \
    --to=naohiro.aota@wdc.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.