Linux-f2fs-devel Archive on lore.kernel.org
 help / color / Atom feed
* [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices
@ 2019-08-21  4:47 Shin'ichiro Kawasaki
  2019-08-21  4:47 ` [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function Shin'ichiro Kawasaki
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Shin'ichiro Kawasaki @ 2019-08-21  4:47 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Damien Le Moal

On sudden f2fs shutdown, zoned block device status and f2fs meta data can be
inconsistent. When f2fs shutdown happens during write operations, write pointers
on the device go forward but the f2fs meta data does not reflect write pointer
progress. This inconsistency will eventually cause "Unaligned write command"
error when restarting write operation after the next mount. This error can be
observed with xfstests test case generic/388, which enforces sudden shutdown
during write operation and checks the file system recovery. Once the error
happens because of the inconsistency, the file system requires fix. However,
fsck.f2fs does not have a feature to check and fix it.

This patch series adds a new feature to fsck.f2fs to check and fix the
inconsistency. First and second patches add two functions which helps fsck to
call report zone and reset zone commands to zoned block devices. Third patch
checks write pointers of zones that current segments recorded in meta data point
to. This covers the failure symptom observed with xfstests. The last patch
checks write pointers of zones that current segments do not point to, which
covers a potential failure scenario.

This patch series depends on other patches for zoned block devices, then it
conflicts with the master branch in f2fs-tools.git as of Aug/19/2019. It can be
applied without conflict to dev and dev-test branch tips.

Changes from v1:
* Fixed build failure on dev branch

Shin'ichiro Kawasaki (4):
  libf2fs_zoned: Introduce f2fs_report_zones() helper function
  libf2fs_zoned: Introduce f2fs_reset_zone() function
  fsck.f2fs: Check write pointer consistency with current segments
  fsck.f2fs: Check write pointer consistency with valid blocks count

 fsck/fsck.c         | 161 ++++++++++++++++++++++++++++++++++++++++++++
 fsck/fsck.h         |   3 +
 fsck/main.c         |   2 +
 include/f2fs_fs.h   |   3 +
 lib/libf2fs_zoned.c |  81 ++++++++++++++++++++++
 5 files changed, 250 insertions(+)

-- 
2.21.0



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function
  2019-08-21  4:47 [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Shin'ichiro Kawasaki
@ 2019-08-21  4:47 ` Shin'ichiro Kawasaki
  2019-08-27  1:34   ` Chao Yu
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 2/4] libf2fs_zoned: Introduce f2fs_reset_zone() function Shin'ichiro Kawasaki
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Shin'ichiro Kawasaki @ 2019-08-21  4:47 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Damien Le Moal

To prepare for write pointer consistency check by fsck, add
f2fs_report_zones() helper function which calls REPORT ZONE command to
get write pointer status. The function is added to lib/libf2fs_zoned
which gathers zoned block device related functions.

To check write pointer consistency with f2fs meta data, fsck needs to
refer both of reported zone information and f2fs super block structure
"f2fs_sb_info". However, libf2fs_zoned does not import f2fs_sb_info. To
keep f2fs_sb_info structure out of libf2fs_zoned, provide a callback
function in fsck to f2fs_report_zones() and call it for each zone.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 include/f2fs_fs.h   |  2 ++
 lib/libf2fs_zoned.c | 57 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
index 0d9a036..abadd1b 100644
--- a/include/f2fs_fs.h
+++ b/include/f2fs_fs.h
@@ -1279,6 +1279,8 @@ blk_zone_cond_str(struct blk_zone *blkz)
 
 extern int f2fs_get_zoned_model(int);
 extern int f2fs_get_zone_blocks(int);
+typedef int (report_zones_cb_t)(int i, struct blk_zone *blkz, void *opaque);
+extern int f2fs_report_zones(int, report_zones_cb_t *, void *);
 extern int f2fs_check_zones(int);
 extern int f2fs_reset_zones(int);
 
diff --git a/lib/libf2fs_zoned.c b/lib/libf2fs_zoned.c
index af00b44..fc4974f 100644
--- a/lib/libf2fs_zoned.c
+++ b/lib/libf2fs_zoned.c
@@ -193,6 +193,57 @@ int f2fs_get_zone_blocks(int i)
 
 #define F2FS_REPORT_ZONES_BUFSZ	524288
 
+int f2fs_report_zones(int j, report_zones_cb_t *report_zones_cb, void *opaque)
+{
+	struct device_info *dev = c.devices + j;
+	struct blk_zone_report *rep;
+	struct blk_zone *blkz;
+	unsigned int i, n = 0;
+	u_int64_t total_sectors = (dev->total_sectors * c.sector_size) >> 9;
+	u_int64_t sector = 0;
+	int ret = -1;
+
+	rep = malloc(F2FS_REPORT_ZONES_BUFSZ);
+	if (!rep) {
+		ERR_MSG("No memory for report zones\n");
+		return -ENOMEM;
+	}
+
+	while (sector < total_sectors) {
+
+		/* Get zone info */
+		rep->sector = sector;
+		rep->nr_zones = (F2FS_REPORT_ZONES_BUFSZ - sizeof(struct blk_zone_report))
+			/ sizeof(struct blk_zone);
+
+		ret = ioctl(dev->fd, BLKREPORTZONE, rep);
+		if (ret != 0) {
+			ret = -errno;
+			ERR_MSG("ioctl BLKREPORTZONE failed\n");
+			goto out;
+		}
+
+		if (!rep->nr_zones) {
+			ret = -EIO;
+			ERR_MSG("Unexpected ioctl BLKREPORTZONE result\n");
+			goto out;
+		}
+
+		blkz = (struct blk_zone *)(rep + 1);
+		for (i = 0; i < rep->nr_zones && sector < total_sectors; i++) {
+			ret = report_zones_cb(n, blkz, opaque);
+			if (ret)
+				goto out;
+			sector = blk_zone_sector(blkz) + blk_zone_length(blkz);
+			n++;
+			blkz++;
+		}
+	}
+out:
+	free(rep);
+	return ret;
+}
+
 int f2fs_check_zones(int j)
 {
 	struct device_info *dev = c.devices + j;
@@ -372,6 +423,12 @@ out:
 
 #else
 
+int f2fs_report_zones(int j, report_zones_cb_t *report_zones_cb, void *opaque)
+{
+	ERR_MSG("%d: Zoned block devices are not supported\n", i);
+	return -1;
+}
+
 int f2fs_get_zoned_model(int i)
 {
 	struct device_info *dev = c.devices + i;
-- 
2.21.0



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [f2fs-dev] [PATCH v2 2/4] libf2fs_zoned: Introduce f2fs_reset_zone() function
  2019-08-21  4:47 [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Shin'ichiro Kawasaki
  2019-08-21  4:47 ` [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function Shin'ichiro Kawasaki
@ 2019-08-21  4:48 ` Shin'ichiro Kawasaki
  2019-08-27  1:36   ` Chao Yu
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments Shin'ichiro Kawasaki
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Shin'ichiro Kawasaki @ 2019-08-21  4:48 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Damien Le Moal

To allow fsck to reset a zone with inconsistent write pointer, introduce
a helper function f2fs_reset_zone().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 include/f2fs_fs.h   |  1 +
 lib/libf2fs_zoned.c | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
index abadd1b..34679d8 100644
--- a/include/f2fs_fs.h
+++ b/include/f2fs_fs.h
@@ -1282,6 +1282,7 @@ extern int f2fs_get_zone_blocks(int);
 typedef int (report_zones_cb_t)(int i, struct blk_zone *blkz, void *opaque);
 extern int f2fs_report_zones(int, report_zones_cb_t *, void *);
 extern int f2fs_check_zones(int);
+int f2fs_reset_zone(struct device_info *dev, struct blk_zone *blkz);
 extern int f2fs_reset_zones(int);
 
 #define SIZE_ALIGN(val, size)	((val) + (size) - 1) / (size)
diff --git a/lib/libf2fs_zoned.c b/lib/libf2fs_zoned.c
index fc4974f..f56fa62 100644
--- a/lib/libf2fs_zoned.c
+++ b/lib/libf2fs_zoned.c
@@ -359,6 +359,24 @@ out:
 	return ret;
 }
 
+int f2fs_reset_zone(struct device_info *dev, struct blk_zone *blkz)
+{
+	struct blk_zone_range range;
+	int ret;
+
+	if (!blk_zone_seq(blkz) || blk_zone_empty(blkz))
+		return 0;
+
+	/* Non empty sequential zone: reset */
+	range.sector = blk_zone_sector(blkz);
+	range.nr_sectors = blk_zone_length(blkz);
+	ret = ioctl(dev->fd, BLKRESETZONE, &range);
+	if (ret != 0)
+		ERR_MSG("ioctl BLKRESETZONE failed\n");
+
+	return ret;
+}
+
 int f2fs_reset_zones(int j)
 {
 	struct device_info *dev = c.devices + j;
@@ -456,6 +474,12 @@ int f2fs_check_zones(int i)
 	return -1;
 }
 
+int f2fs_reset_zone(struct device_info *dev, struct blk_zone *blkz)
+{
+	ERR_MSG("%d: Zoned block devices are not supported\n", i);
+	return -1;
+}
+
 int f2fs_reset_zones(int i)
 {
 	ERR_MSG("%d: Zoned block devices are not supported\n", i);
-- 
2.21.0



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments
  2019-08-21  4:47 [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Shin'ichiro Kawasaki
  2019-08-21  4:47 ` [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function Shin'ichiro Kawasaki
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 2/4] libf2fs_zoned: Introduce f2fs_reset_zone() function Shin'ichiro Kawasaki
@ 2019-08-21  4:48 ` Shin'ichiro Kawasaki
  2019-08-27  2:01   ` Chao Yu
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count Shin'ichiro Kawasaki
  2019-08-23 13:09 ` [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Chao Yu
  4 siblings, 1 reply; 18+ messages in thread
From: Shin'ichiro Kawasaki @ 2019-08-21  4:48 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Damien Le Moal

On sudden f2fs shutdown, zoned block device status and f2fs current
segment positions in meta data can be inconsistent. When f2fs shutdown
happens before write operations completes, write pointers of zoned block
devices can go further but f2fs meta data keeps current segments at
positions before the write operations. After remounting the f2fs, the
inconsistency causes write operations not at write pointers and
"Unaligned write command" error is reported. This error was observed when
xfstests test case generic/388 was run with f2fs on a zoned block device.

To avoid the error, have f2fs.fsck check consistency between each current
segment's position and the write pointer of the zone the current segment
points to. If the write pointer goes advance from the current segment,
fix the current segment position setting at same as the write pointer
position. In case the write pointer is behind the current segment, write
zero data at the write pointer position to make write pointer position at
same as the current segment.

When inconsistencies are found, turn on c.bug_on flag in fsck_verify() to
ask users to fix them or not. When inconsistencies get fixed, turn on
'force' flag in fsck_verify() to enforce fixes in following checks. This
position fix is done at the beginning of do_fsck() function so that other
checks reflect the current segment modification.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 fsck/fsck.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 fsck/fsck.h |   3 ++
 fsck/main.c |   2 +
 3 files changed, 138 insertions(+)

diff --git a/fsck/fsck.c b/fsck/fsck.c
index 8953ca1..21a06ac 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -2574,6 +2574,125 @@ out:
 	return cnt;
 }
 
+struct write_pointer_check_data {
+	struct f2fs_sb_info *sbi;
+	struct device_info *dev;
+};
+
+#define SECTOR_SHIFT 9
+
+static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
+{
+	struct write_pointer_check_data *wpd = opaque;
+	struct f2fs_sb_info *sbi = wpd->sbi;
+	struct device_info *dev = wpd->dev;
+	struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
+	block_t zone_block, wp_block, wp_blkoff, cs_block, b;
+	unsigned int zone_segno, wp_segno;
+	struct seg_entry *se;
+	struct curseg_info *cs;
+	int cs_index, ret;
+	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
+	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
+	void *zero_blk;
+
+	if (blk_zone_conv(blkz))
+		return 0;
+
+	zone_block = dev->start_blkaddr
+		+ (blk_zone_sector(blkz) >> log_sectors_per_block);
+	zone_segno = GET_SEGNO(sbi, zone_block);
+	wp_block = dev->start_blkaddr
+		+ (blk_zone_wp_sector(blkz) >> log_sectors_per_block);
+	wp_segno = GET_SEGNO(sbi, wp_block);
+	wp_blkoff = wp_block - START_BLOCK(sbi, wp_segno);
+
+	/* find the curseg which points to the zone */
+	for (cs_index = 0; cs_index < NO_CHECK_TYPE; cs_index++) {
+		cs = &SM_I(sbi)->curseg_array[cs_index];
+		if (zone_segno <= cs->segno &&
+		    cs->segno < zone_segno + segs_per_zone)
+			break;
+	}
+
+	if (cs_index >= NR_CURSEG_TYPE)
+		return 0;
+
+	/* check write pointer consistency with the curseg in the zone */
+	cs_block = START_BLOCK(sbi, cs->segno) + cs->next_blkoff;
+	if (wp_block == cs_block)
+		return 0;
+
+	if (!c.fix_on) {
+		MSG(0, "Inconsistent write pointer: "
+		    "curseg %d[0x%x,0x%x] wp[0x%x,0x%x]\n",
+		    cs_index, cs->segno, cs->next_blkoff, wp_segno, wp_blkoff);
+		fsck->chk.wp_inconsistent_zones++;
+		return 0;
+	}
+
+	/*
+	 * If the curseg is in advance from the write pointer, write zero to
+	 * move the write pointer forward to the same position as the curseg.
+	 */
+	if (wp_block < cs_block) {
+		ret = 0;
+		zero_blk = calloc(BLOCK_SZ, 1);
+		if (!zero_blk)
+			return -EINVAL;
+
+		FIX_MSG("Advance write pointer to match with curseg %d: "
+			"[0x%x,0x%x]->[0x%x,0x%x]",
+			cs_index, wp_segno, wp_blkoff,
+			cs->segno, cs->next_blkoff);
+		for (b = wp_block; b < cs_block && !ret; b++)
+			ret = dev_write_block(zero_blk, b);
+
+		fsck->chk.wp_fixed_zones++;
+		free(zero_blk);
+		return ret;
+	}
+
+	/*
+	 * If the write pointer is in advance from the curseg, modify the
+	 * curseg position to be same as the write pointer.
+	 */
+	FIX_MSG("Advance curseg %d: [0x%x,0x%x]->[0x%x,0x%x]",
+		cs_index, cs->segno, cs->next_blkoff, wp_segno, wp_blkoff);
+	se = get_seg_entry(sbi, wp_segno);
+	se->type = cs_index;
+	cs->segno = wp_segno;
+	cs->next_blkoff = wp_blkoff;
+	fsck->chk.wp_fixed_zones++;
+
+	return 0;
+}
+
+void fsck_chk_write_pointers(struct f2fs_sb_info *sbi)
+{
+	unsigned int i;
+	struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
+	struct write_pointer_check_data wpd = {	sbi, NULL };
+
+	if (c.zoned_model != F2FS_ZONED_HM)
+		return;
+
+	for (i = 0; i < MAX_DEVICES; i++) {
+		if (!c.devices[i].path)
+			break;
+
+		wpd.dev = c.devices + i;
+		if (f2fs_report_zones(i, fsck_chk_write_pointer, &wpd)) {
+			printf("[FSCK] Write pointer check failed: %s\n",
+			       c.devices[i].path);
+			return;
+		}
+	}
+
+	if (fsck->chk.wp_fixed_zones && c.fix_on)
+		write_curseg_info(sbi);
+}
+
 int fsck_chk_curseg_info(struct f2fs_sb_info *sbi)
 {
 	struct curseg_info *curseg;
@@ -2624,6 +2743,20 @@ int fsck_verify(struct f2fs_sb_info *sbi)
 
 	printf("\n");
 
+	if (c.zoned_model == F2FS_ZONED_HM) {
+		printf("[FSCK] Write pointers consistency                    ");
+		if (fsck->chk.wp_inconsistent_zones == 0x0) {
+			printf(" [Ok..]\n");
+		} else {
+			printf(" [Fail] [0x%x]\n",
+			       fsck->chk.wp_inconsistent_zones);
+			c.bug_on = 1;
+		}
+
+		if (fsck->chk.wp_fixed_zones && c.fix_on)
+			force = 1;
+	}
+
 	if (c.feature & cpu_to_le32(F2FS_FEATURE_LOST_FOUND)) {
 		for (i = 0; i < fsck->nr_nat_entries; i++)
 			if (f2fs_test_bit(i, fsck->nat_area_bitmap) != 0)
diff --git a/fsck/fsck.h b/fsck/fsck.h
index d38e8de..aa3dbe7 100644
--- a/fsck/fsck.h
+++ b/fsck/fsck.h
@@ -80,6 +80,8 @@ struct f2fs_fsck {
 		u32 multi_hard_link_files;
 		u64 sit_valid_blocks;
 		u32 sit_free_segs;
+		u32 wp_fixed_zones;
+		u32 wp_inconsistent_zones;
 	} chk;
 
 	struct hard_link_node *hard_link_list_head;
@@ -156,6 +158,7 @@ int fsck_chk_inline_dentries(struct f2fs_sb_info *, struct f2fs_node *,
 		struct child_info *);
 void fsck_chk_checkpoint(struct f2fs_sb_info *sbi);
 int fsck_chk_meta(struct f2fs_sb_info *sbi);
+void fsck_chk_write_pointers(struct f2fs_sb_info *);
 int fsck_chk_curseg_info(struct f2fs_sb_info *);
 void pretty_print_filename(const u8 *raw_name, u32 len,
 			   char out[F2FS_PRINT_NAMELEN], int enc_name);
diff --git a/fsck/main.c b/fsck/main.c
index 9aca024..4b4a789 100644
--- a/fsck/main.c
+++ b/fsck/main.c
@@ -584,6 +584,8 @@ static void do_fsck(struct f2fs_sb_info *sbi)
 
 	print_cp_state(flag);
 
+	fsck_chk_write_pointers(sbi);
+
 	fsck_chk_curseg_info(sbi);
 
 	if (!c.fix_on && !c.bug_on) {
-- 
2.21.0



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count
  2019-08-21  4:47 [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Shin'ichiro Kawasaki
                   ` (2 preceding siblings ...)
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments Shin'ichiro Kawasaki
@ 2019-08-21  4:48 ` Shin'ichiro Kawasaki
  2019-08-27  2:25   ` Chao Yu
  2019-08-23 13:09 ` [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Chao Yu
  4 siblings, 1 reply; 18+ messages in thread
From: Shin'ichiro Kawasaki @ 2019-08-21  4:48 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Damien Le Moal

When sudden f2fs shutdown happens on zoned block devices, write
pointers can be inconsistent with valid blocks counts in meta data.
The failure scenario is as follows:

- Just before a sudden shutdown, a new segment in a new zone is selected
  for a current segment. Write commands were executed to the segment.
  and the zone has a write pointer not at zone start.
- Before the write commands complete, shutdown happens. Meta data is
  not updated and still keeps zero valid blocks count for the zone.
- After next mount of the file system, the zone is selected for the next
  write target because it has zero valid blocks count. However, it has
  the write pointer not at zone start. Then "Unaligned write command"
  error happens.

To avoid this potential error path, reset write pointers if the zone
does not have a current segment, the write pointer is not at the zone
start and the zone has no valid blocks.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 fsck/fsck.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/fsck/fsck.c b/fsck/fsck.c
index 21a06ac..cc9bbc0 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -2595,6 +2595,7 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
 	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
 	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
 	void *zero_blk;
+	block_t	zone_valid_blocks = 0;
 
 	if (blk_zone_conv(blkz))
 		return 0;
@@ -2615,8 +2616,35 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
 			break;
 	}
 
-	if (cs_index >= NR_CURSEG_TYPE)
+	if (cs_index >= NR_CURSEG_TYPE) {
+		for (b = zone_block; b < zone_block + c.zone_blocks &&
+			     IS_VALID_BLK_ADDR(sbi, b); b += c.blks_per_seg) {
+			se = get_seg_entry(sbi, GET_SEGNO(sbi, b));
+			zone_valid_blocks += se->valid_blocks;
+		}
+		if (wp_block == zone_block || zone_valid_blocks)
+			return 0;
+
+		/*
+		 * The write pointer is not at zone start but there is no valid
+		 * block in the zone. Segments in the zone can be selected for
+		 * next write. Need to reset the write pointer to avoid
+		 * unaligned write command error.
+		 */
+		if (c.fix_on) {
+			FIX_MSG("Reset write pointer at segment 0x%x",
+				zone_segno);
+			ret = f2fs_reset_zone(dev, blkz);
+			if (ret)
+				return ret;
+			fsck->chk.wp_fixed_zones++;
+		} else {
+			MSG(0, "Inconsistent write pointer at segment 0x%x\n",
+			    zone_segno);
+			fsck->chk.wp_inconsistent_zones++;
+		}
 		return 0;
+	}
 
 	/* check write pointer consistency with the curseg in the zone */
 	cs_block = START_BLOCK(sbi, cs->segno) + cs->next_blkoff;
-- 
2.21.0



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices
  2019-08-21  4:47 [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Shin'ichiro Kawasaki
                   ` (3 preceding siblings ...)
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count Shin'ichiro Kawasaki
@ 2019-08-23 13:09 ` Chao Yu
  2019-08-26  0:10   ` Damien Le Moal
  4 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2019-08-23 13:09 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Jaegeuk Kim, linux-f2fs-devel

Hi Damien,

Do you have time to take a look at this patch set, and add a reviewed-by if it
is okay to you. :)

Thanks,

On 2019-8-21 12:47, Shin'ichiro Kawasaki wrote:
> On sudden f2fs shutdown, zoned block device status and f2fs meta data can be
> inconsistent. When f2fs shutdown happens during write operations, write pointers
> on the device go forward but the f2fs meta data does not reflect write pointer
> progress. This inconsistency will eventually cause "Unaligned write command"
> error when restarting write operation after the next mount. This error can be
> observed with xfstests test case generic/388, which enforces sudden shutdown
> during write operation and checks the file system recovery. Once the error
> happens because of the inconsistency, the file system requires fix. However,
> fsck.f2fs does not have a feature to check and fix it.
> 
> This patch series adds a new feature to fsck.f2fs to check and fix the
> inconsistency. First and second patches add two functions which helps fsck to
> call report zone and reset zone commands to zoned block devices. Third patch
> checks write pointers of zones that current segments recorded in meta data point
> to. This covers the failure symptom observed with xfstests. The last patch
> checks write pointers of zones that current segments do not point to, which
> covers a potential failure scenario.
> 
> This patch series depends on other patches for zoned block devices, then it
> conflicts with the master branch in f2fs-tools.git as of Aug/19/2019. It can be
> applied without conflict to dev and dev-test branch tips.
> 
> Changes from v1:
> * Fixed build failure on dev branch
> 
> Shin'ichiro Kawasaki (4):
>   libf2fs_zoned: Introduce f2fs_report_zones() helper function
>   libf2fs_zoned: Introduce f2fs_reset_zone() function
>   fsck.f2fs: Check write pointer consistency with current segments
>   fsck.f2fs: Check write pointer consistency with valid blocks count
> 
>  fsck/fsck.c         | 161 ++++++++++++++++++++++++++++++++++++++++++++
>  fsck/fsck.h         |   3 +
>  fsck/main.c         |   2 +
>  include/f2fs_fs.h   |   3 +
>  lib/libf2fs_zoned.c |  81 ++++++++++++++++++++++
>  5 files changed, 250 insertions(+)
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices
  2019-08-23 13:09 ` [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Chao Yu
@ 2019-08-26  0:10   ` Damien Le Moal
  2019-08-26  7:37     ` Chao Yu
  0 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-08-26  0:10 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-f2fs-devel

Chao,

On 2019/08/23 22:09, Chao Yu wrote:
> Hi Damien,
> 
> Do you have time to take a look at this patch set, and add a reviewed-by if it
> is okay to you. :)

My apologies for not chiming in earlier. Shinichiro works in my group and I
asked him to work on this. We have gone through several iterations internally
and this is the latest version.

So:

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

> 
> Thanks,
> 
> On 2019-8-21 12:47, Shin'ichiro Kawasaki wrote:
>> On sudden f2fs shutdown, zoned block device status and f2fs meta data can be
>> inconsistent. When f2fs shutdown happens during write operations, write pointers
>> on the device go forward but the f2fs meta data does not reflect write pointer
>> progress. This inconsistency will eventually cause "Unaligned write command"
>> error when restarting write operation after the next mount. This error can be
>> observed with xfstests test case generic/388, which enforces sudden shutdown
>> during write operation and checks the file system recovery. Once the error
>> happens because of the inconsistency, the file system requires fix. However,
>> fsck.f2fs does not have a feature to check and fix it.
>>
>> This patch series adds a new feature to fsck.f2fs to check and fix the
>> inconsistency. First and second patches add two functions which helps fsck to
>> call report zone and reset zone commands to zoned block devices. Third patch
>> checks write pointers of zones that current segments recorded in meta data point
>> to. This covers the failure symptom observed with xfstests. The last patch
>> checks write pointers of zones that current segments do not point to, which
>> covers a potential failure scenario.
>>
>> This patch series depends on other patches for zoned block devices, then it
>> conflicts with the master branch in f2fs-tools.git as of Aug/19/2019. It can be
>> applied without conflict to dev and dev-test branch tips.
>>
>> Changes from v1:
>> * Fixed build failure on dev branch
>>
>> Shin'ichiro Kawasaki (4):
>>   libf2fs_zoned: Introduce f2fs_report_zones() helper function
>>   libf2fs_zoned: Introduce f2fs_reset_zone() function
>>   fsck.f2fs: Check write pointer consistency with current segments
>>   fsck.f2fs: Check write pointer consistency with valid blocks count
>>
>>  fsck/fsck.c         | 161 ++++++++++++++++++++++++++++++++++++++++++++
>>  fsck/fsck.h         |   3 +
>>  fsck/main.c         |   2 +
>>  include/f2fs_fs.h   |   3 +
>>  lib/libf2fs_zoned.c |  81 ++++++++++++++++++++++
>>  5 files changed, 250 insertions(+)
>>
> 


-- 
Damien Le Moal
Western Digital Research


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices
  2019-08-26  0:10   ` Damien Le Moal
@ 2019-08-26  7:37     ` Chao Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Chao Yu @ 2019-08-26  7:37 UTC (permalink / raw)
  To: Damien Le Moal, Chao Yu; +Cc: Jaegeuk Kim, linux-f2fs-devel

On 2019/8/26 8:10, Damien Le Moal wrote:
> Chao,
> 
> On 2019/08/23 22:09, Chao Yu wrote:
>> Hi Damien,
>>
>> Do you have time to take a look at this patch set, and add a reviewed-by if it
>> is okay to you. :)
> 
> My apologies for not chiming in earlier. Shinichiro works in my group and I

Hi Damien,

I can guess you're in the same team. ;)

> asked him to work on this. We have gone through several iterations internally
> and this is the latest version.

I do believe zoned block codes change needs your review, due to my less
experience on such device. :P

> 
> So:
> 
> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>

Thanks a lot for the review.

Thanks,

> 
>>
>> Thanks,
>>
>> On 2019-8-21 12:47, Shin'ichiro Kawasaki wrote:
>>> On sudden f2fs shutdown, zoned block device status and f2fs meta data can be
>>> inconsistent. When f2fs shutdown happens during write operations, write pointers
>>> on the device go forward but the f2fs meta data does not reflect write pointer
>>> progress. This inconsistency will eventually cause "Unaligned write command"
>>> error when restarting write operation after the next mount. This error can be
>>> observed with xfstests test case generic/388, which enforces sudden shutdown
>>> during write operation and checks the file system recovery. Once the error
>>> happens because of the inconsistency, the file system requires fix. However,
>>> fsck.f2fs does not have a feature to check and fix it.
>>>
>>> This patch series adds a new feature to fsck.f2fs to check and fix the
>>> inconsistency. First and second patches add two functions which helps fsck to
>>> call report zone and reset zone commands to zoned block devices. Third patch
>>> checks write pointers of zones that current segments recorded in meta data point
>>> to. This covers the failure symptom observed with xfstests. The last patch
>>> checks write pointers of zones that current segments do not point to, which
>>> covers a potential failure scenario.
>>>
>>> This patch series depends on other patches for zoned block devices, then it
>>> conflicts with the master branch in f2fs-tools.git as of Aug/19/2019. It can be
>>> applied without conflict to dev and dev-test branch tips.
>>>
>>> Changes from v1:
>>> * Fixed build failure on dev branch
>>>
>>> Shin'ichiro Kawasaki (4):
>>>   libf2fs_zoned: Introduce f2fs_report_zones() helper function
>>>   libf2fs_zoned: Introduce f2fs_reset_zone() function
>>>   fsck.f2fs: Check write pointer consistency with current segments
>>>   fsck.f2fs: Check write pointer consistency with valid blocks count
>>>
>>>  fsck/fsck.c         | 161 ++++++++++++++++++++++++++++++++++++++++++++
>>>  fsck/fsck.h         |   3 +
>>>  fsck/main.c         |   2 +
>>>  include/f2fs_fs.h   |   3 +
>>>  lib/libf2fs_zoned.c |  81 ++++++++++++++++++++++
>>>  5 files changed, 250 insertions(+)
>>>
>>
> 
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function
  2019-08-21  4:47 ` [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function Shin'ichiro Kawasaki
@ 2019-08-27  1:34   ` Chao Yu
  2019-08-28  8:32     ` Shinichiro Kawasaki
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2019-08-27  1:34 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, Jaegeuk Kim, linux-f2fs-devel; +Cc: Damien Le Moal

On 2019/8/21 12:47, Shin'ichiro Kawasaki wrote:
> To prepare for write pointer consistency check by fsck, add
> f2fs_report_zones() helper function which calls REPORT ZONE command to
> get write pointer status. The function is added to lib/libf2fs_zoned
> which gathers zoned block device related functions.
> 
> To check write pointer consistency with f2fs meta data, fsck needs to
> refer both of reported zone information and f2fs super block structure
> "f2fs_sb_info". However, libf2fs_zoned does not import f2fs_sb_info. To
> keep f2fs_sb_info structure out of libf2fs_zoned, provide a callback
> function in fsck to f2fs_report_zones() and call it for each zone.
> 
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
>  include/f2fs_fs.h   |  2 ++
>  lib/libf2fs_zoned.c | 57 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
> index 0d9a036..abadd1b 100644
> --- a/include/f2fs_fs.h
> +++ b/include/f2fs_fs.h
> @@ -1279,6 +1279,8 @@ blk_zone_cond_str(struct blk_zone *blkz)
>  
>  extern int f2fs_get_zoned_model(int);
>  extern int f2fs_get_zone_blocks(int);
> +typedef int (report_zones_cb_t)(int i, struct blk_zone *blkz, void *opaque);
> +extern int f2fs_report_zones(int, report_zones_cb_t *, void *);
>  extern int f2fs_check_zones(int);
>  extern int f2fs_reset_zones(int);
>  
> diff --git a/lib/libf2fs_zoned.c b/lib/libf2fs_zoned.c
> index af00b44..fc4974f 100644
> --- a/lib/libf2fs_zoned.c
> +++ b/lib/libf2fs_zoned.c
> @@ -193,6 +193,57 @@ int f2fs_get_zone_blocks(int i)
>  
>  #define F2FS_REPORT_ZONES_BUFSZ	524288
>  
> +int f2fs_report_zones(int j, report_zones_cb_t *report_zones_cb, void *opaque)
> +{
> +	struct device_info *dev = c.devices + j;
> +	struct blk_zone_report *rep;
> +	struct blk_zone *blkz;
> +	unsigned int i, n = 0;
> +	u_int64_t total_sectors = (dev->total_sectors * c.sector_size) >> 9;

Hi Shin'ichiro,

Could we use SECTOR_SHIFT instead?

> +	u_int64_t sector = 0;
> +	int ret = -1;
> +
> +	rep = malloc(F2FS_REPORT_ZONES_BUFSZ);
> +	if (!rep) {
> +		ERR_MSG("No memory for report zones\n");
> +		return -ENOMEM;
> +	}
> +
> +	while (sector < total_sectors) {
> +
> +		/* Get zone info */
> +		rep->sector = sector;
> +		rep->nr_zones = (F2FS_REPORT_ZONES_BUFSZ - sizeof(struct blk_zone_report))
> +			/ sizeof(struct blk_zone);
> +
> +		ret = ioctl(dev->fd, BLKREPORTZONE, rep);
> +		if (ret != 0) {
> +			ret = -errno;
> +			ERR_MSG("ioctl BLKREPORTZONE failed\n");
It's minor, it will be better to print errno here, since I didn't see we print
error no from caller.

> +			goto out;
> +		}
> +
> +		if (!rep->nr_zones) {
> +			ret = -EIO;
> +			ERR_MSG("Unexpected ioctl BLKREPORTZONE result\n");
> +			goto out;
> +		}
> +
> +		blkz = (struct blk_zone *)(rep + 1);
> +		for (i = 0; i < rep->nr_zones && sector < total_sectors; i++) {

The condition looks like that block zones in rep may across end-of-device? Will
this really happen?

So I mean will "i < rep->nr_zones" be enough?

Thanks,

> +			ret = report_zones_cb(n, blkz, opaque);
> +			if (ret)
> +				goto out;
> +			sector = blk_zone_sector(blkz) + blk_zone_length(blkz);
> +			n++;
> +			blkz++;
> +		}
> +	}
> +out:
> +	free(rep);
> +	return ret;
> +}
> +
>  int f2fs_check_zones(int j)
>  {
>  	struct device_info *dev = c.devices + j;
> @@ -372,6 +423,12 @@ out:
>  
>  #else
>  
> +int f2fs_report_zones(int j, report_zones_cb_t *report_zones_cb, void *opaque)
> +{
> +	ERR_MSG("%d: Zoned block devices are not supported\n", i);
> +	return -1;
> +}
> +
>  int f2fs_get_zoned_model(int i)
>  {
>  	struct device_info *dev = c.devices + i;
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 2/4] libf2fs_zoned: Introduce f2fs_reset_zone() function
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 2/4] libf2fs_zoned: Introduce f2fs_reset_zone() function Shin'ichiro Kawasaki
@ 2019-08-27  1:36   ` Chao Yu
  0 siblings, 0 replies; 18+ messages in thread
From: Chao Yu @ 2019-08-27  1:36 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, Jaegeuk Kim, linux-f2fs-devel; +Cc: Damien Le Moal

On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> To allow fsck to reset a zone with inconsistent write pointer, introduce
> a helper function f2fs_reset_zone().
> 
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Reviewed-by: Chao Yu <yuchao0@huawei.com>

Thanks,


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments Shin'ichiro Kawasaki
@ 2019-08-27  2:01   ` Chao Yu
  2019-08-27  2:13     ` Chao Yu
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2019-08-27  2:01 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, Jaegeuk Kim, linux-f2fs-devel; +Cc: Damien Le Moal

On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> On sudden f2fs shutdown, zoned block device status and f2fs current
> segment positions in meta data can be inconsistent. When f2fs shutdown
> happens before write operations completes, write pointers of zoned block
> devices can go further but f2fs meta data keeps current segments at
> positions before the write operations. After remounting the f2fs, the
> inconsistency causes write operations not at write pointers and
> "Unaligned write command" error is reported. This error was observed when
> xfstests test case generic/388 was run with f2fs on a zoned block device.
> 
> To avoid the error, have f2fs.fsck check consistency between each current
> segment's position and the write pointer of the zone the current segment
> points to. If the write pointer goes advance from the current segment,
> fix the current segment position setting at same as the write pointer
> position. In case the write pointer is behind the current segment, write
> zero data at the write pointer position to make write pointer position at
> same as the current segment.
> 
> When inconsistencies are found, turn on c.bug_on flag in fsck_verify() to
> ask users to fix them or not. When inconsistencies get fixed, turn on
> 'force' flag in fsck_verify() to enforce fixes in following checks. This
> position fix is done at the beginning of do_fsck() function so that other
> checks reflect the current segment modification.
> 
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
>  fsck/fsck.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fsck/fsck.h |   3 ++
>  fsck/main.c |   2 +
>  3 files changed, 138 insertions(+)
> 
> diff --git a/fsck/fsck.c b/fsck/fsck.c
> index 8953ca1..21a06ac 100644
> --- a/fsck/fsck.c
> +++ b/fsck/fsck.c
> @@ -2574,6 +2574,125 @@ out:
>  	return cnt;
>  }
>  
> +struct write_pointer_check_data {
> +	struct f2fs_sb_info *sbi;
> +	struct device_info *dev;
> +};
> +
> +#define SECTOR_SHIFT 9
> +
> +static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
> +{
> +	struct write_pointer_check_data *wpd = opaque;
> +	struct f2fs_sb_info *sbi = wpd->sbi;
> +	struct device_info *dev = wpd->dev;
> +	struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
> +	block_t zone_block, wp_block, wp_blkoff, cs_block, b;
> +	unsigned int zone_segno, wp_segno;
> +	struct seg_entry *se;
> +	struct curseg_info *cs;
> +	int cs_index, ret;
> +	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
> +	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
> +	void *zero_blk;
> +
> +	if (blk_zone_conv(blkz))
> +		return 0;
> +
> +	zone_block = dev->start_blkaddr
> +		+ (blk_zone_sector(blkz) >> log_sectors_per_block);
> +	zone_segno = GET_SEGNO(sbi, zone_block);
> +	wp_block = dev->start_blkaddr
> +		+ (blk_zone_wp_sector(blkz) >> log_sectors_per_block);
> +	wp_segno = GET_SEGNO(sbi, wp_block);
> +	wp_blkoff = wp_block - START_BLOCK(sbi, wp_segno);
> +
> +	/* find the curseg which points to the zone */
> +	for (cs_index = 0; cs_index < NO_CHECK_TYPE; cs_index++) {
> +		cs = &SM_I(sbi)->curseg_array[cs_index];
> +		if (zone_segno <= cs->segno &&
> +		    cs->segno < zone_segno + segs_per_zone)
> +			break;
> +	}

Will this happen?

- write checkpoint
- curseg points zone A
- write large number of data
- curseg points zone B, write pointer > 0
- sudden power cut, curseg will be reset to zone A

zone B's write pointer won't be verified due to curseg points to zone A?

Thanks,

> +
> +	if (cs_index >= NR_CURSEG_TYPE)
> +		return 0;
> +
> +	/* check write pointer consistency with the curseg in the zone */
> +	cs_block = START_BLOCK(sbi, cs->segno) + cs->next_blkoff;
> +	if (wp_block == cs_block)
> +		return 0;
> +
> +	if (!c.fix_on) {
> +		MSG(0, "Inconsistent write pointer: "
> +		    "curseg %d[0x%x,0x%x] wp[0x%x,0x%x]\n",
> +		    cs_index, cs->segno, cs->next_blkoff, wp_segno, wp_blkoff);
> +		fsck->chk.wp_inconsistent_zones++;
> +		return 0;
> +	}
> +
> +	/*
> +	 * If the curseg is in advance from the write pointer, write zero to
> +	 * move the write pointer forward to the same position as the curseg.
> +	 */
> +	if (wp_block < cs_block) {
> +		ret = 0;
> +		zero_blk = calloc(BLOCK_SZ, 1);
> +		if (!zero_blk)
> +			return -EINVAL;
> +
> +		FIX_MSG("Advance write pointer to match with curseg %d: "
> +			"[0x%x,0x%x]->[0x%x,0x%x]",
> +			cs_index, wp_segno, wp_blkoff,
> +			cs->segno, cs->next_blkoff);
> +		for (b = wp_block; b < cs_block && !ret; b++)
> +			ret = dev_write_block(zero_blk, b);
> +
> +		fsck->chk.wp_fixed_zones++;
> +		free(zero_blk);
> +		return ret;
> +	}
> +
> +	/*
> +	 * If the write pointer is in advance from the curseg, modify the
> +	 * curseg position to be same as the write pointer.
> +	 */
> +	FIX_MSG("Advance curseg %d: [0x%x,0x%x]->[0x%x,0x%x]",
> +		cs_index, cs->segno, cs->next_blkoff, wp_segno, wp_blkoff);
> +	se = get_seg_entry(sbi, wp_segno);
> +	se->type = cs_index;
> +	cs->segno = wp_segno;
> +	cs->next_blkoff = wp_blkoff;
> +	fsck->chk.wp_fixed_zones++;
> +
> +	return 0;
> +}
> +
> +void fsck_chk_write_pointers(struct f2fs_sb_info *sbi)
> +{
> +	unsigned int i;
> +	struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
> +	struct write_pointer_check_data wpd = {	sbi, NULL };
> +
> +	if (c.zoned_model != F2FS_ZONED_HM)
> +		return;
> +
> +	for (i = 0; i < MAX_DEVICES; i++) {
> +		if (!c.devices[i].path)
> +			break;
> +
> +		wpd.dev = c.devices + i;
> +		if (f2fs_report_zones(i, fsck_chk_write_pointer, &wpd)) {
> +			printf("[FSCK] Write pointer check failed: %s\n",
> +			       c.devices[i].path);
> +			return;
> +		}
> +	}
> +
> +	if (fsck->chk.wp_fixed_zones && c.fix_on)
> +		write_curseg_info(sbi);
> +}
> +
>  int fsck_chk_curseg_info(struct f2fs_sb_info *sbi)
>  {
>  	struct curseg_info *curseg;
> @@ -2624,6 +2743,20 @@ int fsck_verify(struct f2fs_sb_info *sbi)
>  
>  	printf("\n");
>  
> +	if (c.zoned_model == F2FS_ZONED_HM) {
> +		printf("[FSCK] Write pointers consistency                    ");
> +		if (fsck->chk.wp_inconsistent_zones == 0x0) {
> +			printf(" [Ok..]\n");
> +		} else {
> +			printf(" [Fail] [0x%x]\n",
> +			       fsck->chk.wp_inconsistent_zones);
> +			c.bug_on = 1;
> +		}
> +
> +		if (fsck->chk.wp_fixed_zones && c.fix_on)
> +			force = 1;
> +	}
> +
>  	if (c.feature & cpu_to_le32(F2FS_FEATURE_LOST_FOUND)) {
>  		for (i = 0; i < fsck->nr_nat_entries; i++)
>  			if (f2fs_test_bit(i, fsck->nat_area_bitmap) != 0)
> diff --git a/fsck/fsck.h b/fsck/fsck.h
> index d38e8de..aa3dbe7 100644
> --- a/fsck/fsck.h
> +++ b/fsck/fsck.h
> @@ -80,6 +80,8 @@ struct f2fs_fsck {
>  		u32 multi_hard_link_files;
>  		u64 sit_valid_blocks;
>  		u32 sit_free_segs;
> +		u32 wp_fixed_zones;
> +		u32 wp_inconsistent_zones;
>  	} chk;
>  
>  	struct hard_link_node *hard_link_list_head;
> @@ -156,6 +158,7 @@ int fsck_chk_inline_dentries(struct f2fs_sb_info *, struct f2fs_node *,
>  		struct child_info *);
>  void fsck_chk_checkpoint(struct f2fs_sb_info *sbi);
>  int fsck_chk_meta(struct f2fs_sb_info *sbi);
> +void fsck_chk_write_pointers(struct f2fs_sb_info *);
>  int fsck_chk_curseg_info(struct f2fs_sb_info *);
>  void pretty_print_filename(const u8 *raw_name, u32 len,
>  			   char out[F2FS_PRINT_NAMELEN], int enc_name);
> diff --git a/fsck/main.c b/fsck/main.c
> index 9aca024..4b4a789 100644
> --- a/fsck/main.c
> +++ b/fsck/main.c
> @@ -584,6 +584,8 @@ static void do_fsck(struct f2fs_sb_info *sbi)
>  
>  	print_cp_state(flag);
>  
> +	fsck_chk_write_pointers(sbi);
> +
>  	fsck_chk_curseg_info(sbi);
>  
>  	if (!c.fix_on && !c.bug_on) {
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments
  2019-08-27  2:01   ` Chao Yu
@ 2019-08-27  2:13     ` Chao Yu
  2019-08-29  4:41       ` Shinichiro Kawasaki
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2019-08-27  2:13 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, Jaegeuk Kim, linux-f2fs-devel; +Cc: Damien Le Moal

On 2019/8/27 10:01, Chao Yu wrote:
> On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
>> On sudden f2fs shutdown, zoned block device status and f2fs current
>> segment positions in meta data can be inconsistent. When f2fs shutdown
>> happens before write operations completes, write pointers of zoned block
>> devices can go further but f2fs meta data keeps current segments at
>> positions before the write operations. After remounting the f2fs, the
>> inconsistency causes write operations not at write pointers and
>> "Unaligned write command" error is reported. This error was observed when
>> xfstests test case generic/388 was run with f2fs on a zoned block device.
>>
>> To avoid the error, have f2fs.fsck check consistency between each current
>> segment's position and the write pointer of the zone the current segment
>> points to. If the write pointer goes advance from the current segment,
>> fix the current segment position setting at same as the write pointer
>> position. In case the write pointer is behind the current segment, write
>> zero data at the write pointer position to make write pointer position at
>> same as the current segment.
>>
>> When inconsistencies are found, turn on c.bug_on flag in fsck_verify() to
>> ask users to fix them or not. When inconsistencies get fixed, turn on
>> 'force' flag in fsck_verify() to enforce fixes in following checks. This
>> position fix is done at the beginning of do_fsck() function so that other
>> checks reflect the current segment modification.
>>
>> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
>> ---
>>  fsck/fsck.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  fsck/fsck.h |   3 ++
>>  fsck/main.c |   2 +
>>  3 files changed, 138 insertions(+)
>>
>> diff --git a/fsck/fsck.c b/fsck/fsck.c
>> index 8953ca1..21a06ac 100644
>> --- a/fsck/fsck.c
>> +++ b/fsck/fsck.c
>> @@ -2574,6 +2574,125 @@ out:
>>  	return cnt;
>>  }
>>  
>> +struct write_pointer_check_data {
>> +	struct f2fs_sb_info *sbi;
>> +	struct device_info *dev;
>> +};
>> +
>> +#define SECTOR_SHIFT 9
>> +
>> +static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
>> +{
>> +	struct write_pointer_check_data *wpd = opaque;
>> +	struct f2fs_sb_info *sbi = wpd->sbi;
>> +	struct device_info *dev = wpd->dev;
>> +	struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
>> +	block_t zone_block, wp_block, wp_blkoff, cs_block, b;
>> +	unsigned int zone_segno, wp_segno;
>> +	struct seg_entry *se;
>> +	struct curseg_info *cs;
>> +	int cs_index, ret;
>> +	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
>> +	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
>> +	void *zero_blk;
>> +
>> +	if (blk_zone_conv(blkz))
>> +		return 0;
>> +
>> +	zone_block = dev->start_blkaddr
>> +		+ (blk_zone_sector(blkz) >> log_sectors_per_block);
>> +	zone_segno = GET_SEGNO(sbi, zone_block);
>> +	wp_block = dev->start_blkaddr
>> +		+ (blk_zone_wp_sector(blkz) >> log_sectors_per_block);
>> +	wp_segno = GET_SEGNO(sbi, wp_block);
>> +	wp_blkoff = wp_block - START_BLOCK(sbi, wp_segno);
>> +
>> +	/* find the curseg which points to the zone */
>> +	for (cs_index = 0; cs_index < NO_CHECK_TYPE; cs_index++) {
>> +		cs = &SM_I(sbi)->curseg_array[cs_index];
>> +		if (zone_segno <= cs->segno &&
>> +		    cs->segno < zone_segno + segs_per_zone)
>> +			break;
>> +	}
> 
> Will this happen?
> 
> - write checkpoint
> - curseg points zone A
> - write large number of data
> - curseg points zone B, write pointer > 0
> - sudden power cut, curseg will be reset to zone A
> 
> zone B's write pointer won't be verified due to curseg points to zone A?

IIUC, we are trying fix such condition in a separated PATCH 4/4.

Reviewed-by: Chao Yu <yuchao0@huawei.com>

Thanks


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count
  2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count Shin'ichiro Kawasaki
@ 2019-08-27  2:25   ` Chao Yu
  2019-08-28 11:53     ` Shinichiro Kawasaki
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2019-08-27  2:25 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, Jaegeuk Kim, linux-f2fs-devel; +Cc: Damien Le Moal

On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> When sudden f2fs shutdown happens on zoned block devices, write
> pointers can be inconsistent with valid blocks counts in meta data.
> The failure scenario is as follows:
> 
> - Just before a sudden shutdown, a new segment in a new zone is selected
>   for a current segment. Write commands were executed to the segment.
>   and the zone has a write pointer not at zone start.
> - Before the write commands complete, shutdown happens. Meta data is
>   not updated and still keeps zero valid blocks count for the zone.
> - After next mount of the file system, the zone is selected for the next
>   write target because it has zero valid blocks count. However, it has
>   the write pointer not at zone start. Then "Unaligned write command"
>   error happens.
> 
> To avoid this potential error path, reset write pointers if the zone
> does not have a current segment, the write pointer is not at the zone
> start and the zone has no valid blocks.
> 
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
>  fsck/fsck.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/fsck/fsck.c b/fsck/fsck.c
> index 21a06ac..cc9bbc0 100644
> --- a/fsck/fsck.c
> +++ b/fsck/fsck.c
> @@ -2595,6 +2595,7 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
>  	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
>  	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
>  	void *zero_blk;
> +	block_t	zone_valid_blocks = 0;
>  
>  	if (blk_zone_conv(blkz))
>  		return 0;
> @@ -2615,8 +2616,35 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
>  			break;
>  	}
>  
> -	if (cs_index >= NR_CURSEG_TYPE)
> +	if (cs_index >= NR_CURSEG_TYPE) {
> +		for (b = zone_block; b < zone_block + c.zone_blocks &&
> +			     IS_VALID_BLK_ADDR(sbi, b); b += c.blks_per_seg) {
> +			se = get_seg_entry(sbi, GET_SEGNO(sbi, b));
> +			zone_valid_blocks += se->valid_blocks;
> +		}
> +		if (wp_block == zone_block || zone_valid_blocks)
> +			return 0;
> +
> +		/*
> +		 * The write pointer is not at zone start but there is no valid
> +		 * block in the zone. Segments in the zone can be selected for
> +		 * next write. Need to reset the write pointer to avoid
> +		 * unaligned write command error.

In SPOR (sudden power-off recovery) of kernel side, we may revalidate blocks
belong to fsynced file in such zone within range of [0, write pointer], if we
just reset zone, will we lose those data for ever?

BTW, how you think enabling f2fs kernel module to recover incorrect write
pointer of zone? Once f2fs-tools doesn't upgrade, however kernel does...

Thanks,

> +		 */
> +		if (c.fix_on) {
> +			FIX_MSG("Reset write pointer at segment 0x%x",
> +				zone_segno);
> +			ret = f2fs_reset_zone(dev, blkz);
> +			if (ret)
> +				return ret;
> +			fsck->chk.wp_fixed_zones++;
> +		} else {
> +			MSG(0, "Inconsistent write pointer at segment 0x%x\n",
> +			    zone_segno);
> +			fsck->chk.wp_inconsistent_zones++;
> +		}
>  		return 0;
> +	}
>  
>  	/* check write pointer consistency with the curseg in the zone */
>  	cs_block = START_BLOCK(sbi, cs->segno) + cs->next_blkoff;
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function
  2019-08-27  1:34   ` Chao Yu
@ 2019-08-28  8:32     ` Shinichiro Kawasaki
  0 siblings, 0 replies; 18+ messages in thread
From: Shinichiro Kawasaki @ 2019-08-28  8:32 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, Damien Le Moal, linux-f2fs-devel

On Aug 27, 2019 / 09:34, Chao Yu wrote:
> On 2019/8/21 12:47, Shin'ichiro Kawasaki wrote:
> > To prepare for write pointer consistency check by fsck, add
> > f2fs_report_zones() helper function which calls REPORT ZONE command to
> > get write pointer status. The function is added to lib/libf2fs_zoned
> > which gathers zoned block device related functions.
> > 
> > To check write pointer consistency with f2fs meta data, fsck needs to
> > refer both of reported zone information and f2fs super block structure
> > "f2fs_sb_info". However, libf2fs_zoned does not import f2fs_sb_info. To
> > keep f2fs_sb_info structure out of libf2fs_zoned, provide a callback
> > function in fsck to f2fs_report_zones() and call it for each zone.
> > 
> > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> > ---
> >  include/f2fs_fs.h   |  2 ++
> >  lib/libf2fs_zoned.c | 57 +++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 59 insertions(+)
> > 
> > diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
> > index 0d9a036..abadd1b 100644
> > --- a/include/f2fs_fs.h
> > +++ b/include/f2fs_fs.h
> > @@ -1279,6 +1279,8 @@ blk_zone_cond_str(struct blk_zone *blkz)
> >  
> >  extern int f2fs_get_zoned_model(int);
> >  extern int f2fs_get_zone_blocks(int);
> > +typedef int (report_zones_cb_t)(int i, struct blk_zone *blkz, void *opaque);
> > +extern int f2fs_report_zones(int, report_zones_cb_t *, void *);
> >  extern int f2fs_check_zones(int);
> >  extern int f2fs_reset_zones(int);
> >  
> > diff --git a/lib/libf2fs_zoned.c b/lib/libf2fs_zoned.c
> > index af00b44..fc4974f 100644
> > --- a/lib/libf2fs_zoned.c
> > +++ b/lib/libf2fs_zoned.c
> > @@ -193,6 +193,57 @@ int f2fs_get_zone_blocks(int i)
> >  
> >  #define F2FS_REPORT_ZONES_BUFSZ	524288
> >  
> > +int f2fs_report_zones(int j, report_zones_cb_t *report_zones_cb, void *opaque)
> > +{
> > +	struct device_info *dev = c.devices + j;
> > +	struct blk_zone_report *rep;
> > +	struct blk_zone *blkz;
> > +	unsigned int i, n = 0;
> > +	u_int64_t total_sectors = (dev->total_sectors * c.sector_size) >> 9;
> 
> Hi Shin'ichiro,

Hi Chao, thank you for your review.

> Could we use SECTOR_SHIFT instead?

Yes. In the third patch, I added SECTOR_SHIFT definition in fsck/fsck.c. To use
it both in lib/libf2fs_zoned.c and fsck/fsck.c, I will define SECTOR_SHIFT in
include/f2fs_fs.h.

> 
> > +	u_int64_t sector = 0;
> > +	int ret = -1;
> > +
> > +	rep = malloc(F2FS_REPORT_ZONES_BUFSZ);
> > +	if (!rep) {
> > +		ERR_MSG("No memory for report zones\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	while (sector < total_sectors) {
> > +
> > +		/* Get zone info */
> > +		rep->sector = sector;
> > +		rep->nr_zones = (F2FS_REPORT_ZONES_BUFSZ - sizeof(struct blk_zone_report))
> > +			/ sizeof(struct blk_zone);
> > +
> > +		ret = ioctl(dev->fd, BLKREPORTZONE, rep);
> > +		if (ret != 0) {
> > +			ret = -errno;
> > +			ERR_MSG("ioctl BLKREPORTZONE failed\n");
> It's minor, it will be better to print errno here, since I didn't see we print
> error no from caller.

Thanks. Will do that.

> 
> > +			goto out;
> > +		}
> > +
> > +		if (!rep->nr_zones) {
> > +			ret = -EIO;
> > +			ERR_MSG("Unexpected ioctl BLKREPORTZONE result\n");
> > +			goto out;
> > +		}
> > +
> > +		blkz = (struct blk_zone *)(rep + 1);
> > +		for (i = 0; i < rep->nr_zones && sector < total_sectors; i++) {
> 
> The condition looks like that block zones in rep may across end-of-device? Will
> this really happen?
> 
> So I mean will "i < rep->nr_zones" be enough?

You are correct. I will remove the sector comparison in v2 series.

--
Best Regards,
Shin'ichiro Kawasaki

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count
  2019-08-27  2:25   ` Chao Yu
@ 2019-08-28 11:53     ` Shinichiro Kawasaki
  2019-08-29 14:42       ` Chao Yu
  0 siblings, 1 reply; 18+ messages in thread
From: Shinichiro Kawasaki @ 2019-08-28 11:53 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, Damien Le Moal, linux-f2fs-devel

On Aug 27, 2019 / 10:25, Chao Yu wrote:
> On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> > When sudden f2fs shutdown happens on zoned block devices, write
> > pointers can be inconsistent with valid blocks counts in meta data.
> > The failure scenario is as follows:
> > 
> > - Just before a sudden shutdown, a new segment in a new zone is selected
> >   for a current segment. Write commands were executed to the segment.
> >   and the zone has a write pointer not at zone start.
> > - Before the write commands complete, shutdown happens. Meta data is
> >   not updated and still keeps zero valid blocks count for the zone.
> > - After next mount of the file system, the zone is selected for the next
> >   write target because it has zero valid blocks count. However, it has
> >   the write pointer not at zone start. Then "Unaligned write command"
> >   error happens.
> > 
> > To avoid this potential error path, reset write pointers if the zone
> > does not have a current segment, the write pointer is not at the zone
> > start and the zone has no valid blocks.
> > 
> > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> > ---
> >  fsck/fsck.c | 30 +++++++++++++++++++++++++++++-
> >  1 file changed, 29 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fsck/fsck.c b/fsck/fsck.c
> > index 21a06ac..cc9bbc0 100644
> > --- a/fsck/fsck.c
> > +++ b/fsck/fsck.c
> > @@ -2595,6 +2595,7 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
> >  	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
> >  	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
> >  	void *zero_blk;
> > +	block_t	zone_valid_blocks = 0;
> >  
> >  	if (blk_zone_conv(blkz))
> >  		return 0;
> > @@ -2615,8 +2616,35 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
> >  			break;
> >  	}
> >  
> > -	if (cs_index >= NR_CURSEG_TYPE)
> > +	if (cs_index >= NR_CURSEG_TYPE) {
> > +		for (b = zone_block; b < zone_block + c.zone_blocks &&
> > +			     IS_VALID_BLK_ADDR(sbi, b); b += c.blks_per_seg) {
> > +			se = get_seg_entry(sbi, GET_SEGNO(sbi, b));
> > +			zone_valid_blocks += se->valid_blocks;
> > +		}
> > +		if (wp_block == zone_block || zone_valid_blocks)
> > +			return 0;
> > +
> > +		/*
> > +		 * The write pointer is not at zone start but there is no valid
> > +		 * block in the zone. Segments in the zone can be selected for
> > +		 * next write. Need to reset the write pointer to avoid
> > +		 * unaligned write command error.
> 
> In SPOR (sudden power-off recovery) of kernel side, we may revalidate blocks
> belong to fsynced file in such zone within range of [0, write pointer], if we
> just reset zone, will we lose those data for ever?

Yes. This patch resets zone and the data will be lost. I walked through
fs/f2fs/recovery.c and learned that nodes with fsync mark are recovered at
remount. Such fsync recovery cannot be done after zone reset. To avoid the
data loss, I would like to drop this fourth patch at this moment.

Later on, I will consider safer approach not to reset the zone, but to set next
write target block at the write pointer. I guess this approach will need kernel
side patch to change block selection logic.

> 
> BTW, how you think enabling f2fs kernel module to recover incorrect write
> pointer of zone? Once f2fs-tools doesn't upgrade, however kernel does...

Current f2fs allows to mount zoned block devices even when they have
inconsistency with f2fs meta data. This is not good. Then I believe kernel side
needs the feature to check write pointer inconsistency at mount time and fix it.

As you indicate, fix by kernel is more handy than notice to run fsck, especially
when users do not have latest f2fs-tools. Still fix by fsck is needed when users
use the kernel without the fix feature. I think both approaches are required:
fix by kernel and fix by fsck.

--
Best Regards,
Shin'ichiro Kawasaki

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments
  2019-08-27  2:13     ` Chao Yu
@ 2019-08-29  4:41       ` Shinichiro Kawasaki
  0 siblings, 0 replies; 18+ messages in thread
From: Shinichiro Kawasaki @ 2019-08-29  4:41 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, Damien Le Moal, linux-f2fs-devel

On Aug 27, 2019 / 10:13, Chao Yu wrote:
> On 2019/8/27 10:01, Chao Yu wrote:
> > On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> >> On sudden f2fs shutdown, zoned block device status and f2fs current
> >> segment positions in meta data can be inconsistent. When f2fs shutdown
> >> happens before write operations completes, write pointers of zoned block
> >> devices can go further but f2fs meta data keeps current segments at
> >> positions before the write operations. After remounting the f2fs, the
> >> inconsistency causes write operations not at write pointers and
> >> "Unaligned write command" error is reported. This error was observed when
> >> xfstests test case generic/388 was run with f2fs on a zoned block device.
> >>
> >> To avoid the error, have f2fs.fsck check consistency between each current
> >> segment's position and the write pointer of the zone the current segment
> >> points to. If the write pointer goes advance from the current segment,
> >> fix the current segment position setting at same as the write pointer
> >> position. In case the write pointer is behind the current segment, write
> >> zero data at the write pointer position to make write pointer position at
> >> same as the current segment.
> >>
> >> When inconsistencies are found, turn on c.bug_on flag in fsck_verify() to
> >> ask users to fix them or not. When inconsistencies get fixed, turn on
> >> 'force' flag in fsck_verify() to enforce fixes in following checks. This
> >> position fix is done at the beginning of do_fsck() function so that other
> >> checks reflect the current segment modification.
> >>
> >> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> >> ---
> >>  fsck/fsck.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  fsck/fsck.h |   3 ++
> >>  fsck/main.c |   2 +
> >>  3 files changed, 138 insertions(+)
> >>
> >> diff --git a/fsck/fsck.c b/fsck/fsck.c
> >> index 8953ca1..21a06ac 100644
> >> --- a/fsck/fsck.c
> >> +++ b/fsck/fsck.c
> >> @@ -2574,6 +2574,125 @@ out:
> >>  	return cnt;
> >>  }
> >>  
> >> +struct write_pointer_check_data {
> >> +	struct f2fs_sb_info *sbi;
> >> +	struct device_info *dev;
> >> +};
> >> +
> >> +#define SECTOR_SHIFT 9
> >> +
> >> +static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
> >> +{
> >> +	struct write_pointer_check_data *wpd = opaque;
> >> +	struct f2fs_sb_info *sbi = wpd->sbi;
> >> +	struct device_info *dev = wpd->dev;
> >> +	struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
> >> +	block_t zone_block, wp_block, wp_blkoff, cs_block, b;
> >> +	unsigned int zone_segno, wp_segno;
> >> +	struct seg_entry *se;
> >> +	struct curseg_info *cs;
> >> +	int cs_index, ret;
> >> +	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
> >> +	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
> >> +	void *zero_blk;
> >> +
> >> +	if (blk_zone_conv(blkz))
> >> +		return 0;
> >> +
> >> +	zone_block = dev->start_blkaddr
> >> +		+ (blk_zone_sector(blkz) >> log_sectors_per_block);
> >> +	zone_segno = GET_SEGNO(sbi, zone_block);
> >> +	wp_block = dev->start_blkaddr
> >> +		+ (blk_zone_wp_sector(blkz) >> log_sectors_per_block);
> >> +	wp_segno = GET_SEGNO(sbi, wp_block);
> >> +	wp_blkoff = wp_block - START_BLOCK(sbi, wp_segno);
> >> +
> >> +	/* find the curseg which points to the zone */
> >> +	for (cs_index = 0; cs_index < NO_CHECK_TYPE; cs_index++) {
> >> +		cs = &SM_I(sbi)->curseg_array[cs_index];
> >> +		if (zone_segno <= cs->segno &&
> >> +		    cs->segno < zone_segno + segs_per_zone)
> >> +			break;
> >> +	}
> > 
> > Will this happen?
> > 
> > - write checkpoint
> > - curseg points zone A
> > - write large number of data
> > - curseg points zone B, write pointer > 0
> > - sudden power cut, curseg will be reset to zone A
> > 
> > zone B's write pointer won't be verified due to curseg points to zone A?
> 
> IIUC, we are trying fix such condition in a separated PATCH 4/4.
> 
> Reviewed-by: Chao Yu <yuchao0@huawei.com>

Yes, that's the failure scenario that PATCH 4/4 tried to address. As I
responded separately, I would like to drop PATCH 4/4 at this moment.

Will add your reviewed-by tag to this PATCH 3/4 in the next version.
Thanks!

--
Best Regards,
Shin'ichiro Kawasaki

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count
  2019-08-28 11:53     ` Shinichiro Kawasaki
@ 2019-08-29 14:42       ` Chao Yu
  2019-08-30  7:21         ` Shinichiro Kawasaki
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2019-08-29 14:42 UTC (permalink / raw)
  To: Shinichiro Kawasaki, Chao Yu
  Cc: Jaegeuk Kim, Damien Le Moal, linux-f2fs-devel

On 2019-8-28 19:53, Shinichiro Kawasaki wrote:
> On Aug 27, 2019 / 10:25, Chao Yu wrote:
>> On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
>>> When sudden f2fs shutdown happens on zoned block devices, write
>>> pointers can be inconsistent with valid blocks counts in meta data.
>>> The failure scenario is as follows:
>>>
>>> - Just before a sudden shutdown, a new segment in a new zone is selected
>>>   for a current segment. Write commands were executed to the segment.
>>>   and the zone has a write pointer not at zone start.
>>> - Before the write commands complete, shutdown happens. Meta data is
>>>   not updated and still keeps zero valid blocks count for the zone.
>>> - After next mount of the file system, the zone is selected for the next
>>>   write target because it has zero valid blocks count. However, it has
>>>   the write pointer not at zone start. Then "Unaligned write command"
>>>   error happens.
>>>
>>> To avoid this potential error path, reset write pointers if the zone
>>> does not have a current segment, the write pointer is not at the zone
>>> start and the zone has no valid blocks.
>>>
>>> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
>>> ---
>>>  fsck/fsck.c | 30 +++++++++++++++++++++++++++++-
>>>  1 file changed, 29 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fsck/fsck.c b/fsck/fsck.c
>>> index 21a06ac..cc9bbc0 100644
>>> --- a/fsck/fsck.c
>>> +++ b/fsck/fsck.c
>>> @@ -2595,6 +2595,7 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
>>>  	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
>>>  	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
>>>  	void *zero_blk;
>>> +	block_t	zone_valid_blocks = 0;
>>>  
>>>  	if (blk_zone_conv(blkz))
>>>  		return 0;
>>> @@ -2615,8 +2616,35 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
>>>  			break;
>>>  	}
>>>  
>>> -	if (cs_index >= NR_CURSEG_TYPE)
>>> +	if (cs_index >= NR_CURSEG_TYPE) {
>>> +		for (b = zone_block; b < zone_block + c.zone_blocks &&
>>> +			     IS_VALID_BLK_ADDR(sbi, b); b += c.blks_per_seg) {
>>> +			se = get_seg_entry(sbi, GET_SEGNO(sbi, b));
>>> +			zone_valid_blocks += se->valid_blocks;
>>> +		}
>>> +		if (wp_block == zone_block || zone_valid_blocks)
>>> +			return 0;
>>> +
>>> +		/*
>>> +		 * The write pointer is not at zone start but there is no valid
>>> +		 * block in the zone. Segments in the zone can be selected for
>>> +		 * next write. Need to reset the write pointer to avoid
>>> +		 * unaligned write command error.
>>
>> In SPOR (sudden power-off recovery) of kernel side, we may revalidate blocks
>> belong to fsynced file in such zone within range of [0, write pointer], if we
>> just reset zone, will we lose those data for ever?
> 
> Yes. This patch resets zone and the data will be lost. I walked through
> fs/f2fs/recovery.c and learned that nodes with fsync mark are recovered at
> remount. Such fsync recovery cannot be done after zone reset. To avoid the
> data loss, I would like to drop this fourth patch at this moment.
> 
> Later on, I will consider safer approach not to reset the zone, but to set next
> write target block at the write pointer. I guess this approach will need kernel
> side patch to change block selection logic.

I guess below commit can help to recognize fsynced data in unclean umounted
image, maybe we can skip invalidating those data during zone write pointer recovery.

f2fs-tools: fix to skip block allocation for fsynced data

https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git/commit/?h=dev-test&id=a50cfc89e56ce8c022e295bf4de619af070fabe9

> 
>>
>> BTW, how you think enabling f2fs kernel module to recover incorrect write
>> pointer of zone? Once f2fs-tools doesn't upgrade, however kernel does...
> 
> Current f2fs allows to mount zoned block devices even when they have
> inconsistency with f2fs meta data. This is not good. Then I believe kernel side
> needs the feature to check write pointer inconsistency at mount time and fix it.
> 
> As you indicate, fix by kernel is more handy than notice to run fsck, especially
> when users do not have latest f2fs-tools. Still fix by fsck is needed when users
> use the kernel without the fix feature. I think both approaches are required:
> fix by kernel and fix by fsck.

Agreed, let's try to fix in both side.

Thanks,

> 
> --
> Best Regards,
> Shin'ichiro Kawasaki
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count
  2019-08-29 14:42       ` Chao Yu
@ 2019-08-30  7:21         ` Shinichiro Kawasaki
  0 siblings, 0 replies; 18+ messages in thread
From: Shinichiro Kawasaki @ 2019-08-30  7:21 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, Damien Le Moal, linux-f2fs-devel

On Aug 29, 2019 / 22:42, Chao Yu wrote:
> On 2019-8-28 19:53, Shinichiro Kawasaki wrote:
> > On Aug 27, 2019 / 10:25, Chao Yu wrote:
> >> On 2019/8/21 12:48, Shin'ichiro Kawasaki wrote:
> >>> When sudden f2fs shutdown happens on zoned block devices, write
> >>> pointers can be inconsistent with valid blocks counts in meta data.
> >>> The failure scenario is as follows:
> >>>
> >>> - Just before a sudden shutdown, a new segment in a new zone is selected
> >>>   for a current segment. Write commands were executed to the segment.
> >>>   and the zone has a write pointer not at zone start.
> >>> - Before the write commands complete, shutdown happens. Meta data is
> >>>   not updated and still keeps zero valid blocks count for the zone.
> >>> - After next mount of the file system, the zone is selected for the next
> >>>   write target because it has zero valid blocks count. However, it has
> >>>   the write pointer not at zone start. Then "Unaligned write command"
> >>>   error happens.
> >>>
> >>> To avoid this potential error path, reset write pointers if the zone
> >>> does not have a current segment, the write pointer is not at the zone
> >>> start and the zone has no valid blocks.
> >>>
> >>> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> >>> ---
> >>>  fsck/fsck.c | 30 +++++++++++++++++++++++++++++-
> >>>  1 file changed, 29 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/fsck/fsck.c b/fsck/fsck.c
> >>> index 21a06ac..cc9bbc0 100644
> >>> --- a/fsck/fsck.c
> >>> +++ b/fsck/fsck.c
> >>> @@ -2595,6 +2595,7 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
> >>>  	int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
> >>>  	unsigned int segs_per_zone = sbi->segs_per_sec * sbi->secs_per_zone;
> >>>  	void *zero_blk;
> >>> +	block_t	zone_valid_blocks = 0;
> >>>  
> >>>  	if (blk_zone_conv(blkz))
> >>>  		return 0;
> >>> @@ -2615,8 +2616,35 @@ static int fsck_chk_write_pointer(int i, struct blk_zone *blkz, void *opaque)
> >>>  			break;
> >>>  	}
> >>>  
> >>> -	if (cs_index >= NR_CURSEG_TYPE)
> >>> +	if (cs_index >= NR_CURSEG_TYPE) {
> >>> +		for (b = zone_block; b < zone_block + c.zone_blocks &&
> >>> +			     IS_VALID_BLK_ADDR(sbi, b); b += c.blks_per_seg) {
> >>> +			se = get_seg_entry(sbi, GET_SEGNO(sbi, b));
> >>> +			zone_valid_blocks += se->valid_blocks;
> >>> +		}
> >>> +		if (wp_block == zone_block || zone_valid_blocks)
> >>> +			return 0;
> >>> +
> >>> +		/*
> >>> +		 * The write pointer is not at zone start but there is no valid
> >>> +		 * block in the zone. Segments in the zone can be selected for
> >>> +		 * next write. Need to reset the write pointer to avoid
> >>> +		 * unaligned write command error.
> >>
> >> In SPOR (sudden power-off recovery) of kernel side, we may revalidate blocks
> >> belong to fsynced file in such zone within range of [0, write pointer], if we
> >> just reset zone, will we lose those data for ever?
> > 
> > Yes. This patch resets zone and the data will be lost. I walked through
> > fs/f2fs/recovery.c and learned that nodes with fsync mark are recovered at
> > remount. Such fsync recovery cannot be done after zone reset. To avoid the
> > data loss, I would like to drop this fourth patch at this moment.
> > 
> > Later on, I will consider safer approach not to reset the zone, but to set next
> > write target block at the write pointer. I guess this approach will need kernel
> > side patch to change block selection logic.
> 
> I guess below commit can help to recognize fsynced data in unclean umounted
> image, maybe we can skip invalidating those data during zone write pointer recovery.
> 
> f2fs-tools: fix to skip block allocation for fsynced data
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git/commit/?h=dev-test&id=a50cfc89e56ce8c022e295bf4de619af070fabe9

Thank you for the idea! The find_fsync_inode() function in the commit looks
useful. Will try to create a separated patch on top of the commit.

--
Best Regards,
Shin'ichiro Kawasaki

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, back to index

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-21  4:47 [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Shin'ichiro Kawasaki
2019-08-21  4:47 ` [f2fs-dev] [PATCH v2 1/4] libf2fs_zoned: Introduce f2fs_report_zones() helper function Shin'ichiro Kawasaki
2019-08-27  1:34   ` Chao Yu
2019-08-28  8:32     ` Shinichiro Kawasaki
2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 2/4] libf2fs_zoned: Introduce f2fs_reset_zone() function Shin'ichiro Kawasaki
2019-08-27  1:36   ` Chao Yu
2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 3/4] fsck.f2fs: Check write pointer consistency with current segments Shin'ichiro Kawasaki
2019-08-27  2:01   ` Chao Yu
2019-08-27  2:13     ` Chao Yu
2019-08-29  4:41       ` Shinichiro Kawasaki
2019-08-21  4:48 ` [f2fs-dev] [PATCH v2 4/4] fsck.f2fs: Check write pointer consistency with valid blocks count Shin'ichiro Kawasaki
2019-08-27  2:25   ` Chao Yu
2019-08-28 11:53     ` Shinichiro Kawasaki
2019-08-29 14:42       ` Chao Yu
2019-08-30  7:21         ` Shinichiro Kawasaki
2019-08-23 13:09 ` [f2fs-dev] [PATCH v2 0/4] fsck: Check write pointers of zoned block devices Chao Yu
2019-08-26  0:10   ` Damien Le Moal
2019-08-26  7:37     ` Chao Yu

Linux-f2fs-devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-f2fs-devel/0 linux-f2fs-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-f2fs-devel linux-f2fs-devel/ https://lore.kernel.org/linux-f2fs-devel \
		linux-f2fs-devel@lists.sourceforge.net linux-f2fs-devel@archiver.kernel.org
	public-inbox-index linux-f2fs-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/net.sourceforge.lists.linux-f2fs-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox