All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk.kim@samsung.com>
To: Andrey Tsyvarev <tsyvarev@ispras.ru>
Cc: linux-f2fs-devel@lists.sourceforge.net,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Alexey Khoroshilov <khoroshilov@ispras.ru>
Subject: Re: f2fs: BUG_ON() is triggered when mount valid f2fs filesystem
Date: Tue, 15 Apr 2014 20:04:24 +0900	[thread overview]
Message-ID: <1397559864.7727.5.camel@kjgkr> (raw)
In-Reply-To: <534BC29B.3020408@ispras.ru>

Hi,

Thank you for the report.
I retrieved the fault image and found out that previous garbage data
wreak such the wrong behaviors.
So, I wrote the following patch that fills one zero-block at the
checkpoint procedure.
If the underlying device supports discard, I expect that it mostly
doesn't incur any performance regression significantly.

Could you test this patch?

>From 60588ceb7277aae2a79e7f67f5217d1256720d78 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Date: Tue, 15 Apr 2014 13:57:55 +0900
Subject: [PATCH] f2fs: avoid to conduct roll-forward due to the remained
 garbage blocks

The f2fs always scans the next chain of direct node blocks.
But some garbage blocks are able to be remained due to no discard
support or
SSR triggers.
This occasionally wreaks recovering wrong inodes that were used or
BUG_ONs
due to reallocating node ids as follows.

When mount this f2fs image:
http://linuxtesting.org/downloads/f2fs_fault_image.zip
BUG_ON is triggered in f2fs driver (messages below are generated on
kernel 3.13.2; for other kernels output is similar):

kernel BUG at fs/f2fs/node.c:215!
 Call Trace:
 [<ffffffffa032ebad>] recover_inode_page+0x1fd/0x3e0 [f2fs]
 [<ffffffff811446e7>] ? __lock_page+0x67/0x70
 [<ffffffff81089990>] ? autoremove_wake_function+0x50/0x50
 [<ffffffffa0337788>] recover_fsync_data+0x1398/0x15d0 [f2fs]
 [<ffffffff812b9e5c>] ? selinux_d_instantiate+0x1c/0x20
 [<ffffffff811cb20b>] ? d_instantiate+0x5b/0x80
 [<ffffffffa0321044>] f2fs_fill_super+0xb04/0xbf0 [f2fs]
 [<ffffffff811b861e>] ? mount_bdev+0x7e/0x210
 [<ffffffff811b8769>] mount_bdev+0x1c9/0x210
 [<ffffffffa0320540>] ? validate_superblock+0x210/0x210 [f2fs]
 [<ffffffffa031cf8d>] f2fs_mount+0x1d/0x30 [f2fs]
 [<ffffffff811b9497>] mount_fs+0x47/0x1c0
 [<ffffffff81166e00>] ? __alloc_percpu+0x10/0x20
 [<ffffffff811d4032>] vfs_kern_mount+0x72/0x110
 [<ffffffff811d6763>] do_mount+0x493/0x910
 [<ffffffff811615cb>] ? strndup_user+0x5b/0x80
 [<ffffffff811d6c70>] SyS_mount+0x90/0xe0
 [<ffffffff8166f8d9>] system_call_fastpath+0x16/0x1b

Found by Linux File System Verification project (linuxtesting.org).

Reported-by: Andrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
---
 fs/f2fs/checkpoint.c |  6 ++++++
 fs/f2fs/f2fs.h       |  1 +
 fs/f2fs/segment.c    | 17 +++++++++++++++--
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 4aa521a..890e23d 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -762,6 +762,12 @@ static void do_checkpoint(struct f2fs_sb_info *sbi,
bool is_umount)
 	void *kaddr;
 	int i;
 
+	/*
+	 * This avoids to conduct wrong roll-forward operations and uses
+	 * metapages, so should be called prior to sync_meta_pages below.
+	 */
+	discard_next_dnode(sbi);
+
 	/* Flush all the NAT/SIT pages */
 	while (get_pages(sbi, F2FS_DIRTY_META))
 		sync_meta_pages(sbi, META, LONG_MAX);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 2ecac83..2c5a5da 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1179,6 +1179,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *);
 void invalidate_blocks(struct f2fs_sb_info *, block_t);
 void refresh_sit_entry(struct f2fs_sb_info *, block_t, block_t);
 void clear_prefree_segments(struct f2fs_sb_info *);
+void discard_next_dnode(struct f2fs_sb_info *);
 int npages_for_summary_flush(struct f2fs_sb_info *);
 void allocate_new_segments(struct f2fs_sb_info *);
 struct page *get_sum_page(struct f2fs_sb_info *, unsigned int);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 1e264e7..9993f94 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -335,13 +335,26 @@ static void locate_dirty_segment(struct
f2fs_sb_info *sbi, unsigned int segno)
 	mutex_unlock(&dirty_i->seglist_lock);
 }
 
-static void f2fs_issue_discard(struct f2fs_sb_info *sbi,
+static int f2fs_issue_discard(struct f2fs_sb_info *sbi,
 				block_t blkstart, block_t blklen)
 {
 	sector_t start = SECTOR_FROM_BLOCK(sbi, blkstart);
 	sector_t len = SECTOR_FROM_BLOCK(sbi, blklen);
-	blkdev_issue_discard(sbi->sb->s_bdev, start, len, GFP_NOFS, 0);
 	trace_f2fs_issue_discard(sbi->sb, blkstart, blklen);
+	return blkdev_issue_discard(sbi->sb->s_bdev, start, len, GFP_NOFS, 0);
+}
+
+void discard_next_dnode(struct f2fs_sb_info *sbi)
+{
+	struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
+	block_t blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
+
+	if (f2fs_issue_discard(sbi, blkaddr, 1)) {
+		struct page *page = grab_meta_page(sbi, blkaddr);
+		/* zero-filled page */
+		set_page_dirty(page);
+		f2fs_put_page(page, 1);
+	}
 }
 
 static void add_discard_addrs(struct f2fs_sb_info *sbi,
-- 
1.8.4.474.g128a96c

-- 
Jaegeuk Kim
Samsung


WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim <jaegeuk.kim@samsung.com>
To: Andrey Tsyvarev <tsyvarev@ispras.ru>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Alexey Khoroshilov <khoroshilov@ispras.ru>,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: f2fs: BUG_ON() is triggered when mount valid f2fs filesystem
Date: Tue, 15 Apr 2014 20:04:24 +0900	[thread overview]
Message-ID: <1397559864.7727.5.camel@kjgkr> (raw)
In-Reply-To: <534BC29B.3020408@ispras.ru>

Hi,

Thank you for the report.
I retrieved the fault image and found out that previous garbage data
wreak such the wrong behaviors.
So, I wrote the following patch that fills one zero-block at the
checkpoint procedure.
If the underlying device supports discard, I expect that it mostly
doesn't incur any performance regression significantly.

Could you test this patch?

>From 60588ceb7277aae2a79e7f67f5217d1256720d78 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Date: Tue, 15 Apr 2014 13:57:55 +0900
Subject: [PATCH] f2fs: avoid to conduct roll-forward due to the remained
 garbage blocks

The f2fs always scans the next chain of direct node blocks.
But some garbage blocks are able to be remained due to no discard
support or
SSR triggers.
This occasionally wreaks recovering wrong inodes that were used or
BUG_ONs
due to reallocating node ids as follows.

When mount this f2fs image:
http://linuxtesting.org/downloads/f2fs_fault_image.zip
BUG_ON is triggered in f2fs driver (messages below are generated on
kernel 3.13.2; for other kernels output is similar):

kernel BUG at fs/f2fs/node.c:215!
 Call Trace:
 [<ffffffffa032ebad>] recover_inode_page+0x1fd/0x3e0 [f2fs]
 [<ffffffff811446e7>] ? __lock_page+0x67/0x70
 [<ffffffff81089990>] ? autoremove_wake_function+0x50/0x50
 [<ffffffffa0337788>] recover_fsync_data+0x1398/0x15d0 [f2fs]
 [<ffffffff812b9e5c>] ? selinux_d_instantiate+0x1c/0x20
 [<ffffffff811cb20b>] ? d_instantiate+0x5b/0x80
 [<ffffffffa0321044>] f2fs_fill_super+0xb04/0xbf0 [f2fs]
 [<ffffffff811b861e>] ? mount_bdev+0x7e/0x210
 [<ffffffff811b8769>] mount_bdev+0x1c9/0x210
 [<ffffffffa0320540>] ? validate_superblock+0x210/0x210 [f2fs]
 [<ffffffffa031cf8d>] f2fs_mount+0x1d/0x30 [f2fs]
 [<ffffffff811b9497>] mount_fs+0x47/0x1c0
 [<ffffffff81166e00>] ? __alloc_percpu+0x10/0x20
 [<ffffffff811d4032>] vfs_kern_mount+0x72/0x110
 [<ffffffff811d6763>] do_mount+0x493/0x910
 [<ffffffff811615cb>] ? strndup_user+0x5b/0x80
 [<ffffffff811d6c70>] SyS_mount+0x90/0xe0
 [<ffffffff8166f8d9>] system_call_fastpath+0x16/0x1b

Found by Linux File System Verification project (linuxtesting.org).

Reported-by: Andrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
---
 fs/f2fs/checkpoint.c |  6 ++++++
 fs/f2fs/f2fs.h       |  1 +
 fs/f2fs/segment.c    | 17 +++++++++++++++--
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 4aa521a..890e23d 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -762,6 +762,12 @@ static void do_checkpoint(struct f2fs_sb_info *sbi,
bool is_umount)
 	void *kaddr;
 	int i;
 
+	/*
+	 * This avoids to conduct wrong roll-forward operations and uses
+	 * metapages, so should be called prior to sync_meta_pages below.
+	 */
+	discard_next_dnode(sbi);
+
 	/* Flush all the NAT/SIT pages */
 	while (get_pages(sbi, F2FS_DIRTY_META))
 		sync_meta_pages(sbi, META, LONG_MAX);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 2ecac83..2c5a5da 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1179,6 +1179,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *);
 void invalidate_blocks(struct f2fs_sb_info *, block_t);
 void refresh_sit_entry(struct f2fs_sb_info *, block_t, block_t);
 void clear_prefree_segments(struct f2fs_sb_info *);
+void discard_next_dnode(struct f2fs_sb_info *);
 int npages_for_summary_flush(struct f2fs_sb_info *);
 void allocate_new_segments(struct f2fs_sb_info *);
 struct page *get_sum_page(struct f2fs_sb_info *, unsigned int);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 1e264e7..9993f94 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -335,13 +335,26 @@ static void locate_dirty_segment(struct
f2fs_sb_info *sbi, unsigned int segno)
 	mutex_unlock(&dirty_i->seglist_lock);
 }
 
-static void f2fs_issue_discard(struct f2fs_sb_info *sbi,
+static int f2fs_issue_discard(struct f2fs_sb_info *sbi,
 				block_t blkstart, block_t blklen)
 {
 	sector_t start = SECTOR_FROM_BLOCK(sbi, blkstart);
 	sector_t len = SECTOR_FROM_BLOCK(sbi, blklen);
-	blkdev_issue_discard(sbi->sb->s_bdev, start, len, GFP_NOFS, 0);
 	trace_f2fs_issue_discard(sbi->sb, blkstart, blklen);
+	return blkdev_issue_discard(sbi->sb->s_bdev, start, len, GFP_NOFS, 0);
+}
+
+void discard_next_dnode(struct f2fs_sb_info *sbi)
+{
+	struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
+	block_t blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
+
+	if (f2fs_issue_discard(sbi, blkaddr, 1)) {
+		struct page *page = grab_meta_page(sbi, blkaddr);
+		/* zero-filled page */
+		set_page_dirty(page);
+		f2fs_put_page(page, 1);
+	}
 }
 
 static void add_discard_addrs(struct f2fs_sb_info *sbi,
-- 
1.8.4.474.g128a96c

-- 
Jaegeuk Kim
Samsung


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech

  reply	other threads:[~2014-04-15 11:06 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-06  5:43 f2fs: f2fs unmount hangs if f2fs_init_acl() fails during mkdir syscall Andrey Tsyvarev
2014-02-06  5:43 ` Andrey Tsyvarev
2014-02-06  6:02 ` Jaegeuk Kim
2014-02-06  6:02   ` Jaegeuk Kim
2014-02-06 12:17   ` Andrey Tsyvarev
2014-02-06 12:17     ` Andrey Tsyvarev
2014-02-07  0:49     ` Jaegeuk Kim
2014-02-07  5:12       ` [f2fs-dev] " Jaegeuk Kim
2014-02-07  5:12         ` Jaegeuk Kim
2014-02-11  8:29         ` [f2fs-dev] " Andrey Tsyvarev
2014-02-11  8:29           ` Andrey Tsyvarev
2014-02-13  8:32           ` [f2fs-dev] " Gu Zheng
2014-02-13  8:32             ` Gu Zheng
2014-02-13  9:40             ` [f2fs-dev] " Andrey Tsyvarev
2014-02-13  9:40               ` Andrey Tsyvarev
2014-02-13  9:48               ` [f2fs-dev] " Gu Zheng
2014-02-13  9:48                 ` Gu Zheng
2014-02-14  2:00                 ` [f2fs-dev] " Jaegeuk Kim
2014-02-14  1:58           ` Jaegeuk Kim
2014-02-14  1:58             ` Jaegeuk Kim
2014-04-14 11:12 ` f2fs: BUG_ON() is triggered when mount valid f2fs filesystem Andrey Tsyvarev
2014-04-15 11:04   ` Jaegeuk Kim [this message]
2014-04-15 11:04     ` Jaegeuk Kim
2014-04-16  9:11     ` Andrey Tsyvarev
2014-04-16  9:11       ` Andrey Tsyvarev
2014-04-16 23:35       ` Jaegeuk Kim
2014-04-16 23:35         ` Jaegeuk Kim
2014-04-17  1:11         ` Alexey Khoroshilov
2014-04-17  1:11           ` Alexey Khoroshilov
2014-04-17  7:45           ` Jaegeuk Kim
2014-04-17  7:45             ` Jaegeuk Kim
2014-04-18  6:04             ` Alexey Khoroshilov
2014-04-18  6:04               ` Alexey Khoroshilov
2014-04-18  6:35               ` Jaegeuk Kim
2014-04-18  6:35                 ` Jaegeuk Kim
2014-04-18  6:40               ` Gu Zheng
2014-04-18  6:40                 ` Gu Zheng
2014-07-21 10:56   ` f2fs: Possible use-after-free when umount filesystem Andrey Tsyvarev
2014-07-21 11:09     ` Fwd: " Andrey Tsyvarev
2014-07-22  2:17     ` Gu Zheng
2014-07-22  2:17       ` Gu Zheng
2014-07-22 10:04       ` Andrey Tsyvarev
2014-07-22 10:04         ` Andrey Tsyvarev
2014-07-23  2:12         ` [f2fs-dev] " Chao Yu
2014-07-23  2:12           ` Chao Yu
2014-07-23  3:39           ` [f2fs-dev] " Gu Zheng
2014-07-23  3:39             ` Gu Zheng
2014-07-24 10:14             ` Andrey Tsyvarev
2014-07-24 10:14               ` Andrey Tsyvarev
2014-07-25  3:22               ` [f2fs-dev] " Chao Yu
2014-07-25  3:22                 ` Chao Yu
2014-07-25  5:49                 ` [f2fs-dev] " Gu Zheng
2014-07-25  5:49                   ` Gu Zheng
2014-07-25 15:37                   ` [f2fs-dev] " Jaegeuk Kim
2014-07-25 15:37                     ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1397559864.7727.5.camel@kjgkr \
    --to=jaegeuk.kim@samsung.com \
    --cc=khoroshilov@ispras.ru \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tsyvarev@ispras.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.