linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] f2fs: reuse nids more aggressively
@ 2015-08-18  8:46 Jaegeuk Kim
  2015-08-18  8:46 ` [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page Jaegeuk Kim
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Jaegeuk Kim @ 2015-08-18  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

If we can reuse nids as many as possible, we can mitigate producing obsolete
node pages in the page cache.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/node.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 6e10c2a..3cc32b8 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -306,6 +306,10 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
 	if (nat_get_blkaddr(e) != NEW_ADDR && new_blkaddr == NULL_ADDR) {
 		unsigned char version = nat_get_version(e);
 		nat_set_version(e, inc_node_version(version));
+
+		/* in order to reuse the nid */
+		if (nm_i->next_scan_nid > ni->nid)
+			nm_i->next_scan_nid = ni->nid;
 	}
 
 	/* change address */
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page
  2015-08-18  8:46 [PATCH 1/5] f2fs: reuse nids more aggressively Jaegeuk Kim
@ 2015-08-18  8:46 ` Jaegeuk Kim
  2015-08-20  9:09   ` [f2fs-dev] " Chao Yu
  2015-08-18  8:46 ` [PATCH 3/5] f2fs: retry gc if one section is not successfully reclaimed Jaegeuk Kim
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Jaegeuk Kim @ 2015-08-18  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

Previously, update_inode_page is not called under f2fs_lock_op.
Instead we should call with f2fs_write_inode.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/file.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 016ed3b..7faafb5 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -206,8 +206,8 @@ int f2fs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	}
 
 	/* if the inode is dirty, let's recover all the time */
-	if (!datasync && is_inode_flag_set(fi, FI_DIRTY_INODE)) {
-		update_inode_page(inode);
+	if (!datasync) {
+		f2fs_write_inode(inode, NULL);
 		goto go_write;
 	}
 
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] f2fs: retry gc if one section is not successfully reclaimed
  2015-08-18  8:46 [PATCH 1/5] f2fs: reuse nids more aggressively Jaegeuk Kim
  2015-08-18  8:46 ` [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page Jaegeuk Kim
@ 2015-08-18  8:46 ` Jaegeuk Kim
  2015-08-18  8:46 ` [PATCH 4/5] f2fs: go out for insert_inode_locked failure Jaegeuk Kim
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Jaegeuk Kim @ 2015-08-18  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

If FG_GC failed to reclaim one section, let's retry with another section
from the start, since we can get anoterh good candidate.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/gc.c | 42 +++++++++++++++++++-----------------------
 1 file changed, 19 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 0a5d573..c40dda3 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -391,7 +391,7 @@ static int check_valid_map(struct f2fs_sb_info *sbi,
  * On validity, copy that node with cold status, otherwise (invalid node)
  * ignore that.
  */
-static void gc_node_segment(struct f2fs_sb_info *sbi,
+static int gc_node_segment(struct f2fs_sb_info *sbi,
 		struct f2fs_summary *sum, unsigned int segno, int gc_type)
 {
 	bool initial = true;
@@ -411,7 +411,7 @@ next_step:
 
 		/* stop BG_GC if there is not enough free sections. */
 		if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0))
-			return;
+			return 0;
 
 		if (check_valid_map(sbi, segno, off) == 0)
 			continue;
@@ -461,13 +461,11 @@ next_step:
 		};
 		sync_node_pages(sbi, 0, &wbc);
 
-		/*
-		 * In the case of FG_GC, it'd be better to reclaim this victim
-		 * completely.
-		 */
-		if (get_valid_blocks(sbi, segno, 1) != 0)
-			goto next_step;
+		/* return 1 only if FG_GC succefully reclaimed one */
+		if (get_valid_blocks(sbi, segno, 1) == 0)
+			return 1;
 	}
+	return 0;
 }
 
 /*
@@ -649,7 +647,7 @@ out:
  * If the parent node is not valid or the data block address is different,
  * the victim data block is ignored.
  */
-static void gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
+static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
 		struct gc_inode_list *gc_list, unsigned int segno, int gc_type)
 {
 	struct super_block *sb = sbi->sb;
@@ -672,7 +670,7 @@ next_step:
 
 		/* stop BG_GC if there is not enough free sections. */
 		if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0))
-			return;
+			return 0;
 
 		if (check_valid_map(sbi, segno, off) == 0)
 			continue;
@@ -737,15 +735,11 @@ next_step:
 	if (gc_type == FG_GC) {
 		f2fs_submit_merged_bio(sbi, DATA, WRITE);
 
-		/*
-		 * In the case of FG_GC, it'd be better to reclaim this victim
-		 * completely.
-		 */
-		if (get_valid_blocks(sbi, segno, 1) != 0) {
-			phase = 2;
-			goto next_step;
-		}
+		/* return 1 only if FG_GC succefully reclaimed one */
+		if (get_valid_blocks(sbi, segno, 1) == 0)
+			return 1;
 	}
+	return 0;
 }
 
 static int __get_victim(struct f2fs_sb_info *sbi, unsigned int *victim,
@@ -761,12 +755,13 @@ static int __get_victim(struct f2fs_sb_info *sbi, unsigned int *victim,
 	return ret;
 }
 
-static void do_garbage_collect(struct f2fs_sb_info *sbi, unsigned int segno,
+static int do_garbage_collect(struct f2fs_sb_info *sbi, unsigned int segno,
 				struct gc_inode_list *gc_list, int gc_type)
 {
 	struct page *sum_page;
 	struct f2fs_summary_block *sum;
 	struct blk_plug plug;
+	int nfree = 0;
 
 	/* read segment summary of victim */
 	sum_page = get_sum_page(sbi, segno);
@@ -786,10 +781,11 @@ static void do_garbage_collect(struct f2fs_sb_info *sbi, unsigned int segno,
 
 	switch (GET_SUM_TYPE((&sum->footer))) {
 	case SUM_TYPE_NODE:
-		gc_node_segment(sbi, sum->entries, segno, gc_type);
+		nfree = gc_node_segment(sbi, sum->entries, segno, gc_type);
 		break;
 	case SUM_TYPE_DATA:
-		gc_data_segment(sbi, sum->entries, gc_list, segno, gc_type);
+		nfree = gc_data_segment(sbi, sum->entries, gc_list,
+							segno, gc_type);
 		break;
 	}
 	blk_finish_plug(&plug);
@@ -798,6 +794,7 @@ static void do_garbage_collect(struct f2fs_sb_info *sbi, unsigned int segno,
 	stat_inc_call_count(sbi->stat_info);
 
 	f2fs_put_page(sum_page, 0);
+	return nfree;
 }
 
 int f2fs_gc(struct f2fs_sb_info *sbi)
@@ -836,11 +833,10 @@ gc_more:
 								META_SSA);
 
 	for (i = 0; i < sbi->segs_per_sec; i++)
-		do_garbage_collect(sbi, segno + i, &gc_list, gc_type);
+		nfree += do_garbage_collect(sbi, segno + i, &gc_list, gc_type);
 
 	if (gc_type == FG_GC) {
 		sbi->cur_victim_sec = NULL_SEGNO;
-		nfree++;
 		WARN_ON(get_valid_blocks(sbi, segno, sbi->segs_per_sec));
 	}
 
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] f2fs: go out for insert_inode_locked failure
  2015-08-18  8:46 [PATCH 1/5] f2fs: reuse nids more aggressively Jaegeuk Kim
  2015-08-18  8:46 ` [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page Jaegeuk Kim
  2015-08-18  8:46 ` [PATCH 3/5] f2fs: retry gc if one section is not successfully reclaimed Jaegeuk Kim
@ 2015-08-18  8:46 ` Jaegeuk Kim
  2015-08-20  9:11   ` [f2fs-dev] " Chao Yu
  2015-08-18  8:46 ` [PATCH 5/5] f2fs: check the node block address of newly allocated nid Jaegeuk Kim
  2015-08-20  9:01 ` [f2fs-dev] [PATCH 1/5] f2fs: reuse nids more aggressively Chao Yu
  4 siblings, 1 reply; 13+ messages in thread
From: Jaegeuk Kim @ 2015-08-18  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

We should not call unlock_new_inode when insert_inode_locked failed.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/namei.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 97e97c4..a680bf3 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -53,7 +53,7 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode)
 	if (err) {
 		err = -EINVAL;
 		nid_free = true;
-		goto out;
+		goto fail;
 	}
 
 	/* If the directory encrypted, then we should encrypt the inode. */
@@ -75,9 +75,6 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode)
 	mark_inode_dirty(inode);
 	return inode;
 
-out:
-	clear_nlink(inode);
-	unlock_new_inode(inode);
 fail:
 	trace_f2fs_new_inode(inode, err);
 	make_bad_inode(inode);
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] f2fs: check the node block address of newly allocated nid
  2015-08-18  8:46 [PATCH 1/5] f2fs: reuse nids more aggressively Jaegeuk Kim
                   ` (2 preceding siblings ...)
  2015-08-18  8:46 ` [PATCH 4/5] f2fs: go out for insert_inode_locked failure Jaegeuk Kim
@ 2015-08-18  8:46 ` Jaegeuk Kim
  2015-08-20  9:12   ` [f2fs-dev] " Chao Yu
  2015-08-20  9:01 ` [f2fs-dev] [PATCH 1/5] f2fs: reuse nids more aggressively Chao Yu
  4 siblings, 1 reply; 13+ messages in thread
From: Jaegeuk Kim @ 2015-08-18  8:46 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-f2fs-devel; +Cc: Jaegeuk Kim

This patch adds a routine which checks the block address of newly allocated nid.
If an nid has already allocated by other thread due to subtle data races, it
will result in filesystem corruption.
So, it needs to check whether its block address was already allocated or not
in prior to nid allocation as the last chance.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/node.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 3cc32b8..6bef5a2 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1573,6 +1573,8 @@ retry:
 
 	/* We should not use stale free nids created by build_free_nids */
 	if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
+		struct node_info ni;
+
 		f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
 		list_for_each_entry(i, &nm_i->free_nid_list, list)
 			if (i->state == NID_NEW)
@@ -1583,6 +1585,13 @@ retry:
 		i->state = NID_ALLOC;
 		nm_i->fcnt--;
 		spin_unlock(&nm_i->free_nid_list_lock);
+
+		/* check nid is allocated already */
+		get_node_info(sbi, *nid, &ni);
+		if (ni.blk_addr != NULL_ADDR) {
+			alloc_nid_done(sbi, *nid);
+			goto retry;
+		}
 		return true;
 	}
 	spin_unlock(&nm_i->free_nid_list_lock);
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [f2fs-dev] [PATCH 1/5] f2fs: reuse nids more aggressively
  2015-08-18  8:46 [PATCH 1/5] f2fs: reuse nids more aggressively Jaegeuk Kim
                   ` (3 preceding siblings ...)
  2015-08-18  8:46 ` [PATCH 5/5] f2fs: check the node block address of newly allocated nid Jaegeuk Kim
@ 2015-08-20  9:01 ` Chao Yu
  4 siblings, 0 replies; 13+ messages in thread
From: Chao Yu @ 2015-08-20  9:01 UTC (permalink / raw)
  To: 'Jaegeuk Kim', linux-kernel, linux-fsdevel, linux-f2fs-devel

> -----Original Message-----
> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> Sent: Tuesday, August 18, 2015 4:46 PM
> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 1/5] f2fs: reuse nids more aggressively
> 
> If we can reuse nids as many as possible, we can mitigate producing obsolete
> node pages in the page cache.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Reviewed-by: Chao Yu <chao2.yu@samsung.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [f2fs-dev] [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page
  2015-08-18  8:46 ` [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page Jaegeuk Kim
@ 2015-08-20  9:09   ` Chao Yu
  0 siblings, 0 replies; 13+ messages in thread
From: Chao Yu @ 2015-08-20  9:09 UTC (permalink / raw)
  To: 'Jaegeuk Kim', linux-kernel, linux-fsdevel, linux-f2fs-devel

> -----Original Message-----
> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> Sent: Tuesday, August 18, 2015 4:46 PM
> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page
> 
> Previously, update_inode_page is not called under f2fs_lock_op.
> Instead we should call with f2fs_write_inode.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Reviewed-by: Chao Yu <chao2.yu@samsung.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [f2fs-dev] [PATCH 4/5] f2fs: go out for insert_inode_locked failure
  2015-08-18  8:46 ` [PATCH 4/5] f2fs: go out for insert_inode_locked failure Jaegeuk Kim
@ 2015-08-20  9:11   ` Chao Yu
  0 siblings, 0 replies; 13+ messages in thread
From: Chao Yu @ 2015-08-20  9:11 UTC (permalink / raw)
  To: 'Jaegeuk Kim', linux-kernel, linux-fsdevel, linux-f2fs-devel

> -----Original Message-----
> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> Sent: Tuesday, August 18, 2015 4:46 PM
> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 4/5] f2fs: go out for insert_inode_locked failure
> 
> We should not call unlock_new_inode when insert_inode_locked failed.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Reviewed-by: Chao Yu <chao2.yu@samsung.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
  2015-08-18  8:46 ` [PATCH 5/5] f2fs: check the node block address of newly allocated nid Jaegeuk Kim
@ 2015-08-20  9:12   ` Chao Yu
  2015-08-20 15:35     ` Jaegeuk Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Chao Yu @ 2015-08-20  9:12 UTC (permalink / raw)
  To: 'Jaegeuk Kim', linux-kernel, linux-fsdevel, linux-f2fs-devel

Hi Jaegeuk,

> -----Original Message-----
> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> Sent: Tuesday, August 18, 2015 4:46 PM
> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> 
> This patch adds a routine which checks the block address of newly allocated nid.
> If an nid has already allocated by other thread due to subtle data races, it
> will result in filesystem corruption.
> So, it needs to check whether its block address was already allocated or not
> in prior to nid allocation as the last chance.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/node.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 3cc32b8..6bef5a2 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1573,6 +1573,8 @@ retry:
> 
>  	/* We should not use stale free nids created by build_free_nids */
>  	if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> +		struct node_info ni;
> +
>  		f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
>  		list_for_each_entry(i, &nm_i->free_nid_list, list)
>  			if (i->state == NID_NEW)
> @@ -1583,6 +1585,13 @@ retry:
>  		i->state = NID_ALLOC;
>  		nm_i->fcnt--;
>  		spin_unlock(&nm_i->free_nid_list_lock);
> +
> +		/* check nid is allocated already */
> +		get_node_info(sbi, *nid, &ni);
> +		if (ni.blk_addr != NULL_ADDR) {

I didn't get it, why free nid is with non-NULL blkaddr?
Could you please explain more about this?

> +			alloc_nid_done(sbi, *nid);

Will another thread call alloc_nid_done too, making this free nid being
released again?

Thanks,

> +			goto retry;
> +		}
>  		return true;
>  	}
>  	spin_unlock(&nm_i->free_nid_list_lock);
> --
> 2.1.1
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
  2015-08-20  9:12   ` [f2fs-dev] " Chao Yu
@ 2015-08-20 15:35     ` Jaegeuk Kim
  2015-08-21 12:48       ` Chao Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Jaegeuk Kim @ 2015-08-20 15:35 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-fsdevel, linux-f2fs-devel

On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
> Hi Jaegeuk,
> 
> > -----Original Message-----
> > From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> > Sent: Tuesday, August 18, 2015 4:46 PM
> > To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> > linux-f2fs-devel@lists.sourceforge.net
> > Cc: Jaegeuk Kim
> > Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> > 
> > This patch adds a routine which checks the block address of newly allocated nid.
> > If an nid has already allocated by other thread due to subtle data races, it
> > will result in filesystem corruption.
> > So, it needs to check whether its block address was already allocated or not
> > in prior to nid allocation as the last chance.
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/node.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index 3cc32b8..6bef5a2 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1573,6 +1573,8 @@ retry:
> > 
> >  	/* We should not use stale free nids created by build_free_nids */
> >  	if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> > +		struct node_info ni;
> > +
> >  		f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
> >  		list_for_each_entry(i, &nm_i->free_nid_list, list)
> >  			if (i->state == NID_NEW)
> > @@ -1583,6 +1585,13 @@ retry:
> >  		i->state = NID_ALLOC;
> >  		nm_i->fcnt--;
> >  		spin_unlock(&nm_i->free_nid_list_lock);
> > +
> > +		/* check nid is allocated already */
> > +		get_node_info(sbi, *nid, &ni);
> > +		if (ni.blk_addr != NULL_ADDR) {
> 
> I didn't get it, why free nid is with non-NULL blkaddr?
> Could you please explain more about this?

As I wrote in the description, I've been suffering from wrongly added free nids
which results in fs corruption. I suspect somewhat race condition in
build_free_nids, but it is very subtle to figure out exactly.
So, I wrote this patch to fix that.

The concern would be performance regarding to cold cache miss at an NAT entry.
However, I expect that it would be tolerable since get_node_info will be called
after alloc_nid later.

> 
> > +			alloc_nid_done(sbi, *nid);
> 
> Will another thread call alloc_nid_done too, making this free nid being
> released again?

No, its state became NID_ALLOC, so no other thread can pick this up till
alloc_nid_done is called.

Thanks,

> 
> Thanks,
> 
> > +			goto retry;
> > +		}
> >  		return true;
> >  	}
> >  	spin_unlock(&nm_i->free_nid_list_lock);
> > --
> > 2.1.1
> > 
> > 
> > ------------------------------------------------------------------------------
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
  2015-08-20 15:35     ` Jaegeuk Kim
@ 2015-08-21 12:48       ` Chao Yu
  2015-08-21 14:59         ` Chao Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Chao Yu @ 2015-08-21 12:48 UTC (permalink / raw)
  To: 'Jaegeuk Kim'; +Cc: linux-kernel, linux-fsdevel, linux-f2fs-devel

Hi Jaegeuk,

> -----Original Message-----
> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> Sent: Thursday, August 20, 2015 11:35 PM
> To: Chao Yu
> Cc: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> 
> On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
> > Hi Jaegeuk,
> >
> > > -----Original Message-----
> > > From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> > > Sent: Tuesday, August 18, 2015 4:46 PM
> > > To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> > > linux-f2fs-devel@lists.sourceforge.net
> > > Cc: Jaegeuk Kim
> > > Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> > >
> > > This patch adds a routine which checks the block address of newly allocated nid.
> > > If an nid has already allocated by other thread due to subtle data races, it
> > > will result in filesystem corruption.
> > > So, it needs to check whether its block address was already allocated or not
> > > in prior to nid allocation as the last chance.
> > >
> > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > > ---
> > >  fs/f2fs/node.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > >
> > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > > index 3cc32b8..6bef5a2 100644
> > > --- a/fs/f2fs/node.c
> > > +++ b/fs/f2fs/node.c
> > > @@ -1573,6 +1573,8 @@ retry:
> > >
> > >  	/* We should not use stale free nids created by build_free_nids */
> > >  	if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> > > +		struct node_info ni;
> > > +
> > >  		f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
> > >  		list_for_each_entry(i, &nm_i->free_nid_list, list)
> > >  			if (i->state == NID_NEW)
> > > @@ -1583,6 +1585,13 @@ retry:
> > >  		i->state = NID_ALLOC;
> > >  		nm_i->fcnt--;
> > >  		spin_unlock(&nm_i->free_nid_list_lock);
> > > +
> > > +		/* check nid is allocated already */
> > > +		get_node_info(sbi, *nid, &ni);
> > > +		if (ni.blk_addr != NULL_ADDR) {
> >
> > I didn't get it, why free nid is with non-NULL blkaddr?
> > Could you please explain more about this?
> 
> As I wrote in the description, I've been suffering from wrongly added free nids
> which results in fs corruption. I suspect somewhat race condition in
> build_free_nids, but it is very subtle to figure out exactly.
> So, I wrote this patch to fix that.
> 
> The concern would be performance regarding to cold cache miss at an NAT entry.
> However, I expect that it would be tolerable since get_node_info will be called
> after alloc_nid later.

After investigating, I think I can reproduce this bug:

1. touch a (nid = 4) & touch b (nid = 5)
2. sync
3. rm a & rm b
 a) rm a to make next_scan_nid = 4.
 b) I change the logical of f2fs code making remove_inode_page failed when
file b is being removed, so file b's nat entry is not set dirty;
4. sync
5. touch 1815 files
6. echo 3 > /proc/sys/vm/drop_caches
 drop clean nat entry of inode (nid:5), it makes we can pass blkaddr
verification in add_free_nid:
	if (build) {
		/* do not add allocated nids */

7. touch c
 because there is no free nids in cache, we try to build cache by two steps:
 a) build nids by loading from nat pages;
 b) build nids by loading from curseg and try to unload nids which has valid
blkaddr in curseg.

 unfortunately, our build operation is not atomic, so after step a), nid:5
 should be in free nids cache and it should be removed in step b). So all
 free nids allocated between step a) and step b) can be risky of incorrect
 allocation.

If I'm not miss something, the root casue looks like our recent change:
allocate free nid aggressively.

Thanks,
> 
> >
> > > +			alloc_nid_done(sbi, *nid);
> >
> > Will another thread call alloc_nid_done too, making this free nid being
> > released again?
> 
> No, its state became NID_ALLOC, so no other thread can pick this up till
> alloc_nid_done is called.
> 
> Thanks,
> 
> >
> > Thanks,
> >
> > > +			goto retry;
> > > +		}
> > >  		return true;
> > >  	}
> > >  	spin_unlock(&nm_i->free_nid_list_lock);
> > > --
> > > 2.1.1
> > >
> > >
> > > ------------------------------------------------------------------------------
> > > _______________________________________________
> > > Linux-f2fs-devel mailing list
> > > Linux-f2fs-devel@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
  2015-08-21 12:48       ` Chao Yu
@ 2015-08-21 14:59         ` Chao Yu
  2015-08-24  9:38           ` Chao Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Chao Yu @ 2015-08-21 14:59 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-fsdevel, linux-kernel, linux-f2fs-devel

> On Aug 21, 2015, at 8:48 PM, Chao Yu <chao2.yu@samsung.com> wrote:
> 
> Hi Jaegeuk,
> 
>> -----Original Message-----
>> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
>> Sent: Thursday, August 20, 2015 11:35 PM
>> To: Chao Yu
>> Cc: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
>> linux-f2fs-devel@lists.sourceforge.net
>> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
>> 
>> On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
>>> Hi Jaegeuk,
>>> 
>>>> -----Original Message-----
>>>> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
>>>> Sent: Tuesday, August 18, 2015 4:46 PM
>>>> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
>>>> linux-f2fs-devel@lists.sourceforge.net
>>>> Cc: Jaegeuk Kim
>>>> Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
>>>> 
>>>> This patch adds a routine which checks the block address of newly allocated nid.
>>>> If an nid has already allocated by other thread due to subtle data races, it
>>>> will result in filesystem corruption.
>>>> So, it needs to check whether its block address was already allocated or not
>>>> in prior to nid allocation as the last chance.
>>>> 
>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>> ---
>>>> fs/f2fs/node.c | 9 +++++++++
>>>> 1 file changed, 9 insertions(+)
>>>> 
>>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>>>> index 3cc32b8..6bef5a2 100644
>>>> --- a/fs/f2fs/node.c
>>>> +++ b/fs/f2fs/node.c
>>>> @@ -1573,6 +1573,8 @@ retry:
>>>> 
>>>> 	/* We should not use stale free nids created by build_free_nids */
>>>> 	if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>>>> +		struct node_info ni;
>>>> +
>>>> 		f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
>>>> 		list_for_each_entry(i, &nm_i->free_nid_list, list)
>>>> 			if (i->state == NID_NEW)
>>>> @@ -1583,6 +1585,13 @@ retry:
>>>> 		i->state = NID_ALLOC;
>>>> 		nm_i->fcnt--;
>>>> 		spin_unlock(&nm_i->free_nid_list_lock);
>>>> +
>>>> +		/* check nid is allocated already */
>>>> +		get_node_info(sbi, *nid, &ni);
>>>> +		if (ni.blk_addr != NULL_ADDR) {
>>> 
>>> I didn't get it, why free nid is with non-NULL blkaddr?
>>> Could you please explain more about this?
>> 
>> As I wrote in the description, I've been suffering from wrongly added free nids
>> which results in fs corruption. I suspect somewhat race condition in
>> build_free_nids, but it is very subtle to figure out exactly.
>> So, I wrote this patch to fix that.
>> 
>> The concern would be performance regarding to cold cache miss at an NAT entry.
>> However, I expect that it would be tolerable since get_node_info will be called
>> after alloc_nid later.
> 
> After investigating, I think I can reproduce this bug:
> 
> 1. touch a (nid = 4) & touch b (nid = 5)
> 2. sync
> 3. rm a & rm b
> a) rm a to make next_scan_nid = 4.
> b) I change the logical of f2fs code making remove_inode_page failed when
> file b is being removed, so file b's nat entry is not set dirty;
> 4. sync
> 5. touch 1815 files
> 6. echo 3 > /proc/sys/vm/drop_caches
> drop clean nat entry of inode (nid:5), it makes we can pass blkaddr
> verification in add_free_nid:
> 	if (build) {
> 		/* do not add allocated nids */
> 
> 7. touch c
> because there is no free nids in cache, we try to build cache by two steps:
> a) build nids by loading from nat pages;
> b) build nids by loading from curseg and try to unload nids which has valid
> blkaddr in curseg.
> 
> unfortunately, our build operation is not atomic, so after step a), nid:5

After rethinking about this issue on the way coming back home, I find that
it seems not right here, because we will try to check build_lock status in
on_build_free_nids, allocation will not happen during building free nid
cache. I missed that previously.

Sorry for my wrong conclusion, please ignore them. :(

I’d like to reinvestigate this issue.

Thanks,

> should be in free nids cache and it should be removed in step b). So all
> free nids allocated between step a) and step b) can be risky of incorrect
> allocation.
> 
> If I'm not miss something, the root casue looks like our recent change:
> allocate free nid aggressively.
> 
> Thanks,
>> 
>>> 
>>>> +			alloc_nid_done(sbi, *nid);
>>> 
>>> Will another thread call alloc_nid_done too, making this free nid being
>>> released again?
>> 
>> No, its state became NID_ALLOC, so no other thread can pick this up till
>> alloc_nid_done is called.
>> 
>> Thanks,
>> 
>>> 
>>> Thanks,
>>> 
>>>> +			goto retry;
>>>> +		}
>>>> 		return true;
>>>> 	}
>>>> 	spin_unlock(&nm_i->free_nid_list_lock);
>>>> --
>>>> 2.1.1
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> Linux-f2fs-devel mailing list
>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
  2015-08-21 14:59         ` Chao Yu
@ 2015-08-24  9:38           ` Chao Yu
  0 siblings, 0 replies; 13+ messages in thread
From: Chao Yu @ 2015-08-24  9:38 UTC (permalink / raw)
  To: 'Jaegeuk Kim'; +Cc: linux-fsdevel, linux-kernel, linux-f2fs-devel

Hi Jaegeuk,

> -----Original Message-----
> From: Chao Yu [mailto:yuchaochina@hotmail.com]
> Sent: Friday, August 21, 2015 11:00 PM
> To: Chao Yu
> Cc: Jaegeuk Kim; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-f2fs-devel@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> 
> > On Aug 21, 2015, at 8:48 PM, Chao Yu <chao2.yu@samsung.com> wrote:
> >
> > Hi Jaegeuk,
> >
> >> -----Original Message-----
> >> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> >> Sent: Thursday, August 20, 2015 11:35 PM
> >> To: Chao Yu
> >> Cc: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> >> linux-f2fs-devel@lists.sourceforge.net
> >> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated
> nid
> >>
> >> On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
> >>> Hi Jaegeuk,
> >>>
> >>>> -----Original Message-----
> >>>> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
> >>>> Sent: Tuesday, August 18, 2015 4:46 PM
> >>>> To: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org;
> >>>> linux-f2fs-devel@lists.sourceforge.net
> >>>> Cc: Jaegeuk Kim
> >>>> Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of newly allocated nid
> >>>>
> >>>> This patch adds a routine which checks the block address of newly allocated nid.
> >>>> If an nid has already allocated by other thread due to subtle data races, it
> >>>> will result in filesystem corruption.
> >>>> So, it needs to check whether its block address was already allocated or not
> >>>> in prior to nid allocation as the last chance.
> >>>>
> >>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>> ---
> >>>> fs/f2fs/node.c | 9 +++++++++
> >>>> 1 file changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> >>>> index 3cc32b8..6bef5a2 100644
> >>>> --- a/fs/f2fs/node.c
> >>>> +++ b/fs/f2fs/node.c
> >>>> @@ -1573,6 +1573,8 @@ retry:
> >>>>
> >>>> 	/* We should not use stale free nids created by build_free_nids */
> >>>> 	if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> >>>> +		struct node_info ni;
> >>>> +
> >>>> 		f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
> >>>> 		list_for_each_entry(i, &nm_i->free_nid_list, list)
> >>>> 			if (i->state == NID_NEW)
> >>>> @@ -1583,6 +1585,13 @@ retry:
> >>>> 		i->state = NID_ALLOC;
> >>>> 		nm_i->fcnt--;
> >>>> 		spin_unlock(&nm_i->free_nid_list_lock);
> >>>> +
> >>>> +		/* check nid is allocated already */
> >>>> +		get_node_info(sbi, *nid, &ni);
> >>>> +		if (ni.blk_addr != NULL_ADDR) {
> >>>
> >>> I didn't get it, why free nid is with non-NULL blkaddr?
> >>> Could you please explain more about this?
> >>
> >> As I wrote in the description, I've been suffering from wrongly added free nids
> >> which results in fs corruption. I suspect somewhat race condition in
> >> build_free_nids, but it is very subtle to figure out exactly.
> >> So, I wrote this patch to fix that.
> >>
> >> The concern would be performance regarding to cold cache miss at an NAT entry.
> >> However, I expect that it would be tolerable since get_node_info will be called
> >> after alloc_nid later.
> >
> > After investigating, I think I can reproduce this bug:
> >
> > 1. touch a (nid = 4) & touch b (nid = 5)
> > 2. sync
> > 3. rm a & rm b
> > a) rm a to make next_scan_nid = 4.
> > b) I change the logical of f2fs code making remove_inode_page failed when
> > file b is being removed, so file b's nat entry is not set dirty;
> > 4. sync
> > 5. touch 1815 files
> > 6. echo 3 > /proc/sys/vm/drop_caches
> > drop clean nat entry of inode (nid:5), it makes we can pass blkaddr
> > verification in add_free_nid:
> > 	if (build) {
> > 		/* do not add allocated nids */
> >
> > 7. touch c
> > because there is no free nids in cache, we try to build cache by two steps:
> > a) build nids by loading from nat pages;
> > b) build nids by loading from curseg and try to unload nids which has valid
> > blkaddr in curseg.
> >
> > unfortunately, our build operation is not atomic, so after step a), nid:5
> 
> After rethinking about this issue on the way coming back home, I find that
> it seems not right here, because we will try to check build_lock status in
> on_build_free_nids, allocation will not happen during building free nid
> cache. I missed that previously.
> 
> Sorry for my wrong conclusion, please ignore them. :(
> 
> I’d like to reinvestigate this issue.

I reinvestigate this issue and find one possible call path for reproducing this
issue, and I wrote patches for fxing, can you please help to review the following
patches?

Thanks,



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-08-24  9:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-18  8:46 [PATCH 1/5] f2fs: reuse nids more aggressively Jaegeuk Kim
2015-08-18  8:46 ` [PATCH 2/5] f2fs: fix to cover lock_op for update_inode_page Jaegeuk Kim
2015-08-20  9:09   ` [f2fs-dev] " Chao Yu
2015-08-18  8:46 ` [PATCH 3/5] f2fs: retry gc if one section is not successfully reclaimed Jaegeuk Kim
2015-08-18  8:46 ` [PATCH 4/5] f2fs: go out for insert_inode_locked failure Jaegeuk Kim
2015-08-20  9:11   ` [f2fs-dev] " Chao Yu
2015-08-18  8:46 ` [PATCH 5/5] f2fs: check the node block address of newly allocated nid Jaegeuk Kim
2015-08-20  9:12   ` [f2fs-dev] " Chao Yu
2015-08-20 15:35     ` Jaegeuk Kim
2015-08-21 12:48       ` Chao Yu
2015-08-21 14:59         ` Chao Yu
2015-08-24  9:38           ` Chao Yu
2015-08-20  9:01 ` [f2fs-dev] [PATCH 1/5] f2fs: reuse nids more aggressively Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).