All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <yuchao0@huawei.com>
Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH 2/2 v2] f2fs: avoid f2fs_bug_on if f2fs_get_meta_page_nofail got EIO
Date: Wed, 26 Sep 2018 12:49:32 -0700	[thread overview]
Message-ID: <20180926194932.GA44119@jaegeuk-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <2996d79d-c791-1a22-8abd-ad37d48edf17@huawei.com>

On 09/26, Chao Yu wrote:
> On 2018/9/26 9:10, Chao Yu wrote:
> > On 2018/9/26 8:18, Jaegeuk Kim wrote:
> >> On 09/21, Chao Yu wrote:
> >>> On 2018/9/21 5:46, Jaegeuk Kim wrote:
> >>>> On 09/20, Chao Yu wrote:
> >>>>> On 2018/9/19 1:56, Jaegeuk Kim wrote:
> >>>>>> This patch avoids BUG_ON when f2fs_get_meta_page_nofail got EIO during
> >>>>>> xfstests/generic/475.
> >>>>>>
> >>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>> ---
> >>>>>> Change log from v1:
> >>>>>>  - propagate errno
> >>>>>>  - drop sum_pages
> >>>>>>
> >>>>>>  fs/f2fs/checkpoint.c | 10 +++++-----
> >>>>>>  fs/f2fs/f2fs.h       |  2 +-
> >>>>>>  fs/f2fs/gc.c         |  9 +++++++++
> >>>>>>  fs/f2fs/node.c       | 32 ++++++++++++++++++++++++--------
> >>>>>>  fs/f2fs/recovery.c   |  2 ++
> >>>>>>  fs/f2fs/segment.c    |  3 +++
> >>>>>>  6 files changed, 44 insertions(+), 14 deletions(-)
> >>>>>>
> >>>>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >>>>>> index 01e0d8f5bbbe..1ddf332ce4b2 100644
> >>>>>> --- a/fs/f2fs/checkpoint.c
> >>>>>> +++ b/fs/f2fs/checkpoint.c
> >>>>>> @@ -119,11 +119,8 @@ struct page *f2fs_get_meta_page_nofail(struct f2fs_sb_info *sbi, pgoff_t index)
> >>>>>>  		if (PTR_ERR(page) == -EIO &&
> >>>>>>  				++count <= DEFAULT_RETRY_IO_COUNT)
> >>>>>>  			goto retry;
> >>>>>> -
> >>>>>>  		f2fs_stop_checkpoint(sbi, false);
> >>>>>> -		f2fs_bug_on(sbi, 1);
> >>>>>>  	}
> >>>>>> -
> >>>>>>  	return page;
> >>>>>>  }
> >>>>>>  
> >>>>>> @@ -1512,7 +1509,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>>  	ckpt->checkpoint_ver = cpu_to_le64(++ckpt_ver);
> >>>>>>  
> >>>>>>  	/* write cached NAT/SIT entries to NAT/SIT area */
> >>>>>> -	f2fs_flush_nat_entries(sbi, cpc);
> >>>>>> +	err = f2fs_flush_nat_entries(sbi, cpc);
> >>>>>> +	if (err)
> >>>>>> +		goto stop;
> >>>>>
> >>>>> To make sure, if partial NAT pages become dirty, flush them back to device
> >>>>> outside checkpoint() is not harmful, right?
> >>>>
> >>>> I think so.
> >>>
> >>> Oh, one more case, now we use NAT #0, if partial NAT pages were
> >>> writebacked, then we start to use NAT #1, later, in another checkpoint(),
> >>> we update those NAT pages again, we will write them to NAT #0, which is
> >>> belong to last valid checkpoint(), it's may cause corrupted checkpoint in
> >>> scenario of SPO. So it's harmfull, right?
> >>
> >> We already stopped checkpoint anymore, so there'd be no way to proceed another
> >> checkpoint. Am I missing something?
> > 
> > You're right, we have called f2fs_stop_checkpoint(), so we are safe now. :)
> 
> Can we allow to writeback fs metadata (NAT/SIT/SSA) in prior to checkpoint
> to mitigate IO stress of checkpoint?

What's the point related to this patch?
Anyway, do you mean f2fs_write_meta_pages is not enough?

> 
> Maybe below conditions can be consider to trigger writebacking:
> a) dirty count of meta (NAT/SIT/SSA) exceeds threshold
> b) exceed time threshold
> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>>>
> >>>>>> +
> >>>>>>  	f2fs_flush_sit_entries(sbi, cpc);
> >>>>>>  
> >>>>>>  	/* unlock all the fs_lock[] in do_checkpoint() */
> >>>>>> @@ -1521,7 +1521,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>>  		f2fs_release_discard_addrs(sbi);
> >>>>>>  	else
> >>>>>>  		f2fs_clear_prefree_segments(sbi, cpc);
> >>>>>> -
> >>>>>> +stop:
> >>>>>>  	unblock_operations(sbi);
> >>>>>>  	stat_inc_cp_count(sbi->stat_info);
> >>>>>>  
> >>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>>>>> index a0c868127a7c..29021dbc8f2a 100644
> >>>>>> --- a/fs/f2fs/f2fs.h
> >>>>>> +++ b/fs/f2fs/f2fs.h
> >>>>>> @@ -2895,7 +2895,7 @@ int f2fs_recover_xattr_data(struct inode *inode, struct page *page);
> >>>>>>  int f2fs_recover_inode_page(struct f2fs_sb_info *sbi, struct page *page);
> >>>>>>  int f2fs_restore_node_summary(struct f2fs_sb_info *sbi,
> >>>>>>  			unsigned int segno, struct f2fs_summary_block *sum);
> >>>>>> -void f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc);
> >>>>>> +int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc);
> >>>>>>  int f2fs_build_node_manager(struct f2fs_sb_info *sbi);
> >>>>>>  void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi);
> >>>>>>  int __init f2fs_create_node_manager_caches(void);
> >>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>>>>> index 4bcc8a59fdef..c7d14282dbf4 100644
> >>>>>> --- a/fs/f2fs/gc.c
> >>>>>> +++ b/fs/f2fs/gc.c
> >>>>>> @@ -1070,6 +1070,15 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
> >>>>>>  	/* reference all summary page */
> >>>>>>  	while (segno < end_segno) {
> >>>>>>  		sum_page = f2fs_get_sum_page(sbi, segno++);
> >>>>>> +		if (IS_ERR(sum_page)) {
> >>>>>> +			end_segno = segno - 1;
> >>>>>> +			for (segno = start_segno; segno < end_segno; segno++) {
> >>>>>> +				sum_page = find_get_page(META_MAPPING(sbi),
> >>>>>> +						GET_SUM_BLOCK(sbi, segno));
> >>>>>> +				f2fs_put_page(sum_page, 0);
> >>>>>
> >>>>> find_get_page() will add one more reference on page, so we need to call
> >>>>> f2fs_put_page(sum_page, 0) twice.
> >>>>
> >>>> Done.
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>> +			}
> >>>>>> +			return PTR_ERR(sum_page);
> >>>>>> +		}
> >>>>>>  		unlock_page(sum_page);
> >>>>>>  	}
> >>>>>>  
> >>>>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> >>>>>> index fa2381c0bc47..79b6fee354f7 100644
> >>>>>> --- a/fs/f2fs/node.c
> >>>>>> +++ b/fs/f2fs/node.c
> >>>>>> @@ -126,6 +126,8 @@ static struct page *get_next_nat_page(struct f2fs_sb_info *sbi, nid_t nid)
> >>>>>>  
> >>>>>>  	/* get current nat block page with lock */
> >>>>>>  	src_page = get_current_nat_page(sbi, nid);
> >>>>>> +	if (IS_ERR(src_page))
> >>>>>> +		return src_page;
> >>>>>>  	dst_page = f2fs_grab_meta_page(sbi, dst_off);
> >>>>>>  	f2fs_bug_on(sbi, PageDirty(src_page));
> >>>>>>  
> >>>>>> @@ -2265,15 +2267,19 @@ static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi,
> >>>>>>  						nm_i->nat_block_bitmap)) {
> >>>>>>  			struct page *page = get_current_nat_page(sbi, nid);
> >>>>>>  
> >>>>>> -			ret = scan_nat_page(sbi, page, nid);
> >>>>>> -			f2fs_put_page(page, 1);
> >>>>>> +			if (IS_ERR(page)) {
> >>>>>> +				ret = PTR_ERR(page);
> >>>>>> +			} else {
> >>>>>> +				ret = scan_nat_page(sbi, page, nid);
> >>>>>> +				f2fs_put_page(page, 1);
> >>>>>> +			}
> >>>>>>  
> >>>>>>  			if (ret) {
> >>>>>>  				up_read(&nm_i->nat_tree_lock);
> >>>>>>  				f2fs_bug_on(sbi, !mount);
> >>>>>>  				f2fs_msg(sbi->sb, KERN_ERR,
> >>>>>>  					"NAT is corrupt, run fsck to fix it");
> >>>>>> -				return -EINVAL;
> >>>>>> +				return ret;
> >>>>>>  			}
> >>>>>>  		}
> >>>>>>  
> >>>>>> @@ -2700,7 +2706,7 @@ static void __update_nat_bits(struct f2fs_sb_info *sbi, nid_t start_nid,
> >>>>>>  		__clear_bit_le(nat_index, nm_i->full_nat_bits);
> >>>>>>  }
> >>>>>>  
> >>>>>> -static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
> >>>>>> +static int __flush_nat_entry_set(struct f2fs_sb_info *sbi,
> >>>>>>  		struct nat_entry_set *set, struct cp_control *cpc)
> >>>>>>  {
> >>>>>>  	struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
> >>>>>> @@ -2724,6 +2730,9 @@ static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
> >>>>>>  		down_write(&curseg->journal_rwsem);
> >>>>>>  	} else {
> >>>>>>  		page = get_next_nat_page(sbi, start_nid);
> >>>>>> +		if (IS_ERR(page))
> >>>>>> +			return PTR_ERR(page);
> >>>>>> +
> >>>>>>  		nat_blk = page_address(page);
> >>>>>>  		f2fs_bug_on(sbi, !nat_blk);
> >>>>>>  	}
> >>>>>> @@ -2769,12 +2778,13 @@ static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
> >>>>>>  		radix_tree_delete(&NM_I(sbi)->nat_set_root, set->set);
> >>>>>>  		kmem_cache_free(nat_entry_set_slab, set);
> >>>>>>  	}
> >>>>>> +	return 0;
> >>>>>>  }
> >>>>>>  
> >>>>>>  /*
> >>>>>>   * This function is called during the checkpointing process.
> >>>>>>   */
> >>>>>> -void f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>> +int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>>  {
> >>>>>>  	struct f2fs_nm_info *nm_i = NM_I(sbi);
> >>>>>>  	struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
> >>>>>> @@ -2784,6 +2794,7 @@ void f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>>  	unsigned int found;
> >>>>>>  	nid_t set_idx = 0;
> >>>>>>  	LIST_HEAD(sets);
> >>>>>> +	int err = 0;
> >>>>>>  
> >>>>>>  	/* during unmount, let's flush nat_bits before checking dirty_nat_cnt */
> >>>>>>  	if (enabled_nat_bits(sbi, cpc)) {
> >>>>>> @@ -2793,7 +2804,7 @@ void f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>>  	}
> >>>>>>  
> >>>>>>  	if (!nm_i->dirty_nat_cnt)
> >>>>>> -		return;
> >>>>>> +		return 0;
> >>>>>>  
> >>>>>>  	down_write(&nm_i->nat_tree_lock);
> >>>>>>  
> >>>>>> @@ -2816,11 +2827,16 @@ void f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>>>>  	}
> >>>>>>  
> >>>>>>  	/* flush dirty nats in nat entry set */
> >>>>>> -	list_for_each_entry_safe(set, tmp, &sets, set_list)
> >>>>>> -		__flush_nat_entry_set(sbi, set, cpc);
> >>>>>> +	list_for_each_entry_safe(set, tmp, &sets, set_list) {
> >>>>>> +		err = __flush_nat_entry_set(sbi, set, cpc);
> >>>>>> +		if (err)
> >>>>>> +			break;
> >>>>>> +	}
> >>>>>>  
> >>>>>>  	up_write(&nm_i->nat_tree_lock);
> >>>>>>  	/* Allow dirty nats by node block allocation in write_begin */
> >>>>>> +
> >>>>>> +	return err;
> >>>>>>  }
> >>>>>>  
> >>>>>>  static int __get_nat_bitmaps(struct f2fs_sb_info *sbi)
> >>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>> index 56d34193a74b..ba678efe7cad 100644
> >>>>>> --- a/fs/f2fs/recovery.c
> >>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>> @@ -355,6 +355,8 @@ static int check_index_in_prev_nodes(struct f2fs_sb_info *sbi,
> >>>>>>  	}
> >>>>>>  
> >>>>>>  	sum_page = f2fs_get_sum_page(sbi, segno);
> >>>>>> +	if (IS_ERR(sum_page))
> >>>>>> +		return PTR_ERR(sum_page);
> >>>>>>  	sum_node = (struct f2fs_summary_block *)page_address(sum_page);
> >>>>>>  	sum = sum_node->entries[blkoff];
> >>>>>>  	f2fs_put_page(sum_page, 1);
> >>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >>>>>> index aa96a371aaf8..ab7386a7e9d3 100644
> >>>>>> --- a/fs/f2fs/segment.c
> >>>>>> +++ b/fs/f2fs/segment.c
> >>>>>> @@ -2487,6 +2487,7 @@ static void change_curseg(struct f2fs_sb_info *sbi, int type)
> >>>>>>  	__next_free_blkoff(sbi, curseg, 0);
> >>>>>>  
> >>>>>>  	sum_page = f2fs_get_sum_page(sbi, new_segno);
> >>>>>> +	f2fs_bug_on(sbi, IS_ERR(sum_page));
> >>>>>>  	sum_node = (struct f2fs_summary_block *)page_address(sum_page);
> >>>>>>  	memcpy(curseg->sum_blk, sum_node, SUM_ENTRY_SIZE);
> >>>>>>  	f2fs_put_page(sum_page, 1);
> >>>>>> @@ -3971,6 +3972,8 @@ static int build_sit_entries(struct f2fs_sb_info *sbi)
> >>>>>>  
> >>>>>>  			se = &sit_i->sentries[start];
> >>>>>>  			page = get_current_sit_page(sbi, start);
> >>>>>> +			if (IS_ERR(page))
> >>>>>> +				return PTR_ERR(page);
> >>>>>>  			sit_blk = (struct f2fs_sit_block *)page_address(page);
> >>>>>>  			sit = sit_blk->entries[SIT_ENTRY_OFFSET(sit_i, start)];
> >>>>>>  			f2fs_put_page(page, 1);
> >>>>>>
> >>>>
> >>>> .
> >>>>
> >>
> >> .
> >>
> > 
> > 
> > 
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 
> > .
> > 

  reply	other threads:[~2018-09-26 19:49 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-18  2:18 [PATCH 1/2] f2fs: report ENOENT correct in f2fs_rename Jaegeuk Kim
2018-09-18  2:18 ` [PATCH 2/2] f2fs: avoid f2fs_bug_on if f2fs_get_meta_page_nofail got EIO Jaegeuk Kim
2018-09-18 13:16   ` [f2fs-dev] " Chao Yu
2018-09-18 17:55     ` Jaegeuk Kim
2018-09-18 17:56   ` [PATCH 2/2 v2] " Jaegeuk Kim
2018-09-19  1:45     ` [f2fs-dev] " Chao Yu
2018-09-19  1:45       ` Chao Yu
2018-09-19 22:47       ` Jaegeuk Kim
2018-09-20  6:07     ` Chao Yu
2018-09-20  6:07       ` Chao Yu
2018-09-20 21:46       ` Jaegeuk Kim
2018-09-20 21:46         ` Jaegeuk Kim
2018-09-21  1:30         ` [f2fs-dev] " Chao Yu
2018-09-21  1:30           ` Chao Yu
2018-09-26  0:18           ` Jaegeuk Kim
2018-09-26  0:18             ` Jaegeuk Kim
2018-09-26  1:10             ` [f2fs-dev] " Chao Yu
2018-09-26  1:10               ` Chao Yu
2018-09-26  8:31               ` Chao Yu
2018-09-26  8:31                 ` Chao Yu
2018-09-26 19:49                 ` Jaegeuk Kim [this message]
2018-09-27  1:14                   ` Chao Yu
2018-09-27  1:14                     ` Chao Yu
2018-09-27  4:58                     ` Jaegeuk Kim
2018-09-27  5:45                       ` Chao Yu
2018-09-27  5:45                         ` Chao Yu
2018-09-20 21:48     ` [f2fs-dev] [PATCH 2/2 v3] " Jaegeuk Kim
2018-09-20 21:48       ` Jaegeuk Kim
2018-09-27 12:09       ` [f2fs-dev] " Chao Yu
2018-09-18 12:47 ` [f2fs-dev] [PATCH 1/2] f2fs: report ENOENT correct in f2fs_rename Chao Yu
2018-09-18 17:59   ` Jaegeuk Kim
2018-09-18 17:59     ` Jaegeuk Kim
2018-09-19  1:48     ` [f2fs-dev] " Chao Yu
2018-09-19  1:48       ` Chao Yu
2018-09-19  0:47 ` Sahitya Tummala
2018-09-19 22:50 ` [PATCH 1/2 v2] " Jaegeuk Kim
2018-09-20  6:14   ` [f2fs-dev] " Chao Yu
2018-09-20  6:14     ` Chao Yu
2018-09-20 21:35   ` [f2fs-dev] [PATCH 1/2 v3] " Jaegeuk Kim
2018-09-21  1:31     ` Chao Yu
2018-09-21  1:31       ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180926194932.GA44119@jaegeuk-macbookpro.roam.corp.google.com \
    --to=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.