All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <chao@kernel.org>
Cc: Yangtao Li <frank.li@vivo.com>,
	linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: reset free segment to prefree status when do_checkpoint() fail
Date: Mon, 2 Aug 2021 10:59:17 -0700	[thread overview]
Message-ID: <YQgydetYHOkgY9+B@google.com> (raw)
In-Reply-To: <355ac2ff-f1f1-b9ea-bd8c-139cb24a03fb@kernel.org>

On 08/01, Chao Yu wrote:
> On 2021/7/31 6:18, Jaegeuk Kim wrote:
> > On 07/20, Chao Yu wrote:
> > > On 2021/7/20 2:25, Jaegeuk Kim wrote:
> > > > On 07/19, Chao Yu wrote:
> > > > > On 2021/4/27 20:37, Chao Yu wrote:
> > > > > > I think just reverting dirty/free bitmap is not enough if checkpoint fails,
> > > > > > due to we have updated sbi->cur_cp_pack and nat/sit bitmap, next CP tries
> > > > > > to overwrite last valid meta/node/data, then filesystem will be corrupted.
> > > > > > 
> > > > > > So I suggest to set cp_error if do_checkpoint() fails until we can handle
> > > > > > all cases, which is not so easy.
> > > > > > 
> > > > > > How do you think?
> > > > > 
> > > > > Let's add below patch first before you figure out the patch which covers all
> > > > > things.
> > > > > 
> > > > >   From 3af957c98e9e04259f8bb93ca0b74ba164f3f27e Mon Sep 17 00:00:00 2001
> > > > > From: Chao Yu <chao@kernel.org>
> > > > > Date: Mon, 19 Jul 2021 16:37:44 +0800
> > > > > Subject: [PATCH] f2fs: fix to stop filesystem update once CP failed
> > > > > 
> > > > > During f2fs_write_checkpoint(), once we failed in
> > > > > f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
> > > > > such as prefree bitmap, nat/sit version bitmap won't be recovered,
> > > > > it may cause f2fs image to be inconsistent, let's just set CP error
> > > > > flag to avoid further updates until we figure out a scheme to rollback
> > > > > all metadatas in such condition.
> > > > > 
> > > > > Reported-by: Yangtao Li <frank.li@vivo.com>
> > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > > > Signed-off-by: Chao Yu <chao@kernel.org>
> > > > > ---
> > > > >    fs/f2fs/checkpoint.c | 10 +++++++---
> > > > >    1 file changed, 7 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > > > > index 6c208108d69c..096c85022f62 100644
> > > > > --- a/fs/f2fs/checkpoint.c
> > > > > +++ b/fs/f2fs/checkpoint.c
> > > > > @@ -1639,8 +1639,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > > > > 
> > > > >    	/* write cached NAT/SIT entries to NAT/SIT area */
> > > > >    	err = f2fs_flush_nat_entries(sbi, cpc);
> > > > > -	if (err)
> > > > > +	if (err) {
> > > > > +		f2fs_stop_checkpoint(sbi, false);
> > > > 
> > > > I think we should abuse this, since we can get any known ENOMEM as well.
> > > 
> > > Yup, but one critical issue here is it can break A/B update of NAT area,
> > > so, in order to fix this hole, how about using NOFAIL memory allocation
> > > in f2fs_flush_nat_entries() first until we figure out the finial scheme?
> > 
> > NOFAIL is risky, so how about adding a retry logic on ENOMEM with a message
> > and then giving up if we can't get the memory? BTW, what about EIO or other
> > family?
> 
> How about this?

Hmm, it seems we won't get ENOMEM.

__flush_nat_entry_set
 -> get_next_nat_page
   -> ...
    -> __get_meta_page
      -> repeat on ENOMEM, but stop_checkpoint on EIO

If we have an error here, we should have stopped checkpoint. Have you seen other
issue?

> 
> From ffb50d9a8220be7d9e159b8555533adcf11957a8 Mon Sep 17 00:00:00 2001
> From: Chao Yu <chao@kernel.org>
> Date: Mon, 19 Jul 2021 16:37:44 +0800
> Subject: [PATCH v2] f2fs: fix to stop filesystem update once CP failed
> 
> During f2fs_write_checkpoint(), once we failed in
> f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
> such as prefree bitmap, nat/sit version bitmap won't be recovered,
> it may cause f2fs image to be inconsistent, let's just set CP error
> flag to avoid further updates until we figure out a scheme to rollback
> all metadatas in such condition.
> 
> Reported-by: Yangtao Li <frank.li@vivo.com>
> Signed-off-by: Yangtao Li <frank.li@vivo.com>
> Signed-off-by: Chao Yu <chao@kernel.org>
> ---
>  fs/f2fs/checkpoint.c | 12 +++++++++---
>  fs/f2fs/node.c       |  9 ++++++++-
>  2 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 6c208108d69c..f3f66871ae42 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1639,8 +1639,11 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> 
>  	/* write cached NAT/SIT entries to NAT/SIT area */
>  	err = f2fs_flush_nat_entries(sbi, cpc);
> -	if (err)
> +	if (err) {
> +		f2fs_err(sbi, "f2fs_flush_nat_entries failed err:%d, stop checkpoint", err);
> +		f2fs_stop_checkpoint(sbi, false);
>  		goto stop;
> +	}
> 
>  	f2fs_flush_sit_entries(sbi, cpc);
> 
> @@ -1648,10 +1651,13 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>  	f2fs_save_inmem_curseg(sbi);
> 
>  	err = do_checkpoint(sbi, cpc);
> -	if (err)
> +	if (err) {
> +		f2fs_err(sbi, "do_checkpoint failed err:%d, stop checkpoint", err);
> +		f2fs_stop_checkpoint(sbi, false);
>  		f2fs_release_discard_addrs(sbi);
> -	else
> +	} else {
>  		f2fs_clear_prefree_segments(sbi, cpc);
> +	}
> 
>  	f2fs_restore_inmem_curseg(sbi);
>  stop:
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 5840b82ce311..7162836d71c1 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -141,13 +141,20 @@ static struct page *get_next_nat_page(struct f2fs_sb_info *sbi, nid_t nid)
>  	void *src_addr;
>  	void *dst_addr;
>  	struct f2fs_nm_info *nm_i = NM_I(sbi);
> +	bool retried = false;
> 
>  	dst_off = next_nat_addr(sbi, current_nat_addr(sbi, nid));
> 
> +retry:
>  	/* get current nat block page with lock */
>  	src_page = get_current_nat_page(sbi, nid);
> -	if (IS_ERR(src_page))
> +	if (IS_ERR(src_page)) {
> +		if (PTR_ERR(src_page) == -ENOMEM && !retried) {
> +			retried = true;
> +			goto retry;
> +		}
>  		return src_page;
> +	}
>  	dst_page = f2fs_grab_meta_page(sbi, dst_off);
>  	f2fs_bug_on(sbi, PageDirty(src_page));
> 
> -- 
> 2.22.1
> 

WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <chao@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, Yangtao Li <frank.li@vivo.com>
Subject: Re: [f2fs-dev] [PATCH] f2fs: reset free segment to prefree status when do_checkpoint() fail
Date: Mon, 2 Aug 2021 10:59:17 -0700	[thread overview]
Message-ID: <YQgydetYHOkgY9+B@google.com> (raw)
In-Reply-To: <355ac2ff-f1f1-b9ea-bd8c-139cb24a03fb@kernel.org>

On 08/01, Chao Yu wrote:
> On 2021/7/31 6:18, Jaegeuk Kim wrote:
> > On 07/20, Chao Yu wrote:
> > > On 2021/7/20 2:25, Jaegeuk Kim wrote:
> > > > On 07/19, Chao Yu wrote:
> > > > > On 2021/4/27 20:37, Chao Yu wrote:
> > > > > > I think just reverting dirty/free bitmap is not enough if checkpoint fails,
> > > > > > due to we have updated sbi->cur_cp_pack and nat/sit bitmap, next CP tries
> > > > > > to overwrite last valid meta/node/data, then filesystem will be corrupted.
> > > > > > 
> > > > > > So I suggest to set cp_error if do_checkpoint() fails until we can handle
> > > > > > all cases, which is not so easy.
> > > > > > 
> > > > > > How do you think?
> > > > > 
> > > > > Let's add below patch first before you figure out the patch which covers all
> > > > > things.
> > > > > 
> > > > >   From 3af957c98e9e04259f8bb93ca0b74ba164f3f27e Mon Sep 17 00:00:00 2001
> > > > > From: Chao Yu <chao@kernel.org>
> > > > > Date: Mon, 19 Jul 2021 16:37:44 +0800
> > > > > Subject: [PATCH] f2fs: fix to stop filesystem update once CP failed
> > > > > 
> > > > > During f2fs_write_checkpoint(), once we failed in
> > > > > f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
> > > > > such as prefree bitmap, nat/sit version bitmap won't be recovered,
> > > > > it may cause f2fs image to be inconsistent, let's just set CP error
> > > > > flag to avoid further updates until we figure out a scheme to rollback
> > > > > all metadatas in such condition.
> > > > > 
> > > > > Reported-by: Yangtao Li <frank.li@vivo.com>
> > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > > > Signed-off-by: Chao Yu <chao@kernel.org>
> > > > > ---
> > > > >    fs/f2fs/checkpoint.c | 10 +++++++---
> > > > >    1 file changed, 7 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > > > > index 6c208108d69c..096c85022f62 100644
> > > > > --- a/fs/f2fs/checkpoint.c
> > > > > +++ b/fs/f2fs/checkpoint.c
> > > > > @@ -1639,8 +1639,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > > > > 
> > > > >    	/* write cached NAT/SIT entries to NAT/SIT area */
> > > > >    	err = f2fs_flush_nat_entries(sbi, cpc);
> > > > > -	if (err)
> > > > > +	if (err) {
> > > > > +		f2fs_stop_checkpoint(sbi, false);
> > > > 
> > > > I think we should abuse this, since we can get any known ENOMEM as well.
> > > 
> > > Yup, but one critical issue here is it can break A/B update of NAT area,
> > > so, in order to fix this hole, how about using NOFAIL memory allocation
> > > in f2fs_flush_nat_entries() first until we figure out the finial scheme?
> > 
> > NOFAIL is risky, so how about adding a retry logic on ENOMEM with a message
> > and then giving up if we can't get the memory? BTW, what about EIO or other
> > family?
> 
> How about this?

Hmm, it seems we won't get ENOMEM.

__flush_nat_entry_set
 -> get_next_nat_page
   -> ...
    -> __get_meta_page
      -> repeat on ENOMEM, but stop_checkpoint on EIO

If we have an error here, we should have stopped checkpoint. Have you seen other
issue?

> 
> From ffb50d9a8220be7d9e159b8555533adcf11957a8 Mon Sep 17 00:00:00 2001
> From: Chao Yu <chao@kernel.org>
> Date: Mon, 19 Jul 2021 16:37:44 +0800
> Subject: [PATCH v2] f2fs: fix to stop filesystem update once CP failed
> 
> During f2fs_write_checkpoint(), once we failed in
> f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
> such as prefree bitmap, nat/sit version bitmap won't be recovered,
> it may cause f2fs image to be inconsistent, let's just set CP error
> flag to avoid further updates until we figure out a scheme to rollback
> all metadatas in such condition.
> 
> Reported-by: Yangtao Li <frank.li@vivo.com>
> Signed-off-by: Yangtao Li <frank.li@vivo.com>
> Signed-off-by: Chao Yu <chao@kernel.org>
> ---
>  fs/f2fs/checkpoint.c | 12 +++++++++---
>  fs/f2fs/node.c       |  9 ++++++++-
>  2 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 6c208108d69c..f3f66871ae42 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1639,8 +1639,11 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> 
>  	/* write cached NAT/SIT entries to NAT/SIT area */
>  	err = f2fs_flush_nat_entries(sbi, cpc);
> -	if (err)
> +	if (err) {
> +		f2fs_err(sbi, "f2fs_flush_nat_entries failed err:%d, stop checkpoint", err);
> +		f2fs_stop_checkpoint(sbi, false);
>  		goto stop;
> +	}
> 
>  	f2fs_flush_sit_entries(sbi, cpc);
> 
> @@ -1648,10 +1651,13 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>  	f2fs_save_inmem_curseg(sbi);
> 
>  	err = do_checkpoint(sbi, cpc);
> -	if (err)
> +	if (err) {
> +		f2fs_err(sbi, "do_checkpoint failed err:%d, stop checkpoint", err);
> +		f2fs_stop_checkpoint(sbi, false);
>  		f2fs_release_discard_addrs(sbi);
> -	else
> +	} else {
>  		f2fs_clear_prefree_segments(sbi, cpc);
> +	}
> 
>  	f2fs_restore_inmem_curseg(sbi);
>  stop:
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 5840b82ce311..7162836d71c1 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -141,13 +141,20 @@ static struct page *get_next_nat_page(struct f2fs_sb_info *sbi, nid_t nid)
>  	void *src_addr;
>  	void *dst_addr;
>  	struct f2fs_nm_info *nm_i = NM_I(sbi);
> +	bool retried = false;
> 
>  	dst_off = next_nat_addr(sbi, current_nat_addr(sbi, nid));
> 
> +retry:
>  	/* get current nat block page with lock */
>  	src_page = get_current_nat_page(sbi, nid);
> -	if (IS_ERR(src_page))
> +	if (IS_ERR(src_page)) {
> +		if (PTR_ERR(src_page) == -ENOMEM && !retried) {
> +			retried = true;
> +			goto retry;
> +		}
>  		return src_page;
> +	}
>  	dst_page = f2fs_grab_meta_page(sbi, dst_off);
>  	f2fs_bug_on(sbi, PageDirty(src_page));
> 
> -- 
> 2.22.1
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2021-08-02 17:59 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27  8:21 [PATCH] f2fs: reset free segment to prefree status when do_checkpoint() fail Yangtao Li
2021-04-27  8:21 ` [f2fs-dev] " Yangtao Li
2021-04-27 12:37 ` Chao Yu
2021-04-27 12:37   ` Chao Yu
2021-07-19  8:54   ` Chao Yu
2021-07-19  8:54     ` Chao Yu
2021-07-19 18:25     ` Jaegeuk Kim
2021-07-19 18:25       ` Jaegeuk Kim
2021-07-20  0:04       ` Chao Yu
2021-07-20  0:04         ` Chao Yu
2021-07-29  1:42         ` Chao Yu
2021-07-29  1:42           ` Chao Yu
2021-07-30 22:18         ` Jaegeuk Kim
2021-07-30 22:18           ` Jaegeuk Kim
2021-08-01  9:59           ` Chao Yu
2021-08-01  9:59             ` Chao Yu
2021-08-02 17:59             ` Jaegeuk Kim [this message]
2021-08-02 17:59               ` Jaegeuk Kim
2021-08-03  1:00               ` Chao Yu
2021-08-03  1:00                 ` Chao Yu
2021-08-03  1:44                 ` Jaegeuk Kim
2021-08-03  1:44                   ` Jaegeuk Kim
2021-08-03  2:57                   ` Chao Yu
2021-08-03  2:57                     ` Chao Yu
2021-08-03 18:14                     ` Jaegeuk Kim
2021-08-03 18:14                       ` Jaegeuk Kim
2021-08-03 18:15                     ` Jaegeuk Kim
2021-08-03 18:15                       ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQgydetYHOkgY9+B@google.com \
    --to=jaegeuk@kernel.org \
    --cc=chao@kernel.org \
    --cc=frank.li@vivo.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.