All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <chao@kernel.org>
Cc: Yangtao Li <frank.li@vivo.com>,
	linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: reset free segment to prefree status when do_checkpoint() fail
Date: Fri, 30 Jul 2021 15:18:57 -0700	[thread overview]
Message-ID: <YQR60QUh0Pim8vSf@google.com> (raw)
In-Reply-To: <8d2e3a63-72f9-bcb2-24e5-dddd84136001@kernel.org>

On 07/20, Chao Yu wrote:
> On 2021/7/20 2:25, Jaegeuk Kim wrote:
> > On 07/19, Chao Yu wrote:
> > > On 2021/4/27 20:37, Chao Yu wrote:
> > > > I think just reverting dirty/free bitmap is not enough if checkpoint fails,
> > > > due to we have updated sbi->cur_cp_pack and nat/sit bitmap, next CP tries
> > > > to overwrite last valid meta/node/data, then filesystem will be corrupted.
> > > > 
> > > > So I suggest to set cp_error if do_checkpoint() fails until we can handle
> > > > all cases, which is not so easy.
> > > > 
> > > > How do you think?
> > > 
> > > Let's add below patch first before you figure out the patch which covers all
> > > things.
> > > 
> > >  From 3af957c98e9e04259f8bb93ca0b74ba164f3f27e Mon Sep 17 00:00:00 2001
> > > From: Chao Yu <chao@kernel.org>
> > > Date: Mon, 19 Jul 2021 16:37:44 +0800
> > > Subject: [PATCH] f2fs: fix to stop filesystem update once CP failed
> > > 
> > > During f2fs_write_checkpoint(), once we failed in
> > > f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
> > > such as prefree bitmap, nat/sit version bitmap won't be recovered,
> > > it may cause f2fs image to be inconsistent, let's just set CP error
> > > flag to avoid further updates until we figure out a scheme to rollback
> > > all metadatas in such condition.
> > > 
> > > Reported-by: Yangtao Li <frank.li@vivo.com>
> > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > Signed-off-by: Chao Yu <chao@kernel.org>
> > > ---
> > >   fs/f2fs/checkpoint.c | 10 +++++++---
> > >   1 file changed, 7 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > > index 6c208108d69c..096c85022f62 100644
> > > --- a/fs/f2fs/checkpoint.c
> > > +++ b/fs/f2fs/checkpoint.c
> > > @@ -1639,8 +1639,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > > 
> > >   	/* write cached NAT/SIT entries to NAT/SIT area */
> > >   	err = f2fs_flush_nat_entries(sbi, cpc);
> > > -	if (err)
> > > +	if (err) {
> > > +		f2fs_stop_checkpoint(sbi, false);
> > 
> > I think we should abuse this, since we can get any known ENOMEM as well.
> 
> Yup, but one critical issue here is it can break A/B update of NAT area,
> so, in order to fix this hole, how about using NOFAIL memory allocation
> in f2fs_flush_nat_entries() first until we figure out the finial scheme?

NOFAIL is risky, so how about adding a retry logic on ENOMEM with a message
and then giving up if we can't get the memory? BTW, what about EIO or other
family?

> 
> Thanks,
> 
> > 
> > >   		goto stop;
> > > +	}
> > > 
> > >   	f2fs_flush_sit_entries(sbi, cpc);
> > > 
> > > @@ -1648,10 +1650,12 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > >   	f2fs_save_inmem_curseg(sbi);
> > > 
> > >   	err = do_checkpoint(sbi, cpc);
> > > -	if (err)
> > > +	if (err) {
> > > +		f2fs_stop_checkpoint(sbi, false);
> > >   		f2fs_release_discard_addrs(sbi);
> > > -	else
> > > +	} else {
> > >   		f2fs_clear_prefree_segments(sbi, cpc);
> > > +	}
> > > 
> > >   	f2fs_restore_inmem_curseg(sbi);
> > >   stop:
> > > -- 
> > > 2.22.1

WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <chao@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, Yangtao Li <frank.li@vivo.com>
Subject: Re: [f2fs-dev] [PATCH] f2fs: reset free segment to prefree status when do_checkpoint() fail
Date: Fri, 30 Jul 2021 15:18:57 -0700	[thread overview]
Message-ID: <YQR60QUh0Pim8vSf@google.com> (raw)
In-Reply-To: <8d2e3a63-72f9-bcb2-24e5-dddd84136001@kernel.org>

On 07/20, Chao Yu wrote:
> On 2021/7/20 2:25, Jaegeuk Kim wrote:
> > On 07/19, Chao Yu wrote:
> > > On 2021/4/27 20:37, Chao Yu wrote:
> > > > I think just reverting dirty/free bitmap is not enough if checkpoint fails,
> > > > due to we have updated sbi->cur_cp_pack and nat/sit bitmap, next CP tries
> > > > to overwrite last valid meta/node/data, then filesystem will be corrupted.
> > > > 
> > > > So I suggest to set cp_error if do_checkpoint() fails until we can handle
> > > > all cases, which is not so easy.
> > > > 
> > > > How do you think?
> > > 
> > > Let's add below patch first before you figure out the patch which covers all
> > > things.
> > > 
> > >  From 3af957c98e9e04259f8bb93ca0b74ba164f3f27e Mon Sep 17 00:00:00 2001
> > > From: Chao Yu <chao@kernel.org>
> > > Date: Mon, 19 Jul 2021 16:37:44 +0800
> > > Subject: [PATCH] f2fs: fix to stop filesystem update once CP failed
> > > 
> > > During f2fs_write_checkpoint(), once we failed in
> > > f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
> > > such as prefree bitmap, nat/sit version bitmap won't be recovered,
> > > it may cause f2fs image to be inconsistent, let's just set CP error
> > > flag to avoid further updates until we figure out a scheme to rollback
> > > all metadatas in such condition.
> > > 
> > > Reported-by: Yangtao Li <frank.li@vivo.com>
> > > Signed-off-by: Yangtao Li <frank.li@vivo.com>
> > > Signed-off-by: Chao Yu <chao@kernel.org>
> > > ---
> > >   fs/f2fs/checkpoint.c | 10 +++++++---
> > >   1 file changed, 7 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > > index 6c208108d69c..096c85022f62 100644
> > > --- a/fs/f2fs/checkpoint.c
> > > +++ b/fs/f2fs/checkpoint.c
> > > @@ -1639,8 +1639,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > > 
> > >   	/* write cached NAT/SIT entries to NAT/SIT area */
> > >   	err = f2fs_flush_nat_entries(sbi, cpc);
> > > -	if (err)
> > > +	if (err) {
> > > +		f2fs_stop_checkpoint(sbi, false);
> > 
> > I think we should abuse this, since we can get any known ENOMEM as well.
> 
> Yup, but one critical issue here is it can break A/B update of NAT area,
> so, in order to fix this hole, how about using NOFAIL memory allocation
> in f2fs_flush_nat_entries() first until we figure out the finial scheme?

NOFAIL is risky, so how about adding a retry logic on ENOMEM with a message
and then giving up if we can't get the memory? BTW, what about EIO or other
family?

> 
> Thanks,
> 
> > 
> > >   		goto stop;
> > > +	}
> > > 
> > >   	f2fs_flush_sit_entries(sbi, cpc);
> > > 
> > > @@ -1648,10 +1650,12 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > >   	f2fs_save_inmem_curseg(sbi);
> > > 
> > >   	err = do_checkpoint(sbi, cpc);
> > > -	if (err)
> > > +	if (err) {
> > > +		f2fs_stop_checkpoint(sbi, false);
> > >   		f2fs_release_discard_addrs(sbi);
> > > -	else
> > > +	} else {
> > >   		f2fs_clear_prefree_segments(sbi, cpc);
> > > +	}
> > > 
> > >   	f2fs_restore_inmem_curseg(sbi);
> > >   stop:
> > > -- 
> > > 2.22.1


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  parent reply	other threads:[~2021-07-30 22:19 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27  8:21 [PATCH] f2fs: reset free segment to prefree status when do_checkpoint() fail Yangtao Li
2021-04-27  8:21 ` [f2fs-dev] " Yangtao Li
2021-04-27 12:37 ` Chao Yu
2021-04-27 12:37   ` Chao Yu
2021-07-19  8:54   ` Chao Yu
2021-07-19  8:54     ` Chao Yu
2021-07-19 18:25     ` Jaegeuk Kim
2021-07-19 18:25       ` Jaegeuk Kim
2021-07-20  0:04       ` Chao Yu
2021-07-20  0:04         ` Chao Yu
2021-07-29  1:42         ` Chao Yu
2021-07-29  1:42           ` Chao Yu
2021-07-30 22:18         ` Jaegeuk Kim [this message]
2021-07-30 22:18           ` Jaegeuk Kim
2021-08-01  9:59           ` Chao Yu
2021-08-01  9:59             ` Chao Yu
2021-08-02 17:59             ` Jaegeuk Kim
2021-08-02 17:59               ` Jaegeuk Kim
2021-08-03  1:00               ` Chao Yu
2021-08-03  1:00                 ` Chao Yu
2021-08-03  1:44                 ` Jaegeuk Kim
2021-08-03  1:44                   ` Jaegeuk Kim
2021-08-03  2:57                   ` Chao Yu
2021-08-03  2:57                     ` Chao Yu
2021-08-03 18:14                     ` Jaegeuk Kim
2021-08-03 18:14                       ` Jaegeuk Kim
2021-08-03 18:15                     ` Jaegeuk Kim
2021-08-03 18:15                       ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQR60QUh0Pim8vSf@google.com \
    --to=jaegeuk@kernel.org \
    --cc=chao@kernel.org \
    --cc=frank.li@vivo.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.