* [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
@ 2017-12-20 5:03 Yunlei He
2017-12-20 21:00 ` Jaegeuk Kim
2017-12-25 2:33 ` Chao Yu
0 siblings, 2 replies; 6+ messages in thread
From: Yunlei He @ 2017-12-20 5:03 UTC (permalink / raw)
To: jaegeuk, yuchao0, linux-f2fs-devel; +Cc: ning.jia, heyunlei
v4 -> v5 return err instead destory inode list for two reasons:
i. avoid duplicated destroy inode list in function recover_fsync_data.
ii. report an error for recovery, and set need_fsck flag
Came across a dead loop in recovery like this:
......
[ 24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
[ 24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
[ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[ 24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
......
Mount process will block in dead loop and fsck can do nothing with this
error, This patch abandon recovery if node chain is cyclical.
Signed-off-by: Yunlei He <heyunlei@huawei.com>
---
fs/f2fs/recovery.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index d025aa8..a535ec2 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
return 0;
page = get_tmp_page(sbi, blkaddr);
+ if (PageChecked(page)) {
+ f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
+ err = -EINVAL;
+ break;
+ }
+
+ /*
+ * it's not needed to clear PG_checked flag in temp page since we
+ * will truncate all those pages in the end of recovery.
+ */
+ SetPageChecked(page);
if (!is_recoverable_dnode(page))
break;
--
1.9.1
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
2017-12-20 5:03 [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes Yunlei He
@ 2017-12-20 21:00 ` Jaegeuk Kim
2017-12-21 3:45 ` heyunlei
2017-12-25 2:33 ` Chao Yu
1 sibling, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2017-12-20 21:00 UTC (permalink / raw)
To: Yunlei He; +Cc: ning.jia, linux-f2fs-devel
Hi Yunlei,
On 12/20, Yunlei He wrote:
> v4 -> v5 return err instead destory inode list for two reasons:
> i. avoid duplicated destroy inode list in function recover_fsync_data.
> ii. report an error for recovery, and set need_fsck flag
>
> Came across a dead loop in recovery like this:
>
> ......
> [ 24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
> [ 24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> ......
>
> Mount process will block in dead loop and fsck can do nothing with this
> error, This patch abandon recovery if node chain is cyclical.
>
> Signed-off-by: Yunlei He <heyunlei@huawei.com>
> ---
> fs/f2fs/recovery.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index d025aa8..a535ec2 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
> return 0;
>
> page = get_tmp_page(sbi, blkaddr);
> + if (PageChecked(page)) {
> + f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
> + err = -EINVAL;
Is there any problem, if we break here without error, so that we could recover
some of the inodes in the chain as best efforts?
Thanks,
> + break;
> + }
> +
> + /*
> + * it's not needed to clear PG_checked flag in temp page since we
> + * will truncate all those pages in the end of recovery.
> + */
> + SetPageChecked(page);
>
> if (!is_recoverable_dnode(page))
> break;
> --
> 1.9.1
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
2017-12-20 21:00 ` Jaegeuk Kim
@ 2017-12-21 3:45 ` heyunlei
0 siblings, 0 replies; 6+ messages in thread
From: heyunlei @ 2017-12-21 3:45 UTC (permalink / raw)
To: Jaegeuk Kim; +Cc: Jianing (Euler), linux-f2fs-devel
>-----Original Message-----
>From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
>Sent: Thursday, December 21, 2017 5:01 AM
>To: heyunlei
>Cc: Yuchao (T); linux-f2fs-devel@lists.sourceforge.net; Wangbintian; Jianing (Euler)
>Subject: Re: [f2fs-dev][PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
>
>Hi Yunlei,
Hi Kim,
>
>On 12/20, Yunlei He wrote:
>> v4 -> v5 return err instead destory inode list for two reasons:
>> i. avoid duplicated destroy inode list in function recover_fsync_data.
>> ii. report an error for recovery, and set need_fsck flag
>>
>> Came across a dead loop in recovery like this:
>>
>> ......
>> [ 24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
>> [ 24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
>> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [ 24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> ......
>>
>> Mount process will block in dead loop and fsck can do nothing with this
>> error, This patch abandon recovery if node chain is cyclical.
>>
>> Signed-off-by: Yunlei He <heyunlei@huawei.com>
>> ---
>> fs/f2fs/recovery.c | 11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>> index d025aa8..a535ec2 100644
>> --- a/fs/f2fs/recovery.c
>> +++ b/fs/f2fs/recovery.c
>> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
>> return 0;
>>
>> page = get_tmp_page(sbi, blkaddr);
>> + if (PageChecked(page)) {
>> + f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
>> + err = -EINVAL;
>
>Is there any problem, if we break here without error, so that we could recover
>some of the inodes in the chain as best efforts?
I think it's a corner case caused by some weird reason. Maybe data in recover node
Chain is also polluted and unreliable. So I think we'd better abandon this recovery.
Thanks,
>
>Thanks,
>
>> + break;
>> + }
>> +
>> + /*
>> + * it's not needed to clear PG_checked flag in temp page since we
>> + * will truncate all those pages in the end of recovery.
>> + */
>> + SetPageChecked(page);
>>
>> if (!is_recoverable_dnode(page))
>> break;
>> --
>> 1.9.1
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
2017-12-20 5:03 [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes Yunlei He
2017-12-20 21:00 ` Jaegeuk Kim
@ 2017-12-25 2:33 ` Chao Yu
2017-12-28 2:37 ` Jaegeuk Kim
1 sibling, 1 reply; 6+ messages in thread
From: Chao Yu @ 2017-12-25 2:33 UTC (permalink / raw)
To: Yunlei He, jaegeuk, linux-f2fs-devel; +Cc: ning.jia
On 2017/12/20 13:03, Yunlei He wrote:
> v4 -> v5 return err instead destory inode list for two reasons:
> i. avoid duplicated destroy inode list in function recover_fsync_data.
> ii. report an error for recovery, and set need_fsck flag
fsck can't fix this issue so far, it doesn't work even we set need_fsck flag,
would it be better to just drop last dnode with corrupted next_blkaddr and try
our best to recover fsynced data?
Thanks,
>
> Came across a dead loop in recovery like this:
>
> ......
> [ 24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
> [ 24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [ 24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> ......
>
> Mount process will block in dead loop and fsck can do nothing with this
> error, This patch abandon recovery if node chain is cyclical.
>
> Signed-off-by: Yunlei He <heyunlei@huawei.com>
> ---
> fs/f2fs/recovery.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index d025aa8..a535ec2 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
> return 0;
>
> page = get_tmp_page(sbi, blkaddr);
> + if (PageChecked(page)) {
> + f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
> + err = -EINVAL;
> + break;
> + }
> +
> + /*
> + * it's not needed to clear PG_checked flag in temp page since we
> + * will truncate all those pages in the end of recovery.
> + */
> + SetPageChecked(page);
>
> if (!is_recoverable_dnode(page))
> break;
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
2017-12-25 2:33 ` Chao Yu
@ 2017-12-28 2:37 ` Jaegeuk Kim
2018-01-04 3:07 ` Chao Yu
0 siblings, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2017-12-28 2:37 UTC (permalink / raw)
To: Chao Yu; +Cc: ning.jia, Yunlei He, linux-f2fs-devel
On 12/25, Chao Yu wrote:
> On 2017/12/20 13:03, Yunlei He wrote:
> > v4 -> v5 return err instead destory inode list for two reasons:
> > i. avoid duplicated destroy inode list in function recover_fsync_data.
> > ii. report an error for recovery, and set need_fsck flag
>
> fsck can't fix this issue so far, it doesn't work even we set need_fsck flag,
> would it be better to just drop last dnode with corrupted next_blkaddr and try
> our best to recover fsynced data?
Yup, agreed to this. There's no way to recover this, if we just return an error.
>
> Thanks,
>
> >
> > Came across a dead loop in recovery like this:
> >
> > ......
> > [ 24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
> > [ 24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
> > [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [ 24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > ......
> >
> > Mount process will block in dead loop and fsck can do nothing with this
> > error, This patch abandon recovery if node chain is cyclical.
> >
> > Signed-off-by: Yunlei He <heyunlei@huawei.com>
> > ---
> > fs/f2fs/recovery.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index d025aa8..a535ec2 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
> > return 0;
> >
> > page = get_tmp_page(sbi, blkaddr);
> > + if (PageChecked(page)) {
> > + f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
> > + err = -EINVAL;
> > + break;
> > + }
> > +
> > + /*
> > + * it's not needed to clear PG_checked flag in temp page since we
> > + * will truncate all those pages in the end of recovery.
> > + */
> > + SetPageChecked(page);
> >
> > if (!is_recoverable_dnode(page))
> > break;
> >
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
2017-12-28 2:37 ` Jaegeuk Kim
@ 2018-01-04 3:07 ` Chao Yu
0 siblings, 0 replies; 6+ messages in thread
From: Chao Yu @ 2018-01-04 3:07 UTC (permalink / raw)
To: Yunlei He; +Cc: Jaegeuk Kim, ning.jia, linux-f2fs-devel
Hi, Yunlei,
Could you rework this patch as we discussed?
Thanks,
On 2017/12/28 10:37, Jaegeuk Kim wrote:
> On 12/25, Chao Yu wrote:
>> On 2017/12/20 13:03, Yunlei He wrote:
>>> v4 -> v5 return err instead destory inode list for two reasons:
>>> i. avoid duplicated destroy inode list in function recover_fsync_data.
>>> ii. report an error for recovery, and set need_fsck flag
>>
>> fsck can't fix this issue so far, it doesn't work even we set need_fsck flag,
>> would it be better to just drop last dnode with corrupted next_blkaddr and try
>> our best to recover fsynced data?
>
> Yup, agreed to this. There's no way to recover this, if we just return an error.
>
>>
>> Thanks,
>>
>>>
>>> Came across a dead loop in recovery like this:
>>>
>>> ......
>>> [ 24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
>>> [ 24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
>>> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [ 24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> ......
>>>
>>> Mount process will block in dead loop and fsck can do nothing with this
>>> error, This patch abandon recovery if node chain is cyclical.
>>>
>>> Signed-off-by: Yunlei He <heyunlei@huawei.com>
>>> ---
>>> fs/f2fs/recovery.c | 11 +++++++++++
>>> 1 file changed, 11 insertions(+)
>>>
>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>> index d025aa8..a535ec2 100644
>>> --- a/fs/f2fs/recovery.c
>>> +++ b/fs/f2fs/recovery.c
>>> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
>>> return 0;
>>>
>>> page = get_tmp_page(sbi, blkaddr);
>>> + if (PageChecked(page)) {
>>> + f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
>>> + err = -EINVAL;
>>> + break;
>>> + }
>>> +
>>> + /*
>>> + * it's not needed to clear PG_checked flag in temp page since we
>>> + * will truncate all those pages in the end of recovery.
>>> + */
>>> + SetPageChecked(page);
>>>
>>> if (!is_recoverable_dnode(page))
>>> break;
>>>
>
> .
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-01-04 3:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20 5:03 [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes Yunlei He
2017-12-20 21:00 ` Jaegeuk Kim
2017-12-21 3:45 ` heyunlei
2017-12-25 2:33 ` Chao Yu
2017-12-28 2:37 ` Jaegeuk Kim
2018-01-04 3:07 ` Chao Yu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.