All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
@ 2017-12-20  5:03 Yunlei He
  2017-12-20 21:00 ` Jaegeuk Kim
  2017-12-25  2:33 ` Chao Yu
  0 siblings, 2 replies; 6+ messages in thread
From: Yunlei He @ 2017-12-20  5:03 UTC (permalink / raw)
  To: jaegeuk, yuchao0, linux-f2fs-devel; +Cc: ning.jia, heyunlei

v4 -> v5 return err instead destory inode list for two reasons:
i.  avoid duplicated destroy inode list in function recover_fsync_data.
ii. report an error for recovery, and set need_fsck flag
        
Came across a dead loop in recovery like this:

......
[   24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
[   24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
[   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
[   24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
......

Mount process will block in dead loop and fsck can do nothing with this
error, This patch abandon recovery if node chain is cyclical.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
---
 fs/f2fs/recovery.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index d025aa8..a535ec2 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
 			return 0;
 
 		page = get_tmp_page(sbi, blkaddr);
+		if (PageChecked(page)) {
+			f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
+			err = -EINVAL;
+			break;
+		}
+
+		/*
+		 * it's not needed to clear PG_checked flag in temp page since we
+		 * will truncate all those pages in the end of recovery.
+		 */
+		SetPageChecked(page);
 
 		if (!is_recoverable_dnode(page))
 			break;
-- 
1.9.1


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
  2017-12-20  5:03 [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes Yunlei He
@ 2017-12-20 21:00 ` Jaegeuk Kim
  2017-12-21  3:45   ` heyunlei
  2017-12-25  2:33 ` Chao Yu
  1 sibling, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2017-12-20 21:00 UTC (permalink / raw)
  To: Yunlei He; +Cc: ning.jia, linux-f2fs-devel

Hi Yunlei,

On 12/20, Yunlei He wrote:
> v4 -> v5 return err instead destory inode list for two reasons:
> i.  avoid duplicated destroy inode list in function recover_fsync_data.
> ii. report an error for recovery, and set need_fsck flag
>         
> Came across a dead loop in recovery like this:
> 
> ......
> [   24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
> [   24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> ......
> 
> Mount process will block in dead loop and fsck can do nothing with this
> error, This patch abandon recovery if node chain is cyclical.
> 
> Signed-off-by: Yunlei He <heyunlei@huawei.com>
> ---
>  fs/f2fs/recovery.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index d025aa8..a535ec2 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
>  			return 0;
>  
>  		page = get_tmp_page(sbi, blkaddr);
> +		if (PageChecked(page)) {
> +			f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
> +			err = -EINVAL;

Is there any problem, if we break here without error, so that we could recover
some of the inodes in the chain as best efforts?

Thanks,

> +			break;
> +		}
> +
> +		/*
> +		 * it's not needed to clear PG_checked flag in temp page since we
> +		 * will truncate all those pages in the end of recovery.
> +		 */
> +		SetPageChecked(page);
>  
>  		if (!is_recoverable_dnode(page))
>  			break;
> -- 
> 1.9.1

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
  2017-12-20 21:00 ` Jaegeuk Kim
@ 2017-12-21  3:45   ` heyunlei
  0 siblings, 0 replies; 6+ messages in thread
From: heyunlei @ 2017-12-21  3:45 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Jianing (Euler), linux-f2fs-devel



>-----Original Message-----
>From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
>Sent: Thursday, December 21, 2017 5:01 AM
>To: heyunlei
>Cc: Yuchao (T); linux-f2fs-devel@lists.sourceforge.net; Wangbintian; Jianing (Euler)
>Subject: Re: [f2fs-dev][PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
>
>Hi Yunlei,
Hi Kim,
>
>On 12/20, Yunlei He wrote:
>> v4 -> v5 return err instead destory inode list for two reasons:
>> i.  avoid duplicated destroy inode list in function recover_fsync_data.
>> ii. report an error for recovery, and set need_fsck flag
>>
>> Came across a dead loop in recovery like this:
>>
>> ......
>> [   24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
>> [   24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
>> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> [   24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>> ......
>>
>> Mount process will block in dead loop and fsck can do nothing with this
>> error, This patch abandon recovery if node chain is cyclical.
>>
>> Signed-off-by: Yunlei He <heyunlei@huawei.com>
>> ---
>>  fs/f2fs/recovery.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>> index d025aa8..a535ec2 100644
>> --- a/fs/f2fs/recovery.c
>> +++ b/fs/f2fs/recovery.c
>> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
>>  			return 0;
>>
>>  		page = get_tmp_page(sbi, blkaddr);
>> +		if (PageChecked(page)) {
>> +			f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
>> +			err = -EINVAL;
>
>Is there any problem, if we break here without error, so that we could recover
>some of the inodes in the chain as best efforts?

I think it's a corner case caused by some weird reason. Maybe data in recover node
Chain is also polluted and unreliable. So I think we'd better abandon this recovery.  

Thanks,
>
>Thanks,
>
>> +			break;
>> +		}
>> +
>> +		/*
>> +		 * it's not needed to clear PG_checked flag in temp page since we
>> +		 * will truncate all those pages in the end of recovery.
>> +		 */
>> +		SetPageChecked(page);
>>
>>  		if (!is_recoverable_dnode(page))
>>  			break;
>> --
>> 1.9.1
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
  2017-12-20  5:03 [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes Yunlei He
  2017-12-20 21:00 ` Jaegeuk Kim
@ 2017-12-25  2:33 ` Chao Yu
  2017-12-28  2:37   ` Jaegeuk Kim
  1 sibling, 1 reply; 6+ messages in thread
From: Chao Yu @ 2017-12-25  2:33 UTC (permalink / raw)
  To: Yunlei He, jaegeuk, linux-f2fs-devel; +Cc: ning.jia

On 2017/12/20 13:03, Yunlei He wrote:
> v4 -> v5 return err instead destory inode list for two reasons:
> i.  avoid duplicated destroy inode list in function recover_fsync_data.
> ii. report an error for recovery, and set need_fsck flag

fsck can't fix this issue so far, it doesn't work even we set need_fsck flag,
would it be better to just drop last dnode with corrupted next_blkaddr and try
our best to recover fsynced data?

Thanks,

>         
> Came across a dead loop in recovery like this:
> 
> ......
> [   24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
> [   24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> [   24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> ......
> 
> Mount process will block in dead loop and fsck can do nothing with this
> error, This patch abandon recovery if node chain is cyclical.
> 
> Signed-off-by: Yunlei He <heyunlei@huawei.com>
> ---
>  fs/f2fs/recovery.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index d025aa8..a535ec2 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
>  			return 0;
>  
>  		page = get_tmp_page(sbi, blkaddr);
> +		if (PageChecked(page)) {
> +			f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
> +			err = -EINVAL;
> +			break;
> +		}
> +
> +		/*
> +		 * it's not needed to clear PG_checked flag in temp page since we
> +		 * will truncate all those pages in the end of recovery.
> +		 */
> +		SetPageChecked(page);
>  
>  		if (!is_recoverable_dnode(page))
>  			break;
> 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
  2017-12-25  2:33 ` Chao Yu
@ 2017-12-28  2:37   ` Jaegeuk Kim
  2018-01-04  3:07     ` Chao Yu
  0 siblings, 1 reply; 6+ messages in thread
From: Jaegeuk Kim @ 2017-12-28  2:37 UTC (permalink / raw)
  To: Chao Yu; +Cc: ning.jia, Yunlei He, linux-f2fs-devel

On 12/25, Chao Yu wrote:
> On 2017/12/20 13:03, Yunlei He wrote:
> > v4 -> v5 return err instead destory inode list for two reasons:
> > i.  avoid duplicated destroy inode list in function recover_fsync_data.
> > ii. report an error for recovery, and set need_fsck flag
> 
> fsck can't fix this issue so far, it doesn't work even we set need_fsck flag,
> would it be better to just drop last dnode with corrupted next_blkaddr and try
> our best to recover fsynced data?

Yup, agreed to this. There's no way to recover this, if we just return an error.

> 
> Thanks,
> 
> >         
> > Came across a dead loop in recovery like this:
> > 
> > ......
> > [   24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
> > [   24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
> > [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > [   24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
> > ......
> > 
> > Mount process will block in dead loop and fsck can do nothing with this
> > error, This patch abandon recovery if node chain is cyclical.
> > 
> > Signed-off-by: Yunlei He <heyunlei@huawei.com>
> > ---
> >  fs/f2fs/recovery.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index d025aa8..a535ec2 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
> >  			return 0;
> >  
> >  		page = get_tmp_page(sbi, blkaddr);
> > +		if (PageChecked(page)) {
> > +			f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
> > +			err = -EINVAL;
> > +			break;
> > +		}
> > +
> > +		/*
> > +		 * it's not needed to clear PG_checked flag in temp page since we
> > +		 * will truncate all those pages in the end of recovery.
> > +		 */
> > +		SetPageChecked(page);
> >  
> >  		if (!is_recoverable_dnode(page))
> >  			break;
> > 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes
  2017-12-28  2:37   ` Jaegeuk Kim
@ 2018-01-04  3:07     ` Chao Yu
  0 siblings, 0 replies; 6+ messages in thread
From: Chao Yu @ 2018-01-04  3:07 UTC (permalink / raw)
  To: Yunlei He; +Cc: Jaegeuk Kim, ning.jia, linux-f2fs-devel

Hi, Yunlei,

Could you rework this patch as we discussed?

Thanks,

On 2017/12/28 10:37, Jaegeuk Kim wrote:
> On 12/25, Chao Yu wrote:
>> On 2017/12/20 13:03, Yunlei He wrote:
>>> v4 -> v5 return err instead destory inode list for two reasons:
>>> i.  avoid duplicated destroy inode list in function recover_fsync_data.
>>> ii. report an error for recovery, and set need_fsck flag
>>
>> fsck can't fix this issue so far, it doesn't work even we set need_fsck flag,
>> would it be better to just drop last dnode with corrupted next_blkaddr and try
>> our best to recover fsynced data?
> 
> Yup, agreed to this. There's no way to recover this, if we just return an error.
> 
>>
>> Thanks,
>>
>>>         
>>> Came across a dead loop in recovery like this:
>>>
>>> ......
>>> [   24.680480s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597696
>>> [   24.698394s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597697
>>> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724334s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724365s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724395s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> [   24.724426s][pid:320,cpu0,init]find_fsync_dnodes: blkaddr =13597698
>>> ......
>>>
>>> Mount process will block in dead loop and fsck can do nothing with this
>>> error, This patch abandon recovery if node chain is cyclical.
>>>
>>> Signed-off-by: Yunlei He <heyunlei@huawei.com>
>>> ---
>>>  fs/f2fs/recovery.c | 11 +++++++++++
>>>  1 file changed, 11 insertions(+)
>>>
>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>> index d025aa8..a535ec2 100644
>>> --- a/fs/f2fs/recovery.c
>>> +++ b/fs/f2fs/recovery.c
>>> @@ -216,6 +216,17 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head)
>>>  			return 0;
>>>  
>>>  		page = get_tmp_page(sbi, blkaddr);
>>> +		if (PageChecked(page)) {
>>> +			f2fs_msg(sbi->sb, KERN_ERR, "Abandon looped node block list");
>>> +			err = -EINVAL;
>>> +			break;
>>> +		}
>>> +
>>> +		/*
>>> +		 * it's not needed to clear PG_checked flag in temp page since we
>>> +		 * will truncate all those pages in the end of recovery.
>>> +		 */
>>> +		SetPageChecked(page);
>>>  
>>>  		if (!is_recoverable_dnode(page))
>>>  			break;
>>>
> 
> .
> 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-01-04  3:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20  5:03 [PATCH v5] f2fs: avoid dead loop in function find_fsync_dnodes Yunlei He
2017-12-20 21:00 ` Jaegeuk Kim
2017-12-21  3:45   ` heyunlei
2017-12-25  2:33 ` Chao Yu
2017-12-28  2:37   ` Jaegeuk Kim
2018-01-04  3:07     ` Chao Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.