linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
* Re: [f2fs-dev] F2fs panic | update_sit_entry() | no free segment | se invalid
       [not found] <381334f9.10021.1704dcbf102.Coremail.spearmao@126.com>
@ 2020-02-17  3:50 ` Chao Yu
       [not found]   ` <6d0f19c8.9bd4.170518c5bf1.Coremail.spearmao@126.com>
       [not found]   ` <20200217052829.GD20234@codeaurora.org>
  0 siblings, 2 replies; 3+ messages in thread
From: Chao Yu @ 2020-02-17  3:50 UTC (permalink / raw)
  To: 王矛; +Cc: linux-f2fs-devel

Hi 王矛,

On 2020/2/16 19:39, 王矛 wrote:
> *So the problem is:*
> 
> 1. in new_curseg(), if the segno allocated is invalid(no free segment, max segno
> is returned).
> 
> F2fs should do something to indicate this exception.
> 
> 2. otherwise, we may hit the f2fs panic(se invalid).
> 
> Maybe we should do sanity check in update_sit_entry() to see if segno is really
> out of range and caused this panic.

I'm afraid it's too late to handle such error in update_sit_entry(), since we
expect all procedures in do_write_page() will be successful, it's a little hard
to handle such error in that context.

So the problem here is why we can not find any free segments w/ LFS allocation,
because in case of lack of free segments (check via has_not_enough_free_secs()),
f2fs will force to trigger f2fs_gc() to recycle free sections.

I doubt there may be some corner case we haven't considered, result all free
segments (including reserved free segments) was exhausted by data/node writes
when last checkpoint is triggered during umount.

If this issue can be reproduced (during umount, free_segments() <
reserved_segments()), we can add some logs to see why f2fs_balance_fs() fail to
recycle enough free segments previously.

Thanks,

> 
> 
> Thanks,
> Mao
> 
> 
> 
>  
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [f2fs-dev] F2fs panic | update_sit_entry() | no free segment | se invalid
       [not found]   ` <6d0f19c8.9bd4.170518c5bf1.Coremail.spearmao@126.com>
@ 2020-02-17  7:10     ` Chao Yu
  0 siblings, 0 replies; 3+ messages in thread
From: Chao Yu @ 2020-02-17  7:10 UTC (permalink / raw)
  To: 王矛; +Cc: linux-f2fs-devel

On 2020/2/17 13:08, 王矛 wrote:
> 
> Hi Chao,
> 
> Thanks for your prompt reply.
> 1. Is it possible that when the issue occurred, there is indeed all free segment
> used up?

No, f2fs reserved some sections for foreground GC, no one should use up those
reserved sections, though foreground GC may use them temporarily to migrate blocks.

>     (End user used up all free segments, system core service(uid root) used up
> all reserved segments)
> 
>     If this maybe true, it would lead such crash issue, is this right?
>     I mean if we can add more detailed check to see if there is free segment and
> return ENOSPC in time before hit crash.
> 2. In data/node write path, it will call has_not_enough_free_secs(sbi, 0, 0) to
> make sure there is enough segments.

That is for reclaim path in where wbc->for_reclaim should be true.

But anyway, in data/node write path, we may call f2fs_balance_fs() to check
foreground GC water line.

> But in this crash callstack, it crashed in
> do_write_page()->f2fs_allocate_data_block().
> Is there any corner case that has_not_enough_free_secs() is missed during this
> write procedure?

I've no idea, one way is to check whether there is a determinate method to
reproduce this issue, if there is, maybe I can check this issue in more details.

Thanks,

> 
> Thanks,
> Mao
> 
> 
> At 2020-02-17 11:50:56, "Chao Yu" <yuchao0@huawei.com> wrote:
>>Hi 王矛,
>>
>>On 2020/2/16 19:39, 王矛 wrote:
>>> *So the problem is:*
>>> 
>>> 1. in new_curseg(), if the segno allocated is invalid(no free segment, max segno
>>> is returned).
>>> 
>>> F2fs should do something to indicate this exception.
>>> 
>>> 2. otherwise, we may hit the f2fs panic(se invalid).
>>> 
>>> Maybe we should do sanity check in update_sit_entry() to see if segno is really
>>> out of range and caused this panic.
>>
>>I'm afraid it's too late to handle such error in update_sit_entry(), since we
>>expect all procedures in do_write_page() will be successful, it's a little hard
>>to handle such error in that context.
>>
>>So the problem here is why we can not find any free segments w/ LFS allocation,
>>because in case of lack of free segments (check via has_not_enough_free_secs()),
>>f2fs will force to trigger f2fs_gc() to recycle free sections.
>>
>>I doubt there may be some corner case we haven't considered, result all free
>>segments (including reserved free segments) was exhausted by data/node writes
>>when last checkpoint is triggered during umount.
>>
>>If this issue can be reproduced (during umount, free_segments() <
>>reserved_segments()), we can add some logs to see why f2fs_balance_fs() fail to
>>recycle enough free segments previously.
>>
>>Thanks,
>>
>>> 
>>> 
>>> Thanks,
>>> Mao
>>> 
>>> 
>>> 
>>>  
>>> 
> 
> 
> 
>  
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [f2fs-dev] F2fs panic | update_sit_entry() | no free segment | se invalid
       [not found]     ` <4a686a8b.6490.1709f940c2a.Coremail.spearmao@126.com>
@ 2020-03-03 11:38       ` Chao Yu
  0 siblings, 0 replies; 3+ messages in thread
From: Chao Yu @ 2020-03-03 11:38 UTC (permalink / raw)
  To: 王矛, Sahitya Tummala; +Cc: linux-f2fs-devel

On 2020/3/3 16:47, 王矛 wrote:
> 
> Hi Chao,
> 
> Can u kindly help to check Sahitya's comments?
> 
> Thanks,
> Mao
> 
> 
> At 2020-02-17 13:28:29, "Sahitya Tummala" <stummala@codeaurora.org> wrote:
>>Hi Chao,
>>
>>On Mon, Feb 17, 2020 at 11:50:56AM +0800, Chao Yu wrote:
>>> Hi 王矛,
>>> 
>>> On 2020/2/16 19:39, 王矛 wrote:
>>> > *So the problem is:*
>>> > 
>>> > 1. in new_curseg(), if the segno allocated is invalid(no free segment, max segno
>>> > is returned).
>>> > 
>>> > F2fs should do something to indicate this exception.
>>> > 
>>> > 2. otherwise, we may hit the f2fs panic(se invalid).
>>> > 
>>> > Maybe we should do sanity check in update_sit_entry() to see if segno is really
>>> > out of range and caused this panic.
>>> 
>>> I'm afraid it's too late to handle such error in update_sit_entry(), since we
>>> expect all procedures in do_write_page() will be successful, it's a little hard
>>> to handle such error in that context.
>>> 
>>> So the problem here is why we can not find any free segments w/ LFS allocation,
>>> because in case of lack of free segments (check via has_not_enough_free_secs()),
>>> f2fs will force to trigger f2fs_gc() to recycle free sections.
>>> 
>>
>>I think this issue can happen when the __write_data_page is invoked in
>>non-reclaim path (wbc->for_reclaim is false) and there aren't enough free_secs
>>available. In this case, we won't at all check for has_not_enough_free_secs() and
>>don't do f2fs_balance_fs().
>>
>>I think __write_data_page() can be enhanced to fix this case as below.
>>Can you please check and provide your feedback?
>>
>>diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>index 66b02d3..71e1221 100644
>>--- a/fs/f2fs/data.c
>>+++ b/fs/f2fs/data.c
>>@@ -2176,10 +2176,18 @@ static int __write_data_page(struct page *page, bool *submitted,
>>
>>        if (!wbc->for_reclaim)
>>                need_balance_fs = true;
>>-       else if (has_not_enough_free_secs(sbi, 0, 0))
>>-               goto redirty_out;
>>-       else
>>-               set_inode_flag(inode, FI_HOT_DATA);
>>+
>>+       if (has_not_enough_free_secs(sbi, 0, 0)) {
>>+               if (!wbc->for_reclaim && !free_sections(sbi))

It's still too late if there is no free sections, as even we trigger f2fs_balance_fs
to run foreground GC, we may need a dirty segment to write data w/ SSR mode, however
all dirty segment could be node type...and prefree segment couldn't be reused as we
didn't finish calling last checkpoint during FGGC.

And with this way, we may cover the root cause of this issue, I guess we'd better
check why we have no free sections here first.

Thanks,

>>+                       goto redirty_out;
>>+               if (wbc->for_reclaim) {
>>+                       if (!free_sections(sbi))
>>+                               need_balance_fs = true;
>>+                       goto redirty_out;
>>+               }
>>+       }
>>+
>>+       set_inode_flag(inode, FI_HOT_DATA);
>>
>>        err = -EAGAIN;
>>        if (f2fs_has_inline_data(inode)) {
>>@@ -2252,6 +2260,9 @@ static int __write_data_page(struct page *page, bool *submitted,
>>        if (!err || wbc->for_reclaim)
>>                return AOP_WRITEPAGE_ACTIVATE;
>>        unlock_page(page);
>>+       if (!err && !S_ISDIR(inode->i_mode) && !IS_NOQUOTA(inode) &&
>>+                                       !F2FS_I(inode)->cp_task)
>>+               f2fs_balance_fs(sbi, need_balance_fs);
>>        return err;
>> }
>>
>>Thanks,
>>
>>> I doubt there may be some corner case we haven't considered, result all free
>>> segments (including reserved free segments) was exhausted by data/node writes
>>> when last checkpoint is triggered during umount.
>>> 
>>> If this issue can be reproduced (during umount, free_segments() <
>>> reserved_segments()), we can add some logs to see why f2fs_balance_fs() fail to
>>> recycle enough free segments previously.
>>> 
>>> Thanks,
>>> 
>>> > 
>>> > 
>>> > Thanks,
>>> > Mao
>>> > 
>>> > 
>>> > 
>>> >  
>>> > 
>>> 
>>> 
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>
>>-- 
>>--
>>Sent by a consultant of the Qualcomm Innovation Center, Inc.
>>The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> 
> 
> 
>  
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-03 11:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <381334f9.10021.1704dcbf102.Coremail.spearmao@126.com>
2020-02-17  3:50 ` [f2fs-dev] F2fs panic | update_sit_entry() | no free segment | se invalid Chao Yu
     [not found]   ` <6d0f19c8.9bd4.170518c5bf1.Coremail.spearmao@126.com>
2020-02-17  7:10     ` Chao Yu
     [not found]   ` <20200217052829.GD20234@codeaurora.org>
     [not found]     ` <4a686a8b.6490.1709f940c2a.Coremail.spearmao@126.com>
2020-03-03 11:38       ` Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).