Linux-f2fs-devel Archive on lore.kernel.org
 help / color / Atom feed
* [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
@ 2019-07-29  5:20 Sahitya Tummala
  2019-07-29 16:00 ` Chao Yu
  0 siblings, 1 reply; 6+ messages in thread
From: Sahitya Tummala @ 2019-07-29  5:20 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: linux-kernel

Policy - foreground GC, LFS mode and greedy GC mode.

Under this policy, f2fs_gc() loops forever to GC as it doesn't have
enough free segements to proceed and thus it keeps calling gc_more
for the same victim segment.  This can happen if the selected victim
segment could not be GC'd due to failed blkaddr validity check i.e.
is_alive() returns false for the blocks set in current validity map.

Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
the segment selected could not be GC'd. This helps to select another
segment for GC and thus helps to proceed forward with GC.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
---
 fs/f2fs/gc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8974672..7bbcc4a 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 		round++;
 	}
 
-	if (gc_type == FG_GC)
+	if (gc_type == FG_GC && seg_freed)
 		sbi->cur_victim_sec = NULL_SEGNO;
 
 	if (sync)
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
  2019-07-29  5:20 [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc() Sahitya Tummala
@ 2019-07-29 16:00 ` Chao Yu
  2019-07-30  4:36   ` Sahitya Tummala
  0 siblings, 1 reply; 6+ messages in thread
From: Chao Yu @ 2019-07-29 16:00 UTC (permalink / raw)
  To: Sahitya Tummala, Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: linux-kernel

Hi Sahitya,

On 2019-7-29 13:20, Sahitya Tummala wrote:
> Policy - foreground GC, LFS mode and greedy GC mode.
> 
> Under this policy, f2fs_gc() loops forever to GC as it doesn't have
> enough free segements to proceed and thus it keeps calling gc_more
> for the same victim segment.  This can happen if the selected victim
> segment could not be GC'd due to failed blkaddr validity check i.e.
> is_alive() returns false for the blocks set in current validity map.
> 
> Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
> the segment selected could not be GC'd. This helps to select another
> segment for GC and thus helps to proceed forward with GC.
> 
> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> ---
>  fs/f2fs/gc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 8974672..7bbcc4a 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  		round++;
>  	}
>  
> -	if (gc_type == FG_GC)
> +	if (gc_type == FG_GC && seg_freed)
>  		sbi->cur_victim_sec = NULL_SEGNO;

In some cases, we may remain last victim in sbi->cur_victim_sec, and jump out of
GC cycle, then SSR can skip the last victim due to sec_usage_check()...

Thanks,

>  
>  	if (sync)
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
  2019-07-29 16:00 ` Chao Yu
@ 2019-07-30  4:36   ` Sahitya Tummala
  2019-07-30 12:35     ` Chao Yu
  0 siblings, 1 reply; 6+ messages in thread
From: Sahitya Tummala @ 2019-07-30  4:36 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Chao,

On Tue, Jul 30, 2019 at 12:00:45AM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> On 2019-7-29 13:20, Sahitya Tummala wrote:
> > Policy - foreground GC, LFS mode and greedy GC mode.
> > 
> > Under this policy, f2fs_gc() loops forever to GC as it doesn't have
> > enough free segements to proceed and thus it keeps calling gc_more
> > for the same victim segment.  This can happen if the selected victim
> > segment could not be GC'd due to failed blkaddr validity check i.e.
> > is_alive() returns false for the blocks set in current validity map.
> > 
> > Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
> > the segment selected could not be GC'd. This helps to select another
> > segment for GC and thus helps to proceed forward with GC.
> > 
> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> > ---
> >  fs/f2fs/gc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index 8974672..7bbcc4a 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >  		round++;
> >  	}
> >  
> > -	if (gc_type == FG_GC)
> > +	if (gc_type == FG_GC && seg_freed)
> >  		sbi->cur_victim_sec = NULL_SEGNO;
> 
> In some cases, we may remain last victim in sbi->cur_victim_sec, and jump out of
> GC cycle, then SSR can skip the last victim due to sec_usage_check()...
> 

I see. I have a few questions on how to fix this issue. Please share your
comments.

1. Do you think the scenario described is valid? It happens rarely, not very
easy to reproduce.  From the dumps, I see that only block is set as valid in
the sentry->cur_valid_map for which I see that summary block check is_alive()
could return false. As only one block is set as valid, chances are there it
can be always selected as the victim by get_victim_by_default() under FG_GC.

2. What are the possible scenarios where summary block check is_alive() could
fail for a segment?

3. How does GC handle such segments?

Thanks,

> Thanks,
> 
> >  
> >  	if (sync)
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
  2019-07-30  4:36   ` Sahitya Tummala
@ 2019-07-30 12:35     ` Chao Yu
  2019-07-31  3:41       ` Sahitya Tummala
  0 siblings, 1 reply; 6+ messages in thread
From: Chao Yu @ 2019-07-30 12:35 UTC (permalink / raw)
  To: Sahitya Tummala, Chao Yu; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Sahitya,

On 2019/7/30 12:36, Sahitya Tummala wrote:
> Hi Chao,
> 
> On Tue, Jul 30, 2019 at 12:00:45AM +0800, Chao Yu wrote:
>> Hi Sahitya,
>>
>> On 2019-7-29 13:20, Sahitya Tummala wrote:
>>> Policy - foreground GC, LFS mode and greedy GC mode.
>>>
>>> Under this policy, f2fs_gc() loops forever to GC as it doesn't have
>>> enough free segements to proceed and thus it keeps calling gc_more
>>> for the same victim segment.  This can happen if the selected victim
>>> segment could not be GC'd due to failed blkaddr validity check i.e.
>>> is_alive() returns false for the blocks set in current validity map.
>>>
>>> Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
>>> the segment selected could not be GC'd. This helps to select another
>>> segment for GC and thus helps to proceed forward with GC.
>>>
>>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
>>> ---
>>>  fs/f2fs/gc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index 8974672..7bbcc4a 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>  		round++;
>>>  	}
>>>  
>>> -	if (gc_type == FG_GC)
>>> +	if (gc_type == FG_GC && seg_freed)
>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>
>> In some cases, we may remain last victim in sbi->cur_victim_sec, and jump out of
>> GC cycle, then SSR can skip the last victim due to sec_usage_check()...
>>
> 
> I see. I have a few questions on how to fix this issue. Please share your
> comments.
> 
> 1. Do you think the scenario described is valid? It happens rarely, not very

IIRC, we suffered endless gc loop due to there is valid block belong to an
opened atomic write file. (because we will skip directly once we hit atomic file)

For your case, I'm not sure that would happen, did you look into is_alive(), why
will it fail? block address not match? If so, it looks like summary info and
dnode block and nat entry are inconsistent.

> easy to reproduce.  From the dumps, I see that only block is set as valid in
> the sentry->cur_valid_map for which I see that summary block check is_alive()
> could return false. As only one block is set as valid, chances are there it
> can be always selected as the victim by get_victim_by_default() under FG_GC.
> 
> 2. What are the possible scenarios where summary block check is_alive() could
> fail for a segment?

I guess, maybe after check_valid_map(), the block is been truncated before
is_alive(). If so the victim should be prefree directly instead of being
selected again...

> 
> 3. How does GC handle such segments?

I think that's not a normal case, or I'm missing something.

Thanks,

> 
> Thanks,
> 
>> Thanks,
>>
>>>  
>>>  	if (sync)
>>>
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
  2019-07-30 12:35     ` Chao Yu
@ 2019-07-31  3:41       ` Sahitya Tummala
  2019-07-31 10:34         ` Chao Yu
  0 siblings, 1 reply; 6+ messages in thread
From: Sahitya Tummala @ 2019-07-31  3:41 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Chao,

On Tue, Jul 30, 2019 at 08:35:46PM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> On 2019/7/30 12:36, Sahitya Tummala wrote:
> > Hi Chao,
> > 
> > On Tue, Jul 30, 2019 at 12:00:45AM +0800, Chao Yu wrote:
> >> Hi Sahitya,
> >>
> >> On 2019-7-29 13:20, Sahitya Tummala wrote:
> >>> Policy - foreground GC, LFS mode and greedy GC mode.
> >>>
> >>> Under this policy, f2fs_gc() loops forever to GC as it doesn't have
> >>> enough free segements to proceed and thus it keeps calling gc_more
> >>> for the same victim segment.  This can happen if the selected victim
> >>> segment could not be GC'd due to failed blkaddr validity check i.e.
> >>> is_alive() returns false for the blocks set in current validity map.
> >>>
> >>> Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
> >>> the segment selected could not be GC'd. This helps to select another
> >>> segment for GC and thus helps to proceed forward with GC.
> >>>
> >>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> >>> ---
> >>>  fs/f2fs/gc.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index 8974672..7bbcc4a 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>>  		round++;
> >>>  	}
> >>>  
> >>> -	if (gc_type == FG_GC)
> >>> +	if (gc_type == FG_GC && seg_freed)
> >>>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>
> >> In some cases, we may remain last victim in sbi->cur_victim_sec, and jump out of
> >> GC cycle, then SSR can skip the last victim due to sec_usage_check()...
> >>
> > 
> > I see. I have a few questions on how to fix this issue. Please share your
> > comments.
> > 
> > 1. Do you think the scenario described is valid? It happens rarely, not very
> 
> IIRC, we suffered endless gc loop due to there is valid block belong to an
> opened atomic write file. (because we will skip directly once we hit atomic file)
> 
> For your case, I'm not sure that would happen, did you look into is_alive(), why
> will it fail? block address not match? If so, it looks like summary info and
> dnode block and nat entry are inconsistent.

Yes, from the ramdumps, I could see that block address is not matching and
hence, is_alive() could fail in the issue scenario. Have you observed any such
cases before? What could be the reason for this mismatch?

Thanks,

> 
> > easy to reproduce.  From the dumps, I see that only block is set as valid in
> > the sentry->cur_valid_map for which I see that summary block check is_alive()
> > could return false. As only one block is set as valid, chances are there it
> > can be always selected as the victim by get_victim_by_default() under FG_GC.
> > 
> > 2. What are the possible scenarios where summary block check is_alive() could
> > fail for a segment?
> 
> I guess, maybe after check_valid_map(), the block is been truncated before
> is_alive(). If so the victim should be prefree directly instead of being
> selected again...
> 
> > 
> > 3. How does GC handle such segments?
> 
> I think that's not a normal case, or I'm missing something.
> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >> Thanks,
> >>
> >>>  
> >>>  	if (sync)
> >>>
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
  2019-07-31  3:41       ` Sahitya Tummala
@ 2019-07-31 10:34         ` Chao Yu
  0 siblings, 0 replies; 6+ messages in thread
From: Chao Yu @ 2019-07-31 10:34 UTC (permalink / raw)
  To: Sahitya Tummala; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Sahitya,

On 2019/7/31 11:41, Sahitya Tummala wrote:
> Hi Chao,
> 
> On Tue, Jul 30, 2019 at 08:35:46PM +0800, Chao Yu wrote:
>> Hi Sahitya,
>>
>> On 2019/7/30 12:36, Sahitya Tummala wrote:
>>> Hi Chao,
>>>
>>> On Tue, Jul 30, 2019 at 12:00:45AM +0800, Chao Yu wrote:
>>>> Hi Sahitya,
>>>>
>>>> On 2019-7-29 13:20, Sahitya Tummala wrote:
>>>>> Policy - foreground GC, LFS mode and greedy GC mode.
>>>>>
>>>>> Under this policy, f2fs_gc() loops forever to GC as it doesn't have
>>>>> enough free segements to proceed and thus it keeps calling gc_more
>>>>> for the same victim segment.  This can happen if the selected victim
>>>>> segment could not be GC'd due to failed blkaddr validity check i.e.
>>>>> is_alive() returns false for the blocks set in current validity map.
>>>>>
>>>>> Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
>>>>> the segment selected could not be GC'd. This helps to select another
>>>>> segment for GC and thus helps to proceed forward with GC.
>>>>>
>>>>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
>>>>> ---
>>>>>  fs/f2fs/gc.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>>> index 8974672..7bbcc4a 100644
>>>>> --- a/fs/f2fs/gc.c
>>>>> +++ b/fs/f2fs/gc.c
>>>>> @@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>>>  		round++;
>>>>>  	}
>>>>>  
>>>>> -	if (gc_type == FG_GC)
>>>>> +	if (gc_type == FG_GC && seg_freed)
>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>>>
>>>> In some cases, we may remain last victim in sbi->cur_victim_sec, and jump out of
>>>> GC cycle, then SSR can skip the last victim due to sec_usage_check()...
>>>>
>>>
>>> I see. I have a few questions on how to fix this issue. Please share your
>>> comments.
>>>
>>> 1. Do you think the scenario described is valid? It happens rarely, not very
>>
>> IIRC, we suffered endless gc loop due to there is valid block belong to an
>> opened atomic write file. (because we will skip directly once we hit atomic file)
>>
>> For your case, I'm not sure that would happen, did you look into is_alive(), why
>> will it fail? block address not match? If so, it looks like summary info and
>> dnode block and nat entry are inconsistent.
> 
> Yes, from the ramdumps, I could see that block address is not matching and
> hence, is_alive() could fail in the issue scenario. Have you observed any such
> cases before? What could be the reason for this mismatch?

Alright, I didn't suffer such case before...

I don't know, too few clues to find the root cause. I guess maybe:
- random data caused by emmc/ufs firmware bugs
- bit-flip or memory overflow
- f2fs bugs

So, for the solution, I suggest to detect such inconsistency, and tag in
somewhere to just get rid of selecting the corrupted section.

BTW, do you try fsck on that image? what's the result?

Thanks,

> 
> Thanks,
> 
>>
>>> easy to reproduce.  From the dumps, I see that only block is set as valid in
>>> the sentry->cur_valid_map for which I see that summary block check is_alive()
>>> could return false. As only one block is set as valid, chances are there it
>>> can be always selected as the victim by get_victim_by_default() under FG_GC.
>>>
>>> 2. What are the possible scenarios where summary block check is_alive() could
>>> fail for a segment?
>>
>> I guess, maybe after check_valid_map(), the block is been truncated before
>> is_alive(). If so the victim should be prefree directly instead of being
>> selected again...
>>
>>>
>>> 3. How does GC handle such segments?
>>
>> I think that's not a normal case, or I'm missing something.
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>> Thanks,
>>>>
>>>>>  
>>>>>  	if (sync)
>>>>>
>>>
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29  5:20 [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc() Sahitya Tummala
2019-07-29 16:00 ` Chao Yu
2019-07-30  4:36   ` Sahitya Tummala
2019-07-30 12:35     ` Chao Yu
2019-07-31  3:41       ` Sahitya Tummala
2019-07-31 10:34         ` Chao Yu

Linux-f2fs-devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-f2fs-devel/0 linux-f2fs-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-f2fs-devel linux-f2fs-devel/ https://lore.kernel.org/linux-f2fs-devel \
		linux-f2fs-devel@lists.sourceforge.net linux-f2fs-devel@archiver.kernel.org
	public-inbox-index linux-f2fs-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/net.sourceforge.lists.linux-f2fs-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox