[PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
@ 2020-02-27 10:39 ` Sahitya Tummala
  0 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-02-27 10:39 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Sahitya Tummala, linux-kernel

Even though online resize is successfully done, a SPO immediately
after resize, still causes below error in the next mount.

[   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
[   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint

This is because after FS metadata is updated in update_fs_metadata()
if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
the new user_block_count.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
---
 fs/f2fs/gc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a92fa49..a14a75f 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 
 	update_fs_metadata(sbi, -secs);
 	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
+	set_sbi_flag(sbi, SBI_IS_DIRTY);
 	err = f2fs_sync_fs(sbi->sb, 1);
 	if (err) {
 		update_fs_metadata(sbi, secs);
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [f2fs-dev] [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
@ 2020-02-27 10:39 ` Sahitya Tummala
  0 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-02-27 10:39 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: linux-kernel

Even though online resize is successfully done, a SPO immediately
after resize, still causes below error in the next mount.

[   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
[   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint

This is because after FS metadata is updated in update_fs_metadata()
if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
the new user_block_count.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
---
 fs/f2fs/gc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a92fa49..a14a75f 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 
 	update_fs_metadata(sbi, -secs);
 	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
+	set_sbi_flag(sbi, SBI_IS_DIRTY);
 	err = f2fs_sync_fs(sbi->sb, 1);
 	if (err) {
 		update_fs_metadata(sbi, secs);
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] f2fs: Add a new CP flag to help fsck fix resize SPO issues
  2020-02-27 10:39 ` [f2fs-dev] " Sahitya Tummala
@ 2020-02-27 10:39   ` Sahitya Tummala
  -1 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-02-27 10:39 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: Sahitya Tummala, linux-kernel

Add and set a new CP flag CP_RESIZEFS_FLAG during
online resize FS to help fsck fix the metadata mismatch
that may happen due to SPO during resize, where SB
got updated but CP data couldn't be written yet.

fsck errors -
Info: CKPT version = 6ed7bccb
        Wrong user_block_count(2233856)
[f2fs_do_mount:3365] Checkpoint is polluted

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
---
 fs/f2fs/checkpoint.c    | 8 ++++++--
 include/linux/f2fs_fs.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index fdd7f3d..0bd4cdb 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1301,10 +1301,14 @@ static void update_ckpt_flags(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 	else
 		__clear_ckpt_flags(ckpt, CP_ORPHAN_PRESENT_FLAG);
 
-	if (is_sbi_flag_set(sbi, SBI_NEED_FSCK) ||
-		is_sbi_flag_set(sbi, SBI_IS_RESIZEFS))
+	if (is_sbi_flag_set(sbi, SBI_NEED_FSCK))
 		__set_ckpt_flags(ckpt, CP_FSCK_FLAG);
 
+	if (is_sbi_flag_set(sbi, SBI_IS_RESIZEFS))
+		__set_ckpt_flags(ckpt, CP_RESIZEFS_FLAG);
+	else
+		__clear_ckpt_flags(ckpt, CP_RESIZEFS_FLAG);
+
 	if (is_sbi_flag_set(sbi, SBI_CP_DISABLED))
 		__set_ckpt_flags(ckpt, CP_DISABLED_FLAG);
 	else
diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index ac3f488..3c383dd 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -125,6 +125,7 @@ struct f2fs_super_block {
 /*
  * For checkpoint
  */
+#define CP_RESIZEFS_FLAG		0x00004000
 #define CP_DISABLED_QUICK_FLAG		0x00002000
 #define CP_DISABLED_FLAG		0x00001000
 #define CP_QUOTA_NEED_FSCK_FLAG		0x00000800
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [f2fs-dev] [PATCH 2/2] f2fs: Add a new CP flag to help fsck fix resize SPO issues
@ 2020-02-27 10:39   ` Sahitya Tummala
  0 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-02-27 10:39 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, linux-f2fs-devel; +Cc: linux-kernel

Add and set a new CP flag CP_RESIZEFS_FLAG during
online resize FS to help fsck fix the metadata mismatch
that may happen due to SPO during resize, where SB
got updated but CP data couldn't be written yet.

fsck errors -
Info: CKPT version = 6ed7bccb
        Wrong user_block_count(2233856)
[f2fs_do_mount:3365] Checkpoint is polluted

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
---
 fs/f2fs/checkpoint.c    | 8 ++++++--
 include/linux/f2fs_fs.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index fdd7f3d..0bd4cdb 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1301,10 +1301,14 @@ static void update_ckpt_flags(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 	else
 		__clear_ckpt_flags(ckpt, CP_ORPHAN_PRESENT_FLAG);
 
-	if (is_sbi_flag_set(sbi, SBI_NEED_FSCK) ||
-		is_sbi_flag_set(sbi, SBI_IS_RESIZEFS))
+	if (is_sbi_flag_set(sbi, SBI_NEED_FSCK))
 		__set_ckpt_flags(ckpt, CP_FSCK_FLAG);
 
+	if (is_sbi_flag_set(sbi, SBI_IS_RESIZEFS))
+		__set_ckpt_flags(ckpt, CP_RESIZEFS_FLAG);
+	else
+		__clear_ckpt_flags(ckpt, CP_RESIZEFS_FLAG);
+
 	if (is_sbi_flag_set(sbi, SBI_CP_DISABLED))
 		__set_ckpt_flags(ckpt, CP_DISABLED_FLAG);
 	else
diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index ac3f488..3c383dd 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -125,6 +125,7 @@ struct f2fs_super_block {
 /*
  * For checkpoint
  */
+#define CP_RESIZEFS_FLAG		0x00004000
 #define CP_DISABLED_QUICK_FLAG		0x00002000
 #define CP_DISABLED_FLAG		0x00001000
 #define CP_QUOTA_NEED_FSCK_FLAG		0x00000800
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
  2020-02-27 10:39 ` [f2fs-dev] " Sahitya Tummala
@ 2020-02-28  8:35   ` Chao Yu
  -1 siblings, 0 replies; 12+ messages in thread
From: Chao Yu @ 2020-02-28  8:35 UTC (permalink / raw)
  To: Sahitya Tummala, Jaegeuk Kim, linux-f2fs-devel; +Cc: linux-kernel

Hi Sahitya,

Good catch.

On 2020/2/27 18:39, Sahitya Tummala wrote:
> Even though online resize is successfully done, a SPO immediately
> after resize, still causes below error in the next mount.
> 
> [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
> [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
> 
> This is because after FS metadata is updated in update_fs_metadata()
> if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
> the new user_block_count.
> 
> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> ---
>  fs/f2fs/gc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index a92fa49..a14a75f 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  
>  	update_fs_metadata(sbi, -secs);
>  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);

Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?

> +	set_sbi_flag(sbi, SBI_IS_DIRTY);
>  	err = f2fs_sync_fs(sbi->sb, 1);
>  	if (err) {
>  		update_fs_metadata(sbi, secs);

Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
path can be covered as well?

Thanks,

> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
@ 2020-02-28  8:35   ` Chao Yu
  0 siblings, 0 replies; 12+ messages in thread
From: Chao Yu @ 2020-02-28  8:35 UTC (permalink / raw)
  To: Sahitya Tummala, Jaegeuk Kim, linux-f2fs-devel; +Cc: linux-kernel

Hi Sahitya,

Good catch.

On 2020/2/27 18:39, Sahitya Tummala wrote:
> Even though online resize is successfully done, a SPO immediately
> after resize, still causes below error in the next mount.
> 
> [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
> [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
> 
> This is because after FS metadata is updated in update_fs_metadata()
> if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
> the new user_block_count.
> 
> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> ---
>  fs/f2fs/gc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index a92fa49..a14a75f 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  
>  	update_fs_metadata(sbi, -secs);
>  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);

Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?

> +	set_sbi_flag(sbi, SBI_IS_DIRTY);
>  	err = f2fs_sync_fs(sbi->sb, 1);
>  	if (err) {
>  		update_fs_metadata(sbi, secs);

Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
path can be covered as well?

Thanks,

> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
  2020-02-28  8:35   ` [f2fs-dev] " Chao Yu
@ 2020-03-02  4:39     ` Sahitya Tummala
  -1 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-03-02  4:39 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, stummala

Hi Chao,

On Fri, Feb 28, 2020 at 04:35:37PM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> Good catch.
> 
> On 2020/2/27 18:39, Sahitya Tummala wrote:
> > Even though online resize is successfully done, a SPO immediately
> > after resize, still causes below error in the next mount.
> > 
> > [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
> > [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
> > 
> > This is because after FS metadata is updated in update_fs_metadata()
> > if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
> > the new user_block_count.
> > 
> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> > ---
> >  fs/f2fs/gc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index a92fa49..a14a75f 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> >  
> >  	update_fs_metadata(sbi, -secs);
> >  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
> 
> Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?

I don't think a barrier will help here. Let us say there is a another context
doing CP already, then it races with update_fs_metadata(), so it may or may not
see the resize updates and it will also clear the SBI_IS_DIRTY flag set by resize
(even with a barrier).

I think we need to synchronize this with CP context, so that these resize changes
will be reflected properly. Please see the new diff below and help with the review.

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a14a75f..5554af8 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1467,6 +1467,7 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
        long long user_block_count =
                                le64_to_cpu(F2FS_CKPT(sbi)->user_block_count);

+       clear_sbi_flag(sbi, SBI_IS_DIRTY);
        SM_I(sbi)->segment_count = (int)SM_I(sbi)->segment_count + segs;
        MAIN_SEGS(sbi) = (int)MAIN_SEGS(sbi) + segs;
        FREE_I(sbi)->free_sections = (int)FREE_I(sbi)->free_sections + secs;
@@ -1575,9 +1576,12 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
                goto out;
        }

+       mutex_lock(&sbi->cp_mutex);
        update_fs_metadata(sbi, -secs);
        clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
        set_sbi_flag(sbi, SBI_IS_DIRTY);
+       mutex_unlock(&sbi->cp_mutex);
+
        err = f2fs_sync_fs(sbi->sb, 1);
        if (err) {
                update_fs_metadata(sbi, secs);

thanks,

> 
> > +	set_sbi_flag(sbi, SBI_IS_DIRTY);
> >  	err = f2fs_sync_fs(sbi->sb, 1);
> >  	if (err) {
> >  		update_fs_metadata(sbi, secs);
> 
> Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
> path can be covered as well?
> 
> Thanks,
> 
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
@ 2020-03-02  4:39     ` Sahitya Tummala
  0 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-03-02  4:39 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Chao,

On Fri, Feb 28, 2020 at 04:35:37PM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> Good catch.
> 
> On 2020/2/27 18:39, Sahitya Tummala wrote:
> > Even though online resize is successfully done, a SPO immediately
> > after resize, still causes below error in the next mount.
> > 
> > [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
> > [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
> > 
> > This is because after FS metadata is updated in update_fs_metadata()
> > if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
> > the new user_block_count.
> > 
> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> > ---
> >  fs/f2fs/gc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index a92fa49..a14a75f 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> >  
> >  	update_fs_metadata(sbi, -secs);
> >  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
> 
> Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?

I don't think a barrier will help here. Let us say there is a another context
doing CP already, then it races with update_fs_metadata(), so it may or may not
see the resize updates and it will also clear the SBI_IS_DIRTY flag set by resize
(even with a barrier).

I think we need to synchronize this with CP context, so that these resize changes
will be reflected properly. Please see the new diff below and help with the review.

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a14a75f..5554af8 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1467,6 +1467,7 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
        long long user_block_count =
                                le64_to_cpu(F2FS_CKPT(sbi)->user_block_count);

+       clear_sbi_flag(sbi, SBI_IS_DIRTY);
        SM_I(sbi)->segment_count = (int)SM_I(sbi)->segment_count + segs;
        MAIN_SEGS(sbi) = (int)MAIN_SEGS(sbi) + segs;
        FREE_I(sbi)->free_sections = (int)FREE_I(sbi)->free_sections + secs;
@@ -1575,9 +1576,12 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
                goto out;
        }

+       mutex_lock(&sbi->cp_mutex);
        update_fs_metadata(sbi, -secs);
        clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
        set_sbi_flag(sbi, SBI_IS_DIRTY);
+       mutex_unlock(&sbi->cp_mutex);
+
        err = f2fs_sync_fs(sbi->sb, 1);
        if (err) {
                update_fs_metadata(sbi, secs);

thanks,

> 
> > +	set_sbi_flag(sbi, SBI_IS_DIRTY);
> >  	err = f2fs_sync_fs(sbi->sb, 1);
> >  	if (err) {
> >  		update_fs_metadata(sbi, secs);
> 
> Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
> path can be covered as well?
> 
> Thanks,
> 
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
  2020-03-02  4:39     ` [f2fs-dev] " Sahitya Tummala
@ 2020-03-03 12:06       ` Chao Yu
  -1 siblings, 0 replies; 12+ messages in thread
From: Chao Yu @ 2020-03-03 12:06 UTC (permalink / raw)
  To: Sahitya Tummala; +Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel

Hi Sahitya,

On 2020/3/2 12:39, Sahitya Tummala wrote:
> Hi Chao,
> 
> On Fri, Feb 28, 2020 at 04:35:37PM +0800, Chao Yu wrote:
>> Hi Sahitya,
>>
>> Good catch.
>>
>> On 2020/2/27 18:39, Sahitya Tummala wrote:
>>> Even though online resize is successfully done, a SPO immediately
>>> after resize, still causes below error in the next mount.
>>>
>>> [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
>>> [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
>>>
>>> This is because after FS metadata is updated in update_fs_metadata()
>>> if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
>>> the new user_block_count.
>>>
>>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
>>> ---
>>>  fs/f2fs/gc.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index a92fa49..a14a75f 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>>>  
>>>  	update_fs_metadata(sbi, -secs);
>>>  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
>>
>> Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?
> 
> I don't think a barrier will help here. Let us say there is a another context
> doing CP already, then it races with update_fs_metadata(), so it may or may not
> see the resize updates and it will also clear the SBI_IS_DIRTY flag set by resize
> (even with a barrier).

I agreed, actually, we didn't consider race condition in between CP and
update_fs_metadata(), it should be fixed.

> 
> I think we need to synchronize this with CP context, so that these resize changes
> will be reflected properly. Please see the new diff below and help with the review.
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index a14a75f..5554af8 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1467,6 +1467,7 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
>         long long user_block_count =
>                                 le64_to_cpu(F2FS_CKPT(sbi)->user_block_count);
> 
> +       clear_sbi_flag(sbi, SBI_IS_DIRTY);

Why clear dirty flag here?

And why not use cp_mutex to protect update_fs_metadata() in error path of
f2fs_sync_fs() below?

>         SM_I(sbi)->segment_count = (int)SM_I(sbi)->segment_count + segs;
>         MAIN_SEGS(sbi) = (int)MAIN_SEGS(sbi) + segs;
>         FREE_I(sbi)->free_sections = (int)FREE_I(sbi)->free_sections + secs;
> @@ -1575,9 +1576,12 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>                 goto out;
>         }
> 
> +       mutex_lock(&sbi->cp_mutex);
>         update_fs_metadata(sbi, -secs);
>         clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
>         set_sbi_flag(sbi, SBI_IS_DIRTY);
> +       mutex_unlock(&sbi->cp_mutex);
> +
>         err = f2fs_sync_fs(sbi->sb, 1);
>         if (err) {
>                 update_fs_metadata(sbi, secs);

		  ^^^^^^^^^^^^^^

In addition, I found that we missed to use sb_lock to protect f2fs_super_block
fields update, will submit a patch for that.

Thanks,

> 
> thanks,
> 
>>
>>> +	set_sbi_flag(sbi, SBI_IS_DIRTY);
>>>  	err = f2fs_sync_fs(sbi->sb, 1);
>>>  	if (err) {
>>>  		update_fs_metadata(sbi, secs);
>>
>> Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
>> path can be covered as well?
>>
>> Thanks,
>>
>>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
@ 2020-03-03 12:06       ` Chao Yu
  0 siblings, 0 replies; 12+ messages in thread
From: Chao Yu @ 2020-03-03 12:06 UTC (permalink / raw)
  To: Sahitya Tummala; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Sahitya,

On 2020/3/2 12:39, Sahitya Tummala wrote:
> Hi Chao,
> 
> On Fri, Feb 28, 2020 at 04:35:37PM +0800, Chao Yu wrote:
>> Hi Sahitya,
>>
>> Good catch.
>>
>> On 2020/2/27 18:39, Sahitya Tummala wrote:
>>> Even though online resize is successfully done, a SPO immediately
>>> after resize, still causes below error in the next mount.
>>>
>>> [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
>>> [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
>>>
>>> This is because after FS metadata is updated in update_fs_metadata()
>>> if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
>>> the new user_block_count.
>>>
>>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
>>> ---
>>>  fs/f2fs/gc.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index a92fa49..a14a75f 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>>>  
>>>  	update_fs_metadata(sbi, -secs);
>>>  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
>>
>> Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?
> 
> I don't think a barrier will help here. Let us say there is a another context
> doing CP already, then it races with update_fs_metadata(), so it may or may not
> see the resize updates and it will also clear the SBI_IS_DIRTY flag set by resize
> (even with a barrier).

I agreed, actually, we didn't consider race condition in between CP and
update_fs_metadata(), it should be fixed.

> 
> I think we need to synchronize this with CP context, so that these resize changes
> will be reflected properly. Please see the new diff below and help with the review.
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index a14a75f..5554af8 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1467,6 +1467,7 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
>         long long user_block_count =
>                                 le64_to_cpu(F2FS_CKPT(sbi)->user_block_count);
> 
> +       clear_sbi_flag(sbi, SBI_IS_DIRTY);

Why clear dirty flag here?

And why not use cp_mutex to protect update_fs_metadata() in error path of
f2fs_sync_fs() below?

>         SM_I(sbi)->segment_count = (int)SM_I(sbi)->segment_count + segs;
>         MAIN_SEGS(sbi) = (int)MAIN_SEGS(sbi) + segs;
>         FREE_I(sbi)->free_sections = (int)FREE_I(sbi)->free_sections + secs;
> @@ -1575,9 +1576,12 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>                 goto out;
>         }
> 
> +       mutex_lock(&sbi->cp_mutex);
>         update_fs_metadata(sbi, -secs);
>         clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
>         set_sbi_flag(sbi, SBI_IS_DIRTY);
> +       mutex_unlock(&sbi->cp_mutex);
> +
>         err = f2fs_sync_fs(sbi->sb, 1);
>         if (err) {
>                 update_fs_metadata(sbi, secs);

		  ^^^^^^^^^^^^^^

In addition, I found that we missed to use sb_lock to protect f2fs_super_block
fields update, will submit a patch for that.

Thanks,

> 
> thanks,
> 
>>
>>> +	set_sbi_flag(sbi, SBI_IS_DIRTY);
>>>  	err = f2fs_sync_fs(sbi->sb, 1);
>>>  	if (err) {
>>>  		update_fs_metadata(sbi, secs);
>>
>> Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
>> path can be covered as well?
>>
>> Thanks,
>>
>>>
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
  2020-03-03 12:06       ` [f2fs-dev] " Chao Yu
@ 2020-03-03 14:06         ` Sahitya Tummala
  -1 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-03-03 14:06 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel

Hi Chao,

On Tue, Mar 03, 2020 at 08:06:21PM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> On 2020/3/2 12:39, Sahitya Tummala wrote:
> > Hi Chao,
> > 
> > On Fri, Feb 28, 2020 at 04:35:37PM +0800, Chao Yu wrote:
> >> Hi Sahitya,
> >>
> >> Good catch.
> >>
> >> On 2020/2/27 18:39, Sahitya Tummala wrote:
> >>> Even though online resize is successfully done, a SPO immediately
> >>> after resize, still causes below error in the next mount.
> >>>
> >>> [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
> >>> [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
> >>>
> >>> This is because after FS metadata is updated in update_fs_metadata()
> >>> if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
> >>> the new user_block_count.
> >>>
> >>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> >>> ---
> >>>  fs/f2fs/gc.c | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index a92fa49..a14a75f 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> >>>  
> >>>  	update_fs_metadata(sbi, -secs);
> >>>  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
> >>
> >> Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?
> > 
> > I don't think a barrier will help here. Let us say there is a another context
> > doing CP already, then it races with update_fs_metadata(), so it may or may not
> > see the resize updates and it will also clear the SBI_IS_DIRTY flag set by resize
> > (even with a barrier).
> 
> I agreed, actually, we didn't consider race condition in between CP and
> update_fs_metadata(), it should be fixed.
> 
> > 
> > I think we need to synchronize this with CP context, so that these resize changes
> > will be reflected properly. Please see the new diff below and help with the review.
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index a14a75f..5554af8 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1467,6 +1467,7 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
> >         long long user_block_count =
> >                                 le64_to_cpu(F2FS_CKPT(sbi)->user_block_count);
> > 
> > +       clear_sbi_flag(sbi, SBI_IS_DIRTY);
> 
> Why clear dirty flag here?

Yes, it is not required. I will remove it.

> 
> And why not use cp_mutex to protect update_fs_metadata() in error path of
> f2fs_sync_fs() below?

Yes, will add a lock there too.

Thanks,

> 
> >         SM_I(sbi)->segment_count = (int)SM_I(sbi)->segment_count + segs;
> >         MAIN_SEGS(sbi) = (int)MAIN_SEGS(sbi) + segs;
> >         FREE_I(sbi)->free_sections = (int)FREE_I(sbi)->free_sections + secs;
> > @@ -1575,9 +1576,12 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> >                 goto out;
> >         }
> > 
> > +       mutex_lock(&sbi->cp_mutex);
> >         update_fs_metadata(sbi, -secs);
> >         clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
> >         set_sbi_flag(sbi, SBI_IS_DIRTY);
> > +       mutex_unlock(&sbi->cp_mutex);
> > +
> >         err = f2fs_sync_fs(sbi->sb, 1);
> >         if (err) {
> >                 update_fs_metadata(sbi, secs);
> 
> 		  ^^^^^^^^^^^^^^
> 
> In addition, I found that we missed to use sb_lock to protect f2fs_super_block
> fields update, will submit a patch for that.
> 
> Thanks,
> 
> > 
> > thanks,
> > 
> >>
> >>> +	set_sbi_flag(sbi, SBI_IS_DIRTY);
> >>>  	err = f2fs_sync_fs(sbi->sb, 1);
> >>>  	if (err) {
> >>>  		update_fs_metadata(sbi, secs);
> >>
> >> Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
> >> path can be covered as well?
> >>
> >> Thanks,
> >>
> >>>
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS
@ 2020-03-03 14:06         ` Sahitya Tummala
  0 siblings, 0 replies; 12+ messages in thread
From: Sahitya Tummala @ 2020-03-03 14:06 UTC (permalink / raw)
  To: Chao Yu; +Cc: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

Hi Chao,

On Tue, Mar 03, 2020 at 08:06:21PM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> On 2020/3/2 12:39, Sahitya Tummala wrote:
> > Hi Chao,
> > 
> > On Fri, Feb 28, 2020 at 04:35:37PM +0800, Chao Yu wrote:
> >> Hi Sahitya,
> >>
> >> Good catch.
> >>
> >> On 2020/2/27 18:39, Sahitya Tummala wrote:
> >>> Even though online resize is successfully done, a SPO immediately
> >>> after resize, still causes below error in the next mount.
> >>>
> >>> [   11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856
> >>> [   11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint
> >>>
> >>> This is because after FS metadata is updated in update_fs_metadata()
> >>> if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect
> >>> the new user_block_count.
> >>>
> >>> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> >>> ---
> >>>  fs/f2fs/gc.c | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index a92fa49..a14a75f 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -1577,6 +1577,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> >>>  
> >>>  	update_fs_metadata(sbi, -secs);
> >>>  	clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
> >>
> >> Need a barrier here to keep order in between above code and set_sbi_flag(DIRTY)?
> > 
> > I don't think a barrier will help here. Let us say there is a another context
> > doing CP already, then it races with update_fs_metadata(), so it may or may not
> > see the resize updates and it will also clear the SBI_IS_DIRTY flag set by resize
> > (even with a barrier).
> 
> I agreed, actually, we didn't consider race condition in between CP and
> update_fs_metadata(), it should be fixed.
> 
> > 
> > I think we need to synchronize this with CP context, so that these resize changes
> > will be reflected properly. Please see the new diff below and help with the review.
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index a14a75f..5554af8 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1467,6 +1467,7 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
> >         long long user_block_count =
> >                                 le64_to_cpu(F2FS_CKPT(sbi)->user_block_count);
> > 
> > +       clear_sbi_flag(sbi, SBI_IS_DIRTY);
> 
> Why clear dirty flag here?

Yes, it is not required. I will remove it.

> 
> And why not use cp_mutex to protect update_fs_metadata() in error path of
> f2fs_sync_fs() below?

Yes, will add a lock there too.

Thanks,

> 
> >         SM_I(sbi)->segment_count = (int)SM_I(sbi)->segment_count + segs;
> >         MAIN_SEGS(sbi) = (int)MAIN_SEGS(sbi) + segs;
> >         FREE_I(sbi)->free_sections = (int)FREE_I(sbi)->free_sections + secs;
> > @@ -1575,9 +1576,12 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> >                 goto out;
> >         }
> > 
> > +       mutex_lock(&sbi->cp_mutex);
> >         update_fs_metadata(sbi, -secs);
> >         clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
> >         set_sbi_flag(sbi, SBI_IS_DIRTY);
> > +       mutex_unlock(&sbi->cp_mutex);
> > +
> >         err = f2fs_sync_fs(sbi->sb, 1);
> >         if (err) {
> >                 update_fs_metadata(sbi, secs);
> 
> 		  ^^^^^^^^^^^^^^
> 
> In addition, I found that we missed to use sb_lock to protect f2fs_super_block
> fields update, will submit a patch for that.
> 
> Thanks,
> 
> > 
> > thanks,
> > 
> >>
> >>> +	set_sbi_flag(sbi, SBI_IS_DIRTY);
> >>>  	err = f2fs_sync_fs(sbi->sb, 1);
> >>>  	if (err) {
> >>>  		update_fs_metadata(sbi, secs);
> >>
> >> Do we need to add clear_sbi_flag(, SBI_IS_DIRTY) into update_fs_metadata(), so above
> >> path can be covered as well?
> >>
> >> Thanks,
> >>
> >>>
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-03-03 14:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-27 10:39 [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS Sahitya Tummala
2020-02-27 10:39 ` [f2fs-dev] " Sahitya Tummala
2020-02-27 10:39 ` [PATCH 2/2] f2fs: Add a new CP flag to help fsck fix resize SPO issues Sahitya Tummala
2020-02-27 10:39   ` [f2fs-dev] " Sahitya Tummala
2020-02-28  8:35 ` [PATCH 1/2] f2fs: Fix mount failure due to SPO after a successful online resize FS Chao Yu
2020-02-28  8:35   ` [f2fs-dev] " Chao Yu
2020-03-02  4:39   ` Sahitya Tummala
2020-03-02  4:39     ` [f2fs-dev] " Sahitya Tummala
2020-03-03 12:06     ` Chao Yu
2020-03-03 12:06       ` [f2fs-dev] " Chao Yu
2020-03-03 14:06       ` Sahitya Tummala
2020-03-03 14:06         ` [f2fs-dev] " Sahitya Tummala

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.