linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] f2fs: fix quota info to adjust recovered data
@ 2018-09-11 20:15 Jaegeuk Kim
  2018-09-11 23:51 ` [f2fs-dev] " Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-11 20:15 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

fsck.f2fs is able to recover the quota structure, since roll-forward recovery
can recover it based on previous user information.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/recovery.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 95511ed11a22..1fde86a2107e 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
 
 	need_writecp = true;
 
+	/* quota is not fully updated due to the lack of user information. */
+	set_sbi_flag(sbi, SBI_NEED_FSCK);
+
 	/* step #2: recover data */
 	err = recover_data(sbi, &inode_list, &dir_list);
 	if (!err)
-- 
2.17.0.441.gb46fe60e1d-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-11 20:15 [PATCH] f2fs: fix quota info to adjust recovered data Jaegeuk Kim
@ 2018-09-11 23:51 ` Chao Yu
  2018-09-12  0:06   ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-11 23:51 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 2018/9/12 4:15, Jaegeuk Kim wrote:
> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> can recover it based on previous user information.

I didn't get it, both fsck and kernel recover quota file based all inodes'
uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
same?

Thanks,

> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/recovery.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index 95511ed11a22..1fde86a2107e 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>  
>  	need_writecp = true;
>  
> +	/* quota is not fully updated due to the lack of user information. */
> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> +
>  	/* step #2: recover data */
>  	err = recover_data(sbi, &inode_list, &dir_list);
>  	if (!err)
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-11 23:51 ` [f2fs-dev] " Chao Yu
@ 2018-09-12  0:06   ` Jaegeuk Kim
  2018-09-12  0:27     ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-12  0:06 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/12, Chao Yu wrote:
> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> > fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> > can recover it based on previous user information.
> 
> I didn't get it, both fsck and kernel recover quota file based all inodes'
> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> same?

I thought that, but had to add this, since I was encountering quota errors right
after getting some files recovered. And, I thought it'd make it more safe to do
fsck after roll-forward recovery.

Anyway, let me test again without this patch for a while.

Thanks,


> 
> Thanks,
> 
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/recovery.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index 95511ed11a22..1fde86a2107e 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >  
> >  	need_writecp = true;
> >  
> > +	/* quota is not fully updated due to the lack of user information. */
> > +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> > +
> >  	/* step #2: recover data */
> >  	err = recover_data(sbi, &inode_list, &dir_list);
> >  	if (!err)
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  0:06   ` Jaegeuk Kim
@ 2018-09-12  0:27     ` Jaegeuk Kim
  2018-09-12  1:13       ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-12  0:27 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/11, Jaegeuk Kim wrote:
> On 09/12, Chao Yu wrote:
> > On 2018/9/12 4:15, Jaegeuk Kim wrote:
> > > fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> > > can recover it based on previous user information.
> > 
> > I didn't get it, both fsck and kernel recover quota file based all inodes'
> > uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> > same?
> 
> I thought that, but had to add this, since I was encountering quota errors right
> after getting some files recovered. And, I thought it'd make it more safe to do
> fsck after roll-forward recovery.
> 
> Anyway, let me test again without this patch for a while.

Hmm, I just got a fsck failure right after some files recovered.

> 
> Thanks,
> 
> 
> > 
> > Thanks,
> > 
> > > 
> > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > > ---
> > >  fs/f2fs/recovery.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > > index 95511ed11a22..1fde86a2107e 100644
> > > --- a/fs/f2fs/recovery.c
> > > +++ b/fs/f2fs/recovery.c
> > > @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> > >  
> > >  	need_writecp = true;
> > >  
> > > +	/* quota is not fully updated due to the lack of user information. */
> > > +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> > > +
> > >  	/* step #2: recover data */
> > >  	err = recover_data(sbi, &inode_list, &dir_list);
> > >  	if (!err)
> > > 
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  0:27     ` Jaegeuk Kim
@ 2018-09-12  1:13       ` Chao Yu
  2018-09-12  1:25         ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-12  1:13 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 2018/9/12 8:27, Jaegeuk Kim wrote:
> On 09/11, Jaegeuk Kim wrote:
>> On 09/12, Chao Yu wrote:
>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>> can recover it based on previous user information.
>>>
>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>> same?
>>
>> I thought that, but had to add this, since I was encountering quota errors right
>> after getting some files recovered. And, I thought it'd make it more safe to do
>> fsck after roll-forward recovery.
>>
>> Anyway, let me test again without this patch for a while.
> 
> Hmm, I just got a fsck failure right after some files recovered.

To make sure, do you test with "f2fs: guarantee journalled quota data by
checkpoint"? if not, I think there is no guarantee that f2fs can recover
quote info into correct quote file, because, in last checkpoint, quota file
may was corrupted/inconsistent. Right?

Thanks,

> 
>>
>> Thanks,
>>
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>> ---
>>>>  fs/f2fs/recovery.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>> index 95511ed11a22..1fde86a2107e 100644
>>>> --- a/fs/f2fs/recovery.c
>>>> +++ b/fs/f2fs/recovery.c
>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>  
>>>>  	need_writecp = true;
>>>>  
>>>> +	/* quota is not fully updated due to the lack of user information. */
>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>> +
>>>>  	/* step #2: recover data */
>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
>>>>  	if (!err)
>>>>
>>
>>
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  1:13       ` Chao Yu
@ 2018-09-12  1:25         ` Jaegeuk Kim
  2018-09-12  1:40           ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-12  1:25 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/12, Chao Yu wrote:
> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> > On 09/11, Jaegeuk Kim wrote:
> >> On 09/12, Chao Yu wrote:
> >>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>> can recover it based on previous user information.
> >>>
> >>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>> same?
> >>
> >> I thought that, but had to add this, since I was encountering quota errors right
> >> after getting some files recovered. And, I thought it'd make it more safe to do
> >> fsck after roll-forward recovery.
> >>
> >> Anyway, let me test again without this patch for a while.
> > 
> > Hmm, I just got a fsck failure right after some files recovered.
> 
> To make sure, do you test with "f2fs: guarantee journalled quota data by
> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> quote info into correct quote file, because, in last checkpoint, quota file
> may was corrupted/inconsistent. Right?

I hit the failure with v8. And, the test scenario is 1) enable fault injection
2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
mount, 8) fsck, 9) go 1).

So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
and 7) recovered some files on clean checkpoint.

Thanks,

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>> ---
> >>>>  fs/f2fs/recovery.c | 3 +++
> >>>>  1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>> index 95511ed11a22..1fde86a2107e 100644
> >>>> --- a/fs/f2fs/recovery.c
> >>>> +++ b/fs/f2fs/recovery.c
> >>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >>>>  
> >>>>  	need_writecp = true;
> >>>>  
> >>>> +	/* quota is not fully updated due to the lack of user information. */
> >>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> >>>> +
> >>>>  	/* step #2: recover data */
> >>>>  	err = recover_data(sbi, &inode_list, &dir_list);
> >>>>  	if (!err)
> >>>>
> >>
> >>
> >> _______________________________________________
> >> Linux-f2fs-devel mailing list
> >> Linux-f2fs-devel@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  1:25         ` Jaegeuk Kim
@ 2018-09-12  1:40           ` Chao Yu
  2018-09-12  1:46             ` Chao Yu
  2018-09-12  2:46             ` Jaegeuk Kim
  0 siblings, 2 replies; 38+ messages in thread
From: Chao Yu @ 2018-09-12  1:40 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/12 9:25, Jaegeuk Kim wrote:
> On 09/12, Chao Yu wrote:
>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>> On 09/11, Jaegeuk Kim wrote:
>>>> On 09/12, Chao Yu wrote:
>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>> can recover it based on previous user information.
>>>>>
>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>> same?
>>>>
>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>> fsck after roll-forward recovery.
>>>>
>>>> Anyway, let me test again without this patch for a while.
>>>
>>> Hmm, I just got a fsck failure right after some files recovered.
>>
>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>> quote info into correct quote file, because, in last checkpoint, quota file
>> may was corrupted/inconsistent. Right?

Oh, I forget to mention that, I add a patch to fsck to let it noticing
CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
file if the flag is set, but w/o this flag, quota file is still corrupted
detected by fsck, I guess there is bug in v8.

Can you add that in fsck too? so we can separate real kernel bug and quota
file corruption due to dquot subsystem error caused like below case:

+static int f2fs_dquot_acquire(struct dquot *dquot)
+{
+	int ret;
+
+	ret = dquot_acquire(dquot);
+	if (ret == -ENOSPC || ret == -EIO)
+		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
+	return ret;
+}

> 
> I hit the failure with v8. And, the test scenario is 1) enable fault injection
> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
> mount, 8) fsck, 9) go 1).
> 
> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
> and 7) recovered some files on clean checkpoint.

I see, I can add this case too, does this exist in your xfstest tree in github?

Thanks,

> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>> ---
>>>>>>  fs/f2fs/recovery.c | 3 +++
>>>>>>  1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>> index 95511ed11a22..1fde86a2107e 100644
>>>>>> --- a/fs/f2fs/recovery.c
>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>>>  
>>>>>>  	need_writecp = true;
>>>>>>  
>>>>>> +	/* quota is not fully updated due to the lack of user information. */
>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>>>> +
>>>>>>  	/* step #2: recover data */
>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
>>>>>>  	if (!err)
>>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Linux-f2fs-devel mailing list
>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  1:40           ` Chao Yu
@ 2018-09-12  1:46             ` Chao Yu
  2018-09-12 19:54               ` Jaegeuk Kim
  2018-09-12  2:46             ` Jaegeuk Kim
  1 sibling, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-12  1:46 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/12 9:40, Chao Yu wrote:
> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>> On 09/12, Chao Yu wrote:
>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>> On 09/11, Jaegeuk Kim wrote:
>>>>> On 09/12, Chao Yu wrote:
>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>> can recover it based on previous user information.
>>>>>>
>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>> same?
>>>>>
>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>> fsck after roll-forward recovery.
>>>>>
>>>>> Anyway, let me test again without this patch for a while.
>>>>
>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>
>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>> quote info into correct quote file, because, in last checkpoint, quota file
>>> may was corrupted/inconsistent. Right?
> 
> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> file if the flag is set, but w/o this flag, quota file is still corrupted
> detected by fsck, I guess there is bug in v8.

In v8, there are two cases we didn't guarantee quota file's consistence:
1. flush time in block_operation exceed a threshold.
2. dquot subsystem error occurs.

For above case, fsck should repair the quota file by default.

> 
> Can you add that in fsck too? so we can separate real kernel bug and quota
> file corruption due to dquot subsystem error caused like below case:
> 
> +static int f2fs_dquot_acquire(struct dquot *dquot)
> +{
> +	int ret;
> +
> +	ret = dquot_acquire(dquot);
> +	if (ret == -ENOSPC || ret == -EIO)
> +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
> +	return ret;
> +}
> 
>>
>> I hit the failure with v8. And, the test scenario is 1) enable fault injection
>> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
>> mount, 8) fsck, 9) go 1).

8) fsck is do fscking in a mounted image?

Thanks,

>>
>> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
>> and 7) recovered some files on clean checkpoint.
> 
> I see, I can add this case too, does this exist in your xfstest tree in github?
> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>> ---
>>>>>>>  fs/f2fs/recovery.c | 3 +++
>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>> index 95511ed11a22..1fde86a2107e 100644
>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>>>>  
>>>>>>>  	need_writecp = true;
>>>>>>>  
>>>>>>> +	/* quota is not fully updated due to the lack of user information. */
>>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>>>>> +
>>>>>>>  	/* step #2: recover data */
>>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
>>>>>>>  	if (!err)
>>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Linux-f2fs-devel mailing list
>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>
>>>> .
>>>>
>>
>> .
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  1:40           ` Chao Yu
  2018-09-12  1:46             ` Chao Yu
@ 2018-09-12  2:46             ` Jaegeuk Kim
  2018-09-12  2:57               ` Chao Yu
  1 sibling, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-12  2:46 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/12, Chao Yu wrote:
> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> > On 09/12, Chao Yu wrote:
> >> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>> On 09/11, Jaegeuk Kim wrote:
> >>>> On 09/12, Chao Yu wrote:
> >>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>> can recover it based on previous user information.
> >>>>>
> >>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>> same?
> >>>>
> >>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>> fsck after roll-forward recovery.
> >>>>
> >>>> Anyway, let me test again without this patch for a while.
> >>>
> >>> Hmm, I just got a fsck failure right after some files recovered.
> >>
> >> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >> quote info into correct quote file, because, in last checkpoint, quota file
> >> may was corrupted/inconsistent. Right?
> 
> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> file if the flag is set, but w/o this flag, quota file is still corrupted
> detected by fsck, I guess there is bug in v8.
> 
> Can you add that in fsck too? so we can separate real kernel bug and quota
> file corruption due to dquot subsystem error caused like below case:

I'm testing to trigger fsck when it sees the below flag. But, when considering
old f2fs-tools, we may need a way to detect mkfs version in superblock in order
to determine whether we can rely on this new flag or not.

> 
> +static int f2fs_dquot_acquire(struct dquot *dquot)
> +{
> +	int ret;
> +
> +	ret = dquot_acquire(dquot);
> +	if (ret == -ENOSPC || ret == -EIO)
> +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
> +	return ret;
> +}
> 
> > 
> > I hit the failure with v8. And, the test scenario is 1) enable fault injection
> > 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
> > mount, 8) fsck, 9) go 1).
> > 
> > So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
> > and 7) recovered some files on clean checkpoint.
> 
> I see, I can add this case too, does this exist in your xfstest tree in github?

I think so.

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>>
> >>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>> ---
> >>>>>>  fs/f2fs/recovery.c | 3 +++
> >>>>>>  1 file changed, 3 insertions(+)
> >>>>>>
> >>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>> index 95511ed11a22..1fde86a2107e 100644
> >>>>>> --- a/fs/f2fs/recovery.c
> >>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >>>>>>  
> >>>>>>  	need_writecp = true;
> >>>>>>  
> >>>>>> +	/* quota is not fully updated due to the lack of user information. */
> >>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> >>>>>> +
> >>>>>>  	/* step #2: recover data */
> >>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
> >>>>>>  	if (!err)
> >>>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Linux-f2fs-devel mailing list
> >>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  2:46             ` Jaegeuk Kim
@ 2018-09-12  2:57               ` Chao Yu
  2018-09-12 19:50                 ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-12  2:57 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/12 10:46, Jaegeuk Kim wrote:
> On 09/12, Chao Yu wrote:
>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>> On 09/12, Chao Yu wrote:
>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>> On 09/12, Chao Yu wrote:
>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>> can recover it based on previous user information.
>>>>>>>
>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>> same?
>>>>>>
>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>> fsck after roll-forward recovery.
>>>>>>
>>>>>> Anyway, let me test again without this patch for a while.
>>>>>
>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>
>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>> may was corrupted/inconsistent. Right?
>>
>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>> file if the flag is set, but w/o this flag, quota file is still corrupted
>> detected by fsck, I guess there is bug in v8.
>>
>> Can you add that in fsck too? so we can separate real kernel bug and quota
>> file corruption due to dquot subsystem error caused like below case:
> 
> I'm testing to trigger fsck when it sees the below flag. But, when considering
> old f2fs-tools, we may need a way to detect mkfs version in superblock in order

Oh, that will make kernel be complicated... kernel should be aware of user
space things... if user use old tools and new kernel, how about just let
kernel give warning on dmesg to user to upgrade f2fs tool set.

And also, even if w/o CP_FSCK_FLAG, fsck can also detect such quote file
corruption, then do repair, right?

Thanks,

> to determine whether we can rely on this new flag or not.
> 
>>
>> +static int f2fs_dquot_acquire(struct dquot *dquot)
>> +{
>> +	int ret;
>> +
>> +	ret = dquot_acquire(dquot);
>> +	if (ret == -ENOSPC || ret == -EIO)
>> +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
>> +	return ret;
>> +}
>>
>>>
>>> I hit the failure with v8. And, the test scenario is 1) enable fault injection
>>> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
>>> mount, 8) fsck, 9) go 1).
>>>
>>> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
>>> and 7) recovered some files on clean checkpoint.
>>
>> I see, I can add this case too, does this exist in your xfstest tree in github?
> 
> I think so.
> 
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>
>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>> ---
>>>>>>>>  fs/f2fs/recovery.c | 3 +++
>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>> index 95511ed11a22..1fde86a2107e 100644
>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>>>>>  
>>>>>>>>  	need_writecp = true;
>>>>>>>>  
>>>>>>>> +	/* quota is not fully updated due to the lack of user information. */
>>>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>>>>>> +
>>>>>>>>  	/* step #2: recover data */
>>>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
>>>>>>>>  	if (!err)
>>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Linux-f2fs-devel mailing list
>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  2:57               ` Chao Yu
@ 2018-09-12 19:50                 ` Jaegeuk Kim
  2018-09-12 23:30                   ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-12 19:50 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/12, Chao Yu wrote:
> On 2018/9/12 10:46, Jaegeuk Kim wrote:
> > On 09/12, Chao Yu wrote:
> >> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>> On 09/12, Chao Yu wrote:
> >>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>> On 09/12, Chao Yu wrote:
> >>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>> can recover it based on previous user information.
> >>>>>>>
> >>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>> same?
> >>>>>>
> >>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>> fsck after roll-forward recovery.
> >>>>>>
> >>>>>> Anyway, let me test again without this patch for a while.
> >>>>>
> >>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>
> >>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>> may was corrupted/inconsistent. Right?
> >>
> >> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >> file if the flag is set, but w/o this flag, quota file is still corrupted
> >> detected by fsck, I guess there is bug in v8.
> >>
> >> Can you add that in fsck too? so we can separate real kernel bug and quota
> >> file corruption due to dquot subsystem error caused like below case:
> > 
> > I'm testing to trigger fsck when it sees the below flag. But, when considering
> > old f2fs-tools, we may need a way to detect mkfs version in superblock in order
> 
> Oh, that will make kernel be complicated... kernel should be aware of user
> space things... if user use old tools and new kernel, how about just let
> kernel give warning on dmesg to user to upgrade f2fs tool set.
> 
> And also, even if w/o CP_FSCK_FLAG, fsck can also detect such quote file
> corruption, then do repair, right?

It requires scanning the directory tree, which doesn't make sense.

> 
> Thanks,
> 
> > to determine whether we can rely on this new flag or not.
> > 
> >>
> >> +static int f2fs_dquot_acquire(struct dquot *dquot)
> >> +{
> >> +	int ret;
> >> +
> >> +	ret = dquot_acquire(dquot);
> >> +	if (ret == -ENOSPC || ret == -EIO)
> >> +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
> >> +	return ret;
> >> +}
> >>
> >>>
> >>> I hit the failure with v8. And, the test scenario is 1) enable fault injection
> >>> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
> >>> mount, 8) fsck, 9) go 1).
> >>>
> >>> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
> >>> and 7) recovered some files on clean checkpoint.
> >>
> >> I see, I can add this case too, does this exist in your xfstest tree in github?
> > 
> > I think so.
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>> ---
> >>>>>>>>  fs/f2fs/recovery.c | 3 +++
> >>>>>>>>  1 file changed, 3 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>>>> index 95511ed11a22..1fde86a2107e 100644
> >>>>>>>> --- a/fs/f2fs/recovery.c
> >>>>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >>>>>>>>  
> >>>>>>>>  	need_writecp = true;
> >>>>>>>>  
> >>>>>>>> +	/* quota is not fully updated due to the lack of user information. */
> >>>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> >>>>>>>> +
> >>>>>>>>  	/* step #2: recover data */
> >>>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
> >>>>>>>>  	if (!err)
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Linux-f2fs-devel mailing list
> >>>>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>>
> >>>>> .
> >>>>>
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12  1:46             ` Chao Yu
@ 2018-09-12 19:54               ` Jaegeuk Kim
  2018-09-12 23:28                 ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-12 19:54 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/12, Chao Yu wrote:
> On 2018/9/12 9:40, Chao Yu wrote:
> > On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >> On 09/12, Chao Yu wrote:
> >>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>> On 09/11, Jaegeuk Kim wrote:
> >>>>> On 09/12, Chao Yu wrote:
> >>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>> can recover it based on previous user information.
> >>>>>>
> >>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>> same?
> >>>>>
> >>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>> fsck after roll-forward recovery.
> >>>>>
> >>>>> Anyway, let me test again without this patch for a while.
> >>>>
> >>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>
> >>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>> quote info into correct quote file, because, in last checkpoint, quota file
> >>> may was corrupted/inconsistent. Right?
> > 
> > Oh, I forget to mention that, I add a patch to fsck to let it noticing
> > CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> > file if the flag is set, but w/o this flag, quota file is still corrupted
> > detected by fsck, I guess there is bug in v8.
> 
> In v8, there are two cases we didn't guarantee quota file's consistence:
> 1. flush time in block_operation exceed a threshold.
> 2. dquot subsystem error occurs.
> 
> For above case, fsck should repair the quota file by default.

Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
during the recovery. So, we have something missing in the recovery in terms
of quota updates.

> 
> > 
> > Can you add that in fsck too? so we can separate real kernel bug and quota
> > file corruption due to dquot subsystem error caused like below case:
> > 
> > +static int f2fs_dquot_acquire(struct dquot *dquot)
> > +{
> > +	int ret;
> > +
> > +	ret = dquot_acquire(dquot);
> > +	if (ret == -ENOSPC || ret == -EIO)
> > +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
> > +	return ret;
> > +}
> > 
> >>
> >> I hit the failure with v8. And, the test scenario is 1) enable fault injection
> >> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
> >> mount, 8) fsck, 9) go 1).
> 
> 8) fsck is do fscking in a mounted image?

Missing unmount before 8). :P

> 
> Thanks,
> 
> >>
> >> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
> >> and 7) recovered some files on clean checkpoint.
> > 
> > I see, I can add this case too, does this exist in your xfstest tree in github?
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/recovery.c | 3 +++
> >>>>>>>  1 file changed, 3 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>>> index 95511ed11a22..1fde86a2107e 100644
> >>>>>>> --- a/fs/f2fs/recovery.c
> >>>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
> >>>>>>>  
> >>>>>>>  	need_writecp = true;
> >>>>>>>  
> >>>>>>> +	/* quota is not fully updated due to the lack of user information. */
> >>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
> >>>>>>> +
> >>>>>>>  	/* step #2: recover data */
> >>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
> >>>>>>>  	if (!err)
> >>>>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Linux-f2fs-devel mailing list
> >>>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>
> >>>> .
> >>>>
> >>
> >> .
> >>
> > 
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12 19:54               ` Jaegeuk Kim
@ 2018-09-12 23:28                 ` Chao Yu
  2018-09-18  1:19                   ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-12 23:28 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 2018/9/13 3:54, Jaegeuk Kim wrote:
> On 09/12, Chao Yu wrote:
>> On 2018/9/12 9:40, Chao Yu wrote:
>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>> On 09/12, Chao Yu wrote:
>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>> can recover it based on previous user information.
>>>>>>>>
>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>> same?
>>>>>>>
>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>> fsck after roll-forward recovery.
>>>>>>>
>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>
>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>
>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>> may was corrupted/inconsistent. Right?
>>>
>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>> detected by fsck, I guess there is bug in v8.
>>
>> In v8, there are two cases we didn't guarantee quota file's consistence:
>> 1. flush time in block_operation exceed a threshold.
>> 2. dquot subsystem error occurs.
>>
>> For above case, fsck should repair the quota file by default.
> 
> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> during the recovery. So, we have something missing in the recovery in terms
> of quota updates.

Yeah, I checked the code, just found one suspected place:

find_fsync_dnodes()
 - f2fs_recover_inode_page
  - inc_valid_node_count
   - dquot_reserve_block  dquot info is not initialized now
 - add_fsync_inode
  - dquot_initialize

I think we should reserve block for inode block after dquot_initialize(), can
you confirm this?

> 
>>
>>>
>>> Can you add that in fsck too? so we can separate real kernel bug and quota
>>> file corruption due to dquot subsystem error caused like below case:
>>>
>>> +static int f2fs_dquot_acquire(struct dquot *dquot)
>>> +{
>>> +	int ret;
>>> +
>>> +	ret = dquot_acquire(dquot);
>>> +	if (ret == -ENOSPC || ret == -EIO)
>>> +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
>>> +	return ret;
>>> +}
>>>
>>>>
>>>> I hit the failure with v8. And, the test scenario is 1) enable fault injection
>>>> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
>>>> mount, 8) fsck, 9) go 1).
>>
>> 8) fsck is do fscking in a mounted image?
> 
> Missing unmount before 8). :P

Alright. :)

Thanks,

> 
>>
>> Thanks,
>>
>>>>
>>>> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
>>>> and 7) recovered some files on clean checkpoint.
>>>
>>> I see, I can add this case too, does this exist in your xfstest tree in github?
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>> ---
>>>>>>>>>  fs/f2fs/recovery.c | 3 +++
>>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>>> index 95511ed11a22..1fde86a2107e 100644
>>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>>>>>>  
>>>>>>>>>  	need_writecp = true;
>>>>>>>>>  
>>>>>>>>> +	/* quota is not fully updated due to the lack of user information. */
>>>>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>>>>>>> +
>>>>>>>>>  	/* step #2: recover data */
>>>>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
>>>>>>>>>  	if (!err)
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-f2fs-devel mailing list
>>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>
>>>>>> .
>>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> .
>>>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12 19:50                 ` Jaegeuk Kim
@ 2018-09-12 23:30                   ` Chao Yu
  0 siblings, 0 replies; 38+ messages in thread
From: Chao Yu @ 2018-09-12 23:30 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 2018/9/13 3:50, Jaegeuk Kim wrote:
> On 09/12, Chao Yu wrote:
>> On 2018/9/12 10:46, Jaegeuk Kim wrote:
>>> On 09/12, Chao Yu wrote:
>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>> On 09/12, Chao Yu wrote:
>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>
>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>> same?
>>>>>>>>
>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>
>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>
>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>
>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>> may was corrupted/inconsistent. Right?
>>>>
>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>> detected by fsck, I guess there is bug in v8.
>>>>
>>>> Can you add that in fsck too? so we can separate real kernel bug and quota
>>>> file corruption due to dquot subsystem error caused like below case:
>>>
>>> I'm testing to trigger fsck when it sees the below flag. But, when considering
>>> old f2fs-tools, we may need a way to detect mkfs version in superblock in order
>>
>> Oh, that will make kernel be complicated... kernel should be aware of user
>> space things... if user use old tools and new kernel, how about just let
>> kernel give warning on dmesg to user to upgrade f2fs tool set.
>>
>> And also, even if w/o CP_FSCK_FLAG, fsck can also detect such quote file
>> corruption, then do repair, right?
> 
> It requires scanning the directory tree, which doesn't make sense.

Yeah, that's right.

Thanks,

> 
>>
>> Thanks,
>>
>>> to determine whether we can rely on this new flag or not.
>>>
>>>>
>>>> +static int f2fs_dquot_acquire(struct dquot *dquot)
>>>> +{
>>>> +	int ret;
>>>> +
>>>> +	ret = dquot_acquire(dquot);
>>>> +	if (ret == -ENOSPC || ret == -EIO)
>>>> +		set_sbi_flag(F2FS_SB(dquot->dq_sb), SBI_QUOTA_NEED_REPAIR);
>>>> +	return ret;
>>>> +}
>>>>
>>>>>
>>>>> I hit the failure with v8. And, the test scenario is 1) enable fault injection
>>>>> 2) run fsstress, 3) call shutdowon, 4) kill fsstress, 5) unmount, 6) fsck, 7)
>>>>> mount, 8) fsck, 9) go 1).
>>>>>
>>>>> So, I'm hitting failure in 8) fsck. I expect 6) fsck should fix any corruption
>>>>> and 7) recovered some files on clean checkpoint.
>>>>
>>>> I see, I can add this case too, does this exist in your xfstest tree in github?
>>>
>>> I think so.
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>> ---
>>>>>>>>>>  fs/f2fs/recovery.c | 3 +++
>>>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>>>> index 95511ed11a22..1fde86a2107e 100644
>>>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>>>> @@ -675,6 +675,9 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
>>>>>>>>>>  
>>>>>>>>>>  	need_writecp = true;
>>>>>>>>>>  
>>>>>>>>>> +	/* quota is not fully updated due to the lack of user information. */
>>>>>>>>>> +	set_sbi_flag(sbi, SBI_NEED_FSCK);
>>>>>>>>>> +
>>>>>>>>>>  	/* step #2: recover data */
>>>>>>>>>>  	err = recover_data(sbi, &inode_list, &dir_list);
>>>>>>>>>>  	if (!err)
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Linux-f2fs-devel mailing list
>>>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-12 23:28                 ` Chao Yu
@ 2018-09-18  1:19                   ` Jaegeuk Kim
  2018-09-18  1:38                     ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-18  1:19 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/13, Chao Yu wrote:
> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> > On 09/12, Chao Yu wrote:
> >> On 2018/9/12 9:40, Chao Yu wrote:
> >>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>> On 09/12, Chao Yu wrote:
> >>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>> can recover it based on previous user information.
> >>>>>>>>
> >>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>> same?
> >>>>>>>
> >>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>> fsck after roll-forward recovery.
> >>>>>>>
> >>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>
> >>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>
> >>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>> may was corrupted/inconsistent. Right?
> >>>
> >>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>> detected by fsck, I guess there is bug in v8.
> >>
> >> In v8, there are two cases we didn't guarantee quota file's consistence:
> >> 1. flush time in block_operation exceed a threshold.
> >> 2. dquot subsystem error occurs.
> >>
> >> For above case, fsck should repair the quota file by default.
> > 
> > Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> > during the recovery. So, we have something missing in the recovery in terms
> > of quota updates.
> 
> Yeah, I checked the code, just found one suspected place:
> 
> find_fsync_dnodes()
>  - f2fs_recover_inode_page
>   - inc_valid_node_count
>    - dquot_reserve_block  dquot info is not initialized now
>  - add_fsync_inode
>   - dquot_initialize
> 
> I think we should reserve block for inode block after dquot_initialize(), can
> you confirm this?

Let me test this.

From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Mon, 17 Sep 2018 18:14:41 -0700
Subject: [PATCH] f2fs: count inode block for recovered files

If a new file is recovered, we missed to reserve its inode block.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/recovery.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 56d34193a74b..bff5cf730e13 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
 		err = dquot_alloc_inode(inode);
 		if (err)
 			goto err_out;
+		err = dquot_reserve_block(inode, 1);
+		if (err) {
+			dquot_drop(inode);
+			goto err_out;
+		}
 	}
 
 	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
-- 
2.17.0.441.gb46fe60e1d-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-18  1:19                   ` Jaegeuk Kim
@ 2018-09-18  1:38                     ` Chao Yu
  2018-09-18  2:05                       ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-18  1:38 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 2018/9/18 9:19, Jaegeuk Kim wrote:
> On 09/13, Chao Yu wrote:
>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>> On 09/12, Chao Yu wrote:
>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>> On 09/12, Chao Yu wrote:
>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>
>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>> same?
>>>>>>>>>
>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>
>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>
>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>
>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>> may was corrupted/inconsistent. Right?
>>>>>
>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>> detected by fsck, I guess there is bug in v8.
>>>>
>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>> 1. flush time in block_operation exceed a threshold.
>>>> 2. dquot subsystem error occurs.
>>>>
>>>> For above case, fsck should repair the quota file by default.
>>>
>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>> during the recovery. So, we have something missing in the recovery in terms
>>> of quota updates.
>>
>> Yeah, I checked the code, just found one suspected place:
>>
>> find_fsync_dnodes()
>>  - f2fs_recover_inode_page
>>   - inc_valid_node_count
>>    - dquot_reserve_block  dquot info is not initialized now
>>  - add_fsync_inode
>>   - dquot_initialize
>>
>> I think we should reserve block for inode block after dquot_initialize(), can
>> you confirm this?
> 
> Let me test this.
> 
>>From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim <jaegeuk@kernel.org>
> Date: Mon, 17 Sep 2018 18:14:41 -0700
> Subject: [PATCH] f2fs: count inode block for recovered files
> 
> If a new file is recovered, we missed to reserve its inode block.

I remember, in order to keep line with other filesystem, unlike on-disk, we
have to keep backward compatibilty, in memory we don't account block number
for f2fs' inode block, but only account inode number for it, so here like
we did in inc_valid_node_count(), we don't need to do this.

Can you test v9 first? I didn't encounter quota corruption with your
testcase right now. Will check it in cell phone environment.

> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/recovery.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index 56d34193a74b..bff5cf730e13 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>  		err = dquot_alloc_inode(inode);
>  		if (err)
>  			goto err_out;
> +		err = dquot_reserve_block(inode, 1);
> +		if (err) {
> +			dquot_drop(inode);
> +			goto err_out;
> +		}
>  	}
>  
>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-18  1:38                     ` Chao Yu
@ 2018-09-18  2:05                       ` Jaegeuk Kim
  2018-09-18 10:13                         ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-18  2:05 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/18, Chao Yu wrote:
> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> > On 09/13, Chao Yu wrote:
> >> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>> On 09/12, Chao Yu wrote:
> >>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>> On 09/12, Chao Yu wrote:
> >>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>
> >>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>> same?
> >>>>>>>>>
> >>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>
> >>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>
> >>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>
> >>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>> may was corrupted/inconsistent. Right?
> >>>>>
> >>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>> detected by fsck, I guess there is bug in v8.
> >>>>
> >>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>> 1. flush time in block_operation exceed a threshold.
> >>>> 2. dquot subsystem error occurs.
> >>>>
> >>>> For above case, fsck should repair the quota file by default.
> >>>
> >>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>> during the recovery. So, we have something missing in the recovery in terms
> >>> of quota updates.
> >>
> >> Yeah, I checked the code, just found one suspected place:
> >>
> >> find_fsync_dnodes()
> >>  - f2fs_recover_inode_page
> >>   - inc_valid_node_count
> >>    - dquot_reserve_block  dquot info is not initialized now
> >>  - add_fsync_inode
> >>   - dquot_initialize
> >>
> >> I think we should reserve block for inode block after dquot_initialize(), can
> >> you confirm this?
> > 
> > Let me test this.
> > 
> >>From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> > From: Jaegeuk Kim <jaegeuk@kernel.org>
> > Date: Mon, 17 Sep 2018 18:14:41 -0700
> > Subject: [PATCH] f2fs: count inode block for recovered files
> > 
> > If a new file is recovered, we missed to reserve its inode block.
> 
> I remember, in order to keep line with other filesystem, unlike on-disk, we
> have to keep backward compatibilty, in memory we don't account block number
> for f2fs' inode block, but only account inode number for it, so here like
> we did in inc_valid_node_count(), we don't need to do this.

Okay, I just hit the error again w/o your patch. Another one coming to my mind
is that caused by uid/gid change during recovery. Let me try out your patch.

> 
> Can you test v9 first? I didn't encounter quota corruption with your
> testcase right now. Will check it in cell phone environment.
> 
> > 
> > Signed-off-by: Chao Yu <yuchao0@huawei.com>
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/recovery.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index 56d34193a74b..bff5cf730e13 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >  		err = dquot_alloc_inode(inode);
> >  		if (err)
> >  			goto err_out;
> > +		err = dquot_reserve_block(inode, 1);
> > +		if (err) {
> > +			dquot_drop(inode);
> > +			goto err_out;
> > +		}
> >  	}
> >  
> >  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-18  2:05                       ` Jaegeuk Kim
@ 2018-09-18 10:13                         ` Chao Yu
  2018-09-18 16:45                           ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-18 10:13 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/18 10:05, Jaegeuk Kim wrote:
> On 09/18, Chao Yu wrote:
>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>> On 09/13, Chao Yu wrote:
>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>> On 09/12, Chao Yu wrote:
>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>
>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>> same?
>>>>>>>>>>>
>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>
>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>
>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>
>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>
>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>
>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>> 2. dquot subsystem error occurs.
>>>>>>
>>>>>> For above case, fsck should repair the quota file by default.
>>>>>
>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>> of quota updates.
>>>>
>>>> Yeah, I checked the code, just found one suspected place:
>>>>
>>>> find_fsync_dnodes()
>>>>  - f2fs_recover_inode_page
>>>>   - inc_valid_node_count
>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>  - add_fsync_inode
>>>>   - dquot_initialize
>>>>
>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>> you confirm this?
>>>
>>> Let me test this.
>>>
>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>
>>> If a new file is recovered, we missed to reserve its inode block.
>>
>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>> have to keep backward compatibilty, in memory we don't account block number
>> for f2fs' inode block, but only account inode number for it, so here like
>> we did in inc_valid_node_count(), we don't need to do this.
> 
> Okay, I just hit the error again w/o your patch. Another one coming to my mind
> is that caused by uid/gid change during recovery. Let me try out your patch.

I guess we should update dquot and inode's uid/gid atomically under
lock_op() in f2fs_setattr() to prevent corruption on sys quota file.

v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
enabled, but w/ normal quota file, I got one regression reported by
generic/232, I fixed in v10, will do some tests and release it later.

Note that, my fsck can fix corrupted quota file automatically once
CP_QUOTA_NEED_FSCK_FLAG is set.

Thanks,

> 
>>
>> Can you test v9 first? I didn't encounter quota corruption with your
>> testcase right now. Will check it in cell phone environment.
>>
>>>
>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/recovery.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>> index 56d34193a74b..bff5cf730e13 100644
>>> --- a/fs/f2fs/recovery.c
>>> +++ b/fs/f2fs/recovery.c
>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>  		err = dquot_alloc_inode(inode);
>>>  		if (err)
>>>  			goto err_out;
>>> +		err = dquot_reserve_block(inode, 1);
>>> +		if (err) {
>>> +			dquot_drop(inode);
>>> +			goto err_out;
>>> +		}
>>>  	}
>>>  
>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-18 10:13                         ` Chao Yu
@ 2018-09-18 16:45                           ` Jaegeuk Kim
  2018-09-19  1:38                             ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-18 16:45 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/18, Chao Yu wrote:
> On 2018/9/18 10:05, Jaegeuk Kim wrote:
> > On 09/18, Chao Yu wrote:
> >> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> >>> On 09/13, Chao Yu wrote:
> >>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>>>> On 09/12, Chao Yu wrote:
> >>>>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>>>> same?
> >>>>>>>>>>>
> >>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>>>
> >>>>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>>>
> >>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>>>
> >>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>>>> may was corrupted/inconsistent. Right?
> >>>>>>>
> >>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>>>> detected by fsck, I guess there is bug in v8.
> >>>>>>
> >>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>>>> 1. flush time in block_operation exceed a threshold.
> >>>>>> 2. dquot subsystem error occurs.
> >>>>>>
> >>>>>> For above case, fsck should repair the quota file by default.
> >>>>>
> >>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>>>> during the recovery. So, we have something missing in the recovery in terms
> >>>>> of quota updates.
> >>>>
> >>>> Yeah, I checked the code, just found one suspected place:
> >>>>
> >>>> find_fsync_dnodes()
> >>>>  - f2fs_recover_inode_page
> >>>>   - inc_valid_node_count
> >>>>    - dquot_reserve_block  dquot info is not initialized now
> >>>>  - add_fsync_inode
> >>>>   - dquot_initialize
> >>>>
> >>>> I think we should reserve block for inode block after dquot_initialize(), can
> >>>> you confirm this?
> >>>
> >>> Let me test this.
> >>>
> >>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> >>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> Date: Mon, 17 Sep 2018 18:14:41 -0700
> >>> Subject: [PATCH] f2fs: count inode block for recovered files
> >>>
> >>> If a new file is recovered, we missed to reserve its inode block.
> >>
> >> I remember, in order to keep line with other filesystem, unlike on-disk, we
> >> have to keep backward compatibilty, in memory we don't account block number
> >> for f2fs' inode block, but only account inode number for it, so here like
> >> we did in inc_valid_node_count(), we don't need to do this.
> > 
> > Okay, I just hit the error again w/o your patch. Another one coming to my mind
> > is that caused by uid/gid change during recovery. Let me try out your patch.
> 
> I guess we should update dquot and inode's uid/gid atomically under
> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
> 
> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
> enabled, but w/ normal quota file, I got one regression reported by
> generic/232, I fixed in v10, will do some tests and release it later.
> 
> Note that, my fsck can fix corrupted quota file automatically once
> CP_QUOTA_NEED_FSCK_FLAG is set.

I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
flag in roll-forward recovery, everything is fine.

> 
> Thanks,
> 
> > 
> >>
> >> Can you test v9 first? I didn't encounter quota corruption with your
> >> testcase right now. Will check it in cell phone environment.
> >>
> >>>
> >>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/recovery.c | 5 +++++
> >>>  1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>> index 56d34193a74b..bff5cf730e13 100644
> >>> --- a/fs/f2fs/recovery.c
> >>> +++ b/fs/f2fs/recovery.c
> >>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >>>  		err = dquot_alloc_inode(inode);
> >>>  		if (err)
> >>>  			goto err_out;
> >>> +		err = dquot_reserve_block(inode, 1);
> >>> +		if (err) {
> >>> +			dquot_drop(inode);
> >>> +			goto err_out;
> >>> +		}
> >>>  	}
> >>>  
> >>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-18 16:45                           ` Jaegeuk Kim
@ 2018-09-19  1:38                             ` Chao Yu
  2018-09-19 22:38                               ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-19  1:38 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/19 0:45, Jaegeuk Kim wrote:
> On 09/18, Chao Yu wrote:
>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>> On 09/18, Chao Yu wrote:
>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>> On 09/13, Chao Yu wrote:
>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>
>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>
>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>
>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>
>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>
>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>>>> of quota updates.
>>>>>>
>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>
>>>>>> find_fsync_dnodes()
>>>>>>  - f2fs_recover_inode_page
>>>>>>   - inc_valid_node_count
>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>  - add_fsync_inode
>>>>>>   - dquot_initialize
>>>>>>
>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>>>> you confirm this?
>>>>>
>>>>> Let me test this.
>>>>>
>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>
>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>
>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>>>> have to keep backward compatibilty, in memory we don't account block number
>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>
>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
>>> is that caused by uid/gid change during recovery. Let me try out your patch.
>>
>> I guess we should update dquot and inode's uid/gid atomically under
>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>
>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>> enabled, but w/ normal quota file, I got one regression reported by
>> generic/232, I fixed in v10, will do some tests and release it later.
>>
>> Note that, my fsck can fix corrupted quota file automatically once
>> CP_QUOTA_NEED_FSCK_FLAG is set.
> 
> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect

That's strange, in my environment, before v9, I always encounter corrupted
quota sysfile after step 9), after v9, I never hit failure again.

1) enable fault injection
2) run fsstress
3) call shutdowon
4) kill fsstress
5) unmount
6) fsck
7) mount
8) umount
9) fsck
10) go 1).

> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
> flag in roll-forward recovery, everything is fine.

I do the test based on codes in my git tree, could you check the result
again based on my code? in where I just disable nat_bits recovery, not
sure, in step 6) fsck can break some thing in image.

https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev

Also, I just send the fsck code, could you check that too?

And I'd like to know your mount option and mkfs option, could you list for me?

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>> testcase right now. Will check it in cell phone environment.
>>>>
>>>>>
>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>> ---
>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>  1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>> --- a/fs/f2fs/recovery.c
>>>>> +++ b/fs/f2fs/recovery.c
>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>  		err = dquot_alloc_inode(inode);
>>>>>  		if (err)
>>>>>  			goto err_out;
>>>>> +		err = dquot_reserve_block(inode, 1);
>>>>> +		if (err) {
>>>>> +			dquot_drop(inode);
>>>>> +			goto err_out;
>>>>> +		}
>>>>>  	}
>>>>>  
>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-19  1:38                             ` Chao Yu
@ 2018-09-19 22:38                               ` Jaegeuk Kim
  2018-09-20  9:46                                 ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-19 22:38 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/19, Chao Yu wrote:
> On 2018/9/19 0:45, Jaegeuk Kim wrote:
> > On 09/18, Chao Yu wrote:
> >> On 2018/9/18 10:05, Jaegeuk Kim wrote:
> >>> On 09/18, Chao Yu wrote:
> >>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> >>>>> On 09/13, Chao Yu wrote:
> >>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>>>>>> same?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>>>>>
> >>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>>>>>> may was corrupted/inconsistent. Right?
> >>>>>>>>>
> >>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>>>>>> detected by fsck, I guess there is bug in v8.
> >>>>>>>>
> >>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>>>>>> 1. flush time in block_operation exceed a threshold.
> >>>>>>>> 2. dquot subsystem error occurs.
> >>>>>>>>
> >>>>>>>> For above case, fsck should repair the quota file by default.
> >>>>>>>
> >>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>>>>>> during the recovery. So, we have something missing in the recovery in terms
> >>>>>>> of quota updates.
> >>>>>>
> >>>>>> Yeah, I checked the code, just found one suspected place:
> >>>>>>
> >>>>>> find_fsync_dnodes()
> >>>>>>  - f2fs_recover_inode_page
> >>>>>>   - inc_valid_node_count
> >>>>>>    - dquot_reserve_block  dquot info is not initialized now
> >>>>>>  - add_fsync_inode
> >>>>>>   - dquot_initialize
> >>>>>>
> >>>>>> I think we should reserve block for inode block after dquot_initialize(), can
> >>>>>> you confirm this?
> >>>>>
> >>>>> Let me test this.
> >>>>>
> >>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> >>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
> >>>>> Subject: [PATCH] f2fs: count inode block for recovered files
> >>>>>
> >>>>> If a new file is recovered, we missed to reserve its inode block.
> >>>>
> >>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
> >>>> have to keep backward compatibilty, in memory we don't account block number
> >>>> for f2fs' inode block, but only account inode number for it, so here like
> >>>> we did in inc_valid_node_count(), we don't need to do this.
> >>>
> >>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
> >>> is that caused by uid/gid change during recovery. Let me try out your patch.
> >>
> >> I guess we should update dquot and inode's uid/gid atomically under
> >> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
> >>
> >> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
> >> enabled, but w/ normal quota file, I got one regression reported by
> >> generic/232, I fixed in v10, will do some tests and release it later.
> >>
> >> Note that, my fsck can fix corrupted quota file automatically once
> >> CP_QUOTA_NEED_FSCK_FLAG is set.
> > 
> > I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
> 
> That's strange, in my environment, before v9, I always encounter corrupted
> quota sysfile after step 9), after v9, I never hit failure again.
> 
> 1) enable fault injection
> 2) run fsstress
> 3) call shutdowon
> 4) kill fsstress
> 5) unmount
> 6) fsck
> 7) mount
> 8) umount
> 9) fsck
> 10) go 1).
> 
> > CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
> > flag in roll-forward recovery, everything is fine.
> 
> I do the test based on codes in my git tree, could you check the result
> again based on my code? in where I just disable nat_bits recovery, not
> sure, in step 6) fsck can break some thing in image.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
> 
> Also, I just send the fsck code, could you check that too?
> 
> And I'd like to know your mount option and mkfs option, could you list for me?

I'm just doing this.
https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Can you test v9 first? I didn't encounter quota corruption with your
> >>>> testcase right now. Will check it in cell phone environment.
> >>>>
> >>>>>
> >>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>> ---
> >>>>>  fs/f2fs/recovery.c | 5 +++++
> >>>>>  1 file changed, 5 insertions(+)
> >>>>>
> >>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>> index 56d34193a74b..bff5cf730e13 100644
> >>>>> --- a/fs/f2fs/recovery.c
> >>>>> +++ b/fs/f2fs/recovery.c
> >>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >>>>>  		err = dquot_alloc_inode(inode);
> >>>>>  		if (err)
> >>>>>  			goto err_out;
> >>>>> +		err = dquot_reserve_block(inode, 1);
> >>>>> +		if (err) {
> >>>>> +			dquot_drop(inode);
> >>>>> +			goto err_out;
> >>>>> +		}
> >>>>>  	}
> >>>>>  
> >>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> >>>>>
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-19 22:38                               ` Jaegeuk Kim
@ 2018-09-20  9:46                                 ` Chao Yu
  2018-09-20 21:42                                   ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-20  9:46 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/20 6:38, Jaegeuk Kim wrote:
> On 09/19, Chao Yu wrote:
>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
>>> On 09/18, Chao Yu wrote:
>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>>>> On 09/18, Chao Yu wrote:
>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>>>> On 09/13, Chao Yu wrote:
>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>>>
>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>>>
>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>>>
>>>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>>>
>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>>>>>> of quota updates.
>>>>>>>>
>>>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>>>
>>>>>>>> find_fsync_dnodes()
>>>>>>>>  - f2fs_recover_inode_page
>>>>>>>>   - inc_valid_node_count
>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>>>  - add_fsync_inode
>>>>>>>>   - dquot_initialize
>>>>>>>>
>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>>>>>> you confirm this?
>>>>>>>
>>>>>>> Let me test this.
>>>>>>>
>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>>>
>>>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>>>
>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>>>>>> have to keep backward compatibilty, in memory we don't account block number
>>>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>>>
>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
>>>>
>>>> I guess we should update dquot and inode's uid/gid atomically under
>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>>>
>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>>>> enabled, but w/ normal quota file, I got one regression reported by
>>>> generic/232, I fixed in v10, will do some tests and release it later.
>>>>
>>>> Note that, my fsck can fix corrupted quota file automatically once
>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
>>>
>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
>>
>> That's strange, in my environment, before v9, I always encounter corrupted
>> quota sysfile after step 9), after v9, I never hit failure again.
>>
>> 1) enable fault injection
>> 2) run fsstress
>> 3) call shutdowon
>> 4) kill fsstress
>> 5) unmount
>> 6) fsck
>> 7) mount
>> 8) umount
>> 9) fsck
>> 10) go 1).
>>
>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
>>> flag in roll-forward recovery, everything is fine.
>>
>> I do the test based on codes in my git tree, could you check the result
>> again based on my code? in where I just disable nat_bits recovery, not
>> sure, in step 6) fsck can break some thing in image.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
>>
>> Also, I just send the fsck code, could you check that too?
>>
>> And I'd like to know your mount option and mkfs option, could you list for me?
> 
> I'm just doing this.
> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220

I just sent one patch to fix POR issue which missed to recover uid/gid of
inode.

[PATCH] f2fs: fix to recover inode's uid/gid during POR

After applying this patch, I can reproduce sys quota file corruption... let
me figure out the solution.

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>>>> testcase right now. Will check it in cell phone environment.
>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>> ---
>>>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>>>  		err = dquot_alloc_inode(inode);
>>>>>>>  		if (err)
>>>>>>>  			goto err_out;
>>>>>>> +		err = dquot_reserve_block(inode, 1);
>>>>>>> +		if (err) {
>>>>>>> +			dquot_drop(inode);
>>>>>>> +			goto err_out;
>>>>>>> +		}
>>>>>>>  	}
>>>>>>>  
>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-20  9:46                                 ` Chao Yu
@ 2018-09-20 21:42                                   ` Jaegeuk Kim
  2018-09-21  7:48                                     ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-20 21:42 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/20, Chao Yu wrote:
> On 2018/9/20 6:38, Jaegeuk Kim wrote:
> > On 09/19, Chao Yu wrote:
> >> On 2018/9/19 0:45, Jaegeuk Kim wrote:
> >>> On 09/18, Chao Yu wrote:
> >>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
> >>>>> On 09/18, Chao Yu wrote:
> >>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> >>>>>>> On 09/13, Chao Yu wrote:
> >>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>>>>>>>> same?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>>>>>>>> may was corrupted/inconsistent. Right?
> >>>>>>>>>>>
> >>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>>>>>>>> detected by fsck, I guess there is bug in v8.
> >>>>>>>>>>
> >>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>>>>>>>> 1. flush time in block_operation exceed a threshold.
> >>>>>>>>>> 2. dquot subsystem error occurs.
> >>>>>>>>>>
> >>>>>>>>>> For above case, fsck should repair the quota file by default.
> >>>>>>>>>
> >>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
> >>>>>>>>> of quota updates.
> >>>>>>>>
> >>>>>>>> Yeah, I checked the code, just found one suspected place:
> >>>>>>>>
> >>>>>>>> find_fsync_dnodes()
> >>>>>>>>  - f2fs_recover_inode_page
> >>>>>>>>   - inc_valid_node_count
> >>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
> >>>>>>>>  - add_fsync_inode
> >>>>>>>>   - dquot_initialize
> >>>>>>>>
> >>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
> >>>>>>>> you confirm this?
> >>>>>>>
> >>>>>>> Let me test this.
> >>>>>>>
> >>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> >>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
> >>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
> >>>>>>>
> >>>>>>> If a new file is recovered, we missed to reserve its inode block.
> >>>>>>
> >>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
> >>>>>> have to keep backward compatibilty, in memory we don't account block number
> >>>>>> for f2fs' inode block, but only account inode number for it, so here like
> >>>>>> we did in inc_valid_node_count(), we don't need to do this.
> >>>>>
> >>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
> >>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
> >>>>
> >>>> I guess we should update dquot and inode's uid/gid atomically under
> >>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
> >>>>
> >>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
> >>>> enabled, but w/ normal quota file, I got one regression reported by
> >>>> generic/232, I fixed in v10, will do some tests and release it later.
> >>>>
> >>>> Note that, my fsck can fix corrupted quota file automatically once
> >>>> CP_QUOTA_NEED_FSCK_FLAG is set.
> >>>
> >>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
> >>
> >> That's strange, in my environment, before v9, I always encounter corrupted
> >> quota sysfile after step 9), after v9, I never hit failure again.
> >>
> >> 1) enable fault injection
> >> 2) run fsstress
> >> 3) call shutdowon
> >> 4) kill fsstress
> >> 5) unmount
> >> 6) fsck
> >> 7) mount
> >> 8) umount
> >> 9) fsck
> >> 10) go 1).
> >>
> >>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
> >>> flag in roll-forward recovery, everything is fine.
> >>
> >> I do the test based on codes in my git tree, could you check the result
> >> again based on my code? in where I just disable nat_bits recovery, not
> >> sure, in step 6) fsck can break some thing in image.
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
> >>
> >> Also, I just send the fsck code, could you check that too?
> >>
> >> And I'd like to know your mount option and mkfs option, could you list for me?
> > 
> > I'm just doing this.
> > https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
> 
> I just sent one patch to fix POR issue which missed to recover uid/gid of
> inode.
> 
> [PATCH] f2fs: fix to recover inode's uid/gid during POR
> 
> After applying this patch, I can reproduce sys quota file corruption... let
> me figure out the solution.

Okay.

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Can you test v9 first? I didn't encounter quota corruption with your
> >>>>>> testcase right now. Will check it in cell phone environment.
> >>>>>>
> >>>>>>>
> >>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/recovery.c | 5 +++++
> >>>>>>>  1 file changed, 5 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>>> index 56d34193a74b..bff5cf730e13 100644
> >>>>>>> --- a/fs/f2fs/recovery.c
> >>>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >>>>>>>  		err = dquot_alloc_inode(inode);
> >>>>>>>  		if (err)
> >>>>>>>  			goto err_out;
> >>>>>>> +		err = dquot_reserve_block(inode, 1);
> >>>>>>> +		if (err) {
> >>>>>>> +			dquot_drop(inode);
> >>>>>>> +			goto err_out;
> >>>>>>> +		}
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> >>>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-20 21:42                                   ` Jaegeuk Kim
@ 2018-09-21  7:48                                     ` Chao Yu
  2018-09-26  0:29                                       ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-21  7:48 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/21 5:42, Jaegeuk Kim wrote:
> On 09/20, Chao Yu wrote:
>> On 2018/9/20 6:38, Jaegeuk Kim wrote:
>>> On 09/19, Chao Yu wrote:
>>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
>>>>> On 09/18, Chao Yu wrote:
>>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>>>>>> On 09/13, Chao Yu wrote:
>>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>>>>>
>>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>>>>>
>>>>>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>>>>>
>>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>>>>>>>> of quota updates.
>>>>>>>>>>
>>>>>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>>>>>
>>>>>>>>>> find_fsync_dnodes()
>>>>>>>>>>  - f2fs_recover_inode_page
>>>>>>>>>>   - inc_valid_node_count
>>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>>>>>  - add_fsync_inode
>>>>>>>>>>   - dquot_initialize
>>>>>>>>>>
>>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>>>>>>>> you confirm this?
>>>>>>>>>
>>>>>>>>> Let me test this.
>>>>>>>>>
>>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>>>>>
>>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>>>>>
>>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>>>>>>>> have to keep backward compatibilty, in memory we don't account block number
>>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>>>>>
>>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
>>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
>>>>>>
>>>>>> I guess we should update dquot and inode's uid/gid atomically under
>>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>>>>>
>>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>>>>>> enabled, but w/ normal quota file, I got one regression reported by
>>>>>> generic/232, I fixed in v10, will do some tests and release it later.
>>>>>>
>>>>>> Note that, my fsck can fix corrupted quota file automatically once
>>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
>>>>>
>>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
>>>>
>>>> That's strange, in my environment, before v9, I always encounter corrupted
>>>> quota sysfile after step 9), after v9, I never hit failure again.
>>>>
>>>> 1) enable fault injection
>>>> 2) run fsstress
>>>> 3) call shutdowon
>>>> 4) kill fsstress
>>>> 5) unmount
>>>> 6) fsck
>>>> 7) mount
>>>> 8) umount
>>>> 9) fsck
>>>> 10) go 1).
>>>>
>>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
>>>>> flag in roll-forward recovery, everything is fine.
>>>>
>>>> I do the test based on codes in my git tree, could you check the result
>>>> again based on my code? in where I just disable nat_bits recovery, not
>>>> sure, in step 6) fsck can break some thing in image.
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
>>>>
>>>> Also, I just send the fsck code, could you check that too?
>>>>
>>>> And I'd like to know your mount option and mkfs option, could you list for me?
>>>
>>> I'm just doing this.
>>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
>>
>> I just sent one patch to fix POR issue which missed to recover uid/gid of
>> inode.
>>
>> [PATCH] f2fs: fix to recover inode's uid/gid during POR
>>
>> After applying this patch, I can reproduce sys quota file corruption... let
>> me figure out the solution.
> 
> Okay.

Could you try v11, no quota corruption in my test now.

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>>>>>> testcase right now. Will check it in cell phone environment.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>> ---
>>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>>>>>  		err = dquot_alloc_inode(inode);
>>>>>>>>>  		if (err)
>>>>>>>>>  			goto err_out;
>>>>>>>>> +		err = dquot_reserve_block(inode, 1);
>>>>>>>>> +		if (err) {
>>>>>>>>> +			dquot_drop(inode);
>>>>>>>>> +			goto err_out;
>>>>>>>>> +		}
>>>>>>>>>  	}
>>>>>>>>>  
>>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-21  7:48                                     ` Chao Yu
@ 2018-09-26  0:29                                       ` Jaegeuk Kim
  2018-09-26  1:21                                         ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-26  0:29 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/21, Chao Yu wrote:
> On 2018/9/21 5:42, Jaegeuk Kim wrote:
> > On 09/20, Chao Yu wrote:
> >> On 2018/9/20 6:38, Jaegeuk Kim wrote:
> >>> On 09/19, Chao Yu wrote:
> >>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
> >>>>> On 09/18, Chao Yu wrote:
> >>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
> >>>>>>> On 09/18, Chao Yu wrote:
> >>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/13, Chao Yu wrote:
> >>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>>>>>>>>>> same?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
> >>>>>>>>>>>> 2. dquot subsystem error occurs.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For above case, fsck should repair the quota file by default.
> >>>>>>>>>>>
> >>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
> >>>>>>>>>>> of quota updates.
> >>>>>>>>>>
> >>>>>>>>>> Yeah, I checked the code, just found one suspected place:
> >>>>>>>>>>
> >>>>>>>>>> find_fsync_dnodes()
> >>>>>>>>>>  - f2fs_recover_inode_page
> >>>>>>>>>>   - inc_valid_node_count
> >>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
> >>>>>>>>>>  - add_fsync_inode
> >>>>>>>>>>   - dquot_initialize
> >>>>>>>>>>
> >>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
> >>>>>>>>>> you confirm this?
> >>>>>>>>>
> >>>>>>>>> Let me test this.
> >>>>>>>>>
> >>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> >>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
> >>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
> >>>>>>>>>
> >>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
> >>>>>>>>
> >>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
> >>>>>>>> have to keep backward compatibilty, in memory we don't account block number
> >>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
> >>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
> >>>>>>>
> >>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
> >>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
> >>>>>>
> >>>>>> I guess we should update dquot and inode's uid/gid atomically under
> >>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
> >>>>>>
> >>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
> >>>>>> enabled, but w/ normal quota file, I got one regression reported by
> >>>>>> generic/232, I fixed in v10, will do some tests and release it later.
> >>>>>>
> >>>>>> Note that, my fsck can fix corrupted quota file automatically once
> >>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
> >>>>>
> >>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
> >>>>
> >>>> That's strange, in my environment, before v9, I always encounter corrupted
> >>>> quota sysfile after step 9), after v9, I never hit failure again.
> >>>>
> >>>> 1) enable fault injection
> >>>> 2) run fsstress
> >>>> 3) call shutdowon
> >>>> 4) kill fsstress
> >>>> 5) unmount
> >>>> 6) fsck
> >>>> 7) mount
> >>>> 8) umount
> >>>> 9) fsck
> >>>> 10) go 1).
> >>>>
> >>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
> >>>>> flag in roll-forward recovery, everything is fine.
> >>>>
> >>>> I do the test based on codes in my git tree, could you check the result
> >>>> again based on my code? in where I just disable nat_bits recovery, not
> >>>> sure, in step 6) fsck can break some thing in image.
> >>>>
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
> >>>>
> >>>> Also, I just send the fsck code, could you check that too?
> >>>>
> >>>> And I'd like to know your mount option and mkfs option, could you list for me?
> >>>
> >>> I'm just doing this.
> >>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
> >>
> >> I just sent one patch to fix POR issue which missed to recover uid/gid of
> >> inode.
> >>
> >> [PATCH] f2fs: fix to recover inode's uid/gid during POR
> >>
> >> After applying this patch, I can reproduce sys quota file corruption... let
> >> me figure out the solution.
> > 
> > Okay.
> 
> Could you try v11, no quota corruption in my test now.

Chao,

I missed your fsck patch to recover this. Could you post it as well?

Thanks,

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
> >>>>>>>> testcase right now. Will check it in cell phone environment.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>> ---
> >>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
> >>>>>>>>>  1 file changed, 5 insertions(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
> >>>>>>>>> --- a/fs/f2fs/recovery.c
> >>>>>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >>>>>>>>>  		err = dquot_alloc_inode(inode);
> >>>>>>>>>  		if (err)
> >>>>>>>>>  			goto err_out;
> >>>>>>>>> +		err = dquot_reserve_block(inode, 1);
> >>>>>>>>> +		if (err) {
> >>>>>>>>> +			dquot_drop(inode);
> >>>>>>>>> +			goto err_out;
> >>>>>>>>> +		}
> >>>>>>>>>  	}
> >>>>>>>>>  
> >>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> >>>>>>>>>
> >>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-26  0:29                                       ` Jaegeuk Kim
@ 2018-09-26  1:21                                         ` Chao Yu
  2018-09-26  1:44                                           ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-26  1:21 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/26 8:29, Jaegeuk Kim wrote:
> On 09/21, Chao Yu wrote:
>> On 2018/9/21 5:42, Jaegeuk Kim wrote:
>>> On 09/20, Chao Yu wrote:
>>>> On 2018/9/20 6:38, Jaegeuk Kim wrote:
>>>>> On 09/19, Chao Yu wrote:
>>>>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/13, Chao Yu wrote:
>>>>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>>>>>>>>>> of quota updates.
>>>>>>>>>>>>
>>>>>>>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>>>>>>>
>>>>>>>>>>>> find_fsync_dnodes()
>>>>>>>>>>>>  - f2fs_recover_inode_page
>>>>>>>>>>>>   - inc_valid_node_count
>>>>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>>>>>>>  - add_fsync_inode
>>>>>>>>>>>>   - dquot_initialize
>>>>>>>>>>>>
>>>>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>>>>>>>>>> you confirm this?
>>>>>>>>>>>
>>>>>>>>>>> Let me test this.
>>>>>>>>>>>
>>>>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>>>>>>>
>>>>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>>>>>>>
>>>>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>>>>>>>>>> have to keep backward compatibilty, in memory we don't account block number
>>>>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>>>>>>>
>>>>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
>>>>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
>>>>>>>>
>>>>>>>> I guess we should update dquot and inode's uid/gid atomically under
>>>>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>>>>>>>
>>>>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>>>>>>>> enabled, but w/ normal quota file, I got one regression reported by
>>>>>>>> generic/232, I fixed in v10, will do some tests and release it later.
>>>>>>>>
>>>>>>>> Note that, my fsck can fix corrupted quota file automatically once
>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
>>>>>>>
>>>>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
>>>>>>
>>>>>> That's strange, in my environment, before v9, I always encounter corrupted
>>>>>> quota sysfile after step 9), after v9, I never hit failure again.
>>>>>>
>>>>>> 1) enable fault injection
>>>>>> 2) run fsstress
>>>>>> 3) call shutdowon
>>>>>> 4) kill fsstress
>>>>>> 5) unmount
>>>>>> 6) fsck
>>>>>> 7) mount
>>>>>> 8) umount
>>>>>> 9) fsck
>>>>>> 10) go 1).
>>>>>>
>>>>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
>>>>>>> flag in roll-forward recovery, everything is fine.
>>>>>>
>>>>>> I do the test based on codes in my git tree, could you check the result
>>>>>> again based on my code? in where I just disable nat_bits recovery, not
>>>>>> sure, in step 6) fsck can break some thing in image.
>>>>>>
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
>>>>>>
>>>>>> Also, I just send the fsck code, could you check that too?
>>>>>>
>>>>>> And I'd like to know your mount option and mkfs option, could you list for me?
>>>>>
>>>>> I'm just doing this.
>>>>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
>>>>
>>>> I just sent one patch to fix POR issue which missed to recover uid/gid of
>>>> inode.
>>>>
>>>> [PATCH] f2fs: fix to recover inode's uid/gid during POR
>>>>
>>>> After applying this patch, I can reproduce sys quota file corruption... let
>>>> me figure out the solution.
>>>
>>> Okay.
>>
>> Could you try v11, no quota corruption in my test now.
> 
> Chao,
> 
> I missed your fsck patch to recover this. Could you post it as well?

Could you check below one?

https://lore.kernel.org/patchwork/patch/988210/

Thanks,

> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>>>>>>>> testcase right now. Will check it in cell phone environment.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>> ---
>>>>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>>>>>>>  		err = dquot_alloc_inode(inode);
>>>>>>>>>>>  		if (err)
>>>>>>>>>>>  			goto err_out;
>>>>>>>>>>> +		err = dquot_reserve_block(inode, 1);
>>>>>>>>>>> +		if (err) {
>>>>>>>>>>> +			dquot_drop(inode);
>>>>>>>>>>> +			goto err_out;
>>>>>>>>>>> +		}
>>>>>>>>>>>  	}
>>>>>>>>>>>  
>>>>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-26  1:21                                         ` Chao Yu
@ 2018-09-26  1:44                                           ` Jaegeuk Kim
  2018-09-26  2:06                                             ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-26  1:44 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/26, Chao Yu wrote:
> On 2018/9/26 8:29, Jaegeuk Kim wrote:
> > On 09/21, Chao Yu wrote:
> >> On 2018/9/21 5:42, Jaegeuk Kim wrote:
> >>> On 09/20, Chao Yu wrote:
> >>>> On 2018/9/20 6:38, Jaegeuk Kim wrote:
> >>>>> On 09/19, Chao Yu wrote:
> >>>>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
> >>>>>>> On 09/18, Chao Yu wrote:
> >>>>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/18, Chao Yu wrote:
> >>>>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> >>>>>>>>>>> On 09/13, Chao Yu wrote:
> >>>>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>>>>>>>>>>>> same?
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
> >>>>>>>>>>>>>> 2. dquot subsystem error occurs.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For above case, fsck should repair the quota file by default.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>>>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
> >>>>>>>>>>>>> of quota updates.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yeah, I checked the code, just found one suspected place:
> >>>>>>>>>>>>
> >>>>>>>>>>>> find_fsync_dnodes()
> >>>>>>>>>>>>  - f2fs_recover_inode_page
> >>>>>>>>>>>>   - inc_valid_node_count
> >>>>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
> >>>>>>>>>>>>  - add_fsync_inode
> >>>>>>>>>>>>   - dquot_initialize
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
> >>>>>>>>>>>> you confirm this?
> >>>>>>>>>>>
> >>>>>>>>>>> Let me test this.
> >>>>>>>>>>>
> >>>>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> >>>>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
> >>>>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
> >>>>>>>>>>>
> >>>>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
> >>>>>>>>>>
> >>>>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
> >>>>>>>>>> have to keep backward compatibilty, in memory we don't account block number
> >>>>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
> >>>>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
> >>>>>>>>>
> >>>>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
> >>>>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
> >>>>>>>>
> >>>>>>>> I guess we should update dquot and inode's uid/gid atomically under
> >>>>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
> >>>>>>>>
> >>>>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
> >>>>>>>> enabled, but w/ normal quota file, I got one regression reported by
> >>>>>>>> generic/232, I fixed in v10, will do some tests and release it later.
> >>>>>>>>
> >>>>>>>> Note that, my fsck can fix corrupted quota file automatically once
> >>>>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
> >>>>>>>
> >>>>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
> >>>>>>
> >>>>>> That's strange, in my environment, before v9, I always encounter corrupted
> >>>>>> quota sysfile after step 9), after v9, I never hit failure again.
> >>>>>>
> >>>>>> 1) enable fault injection
> >>>>>> 2) run fsstress
> >>>>>> 3) call shutdowon
> >>>>>> 4) kill fsstress
> >>>>>> 5) unmount
> >>>>>> 6) fsck
> >>>>>> 7) mount
> >>>>>> 8) umount
> >>>>>> 9) fsck
> >>>>>> 10) go 1).
> >>>>>>
> >>>>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
> >>>>>>> flag in roll-forward recovery, everything is fine.
> >>>>>>
> >>>>>> I do the test based on codes in my git tree, could you check the result
> >>>>>> again based on my code? in where I just disable nat_bits recovery, not
> >>>>>> sure, in step 6) fsck can break some thing in image.
> >>>>>>
> >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
> >>>>>>
> >>>>>> Also, I just send the fsck code, could you check that too?
> >>>>>>
> >>>>>> And I'd like to know your mount option and mkfs option, could you list for me?
> >>>>>
> >>>>> I'm just doing this.
> >>>>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
> >>>>
> >>>> I just sent one patch to fix POR issue which missed to recover uid/gid of
> >>>> inode.
> >>>>
> >>>> [PATCH] f2fs: fix to recover inode's uid/gid during POR
> >>>>
> >>>> After applying this patch, I can reproduce sys quota file corruption... let
> >>>> me figure out the solution.
> >>>
> >>> Okay.
> >>
> >> Could you try v11, no quota corruption in my test now.
> > 
> > Chao,
> > 
> > I missed your fsck patch to recover this. Could you post it as well?
> 
> Could you check below one?
> 
> https://lore.kernel.org/patchwork/patch/988210/

It'd be worth to show the flag in print_cp_state.

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
> >>>>>>>>>> testcase right now. Will check it in cell phone environment.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>>>> ---
> >>>>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
> >>>>>>>>>>>  1 file changed, 5 insertions(+)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
> >>>>>>>>>>> --- a/fs/f2fs/recovery.c
> >>>>>>>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >>>>>>>>>>>  		err = dquot_alloc_inode(inode);
> >>>>>>>>>>>  		if (err)
> >>>>>>>>>>>  			goto err_out;
> >>>>>>>>>>> +		err = dquot_reserve_block(inode, 1);
> >>>>>>>>>>> +		if (err) {
> >>>>>>>>>>> +			dquot_drop(inode);
> >>>>>>>>>>> +			goto err_out;
> >>>>>>>>>>> +		}
> >>>>>>>>>>>  	}
> >>>>>>>>>>>  
> >>>>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> .
> >>>>>>>>>
> >>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-26  1:44                                           ` Jaegeuk Kim
@ 2018-09-26  2:06                                             ` Chao Yu
  2018-09-26  2:09                                               ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-26  2:06 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/26 9:44, Jaegeuk Kim wrote:
> On 09/26, Chao Yu wrote:
>> On 2018/9/26 8:29, Jaegeuk Kim wrote:
>>> On 09/21, Chao Yu wrote:
>>>> On 2018/9/21 5:42, Jaegeuk Kim wrote:
>>>>> On 09/20, Chao Yu wrote:
>>>>>> On 2018/9/20 6:38, Jaegeuk Kim wrote:
>>>>>>> On 09/19, Chao Yu wrote:
>>>>>>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
>>>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>>>>>>>>>> On 09/13, Chao Yu wrote:
>>>>>>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>>>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>>>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>>>>>>>>>>>> of quota updates.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> find_fsync_dnodes()
>>>>>>>>>>>>>>  - f2fs_recover_inode_page
>>>>>>>>>>>>>>   - inc_valid_node_count
>>>>>>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>>>>>>>>>  - add_fsync_inode
>>>>>>>>>>>>>>   - dquot_initialize
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>>>>>>>>>>>> you confirm this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me test this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>>>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>>>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>>>>>>>>>
>>>>>>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>>>>>>>>>
>>>>>>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>>>>>>>>>>>> have to keep backward compatibilty, in memory we don't account block number
>>>>>>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>>>>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>>>>>>>>>
>>>>>>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
>>>>>>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
>>>>>>>>>>
>>>>>>>>>> I guess we should update dquot and inode's uid/gid atomically under
>>>>>>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>>>>>>>>>
>>>>>>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>>>>>>>>>> enabled, but w/ normal quota file, I got one regression reported by
>>>>>>>>>> generic/232, I fixed in v10, will do some tests and release it later.
>>>>>>>>>>
>>>>>>>>>> Note that, my fsck can fix corrupted quota file automatically once
>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
>>>>>>>>>
>>>>>>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
>>>>>>>>
>>>>>>>> That's strange, in my environment, before v9, I always encounter corrupted
>>>>>>>> quota sysfile after step 9), after v9, I never hit failure again.
>>>>>>>>
>>>>>>>> 1) enable fault injection
>>>>>>>> 2) run fsstress
>>>>>>>> 3) call shutdowon
>>>>>>>> 4) kill fsstress
>>>>>>>> 5) unmount
>>>>>>>> 6) fsck
>>>>>>>> 7) mount
>>>>>>>> 8) umount
>>>>>>>> 9) fsck
>>>>>>>> 10) go 1).
>>>>>>>>
>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
>>>>>>>>> flag in roll-forward recovery, everything is fine.
>>>>>>>>
>>>>>>>> I do the test based on codes in my git tree, could you check the result
>>>>>>>> again based on my code? in where I just disable nat_bits recovery, not
>>>>>>>> sure, in step 6) fsck can break some thing in image.
>>>>>>>>
>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
>>>>>>>>
>>>>>>>> Also, I just send the fsck code, could you check that too?
>>>>>>>>
>>>>>>>> And I'd like to know your mount option and mkfs option, could you list for me?
>>>>>>>
>>>>>>> I'm just doing this.
>>>>>>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
>>>>>>
>>>>>> I just sent one patch to fix POR issue which missed to recover uid/gid of
>>>>>> inode.
>>>>>>
>>>>>> [PATCH] f2fs: fix to recover inode's uid/gid during POR
>>>>>>
>>>>>> After applying this patch, I can reproduce sys quota file corruption... let
>>>>>> me figure out the solution.
>>>>>
>>>>> Okay.
>>>>
>>>> Could you try v11, no quota corruption in my test now.
>>>
>>> Chao,
>>>
>>> I missed your fsck patch to recover this. Could you post it as well?
>>
>> Could you check below one?
>>
>> https://lore.kernel.org/patchwork/patch/988210/
> 
> It'd be worth to show the flag in print_cp_state.

That patch has already added that?

diff --git a/fsck/mount.c b/fsck/mount.c
index 6a3382dbd449..21a39a7222c6 100644
--- a/fsck/mount.c
+++ b/fsck/mount.c
@@ -405,6 +405,8 @@  void print_ckpt_info(struct f2fs_sb_info *sbi)
 void print_cp_state(u32 flag)
 {
 	MSG(0, "Info: checkpoint state = %x : ", flag);
+	if (flag & CP_QUOTA_NEED_FSCK_FLAG)
+		MSG(0, "%s", " quota_need_fsck");

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>>>>>>>>>> testcase right now. Will check it in cell phone environment.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>>>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>>>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>>>>>>>>>  		err = dquot_alloc_inode(inode);
>>>>>>>>>>>>>  		if (err)
>>>>>>>>>>>>>  			goto err_out;
>>>>>>>>>>>>> +		err = dquot_reserve_block(inode, 1);
>>>>>>>>>>>>> +		if (err) {
>>>>>>>>>>>>> +			dquot_drop(inode);
>>>>>>>>>>>>> +			goto err_out;
>>>>>>>>>>>>> +		}
>>>>>>>>>>>>>  	}
>>>>>>>>>>>>>  
>>>>>>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-26  2:06                                             ` Chao Yu
@ 2018-09-26  2:09                                               ` Jaegeuk Kim
  2018-09-26  2:14                                                 ` Chao Yu
  2018-09-27  1:16                                                 ` Chao Yu
  0 siblings, 2 replies; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-26  2:09 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/26, Chao Yu wrote:
> On 2018/9/26 9:44, Jaegeuk Kim wrote:
> > On 09/26, Chao Yu wrote:
> >> On 2018/9/26 8:29, Jaegeuk Kim wrote:
> >>> On 09/21, Chao Yu wrote:
> >>>> On 2018/9/21 5:42, Jaegeuk Kim wrote:
> >>>>> On 09/20, Chao Yu wrote:
> >>>>>> On 2018/9/20 6:38, Jaegeuk Kim wrote:
> >>>>>>> On 09/19, Chao Yu wrote:
> >>>>>>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/18, Chao Yu wrote:
> >>>>>>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
> >>>>>>>>>>> On 09/18, Chao Yu wrote:
> >>>>>>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
> >>>>>>>>>>>>> On 09/13, Chao Yu wrote:
> >>>>>>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
> >>>>>>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
> >>>>>>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
> >>>>>>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
> >>>>>>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
> >>>>>>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
> >>>>>>>>>>>>>>>>>>>>>> same?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
> >>>>>>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
> >>>>>>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
> >>>>>>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
> >>>>>>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
> >>>>>>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
> >>>>>>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
> >>>>>>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
> >>>>>>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
> >>>>>>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
> >>>>>>>>>>>>>>>> 2. dquot subsystem error occurs.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> For above case, fsck should repair the quota file by default.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
> >>>>>>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
> >>>>>>>>>>>>>>> of quota updates.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Yeah, I checked the code, just found one suspected place:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> find_fsync_dnodes()
> >>>>>>>>>>>>>>  - f2fs_recover_inode_page
> >>>>>>>>>>>>>>   - inc_valid_node_count
> >>>>>>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
> >>>>>>>>>>>>>>  - add_fsync_inode
> >>>>>>>>>>>>>>   - dquot_initialize
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
> >>>>>>>>>>>>>> you confirm this?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Let me test this.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
> >>>>>>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
> >>>>>>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
> >>>>>>>>>>>> have to keep backward compatibilty, in memory we don't account block number
> >>>>>>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
> >>>>>>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
> >>>>>>>>>>>
> >>>>>>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
> >>>>>>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
> >>>>>>>>>>
> >>>>>>>>>> I guess we should update dquot and inode's uid/gid atomically under
> >>>>>>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
> >>>>>>>>>>
> >>>>>>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
> >>>>>>>>>> enabled, but w/ normal quota file, I got one regression reported by
> >>>>>>>>>> generic/232, I fixed in v10, will do some tests and release it later.
> >>>>>>>>>>
> >>>>>>>>>> Note that, my fsck can fix corrupted quota file automatically once
> >>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
> >>>>>>>>>
> >>>>>>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
> >>>>>>>>
> >>>>>>>> That's strange, in my environment, before v9, I always encounter corrupted
> >>>>>>>> quota sysfile after step 9), after v9, I never hit failure again.
> >>>>>>>>
> >>>>>>>> 1) enable fault injection
> >>>>>>>> 2) run fsstress
> >>>>>>>> 3) call shutdowon
> >>>>>>>> 4) kill fsstress
> >>>>>>>> 5) unmount
> >>>>>>>> 6) fsck
> >>>>>>>> 7) mount
> >>>>>>>> 8) umount
> >>>>>>>> 9) fsck
> >>>>>>>> 10) go 1).
> >>>>>>>>
> >>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
> >>>>>>>>> flag in roll-forward recovery, everything is fine.
> >>>>>>>>
> >>>>>>>> I do the test based on codes in my git tree, could you check the result
> >>>>>>>> again based on my code? in where I just disable nat_bits recovery, not
> >>>>>>>> sure, in step 6) fsck can break some thing in image.
> >>>>>>>>
> >>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
> >>>>>>>>
> >>>>>>>> Also, I just send the fsck code, could you check that too?
> >>>>>>>>
> >>>>>>>> And I'd like to know your mount option and mkfs option, could you list for me?
> >>>>>>>
> >>>>>>> I'm just doing this.
> >>>>>>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
> >>>>>>
> >>>>>> I just sent one patch to fix POR issue which missed to recover uid/gid of
> >>>>>> inode.
> >>>>>>
> >>>>>> [PATCH] f2fs: fix to recover inode's uid/gid during POR
> >>>>>>
> >>>>>> After applying this patch, I can reproduce sys quota file corruption... let
> >>>>>> me figure out the solution.
> >>>>>
> >>>>> Okay.
> >>>>
> >>>> Could you try v11, no quota corruption in my test now.
> >>>
> >>> Chao,
> >>>
> >>> I missed your fsck patch to recover this. Could you post it as well?
> >>
> >> Could you check below one?
> >>
> >> https://lore.kernel.org/patchwork/patch/988210/
> > 
> > It'd be worth to show the flag in print_cp_state.
> 
> That patch has already added that?
> 
> diff --git a/fsck/mount.c b/fsck/mount.c
> index 6a3382dbd449..21a39a7222c6 100644
> --- a/fsck/mount.c
> +++ b/fsck/mount.c
> @@ -405,6 +405,8 @@  void print_ckpt_info(struct f2fs_sb_info *sbi)
>  void print_cp_state(u32 flag)
>  {
>  	MSG(0, "Info: checkpoint state = %x : ", flag);
> +	if (flag & CP_QUOTA_NEED_FSCK_FLAG)
> +		MSG(0, "%s", " quota_need_fsck");

Oh, yeah. :P
I started to run all my tests with this. Let me see what will happen.

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
> >>>>>>>>>>>> testcase right now. Will check it in cell phone environment.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> >>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>>>>>> ---
> >>>>>>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
> >>>>>>>>>>>>>  1 file changed, 5 insertions(+)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> >>>>>>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
> >>>>>>>>>>>>> --- a/fs/f2fs/recovery.c
> >>>>>>>>>>>>> +++ b/fs/f2fs/recovery.c
> >>>>>>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
> >>>>>>>>>>>>>  		err = dquot_alloc_inode(inode);
> >>>>>>>>>>>>>  		if (err)
> >>>>>>>>>>>>>  			goto err_out;
> >>>>>>>>>>>>> +		err = dquot_reserve_block(inode, 1);
> >>>>>>>>>>>>> +		if (err) {
> >>>>>>>>>>>>> +			dquot_drop(inode);
> >>>>>>>>>>>>> +			goto err_out;
> >>>>>>>>>>>>> +		}
> >>>>>>>>>>>>>  	}
> >>>>>>>>>>>>>  
> >>>>>>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> .
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> .
> >>>>>>>>>
> >>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>
> >>> .
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-26  2:09                                               ` Jaegeuk Kim
@ 2018-09-26  2:14                                                 ` Chao Yu
  2018-09-27  1:16                                                 ` Chao Yu
  1 sibling, 0 replies; 38+ messages in thread
From: Chao Yu @ 2018-09-26  2:14 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/26 10:09, Jaegeuk Kim wrote:
> On 09/26, Chao Yu wrote:
>> On 2018/9/26 9:44, Jaegeuk Kim wrote:
>>> On 09/26, Chao Yu wrote:
>>>> On 2018/9/26 8:29, Jaegeuk Kim wrote:
>>>>> On 09/21, Chao Yu wrote:
>>>>>> On 2018/9/21 5:42, Jaegeuk Kim wrote:
>>>>>>> On 09/20, Chao Yu wrote:
>>>>>>>> On 2018/9/20 6:38, Jaegeuk Kim wrote:
>>>>>>>>> On 09/19, Chao Yu wrote:
>>>>>>>>>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>>>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>>>>>>>>>>>> On 09/18, Chao Yu wrote:
>>>>>>>>>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>> On 09/13, Chao Yu wrote:
>>>>>>>>>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since roll-forward recovery
>>>>>>>>>>>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based all inodes'
>>>>>>>>>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two recovery result be the
>>>>>>>>>>>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering quota errors right
>>>>>>>>>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it more safe to do
>>>>>>>>>>>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota data by
>>>>>>>>>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can recover
>>>>>>>>>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, quota file
>>>>>>>>>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it noticing
>>>>>>>>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix corrupted quote
>>>>>>>>>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still corrupted
>>>>>>>>>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's consistence:
>>>>>>>>>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>>>>>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was not set
>>>>>>>>>>>>>>>>> during the recovery. So, we have something missing in the recovery in terms
>>>>>>>>>>>>>>>>> of quota updates.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> find_fsync_dnodes()
>>>>>>>>>>>>>>>>  - f2fs_recover_inode_page
>>>>>>>>>>>>>>>>   - inc_valid_node_count
>>>>>>>>>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>>>>>>>>>>>  - add_fsync_inode
>>>>>>>>>>>>>>>>   - dquot_initialize
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think we should reserve block for inode block after dquot_initialize(), can
>>>>>>>>>>>>>>>> you confirm this?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me test this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>>>>>>>>>>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>>>>>>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, we
>>>>>>>>>>>>>> have to keep backward compatibilty, in memory we don't account block number
>>>>>>>>>>>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>>>>>>>>>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my mind
>>>>>>>>>>>>> is that caused by uid/gid change during recovery. Let me try out your patch.
>>>>>>>>>>>>
>>>>>>>>>>>> I guess we should update dquot and inode's uid/gid atomically under
>>>>>>>>>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>>>>>>>>>>>
>>>>>>>>>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>>>>>>>>>>>> enabled, but w/ normal quota file, I got one regression reported by
>>>>>>>>>>>> generic/232, I fixed in v10, will do some tests and release it later.
>>>>>>>>>>>>
>>>>>>>>>>>> Note that, my fsck can fix corrupted quota file automatically once
>>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
>>>>>>>>>>>
>>>>>>>>>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to detect
>>>>>>>>>>
>>>>>>>>>> That's strange, in my environment, before v9, I always encounter corrupted
>>>>>>>>>> quota sysfile after step 9), after v9, I never hit failure again.
>>>>>>>>>>
>>>>>>>>>> 1) enable fault injection
>>>>>>>>>> 2) run fsstress
>>>>>>>>>> 3) call shutdowon
>>>>>>>>>> 4) kill fsstress
>>>>>>>>>> 5) unmount
>>>>>>>>>> 6) fsck
>>>>>>>>>> 7) mount
>>>>>>>>>> 8) umount
>>>>>>>>>> 9) fsck
>>>>>>>>>> 10) go 1).
>>>>>>>>>>
>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
>>>>>>>>>>> flag in roll-forward recovery, everything is fine.
>>>>>>>>>>
>>>>>>>>>> I do the test based on codes in my git tree, could you check the result
>>>>>>>>>> again based on my code? in where I just disable nat_bits recovery, not
>>>>>>>>>> sure, in step 6) fsck can break some thing in image.
>>>>>>>>>>
>>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
>>>>>>>>>>
>>>>>>>>>> Also, I just send the fsck code, could you check that too?
>>>>>>>>>>
>>>>>>>>>> And I'd like to know your mount option and mkfs option, could you list for me?
>>>>>>>>>
>>>>>>>>> I'm just doing this.
>>>>>>>>> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220
>>>>>>>>
>>>>>>>> I just sent one patch to fix POR issue which missed to recover uid/gid of
>>>>>>>> inode.
>>>>>>>>
>>>>>>>> [PATCH] f2fs: fix to recover inode's uid/gid during POR
>>>>>>>>
>>>>>>>> After applying this patch, I can reproduce sys quota file corruption... let
>>>>>>>> me figure out the solution.
>>>>>>>
>>>>>>> Okay.
>>>>>>
>>>>>> Could you try v11, no quota corruption in my test now.
>>>>>
>>>>> Chao,
>>>>>
>>>>> I missed your fsck patch to recover this. Could you post it as well?
>>>>
>>>> Could you check below one?
>>>>
>>>> https://lore.kernel.org/patchwork/patch/988210/
>>>
>>> It'd be worth to show the flag in print_cp_state.
>>
>> That patch has already added that?
>>
>> diff --git a/fsck/mount.c b/fsck/mount.c
>> index 6a3382dbd449..21a39a7222c6 100644
>> --- a/fsck/mount.c
>> +++ b/fsck/mount.c
>> @@ -405,6 +405,8 @@  void print_ckpt_info(struct f2fs_sb_info *sbi)
>>  void print_cp_state(u32 flag)
>>  {
>>  	MSG(0, "Info: checkpoint state = %x : ", flag);
>> +	if (flag & CP_QUOTA_NEED_FSCK_FLAG)
>> +		MSG(0, "%s", " quota_need_fsck");
> 
> Oh, yeah. :P
> I started to run all my tests with this. Let me see what will happen.

Ah, thanks, I just don't want another quota corruption again... :P

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>>>>>>>>>>>> testcase right now. Will check it in cell phone environment.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>>>>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>>>>>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>>>>>>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>>>>>>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>>>>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>>>>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>>>>>>>>>>>  		err = dquot_alloc_inode(inode);
>>>>>>>>>>>>>>>  		if (err)
>>>>>>>>>>>>>>>  			goto err_out;
>>>>>>>>>>>>>>> +		err = dquot_reserve_block(inode, 1);
>>>>>>>>>>>>>>> +		if (err) {
>>>>>>>>>>>>>>> +			dquot_drop(inode);
>>>>>>>>>>>>>>> +			goto err_out;
>>>>>>>>>>>>>>> +		}
>>>>>>>>>>>>>>>  	}
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>  	entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> .
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-26  2:09                                               ` Jaegeuk Kim
  2018-09-26  2:14                                                 ` Chao Yu
@ 2018-09-27  1:16                                                 ` Chao Yu
  2018-09-28 17:37                                                   ` Jaegeuk Kim
  1 sibling, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-27  1:16 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018/9/26 10:09, Jaegeuk Kim wrote:
> Oh, yeah. :P
> I started to run all my tests with this. Let me see what will happen.

Any updates of this test?

Thanks


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-27  1:16                                                 ` Chao Yu
@ 2018-09-28 17:37                                                   ` Jaegeuk Kim
  2018-09-28 23:40                                                     ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-28 17:37 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/27, Chao Yu wrote:
> On 2018/9/26 10:09, Jaegeuk Kim wrote:
> > Oh, yeah. :P
> > I started to run all my tests with this. Let me see what will happen.
> 
> Any updates of this test?

Not a good result. I'm trying to fix some bug in your patch.
Could you take a look at this?

---
 fs/f2fs/checkpoint.c |  8 ++++++++
 fs/f2fs/f2fs.h       |  1 +
 fs/f2fs/gc.c         |  4 ++++
 fs/f2fs/super.c      | 12 +++++++++++-
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 524b87667cf4..b4b1586c683c 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 {
 	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
 	unsigned long long ckpt_ver;
+	bool need_up = false;
 	int err = 0;
 
 	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
@@ -1506,6 +1507,11 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 		f2fs_msg(sbi->sb, KERN_WARNING,
 				"Start checkpoint disabled!");
 	}
+	if (!is_sbi_flag_set(sbi, SBI_IS_MOUNT) &&
+			!is_sbi_flag_set(sbi, SBI_IS_CLOSE)) {
+		need_up = true;
+		down_read(&sbi->sb->s_umount);
+	}
 	mutex_lock(&sbi->cp_mutex);
 
 	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
@@ -1582,6 +1588,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
 out:
 	mutex_unlock(&sbi->cp_mutex);
+	if (need_up)
+		up_read(&sbi->sb->s_umount);
 	return err;
 }
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 57c829dd107e..461656aed78a 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1089,6 +1089,7 @@ struct inode_management {
 enum {
 	SBI_IS_DIRTY,				/* dirty flag for checkpoint */
 	SBI_IS_CLOSE,				/* specify unmounting */
+	SBI_IS_MOUNT,				/* specify mounting */
 	SBI_NEED_FSCK,				/* need fsck.f2fs to fix */
 	SBI_POR_DOING,				/* recovery is doing or not */
 	SBI_NEED_SB_WRITE,			/* need to recover superblock */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index adaf5a695b12..ee6dbd72ad8a 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -55,6 +55,9 @@ static int gc_thread_func(void *data)
 			f2fs_stop_checkpoint(sbi, false);
 		}
 
+		if (!down_read_trylock(&sbi->sb->s_umount))
+			continue;
+
 		if (!sb_start_write_trylock(sbi->sb))
 			continue;
 
@@ -104,6 +107,7 @@ static int gc_thread_func(void *data)
 		f2fs_balance_fs_bg(sbi);
 next:
 		sb_end_write(sbi->sb);
+		up_read(&sbi->sb->s_umount);
 
 	} while (!kthread_should_stop());
 	return 0;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 1d89d5e9e829..603386525336 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1534,6 +1534,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 		}
 	}
 #endif
+	set_sbi_flag(sbi, SBI_IS_MOUNT);
 
 	/* recover superblocks we couldn't write due to previous RO mount */
 	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
@@ -1653,6 +1654,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
 
 	limit_reserve_root(sbi);
+	clear_sbi_flag(sbi, SBI_IS_MOUNT);
 	return 0;
 restore_gc:
 	if (need_restart_gc) {
@@ -1672,6 +1674,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 #endif
 	sbi->mount_opt = org_mount_opt;
 	sb->s_flags = old_sb_flags;
+	clear_sbi_flag(sbi, SBI_IS_MOUNT);
 	return err;
 }
 
@@ -1705,6 +1708,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
 				congestion_wait(BLK_RW_ASYNC, HZ/50);
 				goto repeat;
 			}
+			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
 			return PTR_ERR(page);
 		}
 
@@ -1716,6 +1720,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
 		}
 		if (unlikely(!PageUptodate(page))) {
 			f2fs_put_page(page, 1);
+			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
 			return -EIO;
 		}
 
@@ -1757,6 +1762,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
 				congestion_wait(BLK_RW_ASYNC, HZ/50);
 				goto retry;
 			}
+			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
 			break;
 		}
 
@@ -1793,7 +1799,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
 
 static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
 {
-
 	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
 		f2fs_msg(sbi->sb, KERN_ERR,
 			"quota sysfile may be corrupted, skip loading it");
@@ -3151,6 +3156,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 	init_waitqueue_head(&sbi->cp_wait);
 	init_sb_info(sbi);
 
+	set_sbi_flag(sbi, SBI_IS_MOUNT);
+
 	err = init_percpu_info(sbi);
 	if (err)
 		goto free_bio_info;
@@ -3369,6 +3376,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 				cur_cp_version(F2FS_CKPT(sbi)));
 	f2fs_update_time(sbi, CP_TIME);
 	f2fs_update_time(sbi, REQ_TIME);
+
+	clear_sbi_flag(sbi, SBI_IS_MOUNT);
 	return 0;
 
 free_meta:
@@ -3433,6 +3442,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		shrink_dcache_sb(sb);
 		goto try_onemore;
 	}
+	clear_sbi_flag(sbi, SBI_IS_MOUNT);
 	return err;
 }
 
-- 
2.19.0.605.g01d371f741-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-28 17:37                                                   ` Jaegeuk Kim
@ 2018-09-28 23:40                                                     ` Jaegeuk Kim
  2018-09-29 10:38                                                       ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-28 23:40 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

Testing other fix.

---
 fs/f2fs/checkpoint.c |  7 +++++++
 fs/f2fs/f2fs.h       |  1 +
 fs/f2fs/gc.c         | 10 +++++++++-
 fs/f2fs/super.c      | 22 +++++++++++++++++++++-
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 524b87667cf4..3fde91f41a91 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 {
 	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
 	unsigned long long ckpt_ver;
+	bool need_up = false;
 	int err = 0;
 
 	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
@@ -1506,6 +1507,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 		f2fs_msg(sbi->sb, KERN_WARNING,
 				"Start checkpoint disabled!");
 	}
+	if (!is_sbi_flag_set(sbi, SBI_QUOTA_INIT)) {
+		need_up = true;
+		down_read(&sbi->sb->s_umount);
+	}
 	mutex_lock(&sbi->cp_mutex);
 
 	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
@@ -1582,6 +1587,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
 	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
 out:
 	mutex_unlock(&sbi->cp_mutex);
+	if (need_up)
+		up_read(&sbi->sb->s_umount);
 	return err;
 }
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 57c829dd107e..30194f2f108e 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1096,6 +1096,7 @@ enum {
 	SBI_IS_SHUTDOWN,			/* shutdown by ioctl */
 	SBI_IS_RECOVERED,			/* recovered orphan/data */
 	SBI_CP_DISABLED,			/* CP was disabled last mount */
+	SBI_QUOTA_INIT,				/* avoid sb->s_umount lock */
 	SBI_QUOTA_NEED_FLUSH,			/* need to flush quota info in CP */
 	SBI_QUOTA_SKIP_FLUSH,			/* skip flushing quota in current CP */
 	SBI_QUOTA_NEED_REPAIR,			/* quota file may be corrupted */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index adaf5a695b12..deece448cb3b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -55,9 +55,14 @@ static int gc_thread_func(void *data)
 			f2fs_stop_checkpoint(sbi, false);
 		}
 
-		if (!sb_start_write_trylock(sbi->sb))
+		if (!down_read_trylock(&sbi->sb->s_umount))
 			continue;
 
+		set_sbi_flag(sbi, SBI_QUOTA_INIT);
+
+		if (!sb_start_write_trylock(sbi->sb))
+			goto next_umount;
+
 		/*
 		 * [GC triggering condition]
 		 * 0. GC is not conducted currently.
@@ -104,6 +109,9 @@ static int gc_thread_func(void *data)
 		f2fs_balance_fs_bg(sbi);
 next:
 		sb_end_write(sbi->sb);
+next_umount:
+		clear_sbi_flag(sbi, SBI_QUOTA_INIT);
+		up_read(&sbi->sb->s_umount);
 
 	} while (!kthread_should_stop());
 	return 0;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a28c245b1288..40a77a4eb465 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1029,6 +1029,8 @@ static void f2fs_put_super(struct super_block *sb)
 	int i;
 	bool dropped;
 
+	set_sbi_flag(sbi, SBI_QUOTA_INIT);
+
 	f2fs_quota_off_umount(sb);
 
 	/* prevent remaining shrinker jobs */
@@ -1122,11 +1124,17 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
 
 	if (sync) {
 		struct cp_control cpc;
+		bool keep = is_sbi_flag_set(sbi, SBI_QUOTA_INIT);
 
 		cpc.reason = __get_cp_reason(sbi);
 
 		mutex_lock(&sbi->gc_mutex);
+		if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE)
+			set_sbi_flag(sbi, SBI_QUOTA_INIT);
+
 		err = f2fs_write_checkpoint(sbi, &cpc);
+		if (!keep)
+			clear_sbi_flag(sbi, SBI_QUOTA_INIT);
 		mutex_unlock(&sbi->gc_mutex);
 	}
 	f2fs_trace_ios(NULL, 1);
@@ -1534,6 +1542,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 		}
 	}
 #endif
+	set_sbi_flag(sbi, SBI_QUOTA_INIT);
 
 	/* recover superblocks we couldn't write due to previous RO mount */
 	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
@@ -1653,6 +1662,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
 
 	limit_reserve_root(sbi);
+	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
 	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME);
 	return 0;
 restore_gc:
@@ -1673,6 +1683,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
 #endif
 	sbi->mount_opt = org_mount_opt;
 	sb->s_flags = old_sb_flags;
+	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
 	return err;
 }
 
@@ -1706,6 +1717,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
 				congestion_wait(BLK_RW_ASYNC, HZ/50);
 				goto repeat;
 			}
+			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
 			return PTR_ERR(page);
 		}
 
@@ -1717,6 +1729,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
 		}
 		if (unlikely(!PageUptodate(page))) {
 			f2fs_put_page(page, 1);
+			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
 			return -EIO;
 		}
 
@@ -1758,6 +1771,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
 				congestion_wait(BLK_RW_ASYNC, HZ/50);
 				goto retry;
 			}
+			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
 			break;
 		}
 
@@ -1794,7 +1808,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
 
 static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
 {
-
 	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
 		f2fs_msg(sbi->sb, KERN_ERR,
 			"quota sysfile may be corrupted, skip loading it");
@@ -1958,7 +1971,9 @@ static int f2fs_quota_on(struct super_block *sb, int type, int format_id,
 	if (err)
 		return err;
 
+	set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
 	err = dquot_quota_on(sb, type, format_id, path);
+	clear_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
 	if (err)
 		return err;
 
@@ -3179,6 +3194,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		goto free_meta_inode;
 	}
 
+	set_sbi_flag(sbi, SBI_QUOTA_INIT);
 	if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
 		set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
 
@@ -3370,6 +3386,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 				cur_cp_version(F2FS_CKPT(sbi)));
 	f2fs_update_time(sbi, CP_TIME);
 	f2fs_update_time(sbi, REQ_TIME);
+
+	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
 	return 0;
 
 free_meta:
@@ -3434,6 +3452,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		shrink_dcache_sb(sb);
 		goto try_onemore;
 	}
+	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
 	return err;
 }
 
@@ -3449,6 +3468,7 @@ static void kill_f2fs_super(struct super_block *sb)
 		struct f2fs_sb_info *sbi = F2FS_SB(sb);
 
 		set_sbi_flag(sbi, SBI_IS_CLOSE);
+		set_sbi_flag(sbi, SBI_QUOTA_INIT);
 		f2fs_stop_gc_thread(sbi);
 		f2fs_stop_discard_thread(sbi);
 
-- 
2.19.0.605.g01d371f741-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-28 23:40                                                     ` Jaegeuk Kim
@ 2018-09-29 10:38                                                       ` Chao Yu
  2018-09-30 23:58                                                         ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-09-29 10:38 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2018/9/29 7:40, Jaegeuk Kim wrote:
> Testing other fix.
> 
> ---
>  fs/f2fs/checkpoint.c |  7 +++++++
>  fs/f2fs/f2fs.h       |  1 +
>  fs/f2fs/gc.c         | 10 +++++++++-
>  fs/f2fs/super.c      | 22 +++++++++++++++++++++-
>  4 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 524b87667cf4..3fde91f41a91 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>  {
>  	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
>  	unsigned long long ckpt_ver;
> +	bool need_up = false;
>  	int err = 0;
>  
>  	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
> @@ -1506,6 +1507,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>  		f2fs_msg(sbi->sb, KERN_WARNING,
>  				"Start checkpoint disabled!");
>  	}
> +	if (!is_sbi_flag_set(sbi, SBI_QUOTA_INIT)) {
> +		need_up = true;
> +		down_read(&sbi->sb->s_umount);

This is to avoid show warning when calling dquot_writeback_dquots() in
f2fs_quota_sync(), right?

> +	}
>  	mutex_lock(&sbi->cp_mutex);
>  
>  	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
> @@ -1582,6 +1587,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>  	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
>  out:
>  	mutex_unlock(&sbi->cp_mutex);
> +	if (need_up)
> +		up_read(&sbi->sb->s_umount);
>  	return err;
>  }
>  
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 57c829dd107e..30194f2f108e 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1096,6 +1096,7 @@ enum {
>  	SBI_IS_SHUTDOWN,			/* shutdown by ioctl */
>  	SBI_IS_RECOVERED,			/* recovered orphan/data */
>  	SBI_CP_DISABLED,			/* CP was disabled last mount */
> +	SBI_QUOTA_INIT,				/* avoid sb->s_umount lock */
>  	SBI_QUOTA_NEED_FLUSH,			/* need to flush quota info in CP */
>  	SBI_QUOTA_SKIP_FLUSH,			/* skip flushing quota in current CP */
>  	SBI_QUOTA_NEED_REPAIR,			/* quota file may be corrupted */
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index adaf5a695b12..deece448cb3b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -55,9 +55,14 @@ static int gc_thread_func(void *data)
>  			f2fs_stop_checkpoint(sbi, false);
>  		}
>  
> -		if (!sb_start_write_trylock(sbi->sb))
> +		if (!down_read_trylock(&sbi->sb->s_umount))
>  			continue;
>  
> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
> +
> +		if (!sb_start_write_trylock(sbi->sb))
> +			goto next_umount;
> +
>  		/*
>  		 * [GC triggering condition]
>  		 * 0. GC is not conducted currently.
> @@ -104,6 +109,9 @@ static int gc_thread_func(void *data)
>  		f2fs_balance_fs_bg(sbi);
>  next:
>  		sb_end_write(sbi->sb);
> +next_umount:
> +		clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> +		up_read(&sbi->sb->s_umount);
>  
>  	} while (!kthread_should_stop());
>  	return 0;
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index a28c245b1288..40a77a4eb465 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1029,6 +1029,8 @@ static void f2fs_put_super(struct super_block *sb)
>  	int i;
>  	bool dropped;
>  
> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> +
>  	f2fs_quota_off_umount(sb);
>  
>  	/* prevent remaining shrinker jobs */
> @@ -1122,11 +1124,17 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
>  
>  	if (sync) {
>  		struct cp_control cpc;
> +		bool keep = is_sbi_flag_set(sbi, SBI_QUOTA_INIT);
>  
>  		cpc.reason = __get_cp_reason(sbi);
>  
>  		mutex_lock(&sbi->gc_mutex);
> +		if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE)
> +			set_sbi_flag(sbi, SBI_QUOTA_INIT);
> +
>  		err = f2fs_write_checkpoint(sbi, &cpc);
> +		if (!keep)
> +			clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>  		mutex_unlock(&sbi->gc_mutex);
>  	}
>  	f2fs_trace_ios(NULL, 1);
> @@ -1534,6 +1542,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>  		}
>  	}
>  #endif
> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>  
>  	/* recover superblocks we couldn't write due to previous RO mount */
>  	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
> @@ -1653,6 +1662,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>  		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
>  
>  	limit_reserve_root(sbi);
> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>  	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME);
>  	return 0;
>  restore_gc:
> @@ -1673,6 +1683,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>  #endif
>  	sbi->mount_opt = org_mount_opt;
>  	sb->s_flags = old_sb_flags;
> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>  	return err;
>  }
>  
> @@ -1706,6 +1717,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
>  				goto repeat;
>  			}
> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>  			return PTR_ERR(page);
>  		}
>  
> @@ -1717,6 +1729,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
>  		}
>  		if (unlikely(!PageUptodate(page))) {
>  			f2fs_put_page(page, 1);
> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>  			return -EIO;
>  		}
>  
> @@ -1758,6 +1771,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
>  				goto retry;
>  			}
> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>  			break;

I added these before, but later I didn't encounter quota corruption w/o
them, so I removed them.

Do you still hit corruption after this change?

Thanks,

>  		}
>  
> @@ -1794,7 +1808,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
>  
>  static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
>  {
> -
>  	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
>  		f2fs_msg(sbi->sb, KERN_ERR,
>  			"quota sysfile may be corrupted, skip loading it");
> @@ -1958,7 +1971,9 @@ static int f2fs_quota_on(struct super_block *sb, int type, int format_id,
>  	if (err)
>  		return err;
>  
> +	set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
>  	err = dquot_quota_on(sb, type, format_id, path);
> +	clear_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
>  	if (err)
>  		return err;
>  
> @@ -3179,6 +3194,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  		goto free_meta_inode;
>  	}
>  
> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>  	if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
>  		set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
>  
> @@ -3370,6 +3386,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  				cur_cp_version(F2FS_CKPT(sbi)));
>  	f2fs_update_time(sbi, CP_TIME);
>  	f2fs_update_time(sbi, REQ_TIME);
> +
> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>  	return 0;
>  
>  free_meta:
> @@ -3434,6 +3452,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  		shrink_dcache_sb(sb);
>  		goto try_onemore;
>  	}
> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>  	return err;
>  }
>  
> @@ -3449,6 +3468,7 @@ static void kill_f2fs_super(struct super_block *sb)
>  		struct f2fs_sb_info *sbi = F2FS_SB(sb);
>  
>  		set_sbi_flag(sbi, SBI_IS_CLOSE);
> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
>  		f2fs_stop_gc_thread(sbi);
>  		f2fs_stop_discard_thread(sbi);
>  
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-29 10:38                                                       ` Chao Yu
@ 2018-09-30 23:58                                                         ` Jaegeuk Kim
  2018-10-01  0:35                                                           ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-09-30 23:58 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/29, Chao Yu wrote:
> On 2018/9/29 7:40, Jaegeuk Kim wrote:
> > Testing other fix.
> > 
> > ---
> >  fs/f2fs/checkpoint.c |  7 +++++++
> >  fs/f2fs/f2fs.h       |  1 +
> >  fs/f2fs/gc.c         | 10 +++++++++-
> >  fs/f2fs/super.c      | 22 +++++++++++++++++++++-
> >  4 files changed, 38 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index 524b87667cf4..3fde91f41a91 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >  {
> >  	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
> >  	unsigned long long ckpt_ver;
> > +	bool need_up = false;
> >  	int err = 0;
> >  
> >  	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
> > @@ -1506,6 +1507,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >  		f2fs_msg(sbi->sb, KERN_WARNING,
> >  				"Start checkpoint disabled!");
> >  	}
> > +	if (!is_sbi_flag_set(sbi, SBI_QUOTA_INIT)) {
> > +		need_up = true;
> > +		down_read(&sbi->sb->s_umount);
> 
> This is to avoid show warning when calling dquot_writeback_dquots() in
> f2fs_quota_sync(), right?

Yup. Unfortunately, this can't fix all the issues, so I'm testing trylock
simply in this case.

> 
> > +	}
> >  	mutex_lock(&sbi->cp_mutex);
> >  
> >  	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
> > @@ -1582,6 +1587,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >  	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
> >  out:
> >  	mutex_unlock(&sbi->cp_mutex);
> > +	if (need_up)
> > +		up_read(&sbi->sb->s_umount);
> >  	return err;
> >  }
> >  
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 57c829dd107e..30194f2f108e 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -1096,6 +1096,7 @@ enum {
> >  	SBI_IS_SHUTDOWN,			/* shutdown by ioctl */
> >  	SBI_IS_RECOVERED,			/* recovered orphan/data */
> >  	SBI_CP_DISABLED,			/* CP was disabled last mount */
> > +	SBI_QUOTA_INIT,				/* avoid sb->s_umount lock */
> >  	SBI_QUOTA_NEED_FLUSH,			/* need to flush quota info in CP */
> >  	SBI_QUOTA_SKIP_FLUSH,			/* skip flushing quota in current CP */
> >  	SBI_QUOTA_NEED_REPAIR,			/* quota file may be corrupted */
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index adaf5a695b12..deece448cb3b 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -55,9 +55,14 @@ static int gc_thread_func(void *data)
> >  			f2fs_stop_checkpoint(sbi, false);
> >  		}
> >  
> > -		if (!sb_start_write_trylock(sbi->sb))
> > +		if (!down_read_trylock(&sbi->sb->s_umount))
> >  			continue;
> >  
> > +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
> > +
> > +		if (!sb_start_write_trylock(sbi->sb))
> > +			goto next_umount;
> > +
> >  		/*
> >  		 * [GC triggering condition]
> >  		 * 0. GC is not conducted currently.
> > @@ -104,6 +109,9 @@ static int gc_thread_func(void *data)
> >  		f2fs_balance_fs_bg(sbi);
> >  next:
> >  		sb_end_write(sbi->sb);
> > +next_umount:
> > +		clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> > +		up_read(&sbi->sb->s_umount);
> >  
> >  	} while (!kthread_should_stop());
> >  	return 0;
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index a28c245b1288..40a77a4eb465 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -1029,6 +1029,8 @@ static void f2fs_put_super(struct super_block *sb)
> >  	int i;
> >  	bool dropped;
> >  
> > +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> > +
> >  	f2fs_quota_off_umount(sb);
> >  
> >  	/* prevent remaining shrinker jobs */
> > @@ -1122,11 +1124,17 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
> >  
> >  	if (sync) {
> >  		struct cp_control cpc;
> > +		bool keep = is_sbi_flag_set(sbi, SBI_QUOTA_INIT);
> >  
> >  		cpc.reason = __get_cp_reason(sbi);
> >  
> >  		mutex_lock(&sbi->gc_mutex);
> > +		if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE)
> > +			set_sbi_flag(sbi, SBI_QUOTA_INIT);
> > +
> >  		err = f2fs_write_checkpoint(sbi, &cpc);
> > +		if (!keep)
> > +			clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  		mutex_unlock(&sbi->gc_mutex);
> >  	}
> >  	f2fs_trace_ios(NULL, 1);
> > @@ -1534,6 +1542,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >  		}
> >  	}
> >  #endif
> > +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  
> >  	/* recover superblocks we couldn't write due to previous RO mount */
> >  	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
> > @@ -1653,6 +1662,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >  		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
> >  
> >  	limit_reserve_root(sbi);
> > +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME);
> >  	return 0;
> >  restore_gc:
> > @@ -1673,6 +1683,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >  #endif
> >  	sbi->mount_opt = org_mount_opt;
> >  	sb->s_flags = old_sb_flags;
> > +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  	return err;
> >  }
> >  
> > @@ -1706,6 +1717,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
> >  				congestion_wait(BLK_RW_ASYNC, HZ/50);
> >  				goto repeat;
> >  			}
> > +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
> >  			return PTR_ERR(page);
> >  		}
> >  
> > @@ -1717,6 +1729,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
> >  		}
> >  		if (unlikely(!PageUptodate(page))) {
> >  			f2fs_put_page(page, 1);
> > +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
> >  			return -EIO;
> >  		}
> >  
> > @@ -1758,6 +1771,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
> >  				congestion_wait(BLK_RW_ASYNC, HZ/50);
> >  				goto retry;
> >  			}
> > +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
> >  			break;
> 
> I added these before, but later I didn't encounter quota corruption w/o
> them, so I removed them.
> 
> Do you still hit corruption after this change?

I found one issue where we must avoid roll-forward recovery, if fsck overwrote some
blocks to fix quota which will be used for recovery.

> 
> Thanks,
> 
> >  		}
> >  
> > @@ -1794,7 +1808,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
> >  
> >  static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
> >  {
> > -
> >  	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
> >  		f2fs_msg(sbi->sb, KERN_ERR,
> >  			"quota sysfile may be corrupted, skip loading it");
> > @@ -1958,7 +1971,9 @@ static int f2fs_quota_on(struct super_block *sb, int type, int format_id,
> >  	if (err)
> >  		return err;
> >  
> > +	set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
> >  	err = dquot_quota_on(sb, type, format_id, path);
> > +	clear_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
> >  	if (err)
> >  		return err;
> >  
> > @@ -3179,6 +3194,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >  		goto free_meta_inode;
> >  	}
> >  
> > +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  	if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
> >  		set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
> >  
> > @@ -3370,6 +3386,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >  				cur_cp_version(F2FS_CKPT(sbi)));
> >  	f2fs_update_time(sbi, CP_TIME);
> >  	f2fs_update_time(sbi, REQ_TIME);
> > +
> > +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  	return 0;
> >  
> >  free_meta:
> > @@ -3434,6 +3452,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >  		shrink_dcache_sb(sb);
> >  		goto try_onemore;
> >  	}
> > +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  	return err;
> >  }
> >  
> > @@ -3449,6 +3468,7 @@ static void kill_f2fs_super(struct super_block *sb)
> >  		struct f2fs_sb_info *sbi = F2FS_SB(sb);
> >  
> >  		set_sbi_flag(sbi, SBI_IS_CLOSE);
> > +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >  		f2fs_stop_gc_thread(sbi);
> >  		f2fs_stop_discard_thread(sbi);
> >  
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-09-30 23:58                                                         ` Jaegeuk Kim
@ 2018-10-01  0:35                                                           ` Chao Yu
  2018-10-01  1:27                                                             ` Jaegeuk Kim
  0 siblings, 1 reply; 38+ messages in thread
From: Chao Yu @ 2018-10-01  0:35 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 2018-10-1 7:58, Jaegeuk Kim wrote:
> On 09/29, Chao Yu wrote:
>> On 2018/9/29 7:40, Jaegeuk Kim wrote:
>>> Testing other fix.
>>>
>>> ---
>>>  fs/f2fs/checkpoint.c |  7 +++++++
>>>  fs/f2fs/f2fs.h       |  1 +
>>>  fs/f2fs/gc.c         | 10 +++++++++-
>>>  fs/f2fs/super.c      | 22 +++++++++++++++++++++-
>>>  4 files changed, 38 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>>> index 524b87667cf4..3fde91f41a91 100644
>>> --- a/fs/f2fs/checkpoint.c
>>> +++ b/fs/f2fs/checkpoint.c
>>> @@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>>>  {
>>>  	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
>>>  	unsigned long long ckpt_ver;
>>> +	bool need_up = false;
>>>  	int err = 0;
>>>  
>>>  	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
>>> @@ -1506,6 +1507,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>>>  		f2fs_msg(sbi->sb, KERN_WARNING,
>>>  				"Start checkpoint disabled!");
>>>  	}
>>> +	if (!is_sbi_flag_set(sbi, SBI_QUOTA_INIT)) {
>>> +		need_up = true;
>>> +		down_read(&sbi->sb->s_umount);
>>
>> This is to avoid show warning when calling dquot_writeback_dquots() in
>> f2fs_quota_sync(), right?
> 
> Yup. Unfortunately, this can't fix all the issues, so I'm testing trylock
> simply in this case.

Oh, that's just warning, it could not be harmful, I think we can simply remove
WARN_ON_ONCE in dquot_writeback_dquots to fix this?

> 
>>
>>> +	}
>>>  	mutex_lock(&sbi->cp_mutex);
>>>  
>>>  	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
>>> @@ -1582,6 +1587,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>>>  	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
>>>  out:
>>>  	mutex_unlock(&sbi->cp_mutex);
>>> +	if (need_up)
>>> +		up_read(&sbi->sb->s_umount);
>>>  	return err;
>>>  }
>>>  
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>> index 57c829dd107e..30194f2f108e 100644
>>> --- a/fs/f2fs/f2fs.h
>>> +++ b/fs/f2fs/f2fs.h
>>> @@ -1096,6 +1096,7 @@ enum {
>>>  	SBI_IS_SHUTDOWN,			/* shutdown by ioctl */
>>>  	SBI_IS_RECOVERED,			/* recovered orphan/data */
>>>  	SBI_CP_DISABLED,			/* CP was disabled last mount */
>>> +	SBI_QUOTA_INIT,				/* avoid sb->s_umount lock */
>>>  	SBI_QUOTA_NEED_FLUSH,			/* need to flush quota info in CP */
>>>  	SBI_QUOTA_SKIP_FLUSH,			/* skip flushing quota in current CP */
>>>  	SBI_QUOTA_NEED_REPAIR,			/* quota file may be corrupted */
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index adaf5a695b12..deece448cb3b 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -55,9 +55,14 @@ static int gc_thread_func(void *data)
>>>  			f2fs_stop_checkpoint(sbi, false);
>>>  		}
>>>  
>>> -		if (!sb_start_write_trylock(sbi->sb))
>>> +		if (!down_read_trylock(&sbi->sb->s_umount))
>>>  			continue;
>>>  
>>> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>> +
>>> +		if (!sb_start_write_trylock(sbi->sb))
>>> +			goto next_umount;
>>> +
>>>  		/*
>>>  		 * [GC triggering condition]
>>>  		 * 0. GC is not conducted currently.
>>> @@ -104,6 +109,9 @@ static int gc_thread_func(void *data)
>>>  		f2fs_balance_fs_bg(sbi);
>>>  next:
>>>  		sb_end_write(sbi->sb);
>>> +next_umount:
>>> +		clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>> +		up_read(&sbi->sb->s_umount);
>>>  
>>>  	} while (!kthread_should_stop());
>>>  	return 0;
>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>> index a28c245b1288..40a77a4eb465 100644
>>> --- a/fs/f2fs/super.c
>>> +++ b/fs/f2fs/super.c
>>> @@ -1029,6 +1029,8 @@ static void f2fs_put_super(struct super_block *sb)
>>>  	int i;
>>>  	bool dropped;
>>>  
>>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>> +
>>>  	f2fs_quota_off_umount(sb);
>>>  
>>>  	/* prevent remaining shrinker jobs */
>>> @@ -1122,11 +1124,17 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
>>>  
>>>  	if (sync) {
>>>  		struct cp_control cpc;
>>> +		bool keep = is_sbi_flag_set(sbi, SBI_QUOTA_INIT);
>>>  
>>>  		cpc.reason = __get_cp_reason(sbi);
>>>  
>>>  		mutex_lock(&sbi->gc_mutex);
>>> +		if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE)
>>> +			set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>> +
>>>  		err = f2fs_write_checkpoint(sbi, &cpc);
>>> +		if (!keep)
>>> +			clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  		mutex_unlock(&sbi->gc_mutex);
>>>  	}
>>>  	f2fs_trace_ios(NULL, 1);
>>> @@ -1534,6 +1542,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>  		}
>>>  	}
>>>  #endif
>>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  
>>>  	/* recover superblocks we couldn't write due to previous RO mount */
>>>  	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
>>> @@ -1653,6 +1662,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>  		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
>>>  
>>>  	limit_reserve_root(sbi);
>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME);
>>>  	return 0;
>>>  restore_gc:
>>> @@ -1673,6 +1683,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>  #endif
>>>  	sbi->mount_opt = org_mount_opt;
>>>  	sb->s_flags = old_sb_flags;
>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  	return err;
>>>  }
>>>  
>>> @@ -1706,6 +1717,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
>>>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
>>>  				goto repeat;
>>>  			}
>>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>>>  			return PTR_ERR(page);
>>>  		}
>>>  
>>> @@ -1717,6 +1729,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
>>>  		}
>>>  		if (unlikely(!PageUptodate(page))) {
>>>  			f2fs_put_page(page, 1);
>>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>>>  			return -EIO;
>>>  		}
>>>  
>>> @@ -1758,6 +1771,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
>>>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
>>>  				goto retry;
>>>  			}
>>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>>>  			break;
>>
>> I added these before, but later I didn't encounter quota corruption w/o
>> them, so I removed them.
>>
>> Do you still hit corruption after this change?
> 
> I found one issue where we must avoid roll-forward recovery, if fsck overwrote some
> blocks to fix quota which will be used for recovery.

That's correct, IMO, the right way is that fsck needs to be aware of whole dnode
chain and block address in dnode block, when doing allocation, it needs to
bypass those block address of dnode and block which can be recovered later by
kernel, so that fsynced data can not be lost.

But, in order to trouble shoot current problem more quickly, we can just disable
kernel recovery once fsck recovers quota file for now.

Thanks,

> 
>>
>> Thanks,
>>
>>>  		}
>>>  
>>> @@ -1794,7 +1808,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
>>>  
>>>  static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
>>>  {
>>> -
>>>  	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
>>>  		f2fs_msg(sbi->sb, KERN_ERR,
>>>  			"quota sysfile may be corrupted, skip loading it");
>>> @@ -1958,7 +1971,9 @@ static int f2fs_quota_on(struct super_block *sb, int type, int format_id,
>>>  	if (err)
>>>  		return err;
>>>  
>>> +	set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
>>>  	err = dquot_quota_on(sb, type, format_id, path);
>>> +	clear_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
>>>  	if (err)
>>>  		return err;
>>>  
>>> @@ -3179,6 +3194,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  		goto free_meta_inode;
>>>  	}
>>>  
>>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  	if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
>>>  		set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
>>>  
>>> @@ -3370,6 +3386,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  				cur_cp_version(F2FS_CKPT(sbi)));
>>>  	f2fs_update_time(sbi, CP_TIME);
>>>  	f2fs_update_time(sbi, REQ_TIME);
>>> +
>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  	return 0;
>>>  
>>>  free_meta:
>>> @@ -3434,6 +3452,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  		shrink_dcache_sb(sb);
>>>  		goto try_onemore;
>>>  	}
>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  	return err;
>>>  }
>>>  
>>> @@ -3449,6 +3468,7 @@ static void kill_f2fs_super(struct super_block *sb)
>>>  		struct f2fs_sb_info *sbi = F2FS_SB(sb);
>>>  
>>>  		set_sbi_flag(sbi, SBI_IS_CLOSE);
>>> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>  		f2fs_stop_gc_thread(sbi);
>>>  		f2fs_stop_discard_thread(sbi);
>>>  
>>>
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-10-01  0:35                                                           ` Chao Yu
@ 2018-10-01  1:27                                                             ` Jaegeuk Kim
  2018-10-01  1:37                                                               ` Chao Yu
  0 siblings, 1 reply; 38+ messages in thread
From: Jaegeuk Kim @ 2018-10-01  1:27 UTC (permalink / raw)
  To: Chao Yu; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 10/01, Chao Yu wrote:
> On 2018-10-1 7:58, Jaegeuk Kim wrote:
> > On 09/29, Chao Yu wrote:
> >> On 2018/9/29 7:40, Jaegeuk Kim wrote:
> >>> Testing other fix.
> >>>
> >>> ---
> >>>  fs/f2fs/checkpoint.c |  7 +++++++
> >>>  fs/f2fs/f2fs.h       |  1 +
> >>>  fs/f2fs/gc.c         | 10 +++++++++-
> >>>  fs/f2fs/super.c      | 22 +++++++++++++++++++++-
> >>>  4 files changed, 38 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >>> index 524b87667cf4..3fde91f41a91 100644
> >>> --- a/fs/f2fs/checkpoint.c
> >>> +++ b/fs/f2fs/checkpoint.c
> >>> @@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>  {
> >>>  	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
> >>>  	unsigned long long ckpt_ver;
> >>> +	bool need_up = false;
> >>>  	int err = 0;
> >>>  
> >>>  	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
> >>> @@ -1506,6 +1507,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>  		f2fs_msg(sbi->sb, KERN_WARNING,
> >>>  				"Start checkpoint disabled!");
> >>>  	}
> >>> +	if (!is_sbi_flag_set(sbi, SBI_QUOTA_INIT)) {
> >>> +		need_up = true;
> >>> +		down_read(&sbi->sb->s_umount);
> >>
> >> This is to avoid show warning when calling dquot_writeback_dquots() in
> >> f2fs_quota_sync(), right?
> > 
> > Yup. Unfortunately, this can't fix all the issues, so I'm testing trylock
> > simply in this case.
> 
> Oh, that's just warning, it could not be harmful, I think we can simply remove
> WARN_ON_ONCE in dquot_writeback_dquots to fix this?

Well, I think it'd be better to keep it.

> 
> > 
> >>
> >>> +	}
> >>>  	mutex_lock(&sbi->cp_mutex);
> >>>  
> >>>  	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
> >>> @@ -1582,6 +1587,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> >>>  	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
> >>>  out:
> >>>  	mutex_unlock(&sbi->cp_mutex);
> >>> +	if (need_up)
> >>> +		up_read(&sbi->sb->s_umount);
> >>>  	return err;
> >>>  }
> >>>  
> >>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>> index 57c829dd107e..30194f2f108e 100644
> >>> --- a/fs/f2fs/f2fs.h
> >>> +++ b/fs/f2fs/f2fs.h
> >>> @@ -1096,6 +1096,7 @@ enum {
> >>>  	SBI_IS_SHUTDOWN,			/* shutdown by ioctl */
> >>>  	SBI_IS_RECOVERED,			/* recovered orphan/data */
> >>>  	SBI_CP_DISABLED,			/* CP was disabled last mount */
> >>> +	SBI_QUOTA_INIT,				/* avoid sb->s_umount lock */
> >>>  	SBI_QUOTA_NEED_FLUSH,			/* need to flush quota info in CP */
> >>>  	SBI_QUOTA_SKIP_FLUSH,			/* skip flushing quota in current CP */
> >>>  	SBI_QUOTA_NEED_REPAIR,			/* quota file may be corrupted */
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index adaf5a695b12..deece448cb3b 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -55,9 +55,14 @@ static int gc_thread_func(void *data)
> >>>  			f2fs_stop_checkpoint(sbi, false);
> >>>  		}
> >>>  
> >>> -		if (!sb_start_write_trylock(sbi->sb))
> >>> +		if (!down_read_trylock(&sbi->sb->s_umount))
> >>>  			continue;
> >>>  
> >>> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>> +
> >>> +		if (!sb_start_write_trylock(sbi->sb))
> >>> +			goto next_umount;
> >>> +
> >>>  		/*
> >>>  		 * [GC triggering condition]
> >>>  		 * 0. GC is not conducted currently.
> >>> @@ -104,6 +109,9 @@ static int gc_thread_func(void *data)
> >>>  		f2fs_balance_fs_bg(sbi);
> >>>  next:
> >>>  		sb_end_write(sbi->sb);
> >>> +next_umount:
> >>> +		clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>> +		up_read(&sbi->sb->s_umount);
> >>>  
> >>>  	} while (!kthread_should_stop());
> >>>  	return 0;
> >>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >>> index a28c245b1288..40a77a4eb465 100644
> >>> --- a/fs/f2fs/super.c
> >>> +++ b/fs/f2fs/super.c
> >>> @@ -1029,6 +1029,8 @@ static void f2fs_put_super(struct super_block *sb)
> >>>  	int i;
> >>>  	bool dropped;
> >>>  
> >>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>> +
> >>>  	f2fs_quota_off_umount(sb);
> >>>  
> >>>  	/* prevent remaining shrinker jobs */
> >>> @@ -1122,11 +1124,17 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
> >>>  
> >>>  	if (sync) {
> >>>  		struct cp_control cpc;
> >>> +		bool keep = is_sbi_flag_set(sbi, SBI_QUOTA_INIT);
> >>>  
> >>>  		cpc.reason = __get_cp_reason(sbi);
> >>>  
> >>>  		mutex_lock(&sbi->gc_mutex);
> >>> +		if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE)
> >>> +			set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>> +
> >>>  		err = f2fs_write_checkpoint(sbi, &cpc);
> >>> +		if (!keep)
> >>> +			clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  		mutex_unlock(&sbi->gc_mutex);
> >>>  	}
> >>>  	f2fs_trace_ios(NULL, 1);
> >>> @@ -1534,6 +1542,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >>>  		}
> >>>  	}
> >>>  #endif
> >>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  
> >>>  	/* recover superblocks we couldn't write due to previous RO mount */
> >>>  	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
> >>> @@ -1653,6 +1662,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >>>  		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
> >>>  
> >>>  	limit_reserve_root(sbi);
> >>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME);
> >>>  	return 0;
> >>>  restore_gc:
> >>> @@ -1673,6 +1683,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
> >>>  #endif
> >>>  	sbi->mount_opt = org_mount_opt;
> >>>  	sb->s_flags = old_sb_flags;
> >>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  	return err;
> >>>  }
> >>>  
> >>> @@ -1706,6 +1717,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
> >>>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
> >>>  				goto repeat;
> >>>  			}
> >>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
> >>>  			return PTR_ERR(page);
> >>>  		}
> >>>  
> >>> @@ -1717,6 +1729,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
> >>>  		}
> >>>  		if (unlikely(!PageUptodate(page))) {
> >>>  			f2fs_put_page(page, 1);
> >>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
> >>>  			return -EIO;
> >>>  		}
> >>>  
> >>> @@ -1758,6 +1771,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
> >>>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
> >>>  				goto retry;
> >>>  			}
> >>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
> >>>  			break;
> >>
> >> I added these before, but later I didn't encounter quota corruption w/o
> >> them, so I removed them.
> >>
> >> Do you still hit corruption after this change?
> > 
> > I found one issue where we must avoid roll-forward recovery, if fsck overwrote some
> > blocks to fix quota which will be used for recovery.
> 
> That's correct, IMO, the right way is that fsck needs to be aware of whole dnode
> chain and block address in dnode block, when doing allocation, it needs to
> bypass those block address of dnode and block which can be recovered later by
> kernel, so that fsynced data can not be lost.
> 
> But, in order to trouble shoot current problem more quickly, we can just disable
> kernel recovery once fsck recovers quota file for now.
> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>  		}
> >>>  
> >>> @@ -1794,7 +1808,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
> >>>  
> >>>  static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
> >>>  {
> >>> -
> >>>  	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
> >>>  		f2fs_msg(sbi->sb, KERN_ERR,
> >>>  			"quota sysfile may be corrupted, skip loading it");
> >>> @@ -1958,7 +1971,9 @@ static int f2fs_quota_on(struct super_block *sb, int type, int format_id,
> >>>  	if (err)
> >>>  		return err;
> >>>  
> >>> +	set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
> >>>  	err = dquot_quota_on(sb, type, format_id, path);
> >>> +	clear_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
> >>>  	if (err)
> >>>  		return err;
> >>>  
> >>> @@ -3179,6 +3194,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  		goto free_meta_inode;
> >>>  	}
> >>>  
> >>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  	if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
> >>>  		set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
> >>>  
> >>> @@ -3370,6 +3386,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  				cur_cp_version(F2FS_CKPT(sbi)));
> >>>  	f2fs_update_time(sbi, CP_TIME);
> >>>  	f2fs_update_time(sbi, REQ_TIME);
> >>> +
> >>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  	return 0;
> >>>  
> >>>  free_meta:
> >>> @@ -3434,6 +3452,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  		shrink_dcache_sb(sb);
> >>>  		goto try_onemore;
> >>>  	}
> >>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  	return err;
> >>>  }
> >>>  
> >>> @@ -3449,6 +3468,7 @@ static void kill_f2fs_super(struct super_block *sb)
> >>>  		struct f2fs_sb_info *sbi = F2FS_SB(sb);
> >>>  
> >>>  		set_sbi_flag(sbi, SBI_IS_CLOSE);
> >>> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
> >>>  		f2fs_stop_gc_thread(sbi);
> >>>  		f2fs_stop_discard_thread(sbi);
> >>>  
> >>>
> > 
> > 
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix quota info to adjust recovered data
  2018-10-01  1:27                                                             ` Jaegeuk Kim
@ 2018-10-01  1:37                                                               ` Chao Yu
  0 siblings, 0 replies; 38+ messages in thread
From: Chao Yu @ 2018-10-01  1:37 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: Chao Yu, linux-kernel, linux-f2fs-devel

On 2018-10-1 9:27, Jaegeuk Kim wrote:
> On 10/01, Chao Yu wrote:
>> On 2018-10-1 7:58, Jaegeuk Kim wrote:
>>> On 09/29, Chao Yu wrote:
>>>> On 2018/9/29 7:40, Jaegeuk Kim wrote:
>>>>> Testing other fix.
>>>>>
>>>>> ---
>>>>>  fs/f2fs/checkpoint.c |  7 +++++++
>>>>>  fs/f2fs/f2fs.h       |  1 +
>>>>>  fs/f2fs/gc.c         | 10 +++++++++-
>>>>>  fs/f2fs/super.c      | 22 +++++++++++++++++++++-
>>>>>  4 files changed, 38 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>>>>> index 524b87667cf4..3fde91f41a91 100644
>>>>> --- a/fs/f2fs/checkpoint.c
>>>>> +++ b/fs/f2fs/checkpoint.c
>>>>> @@ -1494,6 +1494,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>>>>>  {
>>>>>  	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
>>>>>  	unsigned long long ckpt_ver;
>>>>> +	bool need_up = false;
>>>>>  	int err = 0;
>>>>>  
>>>>>  	if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) {
>>>>> @@ -1506,6 +1507,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>>>>>  		f2fs_msg(sbi->sb, KERN_WARNING,
>>>>>  				"Start checkpoint disabled!");
>>>>>  	}
>>>>> +	if (!is_sbi_flag_set(sbi, SBI_QUOTA_INIT)) {
>>>>> +		need_up = true;
>>>>> +		down_read(&sbi->sb->s_umount);
>>>>
>>>> This is to avoid show warning when calling dquot_writeback_dquots() in
>>>> f2fs_quota_sync(), right?
>>>
>>> Yup. Unfortunately, this can't fix all the issues, so I'm testing trylock
>>> simply in this case.
>>
>> Oh, that's just warning, it could not be harmful, I think we can simply remove
>> WARN_ON_ONCE in dquot_writeback_dquots to fix this?
> 
> Well, I think it'd be better to keep it.

We'd better to ask suggestion from maintainer of quota subsystem?

Thanks,

> 
>>
>>>
>>>>
>>>>> +	}
>>>>>  	mutex_lock(&sbi->cp_mutex);
>>>>>  
>>>>>  	if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
>>>>> @@ -1582,6 +1587,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
>>>>>  	trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
>>>>>  out:
>>>>>  	mutex_unlock(&sbi->cp_mutex);
>>>>> +	if (need_up)
>>>>> +		up_read(&sbi->sb->s_umount);
>>>>>  	return err;
>>>>>  }
>>>>>  
>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>> index 57c829dd107e..30194f2f108e 100644
>>>>> --- a/fs/f2fs/f2fs.h
>>>>> +++ b/fs/f2fs/f2fs.h
>>>>> @@ -1096,6 +1096,7 @@ enum {
>>>>>  	SBI_IS_SHUTDOWN,			/* shutdown by ioctl */
>>>>>  	SBI_IS_RECOVERED,			/* recovered orphan/data */
>>>>>  	SBI_CP_DISABLED,			/* CP was disabled last mount */
>>>>> +	SBI_QUOTA_INIT,				/* avoid sb->s_umount lock */
>>>>>  	SBI_QUOTA_NEED_FLUSH,			/* need to flush quota info in CP */
>>>>>  	SBI_QUOTA_SKIP_FLUSH,			/* skip flushing quota in current CP */
>>>>>  	SBI_QUOTA_NEED_REPAIR,			/* quota file may be corrupted */
>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>>> index adaf5a695b12..deece448cb3b 100644
>>>>> --- a/fs/f2fs/gc.c
>>>>> +++ b/fs/f2fs/gc.c
>>>>> @@ -55,9 +55,14 @@ static int gc_thread_func(void *data)
>>>>>  			f2fs_stop_checkpoint(sbi, false);
>>>>>  		}
>>>>>  
>>>>> -		if (!sb_start_write_trylock(sbi->sb))
>>>>> +		if (!down_read_trylock(&sbi->sb->s_umount))
>>>>>  			continue;
>>>>>  
>>>>> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>> +
>>>>> +		if (!sb_start_write_trylock(sbi->sb))
>>>>> +			goto next_umount;
>>>>> +
>>>>>  		/*
>>>>>  		 * [GC triggering condition]
>>>>>  		 * 0. GC is not conducted currently.
>>>>> @@ -104,6 +109,9 @@ static int gc_thread_func(void *data)
>>>>>  		f2fs_balance_fs_bg(sbi);
>>>>>  next:
>>>>>  		sb_end_write(sbi->sb);
>>>>> +next_umount:
>>>>> +		clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>> +		up_read(&sbi->sb->s_umount);
>>>>>  
>>>>>  	} while (!kthread_should_stop());
>>>>>  	return 0;
>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>> index a28c245b1288..40a77a4eb465 100644
>>>>> --- a/fs/f2fs/super.c
>>>>> +++ b/fs/f2fs/super.c
>>>>> @@ -1029,6 +1029,8 @@ static void f2fs_put_super(struct super_block *sb)
>>>>>  	int i;
>>>>>  	bool dropped;
>>>>>  
>>>>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>> +
>>>>>  	f2fs_quota_off_umount(sb);
>>>>>  
>>>>>  	/* prevent remaining shrinker jobs */
>>>>> @@ -1122,11 +1124,17 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
>>>>>  
>>>>>  	if (sync) {
>>>>>  		struct cp_control cpc;
>>>>> +		bool keep = is_sbi_flag_set(sbi, SBI_QUOTA_INIT);
>>>>>  
>>>>>  		cpc.reason = __get_cp_reason(sbi);
>>>>>  
>>>>>  		mutex_lock(&sbi->gc_mutex);
>>>>> +		if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE)
>>>>> +			set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>> +
>>>>>  		err = f2fs_write_checkpoint(sbi, &cpc);
>>>>> +		if (!keep)
>>>>> +			clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  		mutex_unlock(&sbi->gc_mutex);
>>>>>  	}
>>>>>  	f2fs_trace_ios(NULL, 1);
>>>>> @@ -1534,6 +1542,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>>>  		}
>>>>>  	}
>>>>>  #endif
>>>>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  
>>>>>  	/* recover superblocks we couldn't write due to previous RO mount */
>>>>>  	if (!(*flags & SB_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) {
>>>>> @@ -1653,6 +1662,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>>>  		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
>>>>>  
>>>>>  	limit_reserve_root(sbi);
>>>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  	*flags = (*flags & ~SB_LAZYTIME) | (sb->s_flags & SB_LAZYTIME);
>>>>>  	return 0;
>>>>>  restore_gc:
>>>>> @@ -1673,6 +1683,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
>>>>>  #endif
>>>>>  	sbi->mount_opt = org_mount_opt;
>>>>>  	sb->s_flags = old_sb_flags;
>>>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  	return err;
>>>>>  }
>>>>>  
>>>>> @@ -1706,6 +1717,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
>>>>>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
>>>>>  				goto repeat;
>>>>>  			}
>>>>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>>>>>  			return PTR_ERR(page);
>>>>>  		}
>>>>>  
>>>>> @@ -1717,6 +1729,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
>>>>>  		}
>>>>>  		if (unlikely(!PageUptodate(page))) {
>>>>>  			f2fs_put_page(page, 1);
>>>>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>>>>>  			return -EIO;
>>>>>  		}
>>>>>  
>>>>> @@ -1758,6 +1771,7 @@ static ssize_t f2fs_quota_write(struct super_block *sb, int type,
>>>>>  				congestion_wait(BLK_RW_ASYNC, HZ/50);
>>>>>  				goto retry;
>>>>>  			}
>>>>> +			set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>>>>>  			break;
>>>>
>>>> I added these before, but later I didn't encounter quota corruption w/o
>>>> them, so I removed them.
>>>>
>>>> Do you still hit corruption after this change?
>>>
>>> I found one issue where we must avoid roll-forward recovery, if fsck overwrote some
>>> blocks to fix quota which will be used for recovery.
>>
>> That's correct, IMO, the right way is that fsck needs to be aware of whole dnode
>> chain and block address in dnode block, when doing allocation, it needs to
>> bypass those block address of dnode and block which can be recovered later by
>> kernel, so that fsynced data can not be lost.
>>
>> But, in order to trouble shoot current problem more quickly, we can just disable
>> kernel recovery once fsck recovers quota file for now.
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>  		}
>>>>>  
>>>>> @@ -1794,7 +1808,6 @@ static qsize_t *f2fs_get_reserved_space(struct inode *inode)
>>>>>  
>>>>>  static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type)
>>>>>  {
>>>>> -
>>>>>  	if (is_set_ckpt_flags(sbi, CP_QUOTA_NEED_FSCK_FLAG)) {
>>>>>  		f2fs_msg(sbi->sb, KERN_ERR,
>>>>>  			"quota sysfile may be corrupted, skip loading it");
>>>>> @@ -1958,7 +1971,9 @@ static int f2fs_quota_on(struct super_block *sb, int type, int format_id,
>>>>>  	if (err)
>>>>>  		return err;
>>>>>  
>>>>> +	set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
>>>>>  	err = dquot_quota_on(sb, type, format_id, path);
>>>>> +	clear_sbi_flag(F2FS_SB(sb), SBI_QUOTA_INIT);
>>>>>  	if (err)
>>>>>  		return err;
>>>>>  
>>>>> @@ -3179,6 +3194,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>>  		goto free_meta_inode;
>>>>>  	}
>>>>>  
>>>>> +	set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  	if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
>>>>>  		set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
>>>>>  
>>>>> @@ -3370,6 +3386,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>>  				cur_cp_version(F2FS_CKPT(sbi)));
>>>>>  	f2fs_update_time(sbi, CP_TIME);
>>>>>  	f2fs_update_time(sbi, REQ_TIME);
>>>>> +
>>>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  	return 0;
>>>>>  
>>>>>  free_meta:
>>>>> @@ -3434,6 +3452,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>>  		shrink_dcache_sb(sb);
>>>>>  		goto try_onemore;
>>>>>  	}
>>>>> +	clear_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  	return err;
>>>>>  }
>>>>>  
>>>>> @@ -3449,6 +3468,7 @@ static void kill_f2fs_super(struct super_block *sb)
>>>>>  		struct f2fs_sb_info *sbi = F2FS_SB(sb);
>>>>>  
>>>>>  		set_sbi_flag(sbi, SBI_IS_CLOSE);
>>>>> +		set_sbi_flag(sbi, SBI_QUOTA_INIT);
>>>>>  		f2fs_stop_gc_thread(sbi);
>>>>>  		f2fs_stop_discard_thread(sbi);
>>>>>  
>>>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2018-10-01  1:37 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11 20:15 [PATCH] f2fs: fix quota info to adjust recovered data Jaegeuk Kim
2018-09-11 23:51 ` [f2fs-dev] " Chao Yu
2018-09-12  0:06   ` Jaegeuk Kim
2018-09-12  0:27     ` Jaegeuk Kim
2018-09-12  1:13       ` Chao Yu
2018-09-12  1:25         ` Jaegeuk Kim
2018-09-12  1:40           ` Chao Yu
2018-09-12  1:46             ` Chao Yu
2018-09-12 19:54               ` Jaegeuk Kim
2018-09-12 23:28                 ` Chao Yu
2018-09-18  1:19                   ` Jaegeuk Kim
2018-09-18  1:38                     ` Chao Yu
2018-09-18  2:05                       ` Jaegeuk Kim
2018-09-18 10:13                         ` Chao Yu
2018-09-18 16:45                           ` Jaegeuk Kim
2018-09-19  1:38                             ` Chao Yu
2018-09-19 22:38                               ` Jaegeuk Kim
2018-09-20  9:46                                 ` Chao Yu
2018-09-20 21:42                                   ` Jaegeuk Kim
2018-09-21  7:48                                     ` Chao Yu
2018-09-26  0:29                                       ` Jaegeuk Kim
2018-09-26  1:21                                         ` Chao Yu
2018-09-26  1:44                                           ` Jaegeuk Kim
2018-09-26  2:06                                             ` Chao Yu
2018-09-26  2:09                                               ` Jaegeuk Kim
2018-09-26  2:14                                                 ` Chao Yu
2018-09-27  1:16                                                 ` Chao Yu
2018-09-28 17:37                                                   ` Jaegeuk Kim
2018-09-28 23:40                                                     ` Jaegeuk Kim
2018-09-29 10:38                                                       ` Chao Yu
2018-09-30 23:58                                                         ` Jaegeuk Kim
2018-10-01  0:35                                                           ` Chao Yu
2018-10-01  1:27                                                             ` Jaegeuk Kim
2018-10-01  1:37                                                               ` Chao Yu
2018-09-12  2:46             ` Jaegeuk Kim
2018-09-12  2:57               ` Chao Yu
2018-09-12 19:50                 ` Jaegeuk Kim
2018-09-12 23:30                   ` Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).