linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
@ 2019-08-16  3:03 Chao Yu
  2019-08-19  6:30 ` Chao Yu
  2019-09-26 20:37 ` [f2fs-dev] " Eric Biggers
  0 siblings, 2 replies; 7+ messages in thread
From: Chao Yu @ 2019-08-16  3:03 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao, Chao Yu

There is one case can cause data corruption.

- write 4k to fileA
- fsync fileA, 4k data is writebacked to lbaA
- write 4k to fileA
- kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
- write 4k to fileB
- kworker flush 4k to lbaA due to SSR
- SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
data

One solution is tracking all fsynced file's block history, and disallow
SSR overwrite on newly invalidated block on that file.

However, during recovery, no matter the dnode is flushed or fsynced, all
previous dnodes until last fsynced one in node chain can be recovered,
that means we need to record all block change in flushed dnode, which
will cause heavy cost, so let's just use simple fix by forbidding SSR
overwrite directly.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
---
 fs/f2fs/segment.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 9d9d9a050d59..69b3b553ee6b 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
 		if (!f2fs_test_and_set_bit(offset, se->discard_map))
 			sbi->discard_blks--;
 
-		/* don't overwrite by SSR to keep node chain */
-		if (IS_NODESEG(se->type) &&
-				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
+		/*
+		 * SSR should never reuse block which is checkpointed
+		 * or newly invalidated.
+		 */
+		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
 			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
 				se->ckpt_valid_blocks++;
 		}
-- 
2.18.0.rc1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
  2019-08-16  3:03 [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite Chao Yu
@ 2019-08-19  6:30 ` Chao Yu
  2019-08-20  1:00   ` Jaegeuk Kim
  2019-09-26 20:37 ` [f2fs-dev] " Eric Biggers
  1 sibling, 1 reply; 7+ messages in thread
From: Chao Yu @ 2019-08-19  6:30 UTC (permalink / raw)
  To: jaegeuk; +Cc: linux-f2fs-devel, linux-kernel, chao

On 2019/8/16 11:03, Chao Yu wrote:
> There is one case can cause data corruption.
> 
> - write 4k to fileA
> - fsync fileA, 4k data is writebacked to lbaA
> - write 4k to fileA
> - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> - write 4k to fileB
> - kworker flush 4k to lbaA due to SSR
> - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> data
> 
> One solution is tracking all fsynced file's block history, and disallow
> SSR overwrite on newly invalidated block on that file.
> 
> However, during recovery, no matter the dnode is flushed or fsynced, all
> previous dnodes until last fsynced one in node chain can be recovered,
> that means we need to record all block change in flushed dnode, which
> will cause heavy cost, so let's just use simple fix by forbidding SSR
> overwrite directly.
> 

Jaegeuk,

Please help to add below missed tag to keep this patch being merged in stable
kernel.

Fixes: 5b6c6be2d878 ("f2fs: use SSR for warm node as well")

Thanks,

> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/segment.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 9d9d9a050d59..69b3b553ee6b 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
>  		if (!f2fs_test_and_set_bit(offset, se->discard_map))
>  			sbi->discard_blks--;
>  
> -		/* don't overwrite by SSR to keep node chain */
> -		if (IS_NODESEG(se->type) &&
> -				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> +		/*
> +		 * SSR should never reuse block which is checkpointed
> +		 * or newly invalidated.
> +		 */
> +		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
>  			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
>  				se->ckpt_valid_blocks++;
>  		}
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
  2019-08-19  6:30 ` Chao Yu
@ 2019-08-20  1:00   ` Jaegeuk Kim
  2019-08-20  9:16     ` Chao Yu
  0 siblings, 1 reply; 7+ messages in thread
From: Jaegeuk Kim @ 2019-08-20  1:00 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, linux-kernel, chao

On 08/19, Chao Yu wrote:
> On 2019/8/16 11:03, Chao Yu wrote:
> > There is one case can cause data corruption.
> > 
> > - write 4k to fileA
> > - fsync fileA, 4k data is writebacked to lbaA
> > - write 4k to fileA
> > - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> > - write 4k to fileB
> > - kworker flush 4k to lbaA due to SSR
> > - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> > data
> > 
> > One solution is tracking all fsynced file's block history, and disallow
> > SSR overwrite on newly invalidated block on that file.
> > 
> > However, during recovery, no matter the dnode is flushed or fsynced, all
> > previous dnodes until last fsynced one in node chain can be recovered,
> > that means we need to record all block change in flushed dnode, which
> > will cause heavy cost, so let's just use simple fix by forbidding SSR
> > overwrite directly.
> > 
> 
> Jaegeuk,
> 
> Please help to add below missed tag to keep this patch being merged in stable
> kernel.
> 
> Fixes: 5b6c6be2d878 ("f2fs: use SSR for warm node as well")

Done.

> 
> Thanks,
> 
> > Signed-off-by: Chao Yu <yuchao0@huawei.com>
> > ---
> >  fs/f2fs/segment.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 9d9d9a050d59..69b3b553ee6b 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
> >  		if (!f2fs_test_and_set_bit(offset, se->discard_map))
> >  			sbi->discard_blks--;
> >  
> > -		/* don't overwrite by SSR to keep node chain */
> > -		if (IS_NODESEG(se->type) &&
> > -				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > +		/*
> > +		 * SSR should never reuse block which is checkpointed
> > +		 * or newly invalidated.
> > +		 */
> > +		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> >  			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
> >  				se->ckpt_valid_blocks++;
> >  		}
> > 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
  2019-08-20  1:00   ` Jaegeuk Kim
@ 2019-08-20  9:16     ` Chao Yu
  0 siblings, 0 replies; 7+ messages in thread
From: Chao Yu @ 2019-08-20  9:16 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel, linux-kernel, chao

On 2019/8/20 9:00, Jaegeuk Kim wrote:
> On 08/19, Chao Yu wrote:
>> On 2019/8/16 11:03, Chao Yu wrote:
>>> There is one case can cause data corruption.
>>>
>>> - write 4k to fileA
>>> - fsync fileA, 4k data is writebacked to lbaA
>>> - write 4k to fileA
>>> - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
>>> - write 4k to fileB
>>> - kworker flush 4k to lbaA due to SSR
>>> - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
>>> data
>>>
>>> One solution is tracking all fsynced file's block history, and disallow
>>> SSR overwrite on newly invalidated block on that file.
>>>
>>> However, during recovery, no matter the dnode is flushed or fsynced, all
>>> previous dnodes until last fsynced one in node chain can be recovered,
>>> that means we need to record all block change in flushed dnode, which
>>> will cause heavy cost, so let's just use simple fix by forbidding SSR
>>> overwrite directly.
>>>
>>
>> Jaegeuk,
>>
>> Please help to add below missed tag to keep this patch being merged in stable
>> kernel.
>>
>> Fixes: 5b6c6be2d878 ("f2fs: use SSR for warm node as well")
> 
> Done.

Thanks! :)

Thanks,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
  2019-08-16  3:03 [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite Chao Yu
  2019-08-19  6:30 ` Chao Yu
@ 2019-09-26 20:37 ` Eric Biggers
  2019-09-26 21:46   ` Jaegeuk Kim
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Biggers @ 2019-09-26 20:37 UTC (permalink / raw)
  To: Chao Yu; +Cc: jaegeuk, linux-kernel, linux-f2fs-devel

On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote:
> There is one case can cause data corruption.
> 
> - write 4k to fileA
> - fsync fileA, 4k data is writebacked to lbaA
> - write 4k to fileA
> - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> - write 4k to fileB
> - kworker flush 4k to lbaA due to SSR
> - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> data
> 
> One solution is tracking all fsynced file's block history, and disallow
> SSR overwrite on newly invalidated block on that file.
> 
> However, during recovery, no matter the dnode is flushed or fsynced, all
> previous dnodes until last fsynced one in node chain can be recovered,
> that means we need to record all block change in flushed dnode, which
> will cause heavy cost, so let's just use simple fix by forbidding SSR
> overwrite directly.
> 
> Signed-off-by: Chao Yu <yuchao0@huawei.com>
> ---
>  fs/f2fs/segment.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 9d9d9a050d59..69b3b553ee6b 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
>  		if (!f2fs_test_and_set_bit(offset, se->discard_map))
>  			sbi->discard_blks--;
>  
> -		/* don't overwrite by SSR to keep node chain */
> -		if (IS_NODESEG(se->type) &&
> -				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> +		/*
> +		 * SSR should never reuse block which is checkpointed
> +		 * or newly invalidated.
> +		 */
> +		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
>  			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
>  				se->ckpt_valid_blocks++;
>  		}
> -- 

FYI, this commit caused xfstests generic/064 to start failing:

$ kvm-xfstests -c f2fs generic/064
...
generic/064 3s ... 	[13:36:37][    5.946293] run fstests generic/064 at 2019-09-26 13:36:37
 [13:36:41]- output mismatch (see /results/f2fs/results-default/generic/064.out.bad)
    --- tests/generic/064.out	2019-09-18 04:53:46.000000000 -0700
    +++ /results/f2fs/results-default/generic/064.out.bad	2019-09-26 13:36:41.533018683 -0700
    @@ -1,2 +1,3 @@
     QA output created by 064
     Extent count after inserts is in range
    +extents mismatched before = 1 after = 50
    ...
    (Run 'diff -u /root/xfstests/tests/generic/064.out /results/f2fs/results-default/generic/064.out.bad'  to see the entire diff)
Ran: generic/064
Failures: generic/064
Failed 1 of 1 tests

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
  2019-09-26 20:37 ` [f2fs-dev] " Eric Biggers
@ 2019-09-26 21:46   ` Jaegeuk Kim
  2019-09-27  0:32     ` Jaegeuk Kim
  0 siblings, 1 reply; 7+ messages in thread
From: Jaegeuk Kim @ 2019-09-26 21:46 UTC (permalink / raw)
  To: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/26, Eric Biggers wrote:
> On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote:
> > There is one case can cause data corruption.
> > 
> > - write 4k to fileA
> > - fsync fileA, 4k data is writebacked to lbaA
> > - write 4k to fileA
> > - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> > - write 4k to fileB
> > - kworker flush 4k to lbaA due to SSR
> > - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> > data
> > 
> > One solution is tracking all fsynced file's block history, and disallow
> > SSR overwrite on newly invalidated block on that file.
> > 
> > However, during recovery, no matter the dnode is flushed or fsynced, all
> > previous dnodes until last fsynced one in node chain can be recovered,
> > that means we need to record all block change in flushed dnode, which
> > will cause heavy cost, so let's just use simple fix by forbidding SSR
> > overwrite directly.
> > 
> > Signed-off-by: Chao Yu <yuchao0@huawei.com>
> > ---
> >  fs/f2fs/segment.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 9d9d9a050d59..69b3b553ee6b 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
> >  		if (!f2fs_test_and_set_bit(offset, se->discard_map))
> >  			sbi->discard_blks--;
> >  
> > -		/* don't overwrite by SSR to keep node chain */
> > -		if (IS_NODESEG(se->type) &&
> > -				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > +		/*
> > +		 * SSR should never reuse block which is checkpointed
> > +		 * or newly invalidated.
> > +		 */
> > +		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> >  			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
> >  				se->ckpt_valid_blocks++;
> >  		}
> > -- 
> 
> FYI, this commit caused xfstests generic/064 to start failing:

Yup, I was looking at this.

> 
> $ kvm-xfstests -c f2fs generic/064
> ...
> generic/064 3s ... 	[13:36:37][    5.946293] run fstests generic/064 at 2019-09-26 13:36:37
>  [13:36:41]- output mismatch (see /results/f2fs/results-default/generic/064.out.bad)
>     --- tests/generic/064.out	2019-09-18 04:53:46.000000000 -0700
>     +++ /results/f2fs/results-default/generic/064.out.bad	2019-09-26 13:36:41.533018683 -0700
>     @@ -1,2 +1,3 @@
>      QA output created by 064
>      Extent count after inserts is in range
>     +extents mismatched before = 1 after = 50
>     ...
>     (Run 'diff -u /root/xfstests/tests/generic/064.out /results/f2fs/results-default/generic/064.out.bad'  to see the entire diff)
> Ran: generic/064
> Failures: generic/064
> Failed 1 of 1 tests

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
  2019-09-26 21:46   ` Jaegeuk Kim
@ 2019-09-27  0:32     ` Jaegeuk Kim
  0 siblings, 0 replies; 7+ messages in thread
From: Jaegeuk Kim @ 2019-09-27  0:32 UTC (permalink / raw)
  To: Chao Yu, linux-kernel, linux-f2fs-devel

On 09/26, Jaegeuk Kim wrote:
> On 09/26, Eric Biggers wrote:
> > On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote:
> > > There is one case can cause data corruption.
> > > 
> > > - write 4k to fileA
> > > - fsync fileA, 4k data is writebacked to lbaA
> > > - write 4k to fileA
> > > - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> > > - write 4k to fileB
> > > - kworker flush 4k to lbaA due to SSR
> > > - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> > > data
> > > 
> > > One solution is tracking all fsynced file's block history, and disallow
> > > SSR overwrite on newly invalidated block on that file.
> > > 
> > > However, during recovery, no matter the dnode is flushed or fsynced, all
> > > previous dnodes until last fsynced one in node chain can be recovered,
> > > that means we need to record all block change in flushed dnode, which
> > > will cause heavy cost, so let's just use simple fix by forbidding SSR
> > > overwrite directly.
> > > 
> > > Signed-off-by: Chao Yu <yuchao0@huawei.com>
> > > ---
> > >  fs/f2fs/segment.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > index 9d9d9a050d59..69b3b553ee6b 100644
> > > --- a/fs/f2fs/segment.c
> > > +++ b/fs/f2fs/segment.c
> > > @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
> > >  		if (!f2fs_test_and_set_bit(offset, se->discard_map))
> > >  			sbi->discard_blks--;
> > >  
> > > -		/* don't overwrite by SSR to keep node chain */
> > > -		if (IS_NODESEG(se->type) &&
> > > -				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > > +		/*
> > > +		 * SSR should never reuse block which is checkpointed
> > > +		 * or newly invalidated.
> > > +		 */
> > > +		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > >  			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
> > >  				se->ckpt_valid_blocks++;
> > >  		}
> > > -- 
> > 
> > FYI, this commit caused xfstests generic/064 to start failing:
> 
> Yup, I was looking at this.

It seems fcollapse couldn't allocate blocks sequential when rewriting blocks.

We need to adjust like this:
---
 tests/generic/064 | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/generic/064 b/tests/generic/064
index 1ace14b6..058258d5 100755
--- a/tests/generic/064
+++ b/tests/generic/064
@@ -72,7 +72,9 @@ done
 
 extent_after=`_count_extents $dest`
 if [ $extent_before -ne $extent_after ]; then
-	echo "extents mismatched before = $extent_before after = $extent_after"
+	if [ "$FSTYP" != "f2fs" ] || [ $extent_before -ne 1 ] || [ $extent_after -ne 50 ]; then
+		echo "extents mismatched before = $extent_before after = $extent_after"
+	fi
 fi
 
 # compare original file and test file.
-- 
2.19.0.605.g01d371f741-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-09-27  0:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-16  3:03 [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite Chao Yu
2019-08-19  6:30 ` Chao Yu
2019-08-20  1:00   ` Jaegeuk Kim
2019-08-20  9:16     ` Chao Yu
2019-09-26 20:37 ` [f2fs-dev] " Eric Biggers
2019-09-26 21:46   ` Jaegeuk Kim
2019-09-27  0:32     ` Jaegeuk Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).