Linux-f2fs-devel Archive on lore.kernel.org
 help / color / Atom feed
* [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
@ 2019-09-09  1:25 Jaegeuk Kim
  2019-09-09  1:25 ` [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files Jaegeuk Kim
  2019-09-09  2:56 ` [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Chao Yu
  0 siblings, 2 replies; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  1:25 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

GC must avoid select the same victim again.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/gc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e88f98ddf396..15ca8bbb0b22 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -274,6 +274,9 @@ static unsigned int get_cb_cost(struct f2fs_sb_info *sbi, unsigned int segno)
 static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi,
 			unsigned int segno, struct victim_sel_policy *p)
 {
+	if (sbi->cur_victim_sec == GET_SEC_FROM_SEG(sbi, segno))
+		return UINT_MAX;
+
 	if (p->alloc_mode == SSR)
 		return get_seg_entry(sbi, segno)->ckpt_valid_blocks;
 
@@ -1326,9 +1329,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 		round++;
 	}
 
-	if (gc_type == FG_GC)
-		sbi->cur_victim_sec = NULL_SEGNO;
-
 	if (sync)
 		goto stop;
 
-- 
2.19.0.605.g01d371f741-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  1:25 [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Jaegeuk Kim
@ 2019-09-09  1:25 ` Jaegeuk Kim
  2019-09-09  3:03   ` Chao Yu
  2019-09-09  2:56 ` [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Chao Yu
  1 sibling, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  1:25 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
get commited pages but atomic_file being still set like:

- inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)

If GC selects this block, we can get an infinite loop like this:

f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c

In that moment, we can observe:

[Before]
Try to move 5084219 blocks (BG: 384508)
  - data blocks : 4962373 (274483)
  - node blocks : 121846 (110025)
Skipped : atomic write 4534686 (10)

[After]
Try to move 5088973 blocks (BG: 384508)
  - data blocks : 4967127 (274483)
  - node blocks : 121846 (110025)
Skipped : atomic write 4539440 (10)

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/file.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7ae2f3bd8c2f..68b6da734e5f 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 			goto err_out;
 
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
-		if (!ret) {
-			clear_inode_flag(inode, FI_ATOMIC_FILE);
-			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
-			stat_dec_atomic_write(inode);
-		}
+
+		/* doesn't need to check error */
+		clear_inode_flag(inode, FI_ATOMIC_FILE);
+		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
+		stat_dec_atomic_write(inode);
 	} else {
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 	}
-- 
2.19.0.605.g01d371f741-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-09  1:25 [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Jaegeuk Kim
  2019-09-09  1:25 ` [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files Jaegeuk Kim
@ 2019-09-09  2:56 ` Chao Yu
  2019-09-09  8:06   ` Jaegeuk Kim
  1 sibling, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09  2:56 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 2019/9/9 9:25, Jaegeuk Kim wrote:
> GC must avoid select the same victim again.

Blocks in previous victim will occupy addition free segment, I doubt after this
change, FGGC may encounter out-of-free space issue more frequently.

Thanks,

> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/gc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index e88f98ddf396..15ca8bbb0b22 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -274,6 +274,9 @@ static unsigned int get_cb_cost(struct f2fs_sb_info *sbi, unsigned int segno)
>  static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi,
>  			unsigned int segno, struct victim_sel_policy *p)
>  {
> +	if (sbi->cur_victim_sec == GET_SEC_FROM_SEG(sbi, segno))
> +		return UINT_MAX;
> +
>  	if (p->alloc_mode == SSR)
>  		return get_seg_entry(sbi, segno)->ckpt_valid_blocks;
>  
> @@ -1326,9 +1329,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  		round++;
>  	}
>  
> -	if (gc_type == FG_GC)
> -		sbi->cur_victim_sec = NULL_SEGNO;
> -
>  	if (sync)
>  		goto stop;
>  
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  1:25 ` [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files Jaegeuk Kim
@ 2019-09-09  3:03   ` Chao Yu
  2019-09-09  7:30     ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09  3:03 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 2019/9/9 9:25, Jaegeuk Kim wrote:
> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> get commited pages but atomic_file being still set like:
> 
> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> 
> If GC selects this block, we can get an infinite loop like this:
> 
> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> 
> In that moment, we can observe:
> 
> [Before]
> Try to move 5084219 blocks (BG: 384508)
>   - data blocks : 4962373 (274483)
>   - node blocks : 121846 (110025)
> Skipped : atomic write 4534686 (10)
> 
> [After]
> Try to move 5088973 blocks (BG: 384508)
>   - data blocks : 4967127 (274483)
>   - node blocks : 121846 (110025)
> Skipped : atomic write 4539440 (10)
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/file.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 7ae2f3bd8c2f..68b6da734e5f 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>  			goto err_out;
>  
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> -		if (!ret) {
> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> -			stat_dec_atomic_write(inode);
> -		}
> +
> +		/* doesn't need to check error */
> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> +		stat_dec_atomic_write(inode);

If there are still valid atomic write pages linked in .inmem_pages, it may cause
memory leak when we just clear FI_ATOMIC_FILE flag.

So my question is why below logic didn't handle such condition well?

f2fs_gc()

	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
		if (skipped_round <= MAX_SKIP_GC_COUNT ||
					skipped_round * 2 < round) {
			segno = NULL_SEGNO;
			goto gc_more;
		}

		if (first_skipped < last_skipped &&
				(last_skipped - first_skipped) >
						sbi->skipped_gc_rwsem) {
			f2fs_drop_inmem_pages_all(sbi, true);
			segno = NULL_SEGNO;
			goto gc_more;
		}
		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
			ret = f2fs_write_checkpoint(sbi, &cpc);
	}

>  	} else {
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>  	}
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  3:03   ` Chao Yu
@ 2019-09-09  7:30     ` Jaegeuk Kim
  2019-09-09  7:54       ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  7:30 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/09, Chao Yu wrote:
> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> > If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> > get commited pages but atomic_file being still set like:
> > 
> > - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> > 
> > If GC selects this block, we can get an infinite loop like this:
> > 
> > f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > 
> > In that moment, we can observe:
> > 
> > [Before]
> > Try to move 5084219 blocks (BG: 384508)
> >   - data blocks : 4962373 (274483)
> >   - node blocks : 121846 (110025)
> > Skipped : atomic write 4534686 (10)
> > 
> > [After]
> > Try to move 5088973 blocks (BG: 384508)
> >   - data blocks : 4967127 (274483)
> >   - node blocks : 121846 (110025)
> > Skipped : atomic write 4539440 (10)
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/file.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index 7ae2f3bd8c2f..68b6da734e5f 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >  			goto err_out;
> >  
> >  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> > -		if (!ret) {
> > -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> > -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> > -			stat_dec_atomic_write(inode);
> > -		}
> > +
> > +		/* doesn't need to check error */
> > +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> > +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> > +		stat_dec_atomic_write(inode);
> 
> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> memory leak when we just clear FI_ATOMIC_FILE flag.

f2fs_commit_inmem_pages() should have flushed them.

> 
> So my question is why below logic didn't handle such condition well?
> 
> f2fs_gc()
> 
> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
> 					skipped_round * 2 < round) {
> 			segno = NULL_SEGNO;
> 			goto gc_more;
> 		}
> 
> 		if (first_skipped < last_skipped &&
> 				(last_skipped - first_skipped) >
> 						sbi->skipped_gc_rwsem) {
> 			f2fs_drop_inmem_pages_all(sbi, true);

This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
from inmem list.

> 			segno = NULL_SEGNO;
> 			goto gc_more;
> 		}
> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
> 			ret = f2fs_write_checkpoint(sbi, &cpc);
> 	}
> 
> >  	} else {
> >  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >  	}
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  7:30     ` Jaegeuk Kim
@ 2019-09-09  7:54       ` Chao Yu
  2019-09-09  8:01         ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09  7:54 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 15:30, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>> get commited pages but atomic_file being still set like:
>>>
>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>
>>> If GC selects this block, we can get an infinite loop like this:
>>>
>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>
>>> In that moment, we can observe:
>>>
>>> [Before]
>>> Try to move 5084219 blocks (BG: 384508)
>>>   - data blocks : 4962373 (274483)
>>>   - node blocks : 121846 (110025)
>>> Skipped : atomic write 4534686 (10)
>>>
>>> [After]
>>> Try to move 5088973 blocks (BG: 384508)
>>>   - data blocks : 4967127 (274483)
>>>   - node blocks : 121846 (110025)
>>> Skipped : atomic write 4539440 (10)
>>>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/file.c | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>> --- a/fs/f2fs/file.c
>>> +++ b/fs/f2fs/file.c
>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>  			goto err_out;
>>>  
>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>> -		if (!ret) {
>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>> -			stat_dec_atomic_write(inode);
>>> -		}
>>> +
>>> +		/* doesn't need to check error */
>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>> +		stat_dec_atomic_write(inode);
>>
>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>> memory leak when we just clear FI_ATOMIC_FILE flag.
> 
> f2fs_commit_inmem_pages() should have flushed them.

Oh, we failed to flush its nodes.

However we won't clear such info if we failed to flush inmen pages, it looks
inconsistent.

Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
error path? I'm not very clear how sqlite handle such error.

Thanks,

> 
>>
>> So my question is why below logic didn't handle such condition well?
>>
>> f2fs_gc()
>>
>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
>> 					skipped_round * 2 < round) {
>> 			segno = NULL_SEGNO;
>> 			goto gc_more;
>> 		}
>>
>> 		if (first_skipped < last_skipped &&
>> 				(last_skipped - first_skipped) >
>> 						sbi->skipped_gc_rwsem) {
>> 			f2fs_drop_inmem_pages_all(sbi, true);
> 
> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> from inmem list.
> 
>> 			segno = NULL_SEGNO;
>> 			goto gc_more;
>> 		}
>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
>> 	}
>>
>>>  	} else {
>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>>  	}
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  7:54       ` Chao Yu
@ 2019-09-09  8:01         ` Jaegeuk Kim
  2019-09-09  8:05           ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  8:01 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/09, Chao Yu wrote:
> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>> get commited pages but atomic_file being still set like:
> >>>
> >>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>
> >>> If GC selects this block, we can get an infinite loop like this:
> >>>
> >>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>
> >>> In that moment, we can observe:
> >>>
> >>> [Before]
> >>> Try to move 5084219 blocks (BG: 384508)
> >>>   - data blocks : 4962373 (274483)
> >>>   - node blocks : 121846 (110025)
> >>> Skipped : atomic write 4534686 (10)
> >>>
> >>> [After]
> >>> Try to move 5088973 blocks (BG: 384508)
> >>>   - data blocks : 4967127 (274483)
> >>>   - node blocks : 121846 (110025)
> >>> Skipped : atomic write 4539440 (10)
> >>>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/file.c | 10 +++++-----
> >>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>> --- a/fs/f2fs/file.c
> >>> +++ b/fs/f2fs/file.c
> >>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>  			goto err_out;
> >>>  
> >>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>> -		if (!ret) {
> >>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>> -			stat_dec_atomic_write(inode);
> >>> -		}
> >>> +
> >>> +		/* doesn't need to check error */
> >>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>> +		stat_dec_atomic_write(inode);
> >>
> >> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >> memory leak when we just clear FI_ATOMIC_FILE flag.
> > 
> > f2fs_commit_inmem_pages() should have flushed them.
> 
> Oh, we failed to flush its nodes.
> 
> However we won't clear such info if we failed to flush inmen pages, it looks
> inconsistent.
> 
> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> error path? I'm not very clear how sqlite handle such error.

f2fs_drop_inmem_pages() did that, but not in this case.

> 
> Thanks,
> 
> > 
> >>
> >> So my question is why below logic didn't handle such condition well?
> >>
> >> f2fs_gc()
> >>
> >> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> >> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
> >> 					skipped_round * 2 < round) {
> >> 			segno = NULL_SEGNO;
> >> 			goto gc_more;
> >> 		}
> >>
> >> 		if (first_skipped < last_skipped &&
> >> 				(last_skipped - first_skipped) >
> >> 						sbi->skipped_gc_rwsem) {
> >> 			f2fs_drop_inmem_pages_all(sbi, true);
> > 
> > This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> > from inmem list.
> > 
> >> 			segno = NULL_SEGNO;
> >> 			goto gc_more;
> >> 		}
> >> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
> >> 			ret = f2fs_write_checkpoint(sbi, &cpc);
> >> 	}
> >>
> >>>  	} else {
> >>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>>  	}
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  8:01         ` Jaegeuk Kim
@ 2019-09-09  8:05           ` Chao Yu
  2019-09-09  8:21             ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09  8:05 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 16:01, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>>>> get commited pages but atomic_file being still set like:
>>>>>
>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>>>
>>>>> If GC selects this block, we can get an infinite loop like this:
>>>>>
>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>
>>>>> In that moment, we can observe:
>>>>>
>>>>> [Before]
>>>>> Try to move 5084219 blocks (BG: 384508)
>>>>>   - data blocks : 4962373 (274483)
>>>>>   - node blocks : 121846 (110025)
>>>>> Skipped : atomic write 4534686 (10)
>>>>>
>>>>> [After]
>>>>> Try to move 5088973 blocks (BG: 384508)
>>>>>   - data blocks : 4967127 (274483)
>>>>>   - node blocks : 121846 (110025)
>>>>> Skipped : atomic write 4539440 (10)
>>>>>
>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>> ---
>>>>>  fs/f2fs/file.c | 10 +++++-----
>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>>>> --- a/fs/f2fs/file.c
>>>>> +++ b/fs/f2fs/file.c
>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>>>  			goto err_out;
>>>>>  
>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>>>> -		if (!ret) {
>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>> -			stat_dec_atomic_write(inode);
>>>>> -		}
>>>>> +
>>>>> +		/* doesn't need to check error */
>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>> +		stat_dec_atomic_write(inode);
>>>>
>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
>>>
>>> f2fs_commit_inmem_pages() should have flushed them.
>>
>> Oh, we failed to flush its nodes.
>>
>> However we won't clear such info if we failed to flush inmen pages, it looks
>> inconsistent.
>>
>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
>> error path? I'm not very clear how sqlite handle such error.
> 
> f2fs_drop_inmem_pages() did that, but not in this case.

What I mean is, for any error returned from atomic_commit() interface, should
userspace application handle it with consistent way, like trigger
f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> So my question is why below logic didn't handle such condition well?
>>>>
>>>> f2fs_gc()
>>>>
>>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
>>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
>>>> 					skipped_round * 2 < round) {
>>>> 			segno = NULL_SEGNO;
>>>> 			goto gc_more;
>>>> 		}
>>>>
>>>> 		if (first_skipped < last_skipped &&
>>>> 				(last_skipped - first_skipped) >
>>>> 						sbi->skipped_gc_rwsem) {
>>>> 			f2fs_drop_inmem_pages_all(sbi, true);
>>>
>>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
>>> from inmem list.
>>>
>>>> 			segno = NULL_SEGNO;
>>>> 			goto gc_more;
>>>> 		}
>>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
>>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
>>>> 	}
>>>>
>>>>>  	} else {
>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>>>>  	}
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-09  2:56 ` [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Chao Yu
@ 2019-09-09  8:06   ` Jaegeuk Kim
  2019-09-09 11:32     ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  8:06 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/09, Chao Yu wrote:
> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> > GC must avoid select the same victim again.
> 
> Blocks in previous victim will occupy addition free segment, I doubt after this
> change, FGGC may encounter out-of-free space issue more frequently.

Hmm, actually this change seems wrong by sec_usage_check().
We may be able to avoid this only in the suspicious loop?

---
 fs/f2fs/gc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index e88f98ddf396..5877bd729689 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 		round++;
 	}
 
-	if (gc_type == FG_GC)
+	if (gc_type == FG_GC && seg_freed)
 		sbi->cur_victim_sec = NULL_SEGNO;
 
 	if (sync)
-- 
2.19.0.605.g01d371f741-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  8:05           ` Chao Yu
@ 2019-09-09  8:21             ` Jaegeuk Kim
  2019-09-09  8:27               ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  8:21 UTC (permalink / raw)
  To: Chao Yu, g; +Cc: linux-kernel, linux-f2fs-devel

On 09/09, Chao Yu wrote:
> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>> get commited pages but atomic_file being still set like:
> >>>>>
> >>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>
> >>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>
> >>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>
> >>>>> In that moment, we can observe:
> >>>>>
> >>>>> [Before]
> >>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>   - data blocks : 4962373 (274483)
> >>>>>   - node blocks : 121846 (110025)
> >>>>> Skipped : atomic write 4534686 (10)
> >>>>>
> >>>>> [After]
> >>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>   - data blocks : 4967127 (274483)
> >>>>>   - node blocks : 121846 (110025)
> >>>>> Skipped : atomic write 4539440 (10)
> >>>>>
> >>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>> ---
> >>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>> --- a/fs/f2fs/file.c
> >>>>> +++ b/fs/f2fs/file.c
> >>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>  			goto err_out;
> >>>>>  
> >>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>> -		if (!ret) {
> >>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>> -			stat_dec_atomic_write(inode);
> >>>>> -		}
> >>>>> +
> >>>>> +		/* doesn't need to check error */
> >>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>> +		stat_dec_atomic_write(inode);
> >>>>
> >>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>
> >>> f2fs_commit_inmem_pages() should have flushed them.
> >>
> >> Oh, we failed to flush its nodes.
> >>
> >> However we won't clear such info if we failed to flush inmen pages, it looks
> >> inconsistent.
> >>
> >> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >> error path? I'm not very clear how sqlite handle such error.
> > 
> > f2fs_drop_inmem_pages() did that, but not in this case.
> 
> What I mean is, for any error returned from atomic_commit() interface, should
> userspace application handle it with consistent way, like trigger
> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().

f2fs_ioc_abort_volatile_write() will be triggered.

> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> So my question is why below logic didn't handle such condition well?
> >>>>
> >>>> f2fs_gc()
> >>>>
> >>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> >>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
> >>>> 					skipped_round * 2 < round) {
> >>>> 			segno = NULL_SEGNO;
> >>>> 			goto gc_more;
> >>>> 		}
> >>>>
> >>>> 		if (first_skipped < last_skipped &&
> >>>> 				(last_skipped - first_skipped) >
> >>>> 						sbi->skipped_gc_rwsem) {
> >>>> 			f2fs_drop_inmem_pages_all(sbi, true);
> >>>
> >>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> >>> from inmem list.
> >>>
> >>>> 			segno = NULL_SEGNO;
> >>>> 			goto gc_more;
> >>>> 		}
> >>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
> >>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
> >>>> 	}
> >>>>
> >>>>>  	} else {
> >>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>>>>  	}
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  8:21             ` Jaegeuk Kim
@ 2019-09-09  8:27               ` Chao Yu
  2019-09-09  8:38                 ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09  8:27 UTC (permalink / raw)
  To: Jaegeuk Kim, g; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 16:21, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
>>>>> On 09/09, Chao Yu wrote:
>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>>>>>> get commited pages but atomic_file being still set like:
>>>>>>>
>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>>>>>
>>>>>>> If GC selects this block, we can get an infinite loop like this:
>>>>>>>
>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>
>>>>>>> In that moment, we can observe:
>>>>>>>
>>>>>>> [Before]
>>>>>>> Try to move 5084219 blocks (BG: 384508)
>>>>>>>   - data blocks : 4962373 (274483)
>>>>>>>   - node blocks : 121846 (110025)
>>>>>>> Skipped : atomic write 4534686 (10)
>>>>>>>
>>>>>>> [After]
>>>>>>> Try to move 5088973 blocks (BG: 384508)
>>>>>>>   - data blocks : 4967127 (274483)
>>>>>>>   - node blocks : 121846 (110025)
>>>>>>> Skipped : atomic write 4539440 (10)
>>>>>>>
>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>> ---
>>>>>>>  fs/f2fs/file.c | 10 +++++-----
>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>>>>>> --- a/fs/f2fs/file.c
>>>>>>> +++ b/fs/f2fs/file.c
>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>>>>>  			goto err_out;
>>>>>>>  
>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>>>>>> -		if (!ret) {
>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>> -			stat_dec_atomic_write(inode);
>>>>>>> -		}
>>>>>>> +
>>>>>>> +		/* doesn't need to check error */
>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>> +		stat_dec_atomic_write(inode);
>>>>>>
>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
>>>>>
>>>>> f2fs_commit_inmem_pages() should have flushed them.
>>>>
>>>> Oh, we failed to flush its nodes.
>>>>
>>>> However we won't clear such info if we failed to flush inmen pages, it looks
>>>> inconsistent.
>>>>
>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
>>>> error path? I'm not very clear how sqlite handle such error.
>>>
>>> f2fs_drop_inmem_pages() did that, but not in this case.
>>
>> What I mean is, for any error returned from atomic_commit() interface, should
>> userspace application handle it with consistent way, like trigger
>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> 
> f2fs_ioc_abort_volatile_write() will be triggered.

If userspace can do this, we can get rid of this patch, or am I missing sth?

- f2fs_ioc_abort_volatile_write
 - f2fs_drop_inmem_pages
  - clear_inode_flag(inode, FI_ATOMIC_FILE);
  - fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
  - stat_dec_atomic_write(inode);

> 
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> So my question is why below logic didn't handle such condition well?
>>>>>>
>>>>>> f2fs_gc()
>>>>>>
>>>>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
>>>>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
>>>>>> 					skipped_round * 2 < round) {
>>>>>> 			segno = NULL_SEGNO;
>>>>>> 			goto gc_more;
>>>>>> 		}
>>>>>>
>>>>>> 		if (first_skipped < last_skipped &&
>>>>>> 				(last_skipped - first_skipped) >
>>>>>> 						sbi->skipped_gc_rwsem) {
>>>>>> 			f2fs_drop_inmem_pages_all(sbi, true);
>>>>>
>>>>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
>>>>> from inmem list.
>>>>>
>>>>>> 			segno = NULL_SEGNO;
>>>>>> 			goto gc_more;
>>>>>> 		}
>>>>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
>>>>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
>>>>>> 	}
>>>>>>
>>>>>>>  	} else {
>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>>>>>>  	}
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  8:27               ` Chao Yu
@ 2019-09-09  8:38                 ` Jaegeuk Kim
  2019-09-09  8:44                   ` Jaegeuk Kim
  2019-09-09 11:26                   ` Chao Yu
  0 siblings, 2 replies; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  8:38 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel, g, linux-kernel

On 09/09, Chao Yu wrote:
> On 2019/9/9 16:21, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>>>> get commited pages but atomic_file being still set like:
> >>>>>>>
> >>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>>>
> >>>>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>>>
> >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>
> >>>>>>> In that moment, we can observe:
> >>>>>>>
> >>>>>>> [Before]
> >>>>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>>>   - data blocks : 4962373 (274483)
> >>>>>>>   - node blocks : 121846 (110025)
> >>>>>>> Skipped : atomic write 4534686 (10)
> >>>>>>>
> >>>>>>> [After]
> >>>>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>>>   - data blocks : 4967127 (274483)
> >>>>>>>   - node blocks : 121846 (110025)
> >>>>>>> Skipped : atomic write 4539440 (10)
> >>>>>>>
> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>>>> --- a/fs/f2fs/file.c
> >>>>>>> +++ b/fs/f2fs/file.c
> >>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>>>  			goto err_out;
> >>>>>>>  
> >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>>>> -		if (!ret) {
> >>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>> -			stat_dec_atomic_write(inode);
> >>>>>>> -		}
> >>>>>>> +
> >>>>>>> +		/* doesn't need to check error */
> >>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>> +		stat_dec_atomic_write(inode);
> >>>>>>
> >>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>>>
> >>>>> f2fs_commit_inmem_pages() should have flushed them.
> >>>>
> >>>> Oh, we failed to flush its nodes.
> >>>>
> >>>> However we won't clear such info if we failed to flush inmen pages, it looks
> >>>> inconsistent.
> >>>>
> >>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >>>> error path? I'm not very clear how sqlite handle such error.
> >>>
> >>> f2fs_drop_inmem_pages() did that, but not in this case.
> >>
> >> What I mean is, for any error returned from atomic_commit() interface, should
> >> userspace application handle it with consistent way, like trigger
> >> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> > 
> > f2fs_ioc_abort_volatile_write() will be triggered.
> 
> If userspace can do this, we can get rid of this patch, or am I missing sth?

We don't know when that will come. And, other threads are waiting for GC here.

> 
> - f2fs_ioc_abort_volatile_write
>  - f2fs_drop_inmem_pages
>   - clear_inode_flag(inode, FI_ATOMIC_FILE);
>   - fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>   - stat_dec_atomic_write(inode);
> 
> > 
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> So my question is why below logic didn't handle such condition well?
> >>>>>>
> >>>>>> f2fs_gc()
> >>>>>>
> >>>>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> >>>>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
> >>>>>> 					skipped_round * 2 < round) {
> >>>>>> 			segno = NULL_SEGNO;
> >>>>>> 			goto gc_more;
> >>>>>> 		}
> >>>>>>
> >>>>>> 		if (first_skipped < last_skipped &&
> >>>>>> 				(last_skipped - first_skipped) >
> >>>>>> 						sbi->skipped_gc_rwsem) {
> >>>>>> 			f2fs_drop_inmem_pages_all(sbi, true);
> >>>>>
> >>>>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> >>>>> from inmem list.
> >>>>>
> >>>>>> 			segno = NULL_SEGNO;
> >>>>>> 			goto gc_more;
> >>>>>> 		}
> >>>>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
> >>>>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
> >>>>>> 	}
> >>>>>>
> >>>>>>>  	} else {
> >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>>>>>>  	}
> >>>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  8:38                 ` Jaegeuk Kim
@ 2019-09-09  8:44                   ` Jaegeuk Kim
  2019-09-09 11:26                   ` Chao Yu
  1 sibling, 0 replies; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09  8:44 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, g, linux-f2fs-devel

On 09/09, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
> > On 2019/9/9 16:21, Jaegeuk Kim wrote:
> > > On 09/09, Chao Yu wrote:
> > >> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> > >>> On 09/09, Chao Yu wrote:
> > >>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> > >>>>> On 09/09, Chao Yu wrote:
> > >>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> > >>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> > >>>>>>> get commited pages but atomic_file being still set like:
> > >>>>>>>
> > >>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> > >>>>>>>
> > >>>>>>> If GC selects this block, we can get an infinite loop like this:
> > >>>>>>>
> > >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > >>>>>>>
> > >>>>>>> In that moment, we can observe:
> > >>>>>>>
> > >>>>>>> [Before]
> > >>>>>>> Try to move 5084219 blocks (BG: 384508)
> > >>>>>>>   - data blocks : 4962373 (274483)
> > >>>>>>>   - node blocks : 121846 (110025)
> > >>>>>>> Skipped : atomic write 4534686 (10)
> > >>>>>>>
> > >>>>>>> [After]
> > >>>>>>> Try to move 5088973 blocks (BG: 384508)
> > >>>>>>>   - data blocks : 4967127 (274483)
> > >>>>>>>   - node blocks : 121846 (110025)
> > >>>>>>> Skipped : atomic write 4539440 (10)
> > >>>>>>>
> > >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > >>>>>>> ---
> > >>>>>>>  fs/f2fs/file.c | 10 +++++-----
> > >>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > >>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> > >>>>>>> --- a/fs/f2fs/file.c
> > >>>>>>> +++ b/fs/f2fs/file.c
> > >>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> > >>>>>>>  			goto err_out;
> > >>>>>>>  
> > >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> > >>>>>>> -		if (!ret) {
> > >>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> > >>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> > >>>>>>> -			stat_dec_atomic_write(inode);
> > >>>>>>> -		}
> > >>>>>>> +
> > >>>>>>> +		/* doesn't need to check error */
> > >>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> > >>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> > >>>>>>> +		stat_dec_atomic_write(inode);
> > >>>>>>
> > >>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> > >>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> > >>>>>
> > >>>>> f2fs_commit_inmem_pages() should have flushed them.
> > >>>>
> > >>>> Oh, we failed to flush its nodes.
> > >>>>
> > >>>> However we won't clear such info if we failed to flush inmen pages, it looks
> > >>>> inconsistent.
> > >>>>
> > >>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> > >>>> error path? I'm not very clear how sqlite handle such error.
> > >>>
> > >>> f2fs_drop_inmem_pages() did that, but not in this case.
> > >>
> > >> What I mean is, for any error returned from atomic_commit() interface, should
> > >> userspace application handle it with consistent way, like trigger
> > >> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> > > 
> > > f2fs_ioc_abort_volatile_write() will be triggered.
> > 
> > If userspace can do this, we can get rid of this patch, or am I missing sth?
> 
> We don't know when that will come. And, other threads are waiting for GC here.
> 

Actually, we can call this.

---
 fs/f2fs/file.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7ae2f3bd8c2f..98e2f58467d3 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1997,11 +1997,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 			goto err_out;
 
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
-		if (!ret) {
-			clear_inode_flag(inode, FI_ATOMIC_FILE);
-			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
-			stat_dec_atomic_write(inode);
-		}
+		f2fs_drop_inmem_pages(inode);
 	} else {
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 	}
-- 
2.19.0.605.g01d371f741-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09  8:38                 ` Jaegeuk Kim
  2019-09-09  8:44                   ` Jaegeuk Kim
@ 2019-09-09 11:26                   ` Chao Yu
  2019-09-09 14:34                     ` Jaegeuk Kim
  1 sibling, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09 11:26 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 16:38, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 16:21, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
>>>>> On 09/09, Chao Yu wrote:
>>>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>>>>>>>> get commited pages but atomic_file being still set like:
>>>>>>>>>
>>>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>>>>>>>
>>>>>>>>> If GC selects this block, we can get an infinite loop like this:
>>>>>>>>>
>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>>>
>>>>>>>>> In that moment, we can observe:
>>>>>>>>>
>>>>>>>>> [Before]
>>>>>>>>> Try to move 5084219 blocks (BG: 384508)
>>>>>>>>>   - data blocks : 4962373 (274483)
>>>>>>>>>   - node blocks : 121846 (110025)
>>>>>>>>> Skipped : atomic write 4534686 (10)
>>>>>>>>>
>>>>>>>>> [After]
>>>>>>>>> Try to move 5088973 blocks (BG: 384508)
>>>>>>>>>   - data blocks : 4967127 (274483)
>>>>>>>>>   - node blocks : 121846 (110025)
>>>>>>>>> Skipped : atomic write 4539440 (10)
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>> ---
>>>>>>>>>  fs/f2fs/file.c | 10 +++++-----
>>>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>>>>>>>> --- a/fs/f2fs/file.c
>>>>>>>>> +++ b/fs/f2fs/file.c
>>>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>>>>>>>  			goto err_out;
>>>>>>>>>  
>>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>>>>>>>> -		if (!ret) {
>>>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>>>> -			stat_dec_atomic_write(inode);
>>>>>>>>> -		}
>>>>>>>>> +
>>>>>>>>> +		/* doesn't need to check error */
>>>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>>>> +		stat_dec_atomic_write(inode);
>>>>>>>>
>>>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>>>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
>>>>>>>
>>>>>>> f2fs_commit_inmem_pages() should have flushed them.
>>>>>>
>>>>>> Oh, we failed to flush its nodes.
>>>>>>
>>>>>> However we won't clear such info if we failed to flush inmen pages, it looks
>>>>>> inconsistent.
>>>>>>
>>>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
>>>>>> error path? I'm not very clear how sqlite handle such error.
>>>>>
>>>>> f2fs_drop_inmem_pages() did that, but not in this case.
>>>>
>>>> What I mean is, for any error returned from atomic_commit() interface, should
>>>> userspace application handle it with consistent way, like trigger
>>>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
>>>
>>> f2fs_ioc_abort_volatile_write() will be triggered.
>>
>> If userspace can do this, we can get rid of this patch, or am I missing sth?
> 
> We don't know when that will come. And, other threads are waiting for GC here.

Yes, however, even atomic_write won't be called sometimes... that's why we add
handling logic in f2fs_gc().

> 
>>
>> - f2fs_ioc_abort_volatile_write
>>  - f2fs_drop_inmem_pages
>>   - clear_inode_flag(inode, FI_ATOMIC_FILE);
>>   - fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>   - stat_dec_atomic_write(inode);
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> So my question is why below logic didn't handle such condition well?
>>>>>>>>
>>>>>>>> f2fs_gc()
>>>>>>>>
>>>>>>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
>>>>>>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
>>>>>>>> 					skipped_round * 2 < round) {
>>>>>>>> 			segno = NULL_SEGNO;
>>>>>>>> 			goto gc_more;
>>>>>>>> 		}
>>>>>>>>
>>>>>>>> 		if (first_skipped < last_skipped &&
>>>>>>>> 				(last_skipped - first_skipped) >
>>>>>>>> 						sbi->skipped_gc_rwsem) {
>>>>>>>> 			f2fs_drop_inmem_pages_all(sbi, true);
>>>>>>>
>>>>>>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
>>>>>>> from inmem list.

One possible solution is keeping inode in sbi->inode_lock[ATOMIC_FILE] list
until f2fs_do_sync_file() return, then f2fs_drop_inmem_pages_all() can find and
drop the inode.

Thanks,

>>>>>>>
>>>>>>>> 			segno = NULL_SEGNO;
>>>>>>>> 			goto gc_more;
>>>>>>>> 		}
>>>>>>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
>>>>>>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
>>>>>>>> 	}
>>>>>>>>
>>>>>>>>>  	} else {
>>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>>>>>>>>  	}
>>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-09  8:06   ` Jaegeuk Kim
@ 2019-09-09 11:32     ` Chao Yu
  2019-09-09 12:04       ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-09 11:32 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 16:06, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>> GC must avoid select the same victim again.
>>
>> Blocks in previous victim will occupy addition free segment, I doubt after this
>> change, FGGC may encounter out-of-free space issue more frequently.
> 
> Hmm, actually this change seems wrong by sec_usage_check().
> We may be able to avoid this only in the suspicious loop?
> 
> ---
>  fs/f2fs/gc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index e88f98ddf396..5877bd729689 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  		round++;
>  	}
>  
> -	if (gc_type == FG_GC)
> +	if (gc_type == FG_GC && seg_freed)

That's original solution Sahitya provided to avoid infinite loop of GC, but I
suggest to find the root cause first, then we added .invalid_segmap for that
purpose.

Thanks,

>  		sbi->cur_victim_sec = NULL_SEGNO;
>  
>  	if (sync)
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-09 11:32     ` Chao Yu
@ 2019-09-09 12:04       ` Jaegeuk Kim
  2019-09-16  1:22         ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09 12:04 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/09, Chao Yu wrote:
> On 2019/9/9 16:06, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>> GC must avoid select the same victim again.
> >>
> >> Blocks in previous victim will occupy addition free segment, I doubt after this
> >> change, FGGC may encounter out-of-free space issue more frequently.
> > 
> > Hmm, actually this change seems wrong by sec_usage_check().
> > We may be able to avoid this only in the suspicious loop?
> > 
> > ---
> >  fs/f2fs/gc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index e88f98ddf396..5877bd729689 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >  		round++;
> >  	}
> >  
> > -	if (gc_type == FG_GC)
> > +	if (gc_type == FG_GC && seg_freed)
> 
> That's original solution Sahitya provided to avoid infinite loop of GC, but I
> suggest to find the root cause first, then we added .invalid_segmap for that
> purpose.

I've checked the Sahitya's patch. So, it seems the problem can happen due to
is_alive or atomic_file.

> 
> Thanks,
> 
> >  		sbi->cur_victim_sec = NULL_SEGNO;
> >  
> >  	if (sync)
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09 11:26                   ` Chao Yu
@ 2019-09-09 14:34                     ` Jaegeuk Kim
  2019-09-10  0:59                       ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-09 14:34 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/09, Chao Yu wrote:
> On 2019/9/9 16:38, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 16:21, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>>>>>> get commited pages but atomic_file being still set like:
> >>>>>>>>>
> >>>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>>>>>
> >>>>>>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>>>>>
> >>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>>>
> >>>>>>>>> In that moment, we can observe:
> >>>>>>>>>
> >>>>>>>>> [Before]
> >>>>>>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>>>>>   - data blocks : 4962373 (274483)
> >>>>>>>>>   - node blocks : 121846 (110025)
> >>>>>>>>> Skipped : atomic write 4534686 (10)
> >>>>>>>>>
> >>>>>>>>> [After]
> >>>>>>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>>>>>   - data blocks : 4967127 (274483)
> >>>>>>>>>   - node blocks : 121846 (110025)
> >>>>>>>>> Skipped : atomic write 4539440 (10)
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>> ---
> >>>>>>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>>>>>> --- a/fs/f2fs/file.c
> >>>>>>>>> +++ b/fs/f2fs/file.c
> >>>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>>>>>  			goto err_out;
> >>>>>>>>>  
> >>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>>>>>> -		if (!ret) {
> >>>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>>>> -			stat_dec_atomic_write(inode);
> >>>>>>>>> -		}
> >>>>>>>>> +
> >>>>>>>>> +		/* doesn't need to check error */
> >>>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>>>> +		stat_dec_atomic_write(inode);
> >>>>>>>>
> >>>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>>>>>
> >>>>>>> f2fs_commit_inmem_pages() should have flushed them.
> >>>>>>
> >>>>>> Oh, we failed to flush its nodes.
> >>>>>>
> >>>>>> However we won't clear such info if we failed to flush inmen pages, it looks
> >>>>>> inconsistent.
> >>>>>>
> >>>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >>>>>> error path? I'm not very clear how sqlite handle such error.
> >>>>>
> >>>>> f2fs_drop_inmem_pages() did that, but not in this case.
> >>>>
> >>>> What I mean is, for any error returned from atomic_commit() interface, should
> >>>> userspace application handle it with consistent way, like trigger
> >>>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> >>>
> >>> f2fs_ioc_abort_volatile_write() will be triggered.
> >>
> >> If userspace can do this, we can get rid of this patch, or am I missing sth?
> > 
> > We don't know when that will come. And, other threads are waiting for GC here.
> 
> Yes, however, even atomic_write won't be called sometimes... that's why we add
> handling logic in f2fs_gc().

We need to clean this up like:

If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
get commited pages but atomic_file being still set like:

- inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)

If GC selects this block, we can get an infinite loop like this:

f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c

In that moment, we can observe:

[Before]
Try to move 5084219 blocks (BG: 384508)
  - data blocks : 4962373 (274483)
  - node blocks : 121846 (110025)
Skipped : atomic write 4534686 (10)

[After]
Try to move 5088973 blocks (BG: 384508)
  - data blocks : 4967127 (274483)
  - node blocks : 121846 (110025)
Skipped : atomic write 4539440 (10)

So, refactor atomic_write flow like this:
1. start_atomic_write
 - add inmem_list and set atomic_file

2. write()
 - register it in inmem_pages

3. commit_atomic_write
 - if no error, f2fs_drop_inmem_pages()
 - f2fs_commit_inmme_pages() failed
   : __revoked_inmem_pages() was done
 - f2fs_do_sync_file failed
   : abort_atomic_write later

4. abort_atomic_write
 - f2fs_drop_inmem_pages

5. f2fs_drop_inmem_pages
 - clear atomic_file
 - remove inmem_list

Based on this change, when GC fails to move block in atomic_file,
f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/file.c    | 17 +++++++++++------
 fs/f2fs/segment.c | 24 +++++-------------------
 2 files changed, 16 insertions(+), 25 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 10927a0b8df3..1f499ac13d9a 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1830,6 +1830,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
 {
 	struct inode *inode = file_inode(filp);
 	int ret;
+	struct f2fs_inode_info *fi = F2FS_I(inode);
 
 	if (!inode_owner_or_capable(inode))
 		return -EACCES;
@@ -1871,10 +1872,17 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
 		goto out;
 	}
 
-	set_inode_flag(inode, FI_ATOMIC_FILE);
 	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
 	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
 
+	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+	if (list_empty(&fi->inmem_ilist))
+		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
+	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+
+	/* add inode in inmem_list first and set atomic_file */
+	set_inode_flag(inode, FI_ATOMIC_FILE);
+
 	f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
 	F2FS_I(inode)->inmem_task = current;
 	stat_inc_atomic_write(inode);
@@ -1912,11 +1920,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 			goto err_out;
 
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
-		if (!ret) {
-			clear_inode_flag(inode, FI_ATOMIC_FILE);
-			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
-			stat_dec_atomic_write(inode);
-		}
+		if (!ret)
+			f2fs_drop_inmem_pages(inode);
 	} else {
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 	}
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 18584d4c078a..a36bc8bf04c3 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -186,7 +186,6 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
 void f2fs_register_inmem_page(struct inode *inode, struct page *page)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-	struct f2fs_inode_info *fi = F2FS_I(inode);
 	struct inmem_pages *new;
 
 	f2fs_trace_pid(page);
@@ -200,15 +199,9 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
 	INIT_LIST_HEAD(&new->list);
 
 	/* increase reference count with clean state */
-	mutex_lock(&fi->inmem_lock);
 	get_page(page);
 	list_add_tail(&new->list, &fi->inmem_pages);
-	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-	if (list_empty(&fi->inmem_ilist))
-		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
-	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
 	inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
-	mutex_unlock(&fi->inmem_lock);
 
 	trace_f2fs_register_inmem_page(page, INMEM);
 }
@@ -330,19 +323,17 @@ void f2fs_drop_inmem_pages(struct inode *inode)
 		mutex_lock(&fi->inmem_lock);
 		__revoke_inmem_pages(inode, &fi->inmem_pages,
 						true, false, true);
-
-		if (list_empty(&fi->inmem_pages)) {
-			spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-			if (!list_empty(&fi->inmem_ilist))
-				list_del_init(&fi->inmem_ilist);
-			spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
-		}
 		mutex_unlock(&fi->inmem_lock);
 	}
 
 	clear_inode_flag(inode, FI_ATOMIC_FILE);
 	fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
 	stat_dec_atomic_write(inode);
+
+	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+	if (!list_empty(&fi->inmem_ilist))
+		list_del_init(&fi->inmem_ilist);
+	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
 }
 
 void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
@@ -471,11 +462,6 @@ int f2fs_commit_inmem_pages(struct inode *inode)
 
 	mutex_lock(&fi->inmem_lock);
 	err = __f2fs_commit_inmem_pages(inode);
-
-	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-	if (!list_empty(&fi->inmem_ilist))
-		list_del_init(&fi->inmem_ilist);
-	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
 	mutex_unlock(&fi->inmem_lock);
 
 	clear_inode_flag(inode, FI_ATOMIC_COMMIT);
-- 
2.19.0.605.g01d371f741-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-09 14:34                     ` Jaegeuk Kim
@ 2019-09-10  0:59                       ` Chao Yu
  2019-09-10 11:58                         ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-10  0:59 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 22:34, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 16:38, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 16:21, Jaegeuk Kim wrote:
>>>>> On 09/09, Chao Yu wrote:
>>>>>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
>>>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>>>>>>>>>> get commited pages but atomic_file being still set like:
>>>>>>>>>>>
>>>>>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>>>>>>>>>
>>>>>>>>>>> If GC selects this block, we can get an infinite loop like this:
>>>>>>>>>>>
>>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>>>>>
>>>>>>>>>>> In that moment, we can observe:
>>>>>>>>>>>
>>>>>>>>>>> [Before]
>>>>>>>>>>> Try to move 5084219 blocks (BG: 384508)
>>>>>>>>>>>   - data blocks : 4962373 (274483)
>>>>>>>>>>>   - node blocks : 121846 (110025)
>>>>>>>>>>> Skipped : atomic write 4534686 (10)
>>>>>>>>>>>
>>>>>>>>>>> [After]
>>>>>>>>>>> Try to move 5088973 blocks (BG: 384508)
>>>>>>>>>>>   - data blocks : 4967127 (274483)
>>>>>>>>>>>   - node blocks : 121846 (110025)
>>>>>>>>>>> Skipped : atomic write 4539440 (10)
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>> ---
>>>>>>>>>>>  fs/f2fs/file.c | 10 +++++-----
>>>>>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>>>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>>>>>>>>>> --- a/fs/f2fs/file.c
>>>>>>>>>>> +++ b/fs/f2fs/file.c
>>>>>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>>>>>>>>>  			goto err_out;
>>>>>>>>>>>  
>>>>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>>>>>>>>>> -		if (!ret) {
>>>>>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>>>>>> -			stat_dec_atomic_write(inode);
>>>>>>>>>>> -		}
>>>>>>>>>>> +
>>>>>>>>>>> +		/* doesn't need to check error */
>>>>>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>>>>>> +		stat_dec_atomic_write(inode);
>>>>>>>>>>
>>>>>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>>>>>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
>>>>>>>>>
>>>>>>>>> f2fs_commit_inmem_pages() should have flushed them.
>>>>>>>>
>>>>>>>> Oh, we failed to flush its nodes.
>>>>>>>>
>>>>>>>> However we won't clear such info if we failed to flush inmen pages, it looks
>>>>>>>> inconsistent.
>>>>>>>>
>>>>>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
>>>>>>>> error path? I'm not very clear how sqlite handle such error.
>>>>>>>
>>>>>>> f2fs_drop_inmem_pages() did that, but not in this case.
>>>>>>
>>>>>> What I mean is, for any error returned from atomic_commit() interface, should
>>>>>> userspace application handle it with consistent way, like trigger
>>>>>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
>>>>>
>>>>> f2fs_ioc_abort_volatile_write() will be triggered.
>>>>
>>>> If userspace can do this, we can get rid of this patch, or am I missing sth?
>>>
>>> We don't know when that will come. And, other threads are waiting for GC here.
>>
>> Yes, however, even atomic_write won't be called sometimes... that's why we add
>> handling logic in f2fs_gc().
> 
> We need to clean this up like:
> 
> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> get commited pages but atomic_file being still set like:
> 
> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> 
> If GC selects this block, we can get an infinite loop like this:
> 
> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> 
> In that moment, we can observe:
> 
> [Before]
> Try to move 5084219 blocks (BG: 384508)
>   - data blocks : 4962373 (274483)
>   - node blocks : 121846 (110025)
> Skipped : atomic write 4534686 (10)
> 
> [After]
> Try to move 5088973 blocks (BG: 384508)
>   - data blocks : 4967127 (274483)
>   - node blocks : 121846 (110025)
> Skipped : atomic write 4539440 (10)
> 
> So, refactor atomic_write flow like this:
> 1. start_atomic_write
>  - add inmem_list and set atomic_file
> 
> 2. write()
>  - register it in inmem_pages
> 
> 3. commit_atomic_write
>  - if no error, f2fs_drop_inmem_pages()
>  - f2fs_commit_inmme_pages() failed
>    : __revoked_inmem_pages() was done
>  - f2fs_do_sync_file failed
>    : abort_atomic_write later
> 
> 4. abort_atomic_write
>  - f2fs_drop_inmem_pages
> 
> 5. f2fs_drop_inmem_pages
>  - clear atomic_file
>  - remove inmem_list

Looks good.

> 
> Based on this change, when GC fails to move block in atomic_file,
> f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/file.c    | 17 +++++++++++------
>  fs/f2fs/segment.c | 24 +++++-------------------
>  2 files changed, 16 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 10927a0b8df3..1f499ac13d9a 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1830,6 +1830,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>  {
>  	struct inode *inode = file_inode(filp);
>  	int ret;
> +	struct f2fs_inode_info *fi = F2FS_I(inode);
>  
>  	if (!inode_owner_or_capable(inode))
>  		return -EACCES;
> @@ -1871,10 +1872,17 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>  		goto out;
>  	}
>  
> -	set_inode_flag(inode, FI_ATOMIC_FILE);
>  	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>  	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
>  
> +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> +	if (list_empty(&fi->inmem_ilist))
> +		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> +
> +	/* add inode in inmem_list first and set atomic_file */
> +	set_inode_flag(inode, FI_ATOMIC_FILE);

Should call up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]) here? Then in f2fs_gc(),
we can check FI_ATOMIC_FILE atomically under i_gc_rwsem.

Thanks,

> +
>  	f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
>  	F2FS_I(inode)->inmem_task = current;
>  	stat_inc_atomic_write(inode);
> @@ -1912,11 +1920,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>  			goto err_out;
>  
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> -		if (!ret) {
> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> -			stat_dec_atomic_write(inode);
> -		}
> +		if (!ret)
> +			f2fs_drop_inmem_pages(inode);
>  	} else {
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>  	}
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 18584d4c078a..a36bc8bf04c3 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -186,7 +186,6 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
>  void f2fs_register_inmem_page(struct inode *inode, struct page *page)
>  {
>  	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> -	struct f2fs_inode_info *fi = F2FS_I(inode);
>  	struct inmem_pages *new;
>  
>  	f2fs_trace_pid(page);
> @@ -200,15 +199,9 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
>  	INIT_LIST_HEAD(&new->list);
>  
>  	/* increase reference count with clean state */
> -	mutex_lock(&fi->inmem_lock);
>  	get_page(page);
>  	list_add_tail(&new->list, &fi->inmem_pages);
> -	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> -	if (list_empty(&fi->inmem_ilist))
> -		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> -	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>  	inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
> -	mutex_unlock(&fi->inmem_lock);
>  
>  	trace_f2fs_register_inmem_page(page, INMEM);
>  }
> @@ -330,19 +323,17 @@ void f2fs_drop_inmem_pages(struct inode *inode)
>  		mutex_lock(&fi->inmem_lock);
>  		__revoke_inmem_pages(inode, &fi->inmem_pages,
>  						true, false, true);
> -
> -		if (list_empty(&fi->inmem_pages)) {
> -			spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> -			if (!list_empty(&fi->inmem_ilist))
> -				list_del_init(&fi->inmem_ilist);
> -			spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> -		}
>  		mutex_unlock(&fi->inmem_lock);
>  	}
>  
>  	clear_inode_flag(inode, FI_ATOMIC_FILE);
>  	fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>  	stat_dec_atomic_write(inode);
> +
> +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> +	if (!list_empty(&fi->inmem_ilist))
> +		list_del_init(&fi->inmem_ilist);
> +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>  }
>  
>  void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
> @@ -471,11 +462,6 @@ int f2fs_commit_inmem_pages(struct inode *inode)
>  
>  	mutex_lock(&fi->inmem_lock);
>  	err = __f2fs_commit_inmem_pages(inode);
> -
> -	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> -	if (!list_empty(&fi->inmem_ilist))
> -		list_del_init(&fi->inmem_ilist);
> -	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>  	mutex_unlock(&fi->inmem_lock);
>  
>  	clear_inode_flag(inode, FI_ATOMIC_COMMIT);
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-10  0:59                       ` Chao Yu
@ 2019-09-10 11:58                         ` Jaegeuk Kim
  2019-09-10 12:04                           ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-10 11:58 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/10, Chao Yu wrote:
> On 2019/9/9 22:34, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 16:38, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 16:21, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> >>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>>>>>>>> get commited pages but atomic_file being still set like:
> >>>>>>>>>>>
> >>>>>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>>>>>>>
> >>>>>>>>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>>>>>>>
> >>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>>>>>
> >>>>>>>>>>> In that moment, we can observe:
> >>>>>>>>>>>
> >>>>>>>>>>> [Before]
> >>>>>>>>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>>>>>>>   - data blocks : 4962373 (274483)
> >>>>>>>>>>>   - node blocks : 121846 (110025)
> >>>>>>>>>>> Skipped : atomic write 4534686 (10)
> >>>>>>>>>>>
> >>>>>>>>>>> [After]
> >>>>>>>>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>>>>>>>   - data blocks : 4967127 (274483)
> >>>>>>>>>>>   - node blocks : 121846 (110025)
> >>>>>>>>>>> Skipped : atomic write 4539440 (10)
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>>>> ---
> >>>>>>>>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>>>>>>>> --- a/fs/f2fs/file.c
> >>>>>>>>>>> +++ b/fs/f2fs/file.c
> >>>>>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>>>>>>>  			goto err_out;
> >>>>>>>>>>>  
> >>>>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>>>>>>>> -		if (!ret) {
> >>>>>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>>>>>> -			stat_dec_atomic_write(inode);
> >>>>>>>>>>> -		}
> >>>>>>>>>>> +
> >>>>>>>>>>> +		/* doesn't need to check error */
> >>>>>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>>>>>> +		stat_dec_atomic_write(inode);
> >>>>>>>>>>
> >>>>>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>>>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>>>>>>>
> >>>>>>>>> f2fs_commit_inmem_pages() should have flushed them.
> >>>>>>>>
> >>>>>>>> Oh, we failed to flush its nodes.
> >>>>>>>>
> >>>>>>>> However we won't clear such info if we failed to flush inmen pages, it looks
> >>>>>>>> inconsistent.
> >>>>>>>>
> >>>>>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >>>>>>>> error path? I'm not very clear how sqlite handle such error.
> >>>>>>>
> >>>>>>> f2fs_drop_inmem_pages() did that, but not in this case.
> >>>>>>
> >>>>>> What I mean is, for any error returned from atomic_commit() interface, should
> >>>>>> userspace application handle it with consistent way, like trigger
> >>>>>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> >>>>>
> >>>>> f2fs_ioc_abort_volatile_write() will be triggered.
> >>>>
> >>>> If userspace can do this, we can get rid of this patch, or am I missing sth?
> >>>
> >>> We don't know when that will come. And, other threads are waiting for GC here.
> >>
> >> Yes, however, even atomic_write won't be called sometimes... that's why we add
> >> handling logic in f2fs_gc().
> > 
> > We need to clean this up like:
> > 
> > If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> > get commited pages but atomic_file being still set like:
> > 
> > - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> > 
> > If GC selects this block, we can get an infinite loop like this:
> > 
> > f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > 
> > In that moment, we can observe:
> > 
> > [Before]
> > Try to move 5084219 blocks (BG: 384508)
> >   - data blocks : 4962373 (274483)
> >   - node blocks : 121846 (110025)
> > Skipped : atomic write 4534686 (10)
> > 
> > [After]
> > Try to move 5088973 blocks (BG: 384508)
> >   - data blocks : 4967127 (274483)
> >   - node blocks : 121846 (110025)
> > Skipped : atomic write 4539440 (10)
> > 
> > So, refactor atomic_write flow like this:
> > 1. start_atomic_write
> >  - add inmem_list and set atomic_file
> > 
> > 2. write()
> >  - register it in inmem_pages
> > 
> > 3. commit_atomic_write
> >  - if no error, f2fs_drop_inmem_pages()
> >  - f2fs_commit_inmme_pages() failed
> >    : __revoked_inmem_pages() was done
> >  - f2fs_do_sync_file failed
> >    : abort_atomic_write later
> > 
> > 4. abort_atomic_write
> >  - f2fs_drop_inmem_pages
> > 
> > 5. f2fs_drop_inmem_pages
> >  - clear atomic_file
> >  - remove inmem_list
> 
> Looks good.
> 
> > 
> > Based on this change, when GC fails to move block in atomic_file,
> > f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/file.c    | 17 +++++++++++------
> >  fs/f2fs/segment.c | 24 +++++-------------------
> >  2 files changed, 16 insertions(+), 25 deletions(-)
> > 
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index 10927a0b8df3..1f499ac13d9a 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -1830,6 +1830,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
> >  {
> >  	struct inode *inode = file_inode(filp);
> >  	int ret;
> > +	struct f2fs_inode_info *fi = F2FS_I(inode);
> >  
> >  	if (!inode_owner_or_capable(inode))
> >  		return -EACCES;
> > @@ -1871,10 +1872,17 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
> >  		goto out;
> >  	}
> >  
> > -	set_inode_flag(inode, FI_ATOMIC_FILE);
> >  	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >  	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
> >  
> > +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> > +	if (list_empty(&fi->inmem_ilist))
> > +		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> > +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> > +
> > +	/* add inode in inmem_list first and set atomic_file */
> > +	set_inode_flag(inode, FI_ATOMIC_FILE);
> 
> Should call up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]) here? Then in f2fs_gc(),
> we can check FI_ATOMIC_FILE atomically under i_gc_rwsem.

Yup.

If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
get commited pages but atomic_file being still set like:

- inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)

If GC selects this block, we can get an infinite loop like this:

f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c

In that moment, we can observe:

[Before]
Try to move 5084219 blocks (BG: 384508)
  - data blocks : 4962373 (274483)
  - node blocks : 121846 (110025)
Skipped : atomic write 4534686 (10)

[After]
Try to move 5088973 blocks (BG: 384508)
  - data blocks : 4967127 (274483)
  - node blocks : 121846 (110025)
Skipped : atomic write 4539440 (10)

So, refactor atomic_write flow like this:
1. start_atomic_write
 - add inmem_list and set atomic_file

2. write()
 - register it in inmem_pages

3. commit_atomic_write
 - if no error, f2fs_drop_inmem_pages()
 - f2fs_commit_inmme_pages() failed
   : __revoked_inmem_pages() was done
 - f2fs_do_sync_file failed
   : abort_atomic_write later

4. abort_atomic_write
 - f2fs_drop_inmem_pages

5. f2fs_drop_inmem_pages
 - clear atomic_file
 - remove inmem_list

Based on this change, when GC fails to move block in atomic_file,
f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/file.c    | 15 ++++++++++-----
 fs/f2fs/segment.c | 27 ++++++---------------------
 2 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 10927a0b8df3..fab6e4cf8f06 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1829,6 +1829,8 @@ static int f2fs_ioc_getversion(struct file *filp, unsigned long arg)
 static int f2fs_ioc_start_atomic_write(struct file *filp)
 {
 	struct inode *inode = file_inode(filp);
+	struct f2fs_inode_info *fi = F2FS_I(inode);
+	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	int ret;
 
 	if (!inode_owner_or_capable(inode))
@@ -1871,6 +1873,12 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
 		goto out;
 	}
 
+	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+	if (list_empty(&fi->inmem_ilist))
+		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
+	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+
+	/* add inode in inmem_list first and set atomic_file */
 	set_inode_flag(inode, FI_ATOMIC_FILE);
 	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
 	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
@@ -1912,11 +1920,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 			goto err_out;
 
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
-		if (!ret) {
-			clear_inode_flag(inode, FI_ATOMIC_FILE);
-			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
-			stat_dec_atomic_write(inode);
-		}
+		if (!ret)
+			f2fs_drop_inmem_pages(inode);
 	} else {
 		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 	}
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 18584d4c078a..66343f451d73 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -185,8 +185,6 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
 
 void f2fs_register_inmem_page(struct inode *inode, struct page *page)
 {
-	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
-	struct f2fs_inode_info *fi = F2FS_I(inode);
 	struct inmem_pages *new;
 
 	f2fs_trace_pid(page);
@@ -200,15 +198,9 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
 	INIT_LIST_HEAD(&new->list);
 
 	/* increase reference count with clean state */
-	mutex_lock(&fi->inmem_lock);
 	get_page(page);
-	list_add_tail(&new->list, &fi->inmem_pages);
-	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-	if (list_empty(&fi->inmem_ilist))
-		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
-	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+	list_add_tail(&new->list, &F2FS_I(inode)->inmem_pages);
 	inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
-	mutex_unlock(&fi->inmem_lock);
 
 	trace_f2fs_register_inmem_page(page, INMEM);
 }
@@ -330,19 +322,17 @@ void f2fs_drop_inmem_pages(struct inode *inode)
 		mutex_lock(&fi->inmem_lock);
 		__revoke_inmem_pages(inode, &fi->inmem_pages,
 						true, false, true);
-
-		if (list_empty(&fi->inmem_pages)) {
-			spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-			if (!list_empty(&fi->inmem_ilist))
-				list_del_init(&fi->inmem_ilist);
-			spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
-		}
 		mutex_unlock(&fi->inmem_lock);
 	}
 
 	clear_inode_flag(inode, FI_ATOMIC_FILE);
 	fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
 	stat_dec_atomic_write(inode);
+
+	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+	if (!list_empty(&fi->inmem_ilist))
+		list_del_init(&fi->inmem_ilist);
+	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
 }
 
 void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
@@ -471,11 +461,6 @@ int f2fs_commit_inmem_pages(struct inode *inode)
 
 	mutex_lock(&fi->inmem_lock);
 	err = __f2fs_commit_inmem_pages(inode);
-
-	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-	if (!list_empty(&fi->inmem_ilist))
-		list_del_init(&fi->inmem_ilist);
-	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
 	mutex_unlock(&fi->inmem_lock);
 
 	clear_inode_flag(inode, FI_ATOMIC_COMMIT);
-- 
2.19.0.605.g01d371f741-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-10 11:58                         ` Jaegeuk Kim
@ 2019-09-10 12:04                           ` Chao Yu
  2019-09-10 12:09                             ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-10 12:04 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/10 19:58, Jaegeuk Kim wrote:
> On 09/10, Chao Yu wrote:
>> On 2019/9/9 22:34, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 16:38, Jaegeuk Kim wrote:
>>>>> On 09/09, Chao Yu wrote:
>>>>>> On 2019/9/9 16:21, Jaegeuk Kim wrote:
>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
>>>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>>>>>>>>>>>> get commited pages but atomic_file being still set like:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>>>>>>>>>>>
>>>>>>>>>>>>> If GC selects this block, we can get an infinite loop like this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>>>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>>>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>>>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>>>>>>>>>>>
>>>>>>>>>>>>> In that moment, we can observe:
>>>>>>>>>>>>>
>>>>>>>>>>>>> [Before]
>>>>>>>>>>>>> Try to move 5084219 blocks (BG: 384508)
>>>>>>>>>>>>>   - data blocks : 4962373 (274483)
>>>>>>>>>>>>>   - node blocks : 121846 (110025)
>>>>>>>>>>>>> Skipped : atomic write 4534686 (10)
>>>>>>>>>>>>>
>>>>>>>>>>>>> [After]
>>>>>>>>>>>>> Try to move 5088973 blocks (BG: 384508)
>>>>>>>>>>>>>   - data blocks : 4967127 (274483)
>>>>>>>>>>>>>   - node blocks : 121846 (110025)
>>>>>>>>>>>>> Skipped : atomic write 4539440 (10)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>  fs/f2fs/file.c | 10 +++++-----
>>>>>>>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>>>>>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>>>>>>>>>>>> --- a/fs/f2fs/file.c
>>>>>>>>>>>>> +++ b/fs/f2fs/file.c
>>>>>>>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>>>>>>>>>>>  			goto err_out;
>>>>>>>>>>>>>  
>>>>>>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>>>>>>>>>>>> -		if (!ret) {
>>>>>>>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>>>>>>>> -			stat_dec_atomic_write(inode);
>>>>>>>>>>>>> -		}
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +		/* doesn't need to check error */
>>>>>>>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>>>>>>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>>>>>>>>>>>> +		stat_dec_atomic_write(inode);
>>>>>>>>>>>>
>>>>>>>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>>>>>>>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
>>>>>>>>>>>
>>>>>>>>>>> f2fs_commit_inmem_pages() should have flushed them.
>>>>>>>>>>
>>>>>>>>>> Oh, we failed to flush its nodes.
>>>>>>>>>>
>>>>>>>>>> However we won't clear such info if we failed to flush inmen pages, it looks
>>>>>>>>>> inconsistent.
>>>>>>>>>>
>>>>>>>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
>>>>>>>>>> error path? I'm not very clear how sqlite handle such error.
>>>>>>>>>
>>>>>>>>> f2fs_drop_inmem_pages() did that, but not in this case.
>>>>>>>>
>>>>>>>> What I mean is, for any error returned from atomic_commit() interface, should
>>>>>>>> userspace application handle it with consistent way, like trigger
>>>>>>>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
>>>>>>>
>>>>>>> f2fs_ioc_abort_volatile_write() will be triggered.
>>>>>>
>>>>>> If userspace can do this, we can get rid of this patch, or am I missing sth?
>>>>>
>>>>> We don't know when that will come. And, other threads are waiting for GC here.
>>>>
>>>> Yes, however, even atomic_write won't be called sometimes... that's why we add
>>>> handling logic in f2fs_gc().
>>>
>>> We need to clean this up like:
>>>
>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>> get commited pages but atomic_file being still set like:
>>>
>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>
>>> If GC selects this block, we can get an infinite loop like this:
>>>
>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>
>>> In that moment, we can observe:
>>>
>>> [Before]
>>> Try to move 5084219 blocks (BG: 384508)
>>>   - data blocks : 4962373 (274483)
>>>   - node blocks : 121846 (110025)
>>> Skipped : atomic write 4534686 (10)
>>>
>>> [After]
>>> Try to move 5088973 blocks (BG: 384508)
>>>   - data blocks : 4967127 (274483)
>>>   - node blocks : 121846 (110025)
>>> Skipped : atomic write 4539440 (10)
>>>
>>> So, refactor atomic_write flow like this:
>>> 1. start_atomic_write
>>>  - add inmem_list and set atomic_file
>>>
>>> 2. write()
>>>  - register it in inmem_pages
>>>
>>> 3. commit_atomic_write
>>>  - if no error, f2fs_drop_inmem_pages()
>>>  - f2fs_commit_inmme_pages() failed
>>>    : __revoked_inmem_pages() was done
>>>  - f2fs_do_sync_file failed
>>>    : abort_atomic_write later
>>>
>>> 4. abort_atomic_write
>>>  - f2fs_drop_inmem_pages
>>>
>>> 5. f2fs_drop_inmem_pages
>>>  - clear atomic_file
>>>  - remove inmem_list
>>
>> Looks good.
>>
>>>
>>> Based on this change, when GC fails to move block in atomic_file,
>>> f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
>>>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  fs/f2fs/file.c    | 17 +++++++++++------
>>>  fs/f2fs/segment.c | 24 +++++-------------------
>>>  2 files changed, 16 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>> index 10927a0b8df3..1f499ac13d9a 100644
>>> --- a/fs/f2fs/file.c
>>> +++ b/fs/f2fs/file.c
>>> @@ -1830,6 +1830,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>>>  {
>>>  	struct inode *inode = file_inode(filp);
>>>  	int ret;
>>> +	struct f2fs_inode_info *fi = F2FS_I(inode);
>>>  
>>>  	if (!inode_owner_or_capable(inode))
>>>  		return -EACCES;
>>> @@ -1871,10 +1872,17 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>>>  		goto out;
>>>  	}
>>>  
>>> -	set_inode_flag(inode, FI_ATOMIC_FILE);
>>>  	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>>>  	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
>>>  
>>> +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
>>> +	if (list_empty(&fi->inmem_ilist))
>>> +		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
>>> +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>>> +
>>> +	/* add inode in inmem_list first and set atomic_file */
>>> +	set_inode_flag(inode, FI_ATOMIC_FILE);
>>
>> Should call up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]) here? Then in f2fs_gc(),
>> we can check FI_ATOMIC_FILE atomically under i_gc_rwsem.
> 
> Yup.
> 
> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> get commited pages but atomic_file being still set like:
> 
> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> 
> If GC selects this block, we can get an infinite loop like this:
> 
> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> 
> In that moment, we can observe:
> 
> [Before]
> Try to move 5084219 blocks (BG: 384508)
>   - data blocks : 4962373 (274483)
>   - node blocks : 121846 (110025)
> Skipped : atomic write 4534686 (10)
> 
> [After]
> Try to move 5088973 blocks (BG: 384508)
>   - data blocks : 4967127 (274483)
>   - node blocks : 121846 (110025)
> Skipped : atomic write 4539440 (10)
> 
> So, refactor atomic_write flow like this:
> 1. start_atomic_write
>  - add inmem_list and set atomic_file
> 
> 2. write()
>  - register it in inmem_pages
> 
> 3. commit_atomic_write
>  - if no error, f2fs_drop_inmem_pages()
>  - f2fs_commit_inmme_pages() failed
>    : __revoked_inmem_pages() was done
>  - f2fs_do_sync_file failed
>    : abort_atomic_write later
> 
> 4. abort_atomic_write
>  - f2fs_drop_inmem_pages
> 
> 5. f2fs_drop_inmem_pages
>  - clear atomic_file
>  - remove inmem_list
> 
> Based on this change, when GC fails to move block in atomic_file,
> f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/file.c    | 15 ++++++++++-----
>  fs/f2fs/segment.c | 27 ++++++---------------------
>  2 files changed, 16 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 10927a0b8df3..fab6e4cf8f06 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1829,6 +1829,8 @@ static int f2fs_ioc_getversion(struct file *filp, unsigned long arg)
>  static int f2fs_ioc_start_atomic_write(struct file *filp)
>  {
>  	struct inode *inode = file_inode(filp);
> +	struct f2fs_inode_info *fi = F2FS_I(inode);
> +	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
>  	int ret;
>  
>  	if (!inode_owner_or_capable(inode))
> @@ -1871,6 +1873,12 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
>  		goto out;
>  	}
>  
> +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> +	if (list_empty(&fi->inmem_ilist))
> +		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> +
> +	/* add inode in inmem_list first and set atomic_file */
>  	set_inode_flag(inode, FI_ATOMIC_FILE);
>  	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>  	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
> @@ -1912,11 +1920,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>  			goto err_out;
>  
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> -		if (!ret) {
> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> -			stat_dec_atomic_write(inode);
> -		}
> +		if (!ret)
> +			f2fs_drop_inmem_pages(inode);
>  	} else {
>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>  	}
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 18584d4c078a..66343f451d73 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -185,8 +185,6 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
>  
>  void f2fs_register_inmem_page(struct inode *inode, struct page *page)
>  {
> -	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> -	struct f2fs_inode_info *fi = F2FS_I(inode);
>  	struct inmem_pages *new;
>  
>  	f2fs_trace_pid(page);
> @@ -200,15 +198,9 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
>  	INIT_LIST_HEAD(&new->list);
>  
>  	/* increase reference count with clean state */
> -	mutex_lock(&fi->inmem_lock);

The lock is used to protect F2FS_I(inode)->inmem_pages list... it should be kept?

Thanks,

>  	get_page(page);
> -	list_add_tail(&new->list, &fi->inmem_pages);
> -	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> -	if (list_empty(&fi->inmem_ilist))
> -		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> -	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> +	list_add_tail(&new->list, &F2FS_I(inode)->inmem_pages);
>  	inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
> -	mutex_unlock(&fi->inmem_lock);
>  
>  	trace_f2fs_register_inmem_page(page, INMEM);
>  }
> @@ -330,19 +322,17 @@ void f2fs_drop_inmem_pages(struct inode *inode)
>  		mutex_lock(&fi->inmem_lock);
>  		__revoke_inmem_pages(inode, &fi->inmem_pages,
>  						true, false, true);
> -
> -		if (list_empty(&fi->inmem_pages)) {
> -			spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> -			if (!list_empty(&fi->inmem_ilist))
> -				list_del_init(&fi->inmem_ilist);
> -			spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> -		}
>  		mutex_unlock(&fi->inmem_lock);
>  	}
>  
>  	clear_inode_flag(inode, FI_ATOMIC_FILE);
>  	fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>  	stat_dec_atomic_write(inode);
> +
> +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> +	if (!list_empty(&fi->inmem_ilist))
> +		list_del_init(&fi->inmem_ilist);
> +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>  }
>  
>  void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
> @@ -471,11 +461,6 @@ int f2fs_commit_inmem_pages(struct inode *inode)
>  
>  	mutex_lock(&fi->inmem_lock);
>  	err = __f2fs_commit_inmem_pages(inode);
> -
> -	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> -	if (!list_empty(&fi->inmem_ilist))
> -		list_del_init(&fi->inmem_ilist);
> -	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
>  	mutex_unlock(&fi->inmem_lock);
>  
>  	clear_inode_flag(inode, FI_ATOMIC_COMMIT);
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-10 12:04                           ` Chao Yu
@ 2019-09-10 12:09                             ` Jaegeuk Kim
  2019-09-16  1:15                               ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-10 12:09 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/10, Chao Yu wrote:
> On 2019/9/10 19:58, Jaegeuk Kim wrote:
> > On 09/10, Chao Yu wrote:
> >> On 2019/9/9 22:34, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 16:38, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 16:21, Jaegeuk Kim wrote:
> >>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>>>>>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>>>>>>>>>> get commited pages but atomic_file being still set like:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>>>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>>>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>>>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In that moment, we can observe:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [Before]
> >>>>>>>>>>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>>>>>>>>>   - data blocks : 4962373 (274483)
> >>>>>>>>>>>>>   - node blocks : 121846 (110025)
> >>>>>>>>>>>>> Skipped : atomic write 4534686 (10)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [After]
> >>>>>>>>>>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>>>>>>>>>   - data blocks : 4967127 (274483)
> >>>>>>>>>>>>>   - node blocks : 121846 (110025)
> >>>>>>>>>>>>> Skipped : atomic write 4539440 (10)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>>>>>>>> ---
> >>>>>>>>>>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>>>>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>>>>>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>>>>>>>>>> --- a/fs/f2fs/file.c
> >>>>>>>>>>>>> +++ b/fs/f2fs/file.c
> >>>>>>>>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>>>>>>>>>  			goto err_out;
> >>>>>>>>>>>>>  
> >>>>>>>>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>>>>>>>>>> -		if (!ret) {
> >>>>>>>>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>>>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>>>>>>>> -			stat_dec_atomic_write(inode);
> >>>>>>>>>>>>> -		}
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +		/* doesn't need to check error */
> >>>>>>>>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>>>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>>>>>>>> +		stat_dec_atomic_write(inode);
> >>>>>>>>>>>>
> >>>>>>>>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>>>>>>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>>>>>>>>>
> >>>>>>>>>>> f2fs_commit_inmem_pages() should have flushed them.
> >>>>>>>>>>
> >>>>>>>>>> Oh, we failed to flush its nodes.
> >>>>>>>>>>
> >>>>>>>>>> However we won't clear such info if we failed to flush inmen pages, it looks
> >>>>>>>>>> inconsistent.
> >>>>>>>>>>
> >>>>>>>>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >>>>>>>>>> error path? I'm not very clear how sqlite handle such error.
> >>>>>>>>>
> >>>>>>>>> f2fs_drop_inmem_pages() did that, but not in this case.
> >>>>>>>>
> >>>>>>>> What I mean is, for any error returned from atomic_commit() interface, should
> >>>>>>>> userspace application handle it with consistent way, like trigger
> >>>>>>>> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> >>>>>>>
> >>>>>>> f2fs_ioc_abort_volatile_write() will be triggered.
> >>>>>>
> >>>>>> If userspace can do this, we can get rid of this patch, or am I missing sth?
> >>>>>
> >>>>> We don't know when that will come. And, other threads are waiting for GC here.
> >>>>
> >>>> Yes, however, even atomic_write won't be called sometimes... that's why we add
> >>>> handling logic in f2fs_gc().
> >>>
> >>> We need to clean this up like:
> >>>
> >>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>> get commited pages but atomic_file being still set like:
> >>>
> >>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>
> >>> If GC selects this block, we can get an infinite loop like this:
> >>>
> >>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>
> >>> In that moment, we can observe:
> >>>
> >>> [Before]
> >>> Try to move 5084219 blocks (BG: 384508)
> >>>   - data blocks : 4962373 (274483)
> >>>   - node blocks : 121846 (110025)
> >>> Skipped : atomic write 4534686 (10)
> >>>
> >>> [After]
> >>> Try to move 5088973 blocks (BG: 384508)
> >>>   - data blocks : 4967127 (274483)
> >>>   - node blocks : 121846 (110025)
> >>> Skipped : atomic write 4539440 (10)
> >>>
> >>> So, refactor atomic_write flow like this:
> >>> 1. start_atomic_write
> >>>  - add inmem_list and set atomic_file
> >>>
> >>> 2. write()
> >>>  - register it in inmem_pages
> >>>
> >>> 3. commit_atomic_write
> >>>  - if no error, f2fs_drop_inmem_pages()
> >>>  - f2fs_commit_inmme_pages() failed
> >>>    : __revoked_inmem_pages() was done
> >>>  - f2fs_do_sync_file failed
> >>>    : abort_atomic_write later
> >>>
> >>> 4. abort_atomic_write
> >>>  - f2fs_drop_inmem_pages
> >>>
> >>> 5. f2fs_drop_inmem_pages
> >>>  - clear atomic_file
> >>>  - remove inmem_list
> >>
> >> Looks good.
> >>
> >>>
> >>> Based on this change, when GC fails to move block in atomic_file,
> >>> f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
> >>>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/file.c    | 17 +++++++++++------
> >>>  fs/f2fs/segment.c | 24 +++++-------------------
> >>>  2 files changed, 16 insertions(+), 25 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>> index 10927a0b8df3..1f499ac13d9a 100644
> >>> --- a/fs/f2fs/file.c
> >>> +++ b/fs/f2fs/file.c
> >>> @@ -1830,6 +1830,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
> >>>  {
> >>>  	struct inode *inode = file_inode(filp);
> >>>  	int ret;
> >>> +	struct f2fs_inode_info *fi = F2FS_I(inode);
> >>>  
> >>>  	if (!inode_owner_or_capable(inode))
> >>>  		return -EACCES;
> >>> @@ -1871,10 +1872,17 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
> >>>  		goto out;
> >>>  	}
> >>>  
> >>> -	set_inode_flag(inode, FI_ATOMIC_FILE);
> >>>  	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >>>  	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
> >>>  
> >>> +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> >>> +	if (list_empty(&fi->inmem_ilist))
> >>> +		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> >>> +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> >>> +
> >>> +	/* add inode in inmem_list first and set atomic_file */
> >>> +	set_inode_flag(inode, FI_ATOMIC_FILE);
> >>
> >> Should call up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]) here? Then in f2fs_gc(),
> >> we can check FI_ATOMIC_FILE atomically under i_gc_rwsem.
> > 
> > Yup.
> > 
> > If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> > get commited pages but atomic_file being still set like:
> > 
> > - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> > 
> > If GC selects this block, we can get an infinite loop like this:
> > 
> > f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> > f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> > f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> > f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> > 
> > In that moment, we can observe:
> > 
> > [Before]
> > Try to move 5084219 blocks (BG: 384508)
> >   - data blocks : 4962373 (274483)
> >   - node blocks : 121846 (110025)
> > Skipped : atomic write 4534686 (10)
> > 
> > [After]
> > Try to move 5088973 blocks (BG: 384508)
> >   - data blocks : 4967127 (274483)
> >   - node blocks : 121846 (110025)
> > Skipped : atomic write 4539440 (10)
> > 
> > So, refactor atomic_write flow like this:
> > 1. start_atomic_write
> >  - add inmem_list and set atomic_file
> > 
> > 2. write()
> >  - register it in inmem_pages
> > 
> > 3. commit_atomic_write
> >  - if no error, f2fs_drop_inmem_pages()
> >  - f2fs_commit_inmme_pages() failed
> >    : __revoked_inmem_pages() was done
> >  - f2fs_do_sync_file failed
> >    : abort_atomic_write later
> > 
> > 4. abort_atomic_write
> >  - f2fs_drop_inmem_pages
> > 
> > 5. f2fs_drop_inmem_pages
> >  - clear atomic_file
> >  - remove inmem_list
> > 
> > Based on this change, when GC fails to move block in atomic_file,
> > f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/file.c    | 15 ++++++++++-----
> >  fs/f2fs/segment.c | 27 ++++++---------------------
> >  2 files changed, 16 insertions(+), 26 deletions(-)
> > 
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index 10927a0b8df3..fab6e4cf8f06 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -1829,6 +1829,8 @@ static int f2fs_ioc_getversion(struct file *filp, unsigned long arg)
> >  static int f2fs_ioc_start_atomic_write(struct file *filp)
> >  {
> >  	struct inode *inode = file_inode(filp);
> > +	struct f2fs_inode_info *fi = F2FS_I(inode);
> > +	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> >  	int ret;
> >  
> >  	if (!inode_owner_or_capable(inode))
> > @@ -1871,6 +1873,12 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
> >  		goto out;
> >  	}
> >  
> > +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> > +	if (list_empty(&fi->inmem_ilist))
> > +		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> > +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> > +
> > +	/* add inode in inmem_list first and set atomic_file */
> >  	set_inode_flag(inode, FI_ATOMIC_FILE);
> >  	clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >  	up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
> > @@ -1912,11 +1920,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >  			goto err_out;
> >  
> >  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> > -		if (!ret) {
> > -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> > -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> > -			stat_dec_atomic_write(inode);
> > -		}
> > +		if (!ret)
> > +			f2fs_drop_inmem_pages(inode);
> >  	} else {
> >  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >  	}
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 18584d4c078a..66343f451d73 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -185,8 +185,6 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
> >  
> >  void f2fs_register_inmem_page(struct inode *inode, struct page *page)
> >  {
> > -	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> > -	struct f2fs_inode_info *fi = F2FS_I(inode);
> >  	struct inmem_pages *new;
> >  
> >  	f2fs_trace_pid(page);
> > @@ -200,15 +198,9 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
> >  	INIT_LIST_HEAD(&new->list);
> >  
> >  	/* increase reference count with clean state */
> > -	mutex_lock(&fi->inmem_lock);
> 
> The lock is used to protect F2FS_I(inode)->inmem_pages list... it should be kept?

Urg.. yup. I added.

> 
> Thanks,
> 
> >  	get_page(page);
> > -	list_add_tail(&new->list, &fi->inmem_pages);
> > -	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> > -	if (list_empty(&fi->inmem_ilist))
> > -		list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
> > -	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> > +	list_add_tail(&new->list, &F2FS_I(inode)->inmem_pages);
> >  	inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
> > -	mutex_unlock(&fi->inmem_lock);
> >  
> >  	trace_f2fs_register_inmem_page(page, INMEM);
> >  }
> > @@ -330,19 +322,17 @@ void f2fs_drop_inmem_pages(struct inode *inode)
> >  		mutex_lock(&fi->inmem_lock);
> >  		__revoke_inmem_pages(inode, &fi->inmem_pages,
> >  						true, false, true);
> > -
> > -		if (list_empty(&fi->inmem_pages)) {
> > -			spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> > -			if (!list_empty(&fi->inmem_ilist))
> > -				list_del_init(&fi->inmem_ilist);
> > -			spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> > -		}
> >  		mutex_unlock(&fi->inmem_lock);
> >  	}
> >  
> >  	clear_inode_flag(inode, FI_ATOMIC_FILE);
> >  	fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >  	stat_dec_atomic_write(inode);
> > +
> > +	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> > +	if (!list_empty(&fi->inmem_ilist))
> > +		list_del_init(&fi->inmem_ilist);
> > +	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> >  }
> >  
> >  void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
> > @@ -471,11 +461,6 @@ int f2fs_commit_inmem_pages(struct inode *inode)
> >  
> >  	mutex_lock(&fi->inmem_lock);
> >  	err = __f2fs_commit_inmem_pages(inode);
> > -
> > -	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
> > -	if (!list_empty(&fi->inmem_ilist))
> > -		list_del_init(&fi->inmem_ilist);
> > -	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
> >  	mutex_unlock(&fi->inmem_lock);
> >  
> >  	clear_inode_flag(inode, FI_ATOMIC_COMMIT);
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
  2019-09-10 12:09                             ` Jaegeuk Kim
@ 2019-09-16  1:15                               ` Chao Yu
  0 siblings, 0 replies; 32+ messages in thread
From: Chao Yu @ 2019-09-16  1:15 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/10 20:09, Jaegeuk Kim wrote:
>> The lock is used to protect F2FS_I(inode)->inmem_pages list... it should be kept?
> Urg.. yup. I added.

Reviewed-by: Chao Yu <yuchao0@huawei.com>

Thanks,

> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-09 12:04       ` Jaegeuk Kim
@ 2019-09-16  1:22         ` Chao Yu
  2019-09-16 15:37           ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-16  1:22 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/9 20:04, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>> GC must avoid select the same victim again.
>>>>
>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
>>>> change, FGGC may encounter out-of-free space issue more frequently.
>>>
>>> Hmm, actually this change seems wrong by sec_usage_check().
>>> We may be able to avoid this only in the suspicious loop?
>>>
>>> ---
>>>  fs/f2fs/gc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index e88f98ddf396..5877bd729689 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>  		round++;
>>>  	}
>>>  
>>> -	if (gc_type == FG_GC)
>>> +	if (gc_type == FG_GC && seg_freed)
>>
>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
>> suggest to find the root cause first, then we added .invalid_segmap for that
>> purpose.
> 
> I've checked the Sahitya's patch. So, it seems the problem can happen due to
> is_alive or atomic_file.

For some conditions, this doesn't help, for example, two sections contain the
same fewest valid blocks, it will cause to loop selecting them if it fails to
migrate blocks.

How about keeping it as it is to find potential bug.

Thanks,

> 
>>
>> Thanks,
>>
>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>>  
>>>  	if (sync)
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-16  1:22         ` Chao Yu
@ 2019-09-16 15:37           ` Jaegeuk Kim
  2019-09-17  1:42             ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-16 15:37 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/16, Chao Yu wrote:
> On 2019/9/9 20:04, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 16:06, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>> GC must avoid select the same victim again.
> >>>>
> >>>> Blocks in previous victim will occupy addition free segment, I doubt after this
> >>>> change, FGGC may encounter out-of-free space issue more frequently.
> >>>
> >>> Hmm, actually this change seems wrong by sec_usage_check().
> >>> We may be able to avoid this only in the suspicious loop?
> >>>
> >>> ---
> >>>  fs/f2fs/gc.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index e88f98ddf396..5877bd729689 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>>  		round++;
> >>>  	}
> >>>  
> >>> -	if (gc_type == FG_GC)
> >>> +	if (gc_type == FG_GC && seg_freed)
> >>
> >> That's original solution Sahitya provided to avoid infinite loop of GC, but I
> >> suggest to find the root cause first, then we added .invalid_segmap for that
> >> purpose.
> > 
> > I've checked the Sahitya's patch. So, it seems the problem can happen due to
> > is_alive or atomic_file.
> 
> For some conditions, this doesn't help, for example, two sections contain the
> same fewest valid blocks, it will cause to loop selecting them if it fails to
> migrate blocks.
> 
> How about keeping it as it is to find potential bug.

I think it'd be fine to merge this. Could you check the above scenario in more
detail?

Thanks,

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>>  
> >>>  	if (sync)
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-16 15:37           ` Jaegeuk Kim
@ 2019-09-17  1:42             ` Chao Yu
  2019-09-17 20:55               ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-17  1:42 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/16 23:37, Jaegeuk Kim wrote:
> On 09/16, Chao Yu wrote:
>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
>>> On 09/09, Chao Yu wrote:
>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
>>>>> On 09/09, Chao Yu wrote:
>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>> GC must avoid select the same victim again.
>>>>>>
>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
>>>>>
>>>>> Hmm, actually this change seems wrong by sec_usage_check().
>>>>> We may be able to avoid this only in the suspicious loop?
>>>>>
>>>>> ---
>>>>>  fs/f2fs/gc.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>>> index e88f98ddf396..5877bd729689 100644
>>>>> --- a/fs/f2fs/gc.c
>>>>> +++ b/fs/f2fs/gc.c
>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>>>  		round++;
>>>>>  	}
>>>>>  
>>>>> -	if (gc_type == FG_GC)
>>>>> +	if (gc_type == FG_GC && seg_freed)
>>>>
>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
>>>> suggest to find the root cause first, then we added .invalid_segmap for that
>>>> purpose.
>>>
>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
>>> is_alive or atomic_file.
>>
>> For some conditions, this doesn't help, for example, two sections contain the
>> same fewest valid blocks, it will cause to loop selecting them if it fails to
>> migrate blocks.
>>
>> How about keeping it as it is to find potential bug.
> 
> I think it'd be fine to merge this. Could you check the above scenario in more
> detail?

I haven't saw this in real scenario yet.

What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
section, when that bug happens in two candidates, there could be the same
condition that GC will run into loop (select A, fail to migrate; select B, fail
to migrate, select A...).

But I guess the benefit of this change is, if FGGC fails to migrate block due to
i_gc_rwsem race, selecting another section and later retrying previous one may
avoid lock race, right?

Thanks,

> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>>>>  
>>>>>  	if (sync)
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-17  1:42             ` Chao Yu
@ 2019-09-17 20:55               ` Jaegeuk Kim
  2019-09-18  1:43                 ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-17 20:55 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/17, Chao Yu wrote:
> On 2019/9/16 23:37, Jaegeuk Kim wrote:
> > On 09/16, Chao Yu wrote:
> >> On 2019/9/9 20:04, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>> GC must avoid select the same victim again.
> >>>>>>
> >>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
> >>>>>> change, FGGC may encounter out-of-free space issue more frequently.
> >>>>>
> >>>>> Hmm, actually this change seems wrong by sec_usage_check().
> >>>>> We may be able to avoid this only in the suspicious loop?
> >>>>>
> >>>>> ---
> >>>>>  fs/f2fs/gc.c | 2 +-
> >>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>>>> index e88f98ddf396..5877bd729689 100644
> >>>>> --- a/fs/f2fs/gc.c
> >>>>> +++ b/fs/f2fs/gc.c
> >>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>>>>  		round++;
> >>>>>  	}
> >>>>>  
> >>>>> -	if (gc_type == FG_GC)
> >>>>> +	if (gc_type == FG_GC && seg_freed)
> >>>>
> >>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
> >>>> suggest to find the root cause first, then we added .invalid_segmap for that
> >>>> purpose.
> >>>
> >>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
> >>> is_alive or atomic_file.
> >>
> >> For some conditions, this doesn't help, for example, two sections contain the
> >> same fewest valid blocks, it will cause to loop selecting them if it fails to
> >> migrate blocks.
> >>
> >> How about keeping it as it is to find potential bug.
> > 
> > I think it'd be fine to merge this. Could you check the above scenario in more
> > detail?
> 
> I haven't saw this in real scenario yet.
> 
> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
> section, when that bug happens in two candidates, there could be the same
> condition that GC will run into loop (select A, fail to migrate; select B, fail
> to migrate, select A...).
> 
> But I guess the benefit of this change is, if FGGC fails to migrate block due to
> i_gc_rwsem race, selecting another section and later retrying previous one may
> avoid lock race, right?

In any case, I think this can avoid potenial GC loop. At least to me, it'd be
quite risky, if we remain this just for debugging purpose only.

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>>>>  
> >>>>>  	if (sync)
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-17 20:55               ` Jaegeuk Kim
@ 2019-09-18  1:43                 ` Chao Yu
  2019-09-18  3:12                   ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-18  1:43 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/18 4:55, Jaegeuk Kim wrote:
> On 09/17, Chao Yu wrote:
>> On 2019/9/16 23:37, Jaegeuk Kim wrote:
>>> On 09/16, Chao Yu wrote:
>>>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
>>>>> On 09/09, Chao Yu wrote:
>>>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>>>> GC must avoid select the same victim again.
>>>>>>>>
>>>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
>>>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
>>>>>>>
>>>>>>> Hmm, actually this change seems wrong by sec_usage_check().
>>>>>>> We may be able to avoid this only in the suspicious loop?
>>>>>>>
>>>>>>> ---
>>>>>>>  fs/f2fs/gc.c | 2 +-
>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>>>>> index e88f98ddf396..5877bd729689 100644
>>>>>>> --- a/fs/f2fs/gc.c
>>>>>>> +++ b/fs/f2fs/gc.c
>>>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>>>>>  		round++;
>>>>>>>  	}
>>>>>>>  
>>>>>>> -	if (gc_type == FG_GC)
>>>>>>> +	if (gc_type == FG_GC && seg_freed)
>>>>>>
>>>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
>>>>>> suggest to find the root cause first, then we added .invalid_segmap for that
>>>>>> purpose.
>>>>>
>>>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
>>>>> is_alive or atomic_file.
>>>>
>>>> For some conditions, this doesn't help, for example, two sections contain the
>>>> same fewest valid blocks, it will cause to loop selecting them if it fails to
>>>> migrate blocks.
>>>>
>>>> How about keeping it as it is to find potential bug.
>>>
>>> I think it'd be fine to merge this. Could you check the above scenario in more
>>> detail?
>>
>> I haven't saw this in real scenario yet.
>>
>> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
>> section, when that bug happens in two candidates, there could be the same
>> condition that GC will run into loop (select A, fail to migrate; select B, fail
>> to migrate, select A...).
>>
>> But I guess the benefit of this change is, if FGGC fails to migrate block due to
>> i_gc_rwsem race, selecting another section and later retrying previous one may
>> avoid lock race, right?
> 
> In any case, I think this can avoid potenial GC loop. At least to me, it'd be
> quite risky, if we remain this just for debugging purpose only.

Yup,

One more concern is would this cur_victim_sec remain after FGGC? then BGGC/SSR
will always skip the section cur_victim_sec points to.

So could we reset cur_victim_sec in the end of FGGC?

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>>>>>>  
>>>>>>>  	if (sync)
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-18  1:43                 ` Chao Yu
@ 2019-09-18  3:12                   ` Jaegeuk Kim
  2019-09-18  3:26                     ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-18  3:12 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/18, Chao Yu wrote:
> On 2019/9/18 4:55, Jaegeuk Kim wrote:
> > On 09/17, Chao Yu wrote:
> >> On 2019/9/16 23:37, Jaegeuk Kim wrote:
> >>> On 09/16, Chao Yu wrote:
> >>>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
> >>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>>>> GC must avoid select the same victim again.
> >>>>>>>>
> >>>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
> >>>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
> >>>>>>>
> >>>>>>> Hmm, actually this change seems wrong by sec_usage_check().
> >>>>>>> We may be able to avoid this only in the suspicious loop?
> >>>>>>>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/gc.c | 2 +-
> >>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>>>>>> index e88f98ddf396..5877bd729689 100644
> >>>>>>> --- a/fs/f2fs/gc.c
> >>>>>>> +++ b/fs/f2fs/gc.c
> >>>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>>>>>>  		round++;
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> -	if (gc_type == FG_GC)
> >>>>>>> +	if (gc_type == FG_GC && seg_freed)
> >>>>>>
> >>>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
> >>>>>> suggest to find the root cause first, then we added .invalid_segmap for that
> >>>>>> purpose.
> >>>>>
> >>>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
> >>>>> is_alive or atomic_file.
> >>>>
> >>>> For some conditions, this doesn't help, for example, two sections contain the
> >>>> same fewest valid blocks, it will cause to loop selecting them if it fails to
> >>>> migrate blocks.
> >>>>
> >>>> How about keeping it as it is to find potential bug.
> >>>
> >>> I think it'd be fine to merge this. Could you check the above scenario in more
> >>> detail?
> >>
> >> I haven't saw this in real scenario yet.
> >>
> >> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
> >> section, when that bug happens in two candidates, there could be the same
> >> condition that GC will run into loop (select A, fail to migrate; select B, fail
> >> to migrate, select A...).
> >>
> >> But I guess the benefit of this change is, if FGGC fails to migrate block due to
> >> i_gc_rwsem race, selecting another section and later retrying previous one may
> >> avoid lock race, right?
> > 
> > In any case, I think this can avoid potenial GC loop. At least to me, it'd be
> > quite risky, if we remain this just for debugging purpose only.
> 
> Yup,
> 
> One more concern is would this cur_victim_sec remain after FGGC? then BGGC/SSR
> will always skip the section cur_victim_sec points to.

Then, we can get another loop before using it by BGGC/SSR.

> 
> So could we reset cur_victim_sec in the end of FGGC?
> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>>>>>>  
> >>>>>>>  	if (sync)
> >>>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-18  3:12                   ` Jaegeuk Kim
@ 2019-09-18  3:26                     ` Chao Yu
  2019-09-18 16:47                       ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-18  3:26 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/18 11:12, Jaegeuk Kim wrote:
> On 09/18, Chao Yu wrote:
>> On 2019/9/18 4:55, Jaegeuk Kim wrote:
>>> On 09/17, Chao Yu wrote:
>>>> On 2019/9/16 23:37, Jaegeuk Kim wrote:
>>>>> On 09/16, Chao Yu wrote:
>>>>>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
>>>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>> GC must avoid select the same victim again.
>>>>>>>>>>
>>>>>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
>>>>>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
>>>>>>>>>
>>>>>>>>> Hmm, actually this change seems wrong by sec_usage_check().
>>>>>>>>> We may be able to avoid this only in the suspicious loop?
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>  fs/f2fs/gc.c | 2 +-
>>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>>>>>>> index e88f98ddf396..5877bd729689 100644
>>>>>>>>> --- a/fs/f2fs/gc.c
>>>>>>>>> +++ b/fs/f2fs/gc.c
>>>>>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>>>>>>>  		round++;
>>>>>>>>>  	}
>>>>>>>>>  
>>>>>>>>> -	if (gc_type == FG_GC)
>>>>>>>>> +	if (gc_type == FG_GC && seg_freed)
>>>>>>>>
>>>>>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
>>>>>>>> suggest to find the root cause first, then we added .invalid_segmap for that
>>>>>>>> purpose.
>>>>>>>
>>>>>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
>>>>>>> is_alive or atomic_file.
>>>>>>
>>>>>> For some conditions, this doesn't help, for example, two sections contain the
>>>>>> same fewest valid blocks, it will cause to loop selecting them if it fails to
>>>>>> migrate blocks.
>>>>>>
>>>>>> How about keeping it as it is to find potential bug.
>>>>>
>>>>> I think it'd be fine to merge this. Could you check the above scenario in more
>>>>> detail?
>>>>
>>>> I haven't saw this in real scenario yet.
>>>>
>>>> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
>>>> section, when that bug happens in two candidates, there could be the same
>>>> condition that GC will run into loop (select A, fail to migrate; select B, fail
>>>> to migrate, select A...).
>>>>
>>>> But I guess the benefit of this change is, if FGGC fails to migrate block due to
>>>> i_gc_rwsem race, selecting another section and later retrying previous one may
>>>> avoid lock race, right?
>>>
>>> In any case, I think this can avoid potenial GC loop. At least to me, it'd be
>>> quite risky, if we remain this just for debugging purpose only.
>>
>> Yup,
>>
>> One more concern is would this cur_victim_sec remain after FGGC? then BGGC/SSR
>> will always skip the section cur_victim_sec points to.
> 
> Then, we can get another loop before using it by BGGC/SSR.

I guess I didn't catch your point, do you mean, if we reset it in the end of
FGGC, we may encounter the loop during BGGC/SSR?

I meant:

f2fs_gc()
...

+	if (gc_type == FG_GC)
+		sbi->cur_victim_sec = NULL_SEGNO;

	mutex_unlock(&sbi->gc_mutex);

	put_gc_inode(&gc_list);
...

Thanks,

> 
>>
>> So could we reset cur_victim_sec in the end of FGGC?
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>>>>>>>>  
>>>>>>>>>  	if (sync)
>>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-18  3:26                     ` Chao Yu
@ 2019-09-18 16:47                       ` Jaegeuk Kim
  2019-09-19  0:53                         ` Chao Yu
  0 siblings, 1 reply; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-18 16:47 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/18, Chao Yu wrote:
> On 2019/9/18 11:12, Jaegeuk Kim wrote:
> > On 09/18, Chao Yu wrote:
> >> On 2019/9/18 4:55, Jaegeuk Kim wrote:
> >>> On 09/17, Chao Yu wrote:
> >>>> On 2019/9/16 23:37, Jaegeuk Kim wrote:
> >>>>> On 09/16, Chao Yu wrote:
> >>>>>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
> >>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>> GC must avoid select the same victim again.
> >>>>>>>>>>
> >>>>>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
> >>>>>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
> >>>>>>>>>
> >>>>>>>>> Hmm, actually this change seems wrong by sec_usage_check().
> >>>>>>>>> We may be able to avoid this only in the suspicious loop?
> >>>>>>>>>
> >>>>>>>>> ---
> >>>>>>>>>  fs/f2fs/gc.c | 2 +-
> >>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>>>>>>>> index e88f98ddf396..5877bd729689 100644
> >>>>>>>>> --- a/fs/f2fs/gc.c
> >>>>>>>>> +++ b/fs/f2fs/gc.c
> >>>>>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>>>>>>>>  		round++;
> >>>>>>>>>  	}
> >>>>>>>>>  
> >>>>>>>>> -	if (gc_type == FG_GC)
> >>>>>>>>> +	if (gc_type == FG_GC && seg_freed)
> >>>>>>>>
> >>>>>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
> >>>>>>>> suggest to find the root cause first, then we added .invalid_segmap for that
> >>>>>>>> purpose.
> >>>>>>>
> >>>>>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
> >>>>>>> is_alive or atomic_file.
> >>>>>>
> >>>>>> For some conditions, this doesn't help, for example, two sections contain the
> >>>>>> same fewest valid blocks, it will cause to loop selecting them if it fails to
> >>>>>> migrate blocks.
> >>>>>>
> >>>>>> How about keeping it as it is to find potential bug.
> >>>>>
> >>>>> I think it'd be fine to merge this. Could you check the above scenario in more
> >>>>> detail?
> >>>>
> >>>> I haven't saw this in real scenario yet.
> >>>>
> >>>> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
> >>>> section, when that bug happens in two candidates, there could be the same
> >>>> condition that GC will run into loop (select A, fail to migrate; select B, fail
> >>>> to migrate, select A...).
> >>>>
> >>>> But I guess the benefit of this change is, if FGGC fails to migrate block due to
> >>>> i_gc_rwsem race, selecting another section and later retrying previous one may
> >>>> avoid lock race, right?
> >>>
> >>> In any case, I think this can avoid potenial GC loop. At least to me, it'd be
> >>> quite risky, if we remain this just for debugging purpose only.
> >>
> >> Yup,
> >>
> >> One more concern is would this cur_victim_sec remain after FGGC? then BGGC/SSR
> >> will always skip the section cur_victim_sec points to.
> > 
> > Then, we can get another loop before using it by BGGC/SSR.
> 
> I guess I didn't catch your point, do you mean, if we reset it in the end of
> FGGC, we may encounter the loop during BGGC/SSR?

FGGC failed in a loop and last victim was remained in cur_victim_sec.
Next FGGC kicked in and did the same thing again. I don't expect BGGC/SSR
wants to select this victim much, since it will have CB policy.

> 
> I meant:
> 
> f2fs_gc()
> ...
> 
> +	if (gc_type == FG_GC)
> +		sbi->cur_victim_sec = NULL_SEGNO;
> 
> 	mutex_unlock(&sbi->gc_mutex);
> 
> 	put_gc_inode(&gc_list);
> ...
> 
> Thanks,
> 
> > 
> >>
> >> So could we reset cur_victim_sec in the end of FGGC?
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>>>>>>>>  
> >>>>>>>>>  	if (sync)
> >>>>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-18 16:47                       ` Jaegeuk Kim
@ 2019-09-19  0:53                         ` Chao Yu
  2019-09-19 17:11                           ` Jaegeuk Kim
  0 siblings, 1 reply; 32+ messages in thread
From: Chao Yu @ 2019-09-19  0:53 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On 2019/9/19 0:47, Jaegeuk Kim wrote:
> On 09/18, Chao Yu wrote:
>> On 2019/9/18 11:12, Jaegeuk Kim wrote:
>>> On 09/18, Chao Yu wrote:
>>>> On 2019/9/18 4:55, Jaegeuk Kim wrote:
>>>>> On 09/17, Chao Yu wrote:
>>>>>> On 2019/9/16 23:37, Jaegeuk Kim wrote:
>>>>>>> On 09/16, Chao Yu wrote:
>>>>>>>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
>>>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
>>>>>>>>>>> On 09/09, Chao Yu wrote:
>>>>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>>> GC must avoid select the same victim again.
>>>>>>>>>>>>
>>>>>>>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
>>>>>>>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
>>>>>>>>>>>
>>>>>>>>>>> Hmm, actually this change seems wrong by sec_usage_check().
>>>>>>>>>>> We may be able to avoid this only in the suspicious loop?
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>  fs/f2fs/gc.c | 2 +-
>>>>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>>>>>>>>>> index e88f98ddf396..5877bd729689 100644
>>>>>>>>>>> --- a/fs/f2fs/gc.c
>>>>>>>>>>> +++ b/fs/f2fs/gc.c
>>>>>>>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>>>>>>>>>>  		round++;
>>>>>>>>>>>  	}
>>>>>>>>>>>  
>>>>>>>>>>> -	if (gc_type == FG_GC)
>>>>>>>>>>> +	if (gc_type == FG_GC && seg_freed)
>>>>>>>>>>
>>>>>>>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
>>>>>>>>>> suggest to find the root cause first, then we added .invalid_segmap for that
>>>>>>>>>> purpose.
>>>>>>>>>
>>>>>>>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
>>>>>>>>> is_alive or atomic_file.
>>>>>>>>
>>>>>>>> For some conditions, this doesn't help, for example, two sections contain the
>>>>>>>> same fewest valid blocks, it will cause to loop selecting them if it fails to
>>>>>>>> migrate blocks.
>>>>>>>>
>>>>>>>> How about keeping it as it is to find potential bug.
>>>>>>>
>>>>>>> I think it'd be fine to merge this. Could you check the above scenario in more
>>>>>>> detail?
>>>>>>
>>>>>> I haven't saw this in real scenario yet.
>>>>>>
>>>>>> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
>>>>>> section, when that bug happens in two candidates, there could be the same
>>>>>> condition that GC will run into loop (select A, fail to migrate; select B, fail
>>>>>> to migrate, select A...).
>>>>>>
>>>>>> But I guess the benefit of this change is, if FGGC fails to migrate block due to
>>>>>> i_gc_rwsem race, selecting another section and later retrying previous one may
>>>>>> avoid lock race, right?
>>>>>
>>>>> In any case, I think this can avoid potenial GC loop. At least to me, it'd be
>>>>> quite risky, if we remain this just for debugging purpose only.
>>>>
>>>> Yup,
>>>>
>>>> One more concern is would this cur_victim_sec remain after FGGC? then BGGC/SSR
>>>> will always skip the section cur_victim_sec points to.
>>>
>>> Then, we can get another loop before using it by BGGC/SSR.
>>
>> I guess I didn't catch your point, do you mean, if we reset it in the end of
>> FGGC, we may encounter the loop during BGGC/SSR?
> 
> FGGC failed in a loop and last victim was remained in cur_victim_sec.

It won't run into a loop because we keep below condition?

+	if (gc_type == FG_GC && seg_freed)
+		sbi->cur_victim_sec = NULL_SEGNO;

	if (sync)
		goto stop;

I meant add below logic in addition:

+	if (gc_type == FG_GC)
+		sbi->cur_victim_sec = NULL_SEGNO;

	mutex_unlock(&sbi->gc_mutex);

Thanks,

> Next FGGC kicked in and did the same thing again. I don't expect BGGC/SSR
> wants to select this victim much, since it will have CB policy.
> 
>>
>> I meant:
>>
>> f2fs_gc()
>> ...
>>
>> +	if (gc_type == FG_GC)
>> +		sbi->cur_victim_sec = NULL_SEGNO;
>>
>> 	mutex_unlock(&sbi->gc_mutex);
>>
>> 	put_gc_inode(&gc_list);
>> ...
>>
>> Thanks,
>>
>>>
>>>>
>>>> So could we reset cur_victim_sec in the end of FGGC?
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
>>>>>>>>>>>  
>>>>>>>>>>>  	if (sync)
>>>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again
  2019-09-19  0:53                         ` Chao Yu
@ 2019-09-19 17:11                           ` Jaegeuk Kim
  0 siblings, 0 replies; 32+ messages in thread
From: Jaegeuk Kim @ 2019-09-19 17:11 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 09/19, Chao Yu wrote:
> On 2019/9/19 0:47, Jaegeuk Kim wrote:
> > On 09/18, Chao Yu wrote:
> >> On 2019/9/18 11:12, Jaegeuk Kim wrote:
> >>> On 09/18, Chao Yu wrote:
> >>>> On 2019/9/18 4:55, Jaegeuk Kim wrote:
> >>>>> On 09/17, Chao Yu wrote:
> >>>>>> On 2019/9/16 23:37, Jaegeuk Kim wrote:
> >>>>>>> On 09/16, Chao Yu wrote:
> >>>>>>>> On 2019/9/9 20:04, Jaegeuk Kim wrote:
> >>>>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>>>> On 2019/9/9 16:06, Jaegeuk Kim wrote:
> >>>>>>>>>>> On 09/09, Chao Yu wrote:
> >>>>>>>>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>>>>>>>> GC must avoid select the same victim again.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Blocks in previous victim will occupy addition free segment, I doubt after this
> >>>>>>>>>>>> change, FGGC may encounter out-of-free space issue more frequently.
> >>>>>>>>>>>
> >>>>>>>>>>> Hmm, actually this change seems wrong by sec_usage_check().
> >>>>>>>>>>> We may be able to avoid this only in the suspicious loop?
> >>>>>>>>>>>
> >>>>>>>>>>> ---
> >>>>>>>>>>>  fs/f2fs/gc.c | 2 +-
> >>>>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>>>>>>>>>> index e88f98ddf396..5877bd729689 100644
> >>>>>>>>>>> --- a/fs/f2fs/gc.c
> >>>>>>>>>>> +++ b/fs/f2fs/gc.c
> >>>>>>>>>>> @@ -1326,7 +1326,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>>>>>>>>>>  		round++;
> >>>>>>>>>>>  	}
> >>>>>>>>>>>  
> >>>>>>>>>>> -	if (gc_type == FG_GC)
> >>>>>>>>>>> +	if (gc_type == FG_GC && seg_freed)
> >>>>>>>>>>
> >>>>>>>>>> That's original solution Sahitya provided to avoid infinite loop of GC, but I
> >>>>>>>>>> suggest to find the root cause first, then we added .invalid_segmap for that
> >>>>>>>>>> purpose.
> >>>>>>>>>
> >>>>>>>>> I've checked the Sahitya's patch. So, it seems the problem can happen due to
> >>>>>>>>> is_alive or atomic_file.
> >>>>>>>>
> >>>>>>>> For some conditions, this doesn't help, for example, two sections contain the
> >>>>>>>> same fewest valid blocks, it will cause to loop selecting them if it fails to
> >>>>>>>> migrate blocks.
> >>>>>>>>
> >>>>>>>> How about keeping it as it is to find potential bug.
> >>>>>>>
> >>>>>>> I think it'd be fine to merge this. Could you check the above scenario in more
> >>>>>>> detail?
> >>>>>>
> >>>>>> I haven't saw this in real scenario yet.
> >>>>>>
> >>>>>> What I mean is if there is a bug (maybe in is_alive()) failing us to GC on one
> >>>>>> section, when that bug happens in two candidates, there could be the same
> >>>>>> condition that GC will run into loop (select A, fail to migrate; select B, fail
> >>>>>> to migrate, select A...).
> >>>>>>
> >>>>>> But I guess the benefit of this change is, if FGGC fails to migrate block due to
> >>>>>> i_gc_rwsem race, selecting another section and later retrying previous one may
> >>>>>> avoid lock race, right?
> >>>>>
> >>>>> In any case, I think this can avoid potenial GC loop. At least to me, it'd be
> >>>>> quite risky, if we remain this just for debugging purpose only.
> >>>>
> >>>> Yup,
> >>>>
> >>>> One more concern is would this cur_victim_sec remain after FGGC? then BGGC/SSR
> >>>> will always skip the section cur_victim_sec points to.
> >>>
> >>> Then, we can get another loop before using it by BGGC/SSR.
> >>
> >> I guess I didn't catch your point, do you mean, if we reset it in the end of
> >> FGGC, we may encounter the loop during BGGC/SSR?
> > 
> > FGGC failed in a loop and last victim was remained in cur_victim_sec.
> 
> It won't run into a loop because we keep below condition?

The following FGGC will be likely to select this victim again, which doesn't
mean "this loop" but "loop of f2fs_gc".

> 
> +	if (gc_type == FG_GC && seg_freed)
> +		sbi->cur_victim_sec = NULL_SEGNO;
> 
> 	if (sync)
> 		goto stop;
> 
> I meant add below logic in addition:
> 
> +	if (gc_type == FG_GC)
> +		sbi->cur_victim_sec = NULL_SEGNO;
> 
> 	mutex_unlock(&sbi->gc_mutex);
> 
> Thanks,
> 
> > Next FGGC kicked in and did the same thing again. I don't expect BGGC/SSR
> > wants to select this victim much, since it will have CB policy.
> > 
> >>
> >> I meant:
> >>
> >> f2fs_gc()
> >> ...
> >>
> >> +	if (gc_type == FG_GC)
> >> +		sbi->cur_victim_sec = NULL_SEGNO;
> >>
> >> 	mutex_unlock(&sbi->gc_mutex);
> >>
> >> 	put_gc_inode(&gc_list);
> >> ...
> >>
> >> Thanks,
> >>
> >>>
> >>>>
> >>>> So could we reset cur_victim_sec in the end of FGGC?
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>>>  		sbi->cur_victim_sec = NULL_SEGNO;
> >>>>>>>>>>>  
> >>>>>>>>>>>  	if (sync)
> >>>>>>>>>>>
> >>>>>>>>> .
> >>>>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, back to index

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09  1:25 [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Jaegeuk Kim
2019-09-09  1:25 ` [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files Jaegeuk Kim
2019-09-09  3:03   ` Chao Yu
2019-09-09  7:30     ` Jaegeuk Kim
2019-09-09  7:54       ` Chao Yu
2019-09-09  8:01         ` Jaegeuk Kim
2019-09-09  8:05           ` Chao Yu
2019-09-09  8:21             ` Jaegeuk Kim
2019-09-09  8:27               ` Chao Yu
2019-09-09  8:38                 ` Jaegeuk Kim
2019-09-09  8:44                   ` Jaegeuk Kim
2019-09-09 11:26                   ` Chao Yu
2019-09-09 14:34                     ` Jaegeuk Kim
2019-09-10  0:59                       ` Chao Yu
2019-09-10 11:58                         ` Jaegeuk Kim
2019-09-10 12:04                           ` Chao Yu
2019-09-10 12:09                             ` Jaegeuk Kim
2019-09-16  1:15                               ` Chao Yu
2019-09-09  2:56 ` [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Chao Yu
2019-09-09  8:06   ` Jaegeuk Kim
2019-09-09 11:32     ` Chao Yu
2019-09-09 12:04       ` Jaegeuk Kim
2019-09-16  1:22         ` Chao Yu
2019-09-16 15:37           ` Jaegeuk Kim
2019-09-17  1:42             ` Chao Yu
2019-09-17 20:55               ` Jaegeuk Kim
2019-09-18  1:43                 ` Chao Yu
2019-09-18  3:12                   ` Jaegeuk Kim
2019-09-18  3:26                     ` Chao Yu
2019-09-18 16:47                       ` Jaegeuk Kim
2019-09-19  0:53                         ` Chao Yu
2019-09-19 17:11                           ` Jaegeuk Kim

Linux-f2fs-devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-f2fs-devel/0 linux-f2fs-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-f2fs-devel linux-f2fs-devel/ https://lore.kernel.org/linux-f2fs-devel \
		linux-f2fs-devel@lists.sourceforge.net linux-f2fs-devel@archiver.kernel.org
	public-inbox-index linux-f2fs-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/net.sourceforge.lists.linux-f2fs-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox