All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <yuchao0@huawei.com>
Cc: g@jaegeuk-macbookpro.roam.corp.google.com,
	linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
Date: Mon, 9 Sep 2019 09:38:44 +0100	[thread overview]
Message-ID: <20190909083844.GC25724@jaegeuk-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <2f5b844c-f722-6a80-a4ab-61bdd72b8be4@huawei.com>

On 09/09, Chao Yu wrote:
> On 2019/9/9 16:21, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>>>> get commited pages but atomic_file being still set like:
> >>>>>>>
> >>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>>>
> >>>>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>>>
> >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>
> >>>>>>> In that moment, we can observe:
> >>>>>>>
> >>>>>>> [Before]
> >>>>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>>>   - data blocks : 4962373 (274483)
> >>>>>>>   - node blocks : 121846 (110025)
> >>>>>>> Skipped : atomic write 4534686 (10)
> >>>>>>>
> >>>>>>> [After]
> >>>>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>>>   - data blocks : 4967127 (274483)
> >>>>>>>   - node blocks : 121846 (110025)
> >>>>>>> Skipped : atomic write 4539440 (10)
> >>>>>>>
> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>>>> --- a/fs/f2fs/file.c
> >>>>>>> +++ b/fs/f2fs/file.c
> >>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>>>  			goto err_out;
> >>>>>>>  
> >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>>>> -		if (!ret) {
> >>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>> -			stat_dec_atomic_write(inode);
> >>>>>>> -		}
> >>>>>>> +
> >>>>>>> +		/* doesn't need to check error */
> >>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>> +		stat_dec_atomic_write(inode);
> >>>>>>
> >>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>>>
> >>>>> f2fs_commit_inmem_pages() should have flushed them.
> >>>>
> >>>> Oh, we failed to flush its nodes.
> >>>>
> >>>> However we won't clear such info if we failed to flush inmen pages, it looks
> >>>> inconsistent.
> >>>>
> >>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >>>> error path? I'm not very clear how sqlite handle such error.
> >>>
> >>> f2fs_drop_inmem_pages() did that, but not in this case.
> >>
> >> What I mean is, for any error returned from atomic_commit() interface, should
> >> userspace application handle it with consistent way, like trigger
> >> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> > 
> > f2fs_ioc_abort_volatile_write() will be triggered.
> 
> If userspace can do this, we can get rid of this patch, or am I missing sth?

We don't know when that will come. And, other threads are waiting for GC here.

> 
> - f2fs_ioc_abort_volatile_write
>  - f2fs_drop_inmem_pages
>   - clear_inode_flag(inode, FI_ATOMIC_FILE);
>   - fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>   - stat_dec_atomic_write(inode);
> 
> > 
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> So my question is why below logic didn't handle such condition well?
> >>>>>>
> >>>>>> f2fs_gc()
> >>>>>>
> >>>>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> >>>>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
> >>>>>> 					skipped_round * 2 < round) {
> >>>>>> 			segno = NULL_SEGNO;
> >>>>>> 			goto gc_more;
> >>>>>> 		}
> >>>>>>
> >>>>>> 		if (first_skipped < last_skipped &&
> >>>>>> 				(last_skipped - first_skipped) >
> >>>>>> 						sbi->skipped_gc_rwsem) {
> >>>>>> 			f2fs_drop_inmem_pages_all(sbi, true);
> >>>>>
> >>>>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> >>>>> from inmem list.
> >>>>>
> >>>>>> 			segno = NULL_SEGNO;
> >>>>>> 			goto gc_more;
> >>>>>> 		}
> >>>>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
> >>>>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
> >>>>>> 	}
> >>>>>>
> >>>>>>>  	} else {
> >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>>>>>>  	}
> >>>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> > 

WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Chao Yu <yuchao0@huawei.com>
Cc: linux-f2fs-devel@lists.sourceforge.net,
	g@jaegeuk-macbookpro.roam.corp.google.com,
	linux-kernel@vger.kernel.org
Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files
Date: Mon, 9 Sep 2019 09:38:44 +0100	[thread overview]
Message-ID: <20190909083844.GC25724@jaegeuk-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <2f5b844c-f722-6a80-a4ab-61bdd72b8be4@huawei.com>

On 09/09, Chao Yu wrote:
> On 2019/9/9 16:21, Jaegeuk Kim wrote:
> > On 09/09, Chao Yu wrote:
> >> On 2019/9/9 16:01, Jaegeuk Kim wrote:
> >>> On 09/09, Chao Yu wrote:
> >>>> On 2019/9/9 15:30, Jaegeuk Kim wrote:
> >>>>> On 09/09, Chao Yu wrote:
> >>>>>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
> >>>>>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
> >>>>>>> get commited pages but atomic_file being still set like:
> >>>>>>>
> >>>>>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
> >>>>>>>
> >>>>>>> If GC selects this block, we can get an infinite loop like this:
> >>>>>>>
> >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
> >>>>>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
> >>>>>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
> >>>>>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
> >>>>>>>
> >>>>>>> In that moment, we can observe:
> >>>>>>>
> >>>>>>> [Before]
> >>>>>>> Try to move 5084219 blocks (BG: 384508)
> >>>>>>>   - data blocks : 4962373 (274483)
> >>>>>>>   - node blocks : 121846 (110025)
> >>>>>>> Skipped : atomic write 4534686 (10)
> >>>>>>>
> >>>>>>> [After]
> >>>>>>> Try to move 5088973 blocks (BG: 384508)
> >>>>>>>   - data blocks : 4967127 (274483)
> >>>>>>>   - node blocks : 121846 (110025)
> >>>>>>> Skipped : atomic write 4539440 (10)
> >>>>>>>
> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>>>>>> ---
> >>>>>>>  fs/f2fs/file.c | 10 +++++-----
> >>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>>>>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
> >>>>>>> --- a/fs/f2fs/file.c
> >>>>>>> +++ b/fs/f2fs/file.c
> >>>>>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
> >>>>>>>  			goto err_out;
> >>>>>>>  
> >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
> >>>>>>> -		if (!ret) {
> >>>>>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>> -			stat_dec_atomic_write(inode);
> >>>>>>> -		}
> >>>>>>> +
> >>>>>>> +		/* doesn't need to check error */
> >>>>>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
> >>>>>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
> >>>>>>> +		stat_dec_atomic_write(inode);
> >>>>>>
> >>>>>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
> >>>>>> memory leak when we just clear FI_ATOMIC_FILE flag.
> >>>>>
> >>>>> f2fs_commit_inmem_pages() should have flushed them.
> >>>>
> >>>> Oh, we failed to flush its nodes.
> >>>>
> >>>> However we won't clear such info if we failed to flush inmen pages, it looks
> >>>> inconsistent.
> >>>>
> >>>> Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
> >>>> error path? I'm not very clear how sqlite handle such error.
> >>>
> >>> f2fs_drop_inmem_pages() did that, but not in this case.
> >>
> >> What I mean is, for any error returned from atomic_commit() interface, should
> >> userspace application handle it with consistent way, like trigger
> >> f2fs_drop_inmem_pages(), so we don't need to handle it inside atomic_commit().
> > 
> > f2fs_ioc_abort_volatile_write() will be triggered.
> 
> If userspace can do this, we can get rid of this patch, or am I missing sth?

We don't know when that will come. And, other threads are waiting for GC here.

> 
> - f2fs_ioc_abort_volatile_write
>  - f2fs_drop_inmem_pages
>   - clear_inode_flag(inode, FI_ATOMIC_FILE);
>   - fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>   - stat_dec_atomic_write(inode);
> 
> > 
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>>
> >>>>>>
> >>>>>> So my question is why below logic didn't handle such condition well?
> >>>>>>
> >>>>>> f2fs_gc()
> >>>>>>
> >>>>>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
> >>>>>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
> >>>>>> 					skipped_round * 2 < round) {
> >>>>>> 			segno = NULL_SEGNO;
> >>>>>> 			goto gc_more;
> >>>>>> 		}
> >>>>>>
> >>>>>> 		if (first_skipped < last_skipped &&
> >>>>>> 				(last_skipped - first_skipped) >
> >>>>>> 						sbi->skipped_gc_rwsem) {
> >>>>>> 			f2fs_drop_inmem_pages_all(sbi, true);
> >>>>>
> >>>>> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> >>>>> from inmem list.
> >>>>>
> >>>>>> 			segno = NULL_SEGNO;
> >>>>>> 			goto gc_more;
> >>>>>> 		}
> >>>>>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
> >>>>>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
> >>>>>> 	}
> >>>>>>
> >>>>>>>  	} else {
> >>>>>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>>>>>>  	}
> >>>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> > 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2019-09-09  8:38 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09  1:25 [PATCH 1/2] f2fs: do not select same victim right again Jaegeuk Kim
2019-09-09  1:25 ` [f2fs-dev] " Jaegeuk Kim
2019-09-09  1:25 ` [PATCH 2/2] f2fs: avoid infinite GC loop due to stale atomic files Jaegeuk Kim
2019-09-09  1:25   ` [f2fs-dev] " Jaegeuk Kim
2019-09-09  3:03   ` Chao Yu
2019-09-09  3:03     ` Chao Yu
2019-09-09  7:30     ` Jaegeuk Kim
2019-09-09  7:30       ` Jaegeuk Kim
2019-09-09  7:54       ` Chao Yu
2019-09-09  7:54         ` Chao Yu
2019-09-09  8:01         ` Jaegeuk Kim
2019-09-09  8:01           ` Jaegeuk Kim
2019-09-09  8:05           ` Chao Yu
2019-09-09  8:05             ` Chao Yu
2019-09-09  8:21             ` Jaegeuk Kim
2019-09-09  8:21               ` Jaegeuk Kim
2019-09-09  8:27               ` Chao Yu
2019-09-09  8:27                 ` Chao Yu
2019-09-09  8:38                 ` Jaegeuk Kim [this message]
2019-09-09  8:38                   ` Jaegeuk Kim
2019-09-09  8:44                   ` Jaegeuk Kim
2019-09-09  8:44                     ` Jaegeuk Kim
2019-09-09 11:26                   ` Chao Yu
2019-09-09 11:26                     ` Chao Yu
2019-09-09 14:34                     ` Jaegeuk Kim
2019-09-09 14:34                       ` Jaegeuk Kim
2019-09-10  0:59                       ` Chao Yu
2019-09-10  0:59                         ` Chao Yu
2019-09-10 11:58                         ` Jaegeuk Kim
2019-09-10 11:58                           ` Jaegeuk Kim
2019-09-10 12:04                           ` Chao Yu
2019-09-10 12:04                             ` Chao Yu
2019-09-10 12:09                             ` Jaegeuk Kim
2019-09-10 12:09                               ` Jaegeuk Kim
2019-09-16  1:15                               ` Chao Yu
2019-09-16  1:15                                 ` Chao Yu
2019-09-09  2:56 ` [f2fs-dev] [PATCH 1/2] f2fs: do not select same victim right again Chao Yu
2019-09-09  2:56   ` Chao Yu
2019-09-09  8:06   ` Jaegeuk Kim
2019-09-09  8:06     ` Jaegeuk Kim
2019-09-09 11:32     ` Chao Yu
2019-09-09 11:32       ` Chao Yu
2019-09-09 12:04       ` Jaegeuk Kim
2019-09-09 12:04         ` Jaegeuk Kim
2019-09-16  1:22         ` Chao Yu
2019-09-16  1:22           ` Chao Yu
2019-09-16 15:37           ` Jaegeuk Kim
2019-09-16 15:37             ` Jaegeuk Kim
2019-09-17  1:42             ` Chao Yu
2019-09-17  1:42               ` Chao Yu
2019-09-17 20:55               ` Jaegeuk Kim
2019-09-17 20:55                 ` Jaegeuk Kim
2019-09-18  1:43                 ` Chao Yu
2019-09-18  1:43                   ` Chao Yu
2019-09-18  3:12                   ` Jaegeuk Kim
2019-09-18  3:12                     ` Jaegeuk Kim
2019-09-18  3:26                     ` Chao Yu
2019-09-18  3:26                       ` Chao Yu
2019-09-18 16:47                       ` Jaegeuk Kim
2019-09-18 16:47                         ` Jaegeuk Kim
2019-09-19  0:53                         ` Chao Yu
2019-09-19  0:53                           ` Chao Yu
2019-09-19 17:11                           ` Jaegeuk Kim
2019-09-19 17:11                             ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190909083844.GC25724@jaegeuk-macbookpro.roam.corp.google.com \
    --to=jaegeuk@kernel.org \
    --cc=g@jaegeuk-macbookpro.roam.corp.google.com \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.