linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Krause <xerofoify@gmail.com>
To: Chao Yu <chao2.yu@samsung.com>,'Jaegeuk Kim' <jaegeuk@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: Space leak in f2fs
Date: Sun, 17 May 2015 22:50:17 -0400	[thread overview]
Message-ID: <B9543FEE-3BB9-480B-BFB0-A0D980969C0B@gmail.com> (raw)
In-Reply-To: <000001d09114$7e0318d0$7a094a70$@samsung.com>



On May 17, 2015 10:43:14 PM EDT, Chao Yu <chao2.yu@samsung.com> wrote:
>Hi Jaegeuk,
>
>> -----Original Message-----
>> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
>> Sent: Saturday, May 16, 2015 8:56 AM
>> To: Chao Yu
>> Cc: 'hujianyang'; linux-fsdevel@vger.kernel.org;
>linux-f2fs-devel@lists.sourceforge.net
>> Subject: Re: [f2fs-dev] Space leak in f2fs
>> 
>> Hi Chao,
>> 
>> On Fri, May 15, 2015 at 04:31:43PM +0800, Chao Yu wrote:
>> > Hi Jaegeuk,
>> >
>> 
>> [snip]
>> 
>> > > +	/* if orphan inode, we don't need to write its data */
>> > > +	if (is_orphan_inode(sbi, inode->i_ino))
>> > > +		goto out;
>> >
>> > When user create a temp file by invoking open with O_TMPFILE flag,
>> > in ->tmpfile our temp file will be added into orphan list as its
>> > nlink is zero.
>> >
>> > If we skip writting out data for this orphan inode, later, even
>though
>> > we add nlink/directory entry for orphan inode by calling linkat,
>> > our file will contain inconsistent data between in-memory and
>on-disk.
>> >
>> > So how about considering for this case?
>> 
>> Right.
>> How about the below patch?
>> 
>> >
>> > BTW, the previous fixing patch looks good to me.
>> 
>> But, my new concern here is a memory pressure. If we do not drop the
>inode
>> when iput was called, we need to wait for another time slot to
>reclaim its
>> memory.
>
>Agree. Please see below.
>
>> 
>> Thanks,
>> 
>> ---
>>  fs/f2fs/checkpoint.c | 19 +++++++++++++++++++
>>  fs/f2fs/data.c       |  8 ++++++++
>>  fs/f2fs/dir.c        |  1 +
>>  fs/f2fs/f2fs.h       |  2 ++
>>  fs/f2fs/super.c      | 14 +++++++++++++-
>>  5 files changed, 43 insertions(+), 1 deletion(-)
>> 
>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>> index 7b7a9d8..74875fb 100644
>> --- a/fs/f2fs/checkpoint.c
>> +++ b/fs/f2fs/checkpoint.c
>> @@ -378,6 +378,20 @@ static void __remove_ino_entry(struct
>f2fs_sb_info *sbi, nid_t ino, int
>> type)
>>  	spin_unlock(&im->ino_lock);
>>  }
>> 
>> +static bool __exist_ino_entry(struct f2fs_sb_info *sbi, nid_t ino,
>int type)
>> +{
>> +	struct inode_management *im = &sbi->im[type];
>> +	struct ino_entry *e;
>> +	bool exist = false;
>> +
>> +	spin_lock(&im->ino_lock);
>> +	e = radix_tree_lookup(&im->ino_root, ino);
>> +	if (e)
>> +		exist = true;
>> +	spin_unlock(&im->ino_lock);
>> +	return exist;
>> +}
>> +
>>  void add_dirty_inode(struct f2fs_sb_info *sbi, nid_t ino, int type)
>>  {
>>  	/* add new dirty ino entry into list */
>> @@ -458,6 +472,11 @@ void remove_orphan_inode(struct f2fs_sb_info
>*sbi, nid_t ino)
>>  	__remove_ino_entry(sbi, ino, ORPHAN_INO);
>>  }
>> 
>> +bool is_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino)
>> +{
>> +	return __exist_ino_entry(sbi, ino, ORPHAN_INO);
>> +}
>> +
>>  static void recover_orphan_inode(struct f2fs_sb_info *sbi, nid_t
>ino)
>>  {
>>  	struct inode *inode = f2fs_iget(sbi->sb, ino);
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index b0cc2aa..d883c14 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -1749,6 +1749,14 @@ write:
>>  		goto out;
>>  	}
>> 
>> +	/*
>> +	 * if orphan inode, we don't need to write its data,
>> +	 * but, tmpfile is not the case.
>> +	 */
>> +	if (is_orphan_inode(sbi, inode->i_ino) &&
>> +			!is_inode_flag_set(F2FS_I(inode), FI_TMP_INODE))
>
>For normal inode, all dirty pages will not be written out, and after
>that pages
>can be reclaimed by VM any time due to they are be cleaned when flush.
>Then any
>process who held the orphan inode may not read any original data
>correctly from
>this inode.
>
>And here is the unlink description in POSIX:
>"If one or more processes have the file open when the last link is
>removed,
>the link shall be removed before unlink() returns, but the removal of
>the
>file contents shall be postponed until all references to the file are
>closed."
>
>To my understanding for above description, we should keep data of
>helded orphan
>inode in memory or on disk until it is not referenced by any processes.
>
>How do you think of it?
>
>using "if (is_orphan_inode(sbi, inode->i_ino) &&
>!atomic_read(&inode->i_count))"
>to skip writing at the beginning of ->writepage()?
>
>Thanks,
>
Chao, 
Your correct here,  I was going to recommend this but my explanation 
was pretty badly worded.  Again also
I am not that well versed in the f2fs 
code base so I wasn't sure if my 
answer was correct. 
Nick
>> +		goto out;
>> +
>>  	if (!wbc->for_reclaim)
>>  		need_balance_fs = true;
>>  	else if (has_not_enough_free_secs(sbi, 0))
>> diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
>> index 3e92376..a2ea1b9 100644
>> --- a/fs/f2fs/dir.c
>> +++ b/fs/f2fs/dir.c
>> @@ -648,6 +648,7 @@ int f2fs_do_tmpfile(struct inode *inode, struct
>inode *dir)
>>  	update_inode(inode, page);
>>  	f2fs_put_page(page, 1);
>> 
>> +	set_inode_flag(F2FS_I(inode), FI_TMP_INODE);
>>  	clear_inode_flag(F2FS_I(inode), FI_NEW_INODE);
>>  fail:
>>  	up_write(&F2FS_I(inode)->i_sem);
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index cdcae06..de21d38 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -1337,6 +1337,7 @@ static inline void f2fs_change_bit(unsigned int
>nr, char *addr)
>>  /* used for f2fs_inode_info->flags */
>>  enum {
>>  	FI_NEW_INODE,		/* indicate newly allocated inode */
>> +	FI_TMP_INODE,		/* indicate tmpfile */
>>  	FI_DIRTY_INODE,		/* indicate inode is dirty or not */
>>  	FI_DIRTY_DIR,		/* indicate directory has dirty pages */
>>  	FI_INC_LINK,		/* need to increment i_nlink */
>> @@ -1726,6 +1727,7 @@ int acquire_orphan_inode(struct f2fs_sb_info
>*);
>>  void release_orphan_inode(struct f2fs_sb_info *);
>>  void add_orphan_inode(struct f2fs_sb_info *, nid_t);
>>  void remove_orphan_inode(struct f2fs_sb_info *, nid_t);
>> +bool is_orphan_inode(struct f2fs_sb_info *, nid_t);
>>  void recover_orphan_inodes(struct f2fs_sb_info *);
>>  int get_valid_checkpoint(struct f2fs_sb_info *);
>>  void update_dirty_page(struct inode *, struct page *);
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index 7464d08..98af3bf 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -430,9 +430,21 @@ static int f2fs_drop_inode(struct inode *inode)
>>  	 *  - f2fs_write_data_page
>>  	 *    - f2fs_gc -> iput -> evict
>>  	 *       - inode_wait_for_writeback(inode)
>> +	 * In order to avoid that, f2fs_write_data_page does not write data
>> +	 * pages for orphan inode except tmpfile.
>> +	 * Nevertheless, we need to truncate the tmpfile's data to avoid
>> +	 * needless cleaning.
>>  	 */
>> -	if (!inode_unhashed(inode) && inode->i_state & I_SYNC)
>> +	if (is_inode_flag_set(F2FS_I(inode), FI_TMP_INODE) &&
>> +						inode->i_state & I_SYNC) {
>> +		spin_unlock(&inode->i_lock);
>> +		i_size_write(inode, 0);
>> +
>> +		if (F2FS_HAS_BLOCKS(inode))
>> +			f2fs_truncate(inode);
>> +		spin_lock(&inode->i_lock);
>>  		return 0;
>> +	}
>>  	return generic_drop_inode(inode);
>>  }
>> 
>> --
>> 2.1.1
>
>
>
>------------------------------------------------------------------------------
>One dashboard for servers and applications across
>Physical-Virtual-Cloud 
>Widest out-of-the-box monitoring support with 50+ applications
>Performance metrics, stats and reports that give you Actionable
>Insights
>Deep dive visibility with transaction tracing using APM Insight.
>http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>_______________________________________________
>Linux-f2fs-devel mailing list
>Linux-f2fs-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y

  reply	other threads:[~2015-05-18  2:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13  7:17 Space leak in f2fs hujianyang
2015-05-13 17:46 ` [f2fs-dev] " Jaegeuk Kim
2015-05-14  0:24   ` Jaegeuk Kim
2015-05-14  1:40     ` hujianyang
2015-05-14  1:45       ` [f2fs-dev] " Jaegeuk Kim
2015-05-14 21:14       ` Jaegeuk Kim
2015-05-15  8:31         ` Chao Yu
2015-05-16  0:55           ` Jaegeuk Kim
2015-05-18  2:43             ` Chao Yu
2015-05-18  2:50               ` Nicholas Krause [this message]
2015-05-18  5:44               ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B9543FEE-3BB9-480B-BFB0-A0D980969C0B@gmail.com \
    --to=xerofoify@gmail.com \
    --cc=chao2.yu@samsung.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).