From: Nicholas Krause <xerofoify@gmail.com>
To: Chao Yu <chao2.yu@samsung.com>,'Jaegeuk Kim' <jaegeuk@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: Space leak in f2fs
Date: Sun, 17 May 2015 22:50:17 -0400 [thread overview]
Message-ID: <B9543FEE-3BB9-480B-BFB0-A0D980969C0B@gmail.com> (raw)
In-Reply-To: <000001d09114$7e0318d0$7a094a70$@samsung.com>
On May 17, 2015 10:43:14 PM EDT, Chao Yu <chao2.yu@samsung.com> wrote:
>Hi Jaegeuk,
>
>> -----Original Message-----
>> From: Jaegeuk Kim [mailto:jaegeuk@kernel.org]
>> Sent: Saturday, May 16, 2015 8:56 AM
>> To: Chao Yu
>> Cc: 'hujianyang'; linux-fsdevel@vger.kernel.org;
>linux-f2fs-devel@lists.sourceforge.net
>> Subject: Re: [f2fs-dev] Space leak in f2fs
>>
>> Hi Chao,
>>
>> On Fri, May 15, 2015 at 04:31:43PM +0800, Chao Yu wrote:
>> > Hi Jaegeuk,
>> >
>>
>> [snip]
>>
>> > > + /* if orphan inode, we don't need to write its data */
>> > > + if (is_orphan_inode(sbi, inode->i_ino))
>> > > + goto out;
>> >
>> > When user create a temp file by invoking open with O_TMPFILE flag,
>> > in ->tmpfile our temp file will be added into orphan list as its
>> > nlink is zero.
>> >
>> > If we skip writting out data for this orphan inode, later, even
>though
>> > we add nlink/directory entry for orphan inode by calling linkat,
>> > our file will contain inconsistent data between in-memory and
>on-disk.
>> >
>> > So how about considering for this case?
>>
>> Right.
>> How about the below patch?
>>
>> >
>> > BTW, the previous fixing patch looks good to me.
>>
>> But, my new concern here is a memory pressure. If we do not drop the
>inode
>> when iput was called, we need to wait for another time slot to
>reclaim its
>> memory.
>
>Agree. Please see below.
>
>>
>> Thanks,
>>
>> ---
>> fs/f2fs/checkpoint.c | 19 +++++++++++++++++++
>> fs/f2fs/data.c | 8 ++++++++
>> fs/f2fs/dir.c | 1 +
>> fs/f2fs/f2fs.h | 2 ++
>> fs/f2fs/super.c | 14 +++++++++++++-
>> 5 files changed, 43 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>> index 7b7a9d8..74875fb 100644
>> --- a/fs/f2fs/checkpoint.c
>> +++ b/fs/f2fs/checkpoint.c
>> @@ -378,6 +378,20 @@ static void __remove_ino_entry(struct
>f2fs_sb_info *sbi, nid_t ino, int
>> type)
>> spin_unlock(&im->ino_lock);
>> }
>>
>> +static bool __exist_ino_entry(struct f2fs_sb_info *sbi, nid_t ino,
>int type)
>> +{
>> + struct inode_management *im = &sbi->im[type];
>> + struct ino_entry *e;
>> + bool exist = false;
>> +
>> + spin_lock(&im->ino_lock);
>> + e = radix_tree_lookup(&im->ino_root, ino);
>> + if (e)
>> + exist = true;
>> + spin_unlock(&im->ino_lock);
>> + return exist;
>> +}
>> +
>> void add_dirty_inode(struct f2fs_sb_info *sbi, nid_t ino, int type)
>> {
>> /* add new dirty ino entry into list */
>> @@ -458,6 +472,11 @@ void remove_orphan_inode(struct f2fs_sb_info
>*sbi, nid_t ino)
>> __remove_ino_entry(sbi, ino, ORPHAN_INO);
>> }
>>
>> +bool is_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino)
>> +{
>> + return __exist_ino_entry(sbi, ino, ORPHAN_INO);
>> +}
>> +
>> static void recover_orphan_inode(struct f2fs_sb_info *sbi, nid_t
>ino)
>> {
>> struct inode *inode = f2fs_iget(sbi->sb, ino);
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index b0cc2aa..d883c14 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -1749,6 +1749,14 @@ write:
>> goto out;
>> }
>>
>> + /*
>> + * if orphan inode, we don't need to write its data,
>> + * but, tmpfile is not the case.
>> + */
>> + if (is_orphan_inode(sbi, inode->i_ino) &&
>> + !is_inode_flag_set(F2FS_I(inode), FI_TMP_INODE))
>
>For normal inode, all dirty pages will not be written out, and after
>that pages
>can be reclaimed by VM any time due to they are be cleaned when flush.
>Then any
>process who held the orphan inode may not read any original data
>correctly from
>this inode.
>
>And here is the unlink description in POSIX:
>"If one or more processes have the file open when the last link is
>removed,
>the link shall be removed before unlink() returns, but the removal of
>the
>file contents shall be postponed until all references to the file are
>closed."
>
>To my understanding for above description, we should keep data of
>helded orphan
>inode in memory or on disk until it is not referenced by any processes.
>
>How do you think of it?
>
>using "if (is_orphan_inode(sbi, inode->i_ino) &&
>!atomic_read(&inode->i_count))"
>to skip writing at the beginning of ->writepage()?
>
>Thanks,
>
Chao,
Your correct here, I was going to recommend this but my explanation
was pretty badly worded. Again also
I am not that well versed in the f2fs
code base so I wasn't sure if my
answer was correct.
Nick
>> + goto out;
>> +
>> if (!wbc->for_reclaim)
>> need_balance_fs = true;
>> else if (has_not_enough_free_secs(sbi, 0))
>> diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
>> index 3e92376..a2ea1b9 100644
>> --- a/fs/f2fs/dir.c
>> +++ b/fs/f2fs/dir.c
>> @@ -648,6 +648,7 @@ int f2fs_do_tmpfile(struct inode *inode, struct
>inode *dir)
>> update_inode(inode, page);
>> f2fs_put_page(page, 1);
>>
>> + set_inode_flag(F2FS_I(inode), FI_TMP_INODE);
>> clear_inode_flag(F2FS_I(inode), FI_NEW_INODE);
>> fail:
>> up_write(&F2FS_I(inode)->i_sem);
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index cdcae06..de21d38 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -1337,6 +1337,7 @@ static inline void f2fs_change_bit(unsigned int
>nr, char *addr)
>> /* used for f2fs_inode_info->flags */
>> enum {
>> FI_NEW_INODE, /* indicate newly allocated inode */
>> + FI_TMP_INODE, /* indicate tmpfile */
>> FI_DIRTY_INODE, /* indicate inode is dirty or not */
>> FI_DIRTY_DIR, /* indicate directory has dirty pages */
>> FI_INC_LINK, /* need to increment i_nlink */
>> @@ -1726,6 +1727,7 @@ int acquire_orphan_inode(struct f2fs_sb_info
>*);
>> void release_orphan_inode(struct f2fs_sb_info *);
>> void add_orphan_inode(struct f2fs_sb_info *, nid_t);
>> void remove_orphan_inode(struct f2fs_sb_info *, nid_t);
>> +bool is_orphan_inode(struct f2fs_sb_info *, nid_t);
>> void recover_orphan_inodes(struct f2fs_sb_info *);
>> int get_valid_checkpoint(struct f2fs_sb_info *);
>> void update_dirty_page(struct inode *, struct page *);
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>> index 7464d08..98af3bf 100644
>> --- a/fs/f2fs/super.c
>> +++ b/fs/f2fs/super.c
>> @@ -430,9 +430,21 @@ static int f2fs_drop_inode(struct inode *inode)
>> * - f2fs_write_data_page
>> * - f2fs_gc -> iput -> evict
>> * - inode_wait_for_writeback(inode)
>> + * In order to avoid that, f2fs_write_data_page does not write data
>> + * pages for orphan inode except tmpfile.
>> + * Nevertheless, we need to truncate the tmpfile's data to avoid
>> + * needless cleaning.
>> */
>> - if (!inode_unhashed(inode) && inode->i_state & I_SYNC)
>> + if (is_inode_flag_set(F2FS_I(inode), FI_TMP_INODE) &&
>> + inode->i_state & I_SYNC) {
>> + spin_unlock(&inode->i_lock);
>> + i_size_write(inode, 0);
>> +
>> + if (F2FS_HAS_BLOCKS(inode))
>> + f2fs_truncate(inode);
>> + spin_lock(&inode->i_lock);
>> return 0;
>> + }
>> return generic_drop_inode(inode);
>> }
>>
>> --
>> 2.1.1
>
>
>
>------------------------------------------------------------------------------
>One dashboard for servers and applications across
>Physical-Virtual-Cloud
>Widest out-of-the-box monitoring support with 50+ applications
>Performance metrics, stats and reports that give you Actionable
>Insights
>Deep dive visibility with transaction tracing using APM Insight.
>http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>_______________________________________________
>Linux-f2fs-devel mailing list
>Linux-f2fs-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
next prev parent reply other threads:[~2015-05-18 2:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-13 7:17 Space leak in f2fs hujianyang
2015-05-13 17:46 ` [f2fs-dev] " Jaegeuk Kim
2015-05-14 0:24 ` Jaegeuk Kim
2015-05-14 1:40 ` hujianyang
2015-05-14 1:45 ` [f2fs-dev] " Jaegeuk Kim
2015-05-14 21:14 ` Jaegeuk Kim
2015-05-15 8:31 ` Chao Yu
2015-05-16 0:55 ` Jaegeuk Kim
2015-05-18 2:43 ` Chao Yu
2015-05-18 2:50 ` Nicholas Krause [this message]
2015-05-18 5:44 ` Jaegeuk Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B9543FEE-3BB9-480B-BFB0-A0D980969C0B@gmail.com \
--to=xerofoify@gmail.com \
--cc=chao2.yu@samsung.com \
--cc=jaegeuk@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).