From: Hou Tao <houtao1@huawei.com>
To: Dominique Martinet <asmadeus@codewreck.org>
Cc: <v9fs-developer@lists.sourceforge.net>, <lucho@ionkov.net>,
<ericvh@gmail.com>, <linux-fsdevel@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <xingaopeng@huawei.com>
Subject: Re: [PATCH] 9p: use inode->i_lock to protect i_size_write()
Date: Thu, 10 Jan 2019 10:50:12 +0800 [thread overview]
Message-ID: <831fe284-c9e9-49a4-d530-5af57c2dd9d1@huawei.com> (raw)
Message-ID: <20190110025012.z2ilKrOfo0advylfVkunVqFVrpDBPaaXnW8dWF2CjJE@z> (raw)
In-Reply-To: <20190109023832.GA12389@nautica>
Hi,
On 2019/1/9 10:38, Dominique Martinet wrote:
> Hou Tao wrote on Wed, Jan 09, 2019:
>> Use inode->i_lock to protect i_size_write(), else i_size_read() in
>> generic_fillattr() may loop infinitely when multiple processes invoke
>> v9fs_vfs_getattr() or v9fs_vfs_getattr_dotl() simultaneously under
>> 32-bit SMP environment, and a soft lockup will be triggered as show below:
> Hmm, I'm not familiar with the read/write seqcount code for 32 bit but I
> don't understand how locking here helps besides slowing things down (so
> if the value is constantly updated, the read thread might have a chance
> to be scheduled between two updates which was harder to do before ; and
> thus "solving" your soft lockup)
i_size_read() will call read_seqcount_begin() under 32-bit SMP environment,
and it may loop in __read_seqcount_begin() infinitely because two or more
invocations of write_seqcount_begin interleave and s->sequence becomes
an odd number. It's noted in comments of i_size_write():
/*
* NOTE: unlike i_size_read(), i_size_write() does need locking around it
* (normally i_mutex), otherwise on 32bit/SMP an update of i_size_seqcount
* can be lost, resulting in subsequent i_size_read() calls spinning forever.
*/
static inline void i_size_write(struct inode *inode, loff_t i_size)
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
preempt_disable();
write_seqcount_begin(&inode->i_size_seqcount);
inode->i_size = i_size;
write_seqcount_end(&inode->i_size_seqcount);
preempt_enable();
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
preempt_disable();
inode->i_size = i_size;
preempt_enable();
#else
inode->i_size = i_size;
#endif
}
> Instead, a better fix would be to update v9fs_stat2inode to first read
> the inode size, and only call i_size_write if it changed - I'd bet this
> also fixes the problem and looks better than locking to me.
> (Can also probably reuse stat->length instead of the following
> i_size_read for i_blocks...)
For read-only case, this fix will work. However if the inode size is changed
constantly, there will be two or more callers of i_size_write() and the soft-lockup
is still possible.
>
> On the other hand it might make sense to also lock the inode for
> stat2inode because we're dealing with partially updated inodes at time,
> but if we do this I'd rather put the locking in v9fs_stat2inode and not
> outside of it to catch all the places where it's used; but the readers
> don't lock so I'm not sure it makes much sense.
Moving lock into v9fs_stat2inode() sounds reasonable. There are callers
which don't need it (e.g. v9fs_qid_iget() uses it to fill attribute for a
newly-created inode and v9fs_mount() uses it to fill attribute for root inode),
so i will rename v9fs_stat2inode() to v9fs_stat2inode_nolock(), and wrap
v9fs_stat2inode() upon v9fs_stat2inode_nolock().
>
> There's also a window during which the inode's nlink is dropped down to
> 1 then set again appropriately if the extension is present; that's
> rather ugly and we probably should only reset it to 1 if the attribute
> wasn't set before... That can be another patch and/or I'll do it
> eventually if you don't.
I can not follow that. Do you mean inode->i_nlink may be updated concurrently
by v9fs_stat2inode() and v9fs_remove() and that will lead to corruption of i_nlink ?
I also note a race about updating of v9inode->cache_validity. It seems that it is possible
the clear of V9FS_INO_INVALID_ATTR in v9fs_remove() may lost if there are invocations of
v9fs_vfs_getattr() in the same time. We may need to ensure V9FS_INO_INVALID_ATTR is enabled
before clearing it atomically in v9fs_vfs_getattr() and i will send another patch for it.
Regards,
Tao
> I hope what I said makes sense.
>
> Thanks,
next prev parent reply other threads:[~2019-01-10 2:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-09 2:05 [PATCH] 9p: use inode->i_lock to protect i_size_write() Hou Tao
2019-01-09 2:05 ` Hou Tao
2019-01-09 2:38 ` Dominique Martinet
2019-01-10 2:50 ` Hou Tao [this message]
2019-01-10 2:50 ` Hou Tao
2019-01-10 4:18 ` Dominique Martinet
2019-01-10 4:18 ` Dominique Martinet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=831fe284-c9e9-49a4-d530-5af57c2dd9d1@huawei.com \
--to=houtao1@huawei.com \
--cc=asmadeus@codewreck.org \
--cc=ericvh@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lucho@ionkov.net \
--cc=v9fs-developer@lists.sourceforge.net \
--cc=xingaopeng@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).