* possible race condition in vfs code
@ 2003-04-15 16:22 Dave Peterson
2003-04-29 22:51 ` Mika Penttilä
0 siblings, 1 reply; 4+ messages in thread
From: Dave Peterson @ 2003-04-15 16:22 UTC (permalink / raw)
To: linux-kernel
I think I found a race condition in some of the inode handling
code. Here's where I think it is:
1. Task A is inside kill_super(). It clears the MS_ACTIVE
flag of the s_flags field of the super_block struct and
calls invalidate_inodes(). In invalidate_inodes(), it
attempts to acquire inode_lock and spins because task B,
executing inside iput(), just decremented the reference
count of some inode i to 0 and acquired inode_lock.
2. Then the "if (!inode->i_nlink)" test condition evaluates
to false for task B so it executes the following chunk
of code:
01 } else {
02 if (!list_empty(&inode->i_hash)) {
03 if (!(inode->i_state & (I_DIRTY|I_LOCK))) {
04 list_del(&inode->i_list);
05 list_add(&inode->i_list, &inode_unused);
06 }
07 inodes_stat.nr_unused++;
08 spin_unlock(&inode_lock);
09 if (!sb || (sb->s_flags & MS_ACTIVE))
10 return;
11 write_inode_now(inode, 1);
12 spin_lock(&inode_lock);
13 inodes_stat.nr_unused--;
14 list_del_init(&inode->i_hash);
15 }
16 list_del_init(&inode->i_list);
17 inode->i_state|=I_FREEING;
18 inodes_stat.nr_inodes--;
19 spin_unlock(&inode_lock);
20 if (inode->i_data.nrpages)
21 truncate_inode_pages(&inode->i_data, 0);
22 clear_inode(inode);
23 }
24 destroy_inode(inode);
Now the test condition on line 02 evaluates to true, so
task B adds the inode to the inode_unused list and then
releases inode_lock on line 08.
3. Now A acquires inode_lock and B spins on inode_lock inside
write_inode_now();
4. Task A calls invalidate_list(), transferring inode i from
the inode_unused list to its own private throw_away list.
5. Task A releases inode_lock, allowing B to acquire inode_lock
and continue executing.
6. A attempts to destroy inode i inside dispose_list() while B
simultaneously attempts to destroy i, potentially causing
all sorts of bad things to happen.
So, did I find a bug or are my eyes playing tricks on me?
-Dave Peterson
dsp@llnl.gov
P.S. Please CC my email address when responding to this message.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: possible race condition in vfs code
2003-04-15 16:22 possible race condition in vfs code Dave Peterson
@ 2003-04-29 22:51 ` Mika Penttilä
2003-04-29 23:03 ` viro
0 siblings, 1 reply; 4+ messages in thread
From: Mika Penttilä @ 2003-04-29 22:51 UTC (permalink / raw)
To: Dave Peterson; +Cc: linux-kernel
That piece of code looks wrong in other ways also..if we have unmounted
an active fs (which shouldn't be done but happens!) we shouldn't be at
least writing back to it anything! The !sb test isn't useful (we never
clear it in live inodes) and MS_ACTIVE handling is racy as hell wrt
umount...
--Mika
Dave Peterson wrote:
>I think I found a race condition in some of the inode handling
>code. Here's where I think it is:
>
> 1. Task A is inside kill_super(). It clears the MS_ACTIVE
> flag of the s_flags field of the super_block struct and
> calls invalidate_inodes(). In invalidate_inodes(), it
> attempts to acquire inode_lock and spins because task B,
> executing inside iput(), just decremented the reference
> count of some inode i to 0 and acquired inode_lock.
>
> 2. Then the "if (!inode->i_nlink)" test condition evaluates
> to false for task B so it executes the following chunk
> of code:
>
> 01 } else {
> 02 if (!list_empty(&inode->i_hash)) {
> 03 if (!(inode->i_state & (I_DIRTY|I_LOCK))) {
> 04 list_del(&inode->i_list);
> 05 list_add(&inode->i_list, &inode_unused);
> 06 }
> 07 inodes_stat.nr_unused++;
> 08 spin_unlock(&inode_lock);
> 09 if (!sb || (sb->s_flags & MS_ACTIVE))
> 10 return;
> 11 write_inode_now(inode, 1);
> 12 spin_lock(&inode_lock);
> 13 inodes_stat.nr_unused--;
> 14 list_del_init(&inode->i_hash);
> 15 }
> 16 list_del_init(&inode->i_list);
> 17 inode->i_state|=I_FREEING;
> 18 inodes_stat.nr_inodes--;
> 19 spin_unlock(&inode_lock);
> 20 if (inode->i_data.nrpages)
> 21 truncate_inode_pages(&inode->i_data, 0);
> 22 clear_inode(inode);
> 23 }
> 24 destroy_inode(inode);
>
> Now the test condition on line 02 evaluates to true, so
> task B adds the inode to the inode_unused list and then
> releases inode_lock on line 08.
>
> 3. Now A acquires inode_lock and B spins on inode_lock inside
> write_inode_now();
>
> 4. Task A calls invalidate_list(), transferring inode i from
> the inode_unused list to its own private throw_away list.
>
> 5. Task A releases inode_lock, allowing B to acquire inode_lock
> and continue executing.
>
> 6. A attempts to destroy inode i inside dispose_list() while B
> simultaneously attempts to destroy i, potentially causing
> all sorts of bad things to happen.
>
>So, did I find a bug or are my eyes playing tricks on me?
>
>-Dave Peterson
> dsp@llnl.gov
>
>P.S. Please CC my email address when responding to this message.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: possible race condition in vfs code
2003-04-29 22:51 ` Mika Penttilä
@ 2003-04-29 23:03 ` viro
2003-04-30 6:52 ` Mika Penttilä
0 siblings, 1 reply; 4+ messages in thread
From: viro @ 2003-04-29 23:03 UTC (permalink / raw)
To: Mika Penttilä; +Cc: Dave Peterson, linux-kernel
On Wed, Apr 30, 2003 at 01:51:42AM +0300, Mika Penttilä wrote:
> That piece of code looks wrong in other ways also..if we have unmounted
> an active fs (which shouldn't be done but happens!) we shouldn't be at
> least writing back to it anything! The !sb test isn't useful (we never
> clear it in live inodes) and MS_ACTIVE handling is racy as hell wrt
> umount...
Would you mind actually _reading_ kill_super() and stuff called by it?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: possible race condition in vfs code
2003-04-29 23:03 ` viro
@ 2003-04-30 6:52 ` Mika Penttilä
0 siblings, 0 replies; 4+ messages in thread
From: Mika Penttilä @ 2003-04-30 6:52 UTC (permalink / raw)
To: viro; +Cc: Dave Peterson, linux-kernel
I mean busy inodes after umount doing bogus write_inode_now. Busy inodes
don't pin the superblock (vfsmnt does but it's gone, otherwise we
wouldn't be in kill_sb in the first place).
Dave Peterson's fix solves the double free issue, but does potentially
introduce another busy inode after sb has gone.
--Mika
viro@parcelfarce.linux.theplanet.co.uk wrote:
>On Wed, Apr 30, 2003 at 01:51:42AM +0300, Mika Penttilä wrote:
>
>
>>That piece of code looks wrong in other ways also..if we have unmounted
>>an active fs (which shouldn't be done but happens!) we shouldn't be at
>>least writing back to it anything! The !sb test isn't useful (we never
>>clear it in live inodes) and MS_ACTIVE handling is racy as hell wrt
>>umount...
>>
>>
>
>Would you mind actually _reading_ kill_super() and stuff called by it?
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-04-30 6:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-15 16:22 possible race condition in vfs code Dave Peterson
2003-04-29 22:51 ` Mika Penttilä
2003-04-29 23:03 ` viro
2003-04-30 6:52 ` Mika Penttilä
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).