All of lore.kernel.org
 help / color / mirror / Atom feed
* i_mutex and deadlock
@ 2007-02-23 16:02 Steve French (smfltc)
  2007-02-23 16:29 ` [linux-cifs-client] " Dave Kleikamp
  0 siblings, 1 reply; 2+ messages in thread
From: Steve French (smfltc) @ 2007-02-23 16:02 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-cifs-client

A field in i_size_write (i_size_seqcount) must be protected against 
simultaneous update otherwise we risk looping in i_size_read.

The suggestion in fs.h is to use i_mutex which seems too dangerous due 
to the possibility of deadlock.

There are 65 places in the fs directory which lock an i_mutex, and seven 
more in the mm directory.   The vfs does clearly lock file inodes in 
some paths before calling into a particular filesystem (nfs, ext3, cifs 
etc.) - in particular for fsync but probably for others that are harder 
to trace.  This seems to introduce the possibility of deadlock if a 
filesystem also uses i_mutex to protect file size updates

Documentation/filesystems/Locking describes the use of i_mutex (was 
"i_sem" previously) and indicates that it is held by the vfs on three 
additional calls on file inodes which concern me (for deadlock 
possibility), setattr, truncate and unlink.

nfs seems to limit its use of i_mutex to llseek and invalidate_mapping, 
and does not appear to grab the i_mutex (or any sem for that matter) to 
protect i_size_write
(nfs calls i_size_write in nfs_grow_file) - and for the case of 
nfs_fhget (in which they bypass i_size_write and set i_size directly) 
does not seem to grab i_mutex either.

ext3 also does not use i_mutex for this purpose (protecting 
i_size_write) - ony to protect a journalling ioctl.

I am concerned about using i_mutex to protect the cifs calls to 
i_size_write (although it seems to fix a problem reported in i_size_read 
under stress) because of the following:

1) no one else calls i_size_write AFAIK (on our file inodes)
2) we don't block inside i_size_write do we ... (so why in the world do 
they take a slow mutex instead of a fast spinlock)
3) we don't really know what happens inside fsync (the paths through the 
page cache code seem complex and we don't want to reenter writepage in 
low memory conditions and deadlock updating the file size), and there is 
some concern that the vfs takes the i_mutex in other paths on file 
inodes before entering our code and could deadlock.

Any reason, why an fs shouldn't simply use something else (a spinlock) 
other than i_mutex to protect the i_size_write call?

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [linux-cifs-client] i_mutex and deadlock
  2007-02-23 16:02 i_mutex and deadlock Steve French (smfltc)
@ 2007-02-23 16:29 ` Dave Kleikamp
  0 siblings, 0 replies; 2+ messages in thread
From: Dave Kleikamp @ 2007-02-23 16:29 UTC (permalink / raw)
  To: Steve French (smfltc); +Cc: linux-fsdevel, linux-cifs-client

On Fri, 2007-02-23 at 10:02 -0600, Steve French (smfltc) wrote:
> A field in i_size_write (i_size_seqcount) must be protected against 
> simultaneous update otherwise we risk looping in i_size_read.
> 
> The suggestion in fs.h is to use i_mutex which seems too dangerous due 
> to the possibility of deadlock.

I'm not sure if it's as much a suggestion as a way of documenting the
locking  that exists (or existed when the comment was written).

"... i_size_write() does need locking around it  (normally i_mutex) ..."

> There are 65 places in the fs directory which lock an i_mutex, and seven 
> more in the mm directory.   The vfs does clearly lock file inodes in 
> some paths before calling into a particular filesystem (nfs, ext3, cifs 
> etc.) - in particular for fsync but probably for others that are harder 
> to trace.  This seems to introduce the possibility of deadlock if a 
> filesystem also uses i_mutex to protect file size updates
> 
> Documentation/filesystems/Locking describes the use of i_mutex (was 
> "i_sem" previously) and indicates that it is held by the vfs on three 
> additional calls on file inodes which concern me (for deadlock 
> possibility), setattr, truncate and unlink.
> 
> nfs seems to limit its use of i_mutex to llseek and invalidate_mapping, 
> and does not appear to grab the i_mutex (or any sem for that matter) to 
> protect i_size_write
> (nfs calls i_size_write in nfs_grow_file) - and for the case of 
> nfs_fhget (in which they bypass i_size_write and set i_size directly) 
> does not seem to grab i_mutex either.
> 
> ext3 also does not use i_mutex for this purpose (protecting 
> i_size_write) - ony to protect a journalling ioctl.
> 
> I am concerned about using i_mutex to protect the cifs calls to 
> i_size_write (although it seems to fix a problem reported in i_size_read 
> under stress) because of the following:
> 
> 1) no one else calls i_size_write AFAIK (on our file inodes)

I think you're right.

> 2) we don't block inside i_size_write do we ... (so why in the world do 
> they take a slow mutex instead of a fast spinlock)

My guess, is that in existing cases, it was already being held, so there
is no need to do something different.  I'm not sure if the comment is
still accurate.  What locking protects it in generic_commit_write() and
nobh_commit_write()?

> 3) we don't really know what happens inside fsync (the paths through the 
> page cache code seem complex and we don't want to reenter writepage in 
> low memory conditions and deadlock updating the file size), and there is 
> some concern that the vfs takes the i_mutex in other paths on file 
> inodes before entering our code and could deadlock.
> 
> Any reason, why an fs shouldn't simply use something else (a spinlock) 
> other than i_mutex to protect the i_size_write call?

i_mutex doesn't make sense in your case.  Use whatever makes sense in
cifs.

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-02-23 16:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-23 16:02 i_mutex and deadlock Steve French (smfltc)
2007-02-23 16:29 ` [linux-cifs-client] " Dave Kleikamp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.