All of lore.kernel.org
 help / color / mirror / Atom feed
* + fs-affs-use-inode-writecount-instead-of-local-i_opencnt.patch added to -mm tree
@ 2015-01-28 21:45 akpm
       [not found] ` <20150128224534.GB29656@ZenIV.linux.org.uk>
  0 siblings, 1 reply; 6+ messages in thread
From: akpm @ 2015-01-28 21:45 UTC (permalink / raw)
  To: fabf, jack, viro, mm-commits


The patch titled
     Subject: fs/affs: use inode writecount instead of local i_opencnt
has been added to the -mm tree.  Its filename is
     fs-affs-use-inode-writecount-instead-of-local-i_opencnt.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-affs-use-inode-writecount-instead-of-local-i_opencnt.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-affs-use-inode-writecount-instead-of-local-i_opencnt.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Fabian Frederick <fabf@skynet.be>
Subject: fs/affs: use inode writecount instead of local i_opencnt

atomic_t i_opencnt was used to free allocation in case there were no more
opens.  This patch replaces affs_file_open by generic_file_open and uses
FMODE_WRITE/i_writecount==1 for the task like other FS.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/affs/affs.h  |    1 -
 fs/affs/file.c  |   17 ++++-------------
 fs/affs/inode.c |    2 --
 3 files changed, 4 insertions(+), 16 deletions(-)

diff -puN fs/affs/affs.h~fs-affs-use-inode-writecount-instead-of-local-i_opencnt fs/affs/affs.h
--- a/fs/affs/affs.h~fs-affs-use-inode-writecount-instead-of-local-i_opencnt
+++ a/fs/affs/affs.h
@@ -39,7 +39,6 @@ struct affs_ext_key {
  * affs fs inode data in memory
  */
 struct affs_inode_info {
-	atomic_t i_opencnt;
 	struct semaphore i_link_lock;		/* Protects internal inode access. */
 	struct semaphore i_ext_lock;		/* Protects internal inode access. */
 #define i_hash_lock i_ext_lock
diff -puN fs/affs/file.c~fs-affs-use-inode-writecount-instead-of-local-i_opencnt fs/affs/file.c
--- a/fs/affs/file.c~fs-affs-use-inode-writecount-instead-of-local-i_opencnt
+++ a/fs/affs/file.c
@@ -18,21 +18,12 @@
 static struct buffer_head *affs_get_extblock_slow(struct inode *inode, u32 ext);
 
 static int
-affs_file_open(struct inode *inode, struct file *filp)
-{
-	pr_debug("open(%lu,%d)\n",
-		 inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt));
-	atomic_inc(&AFFS_I(inode)->i_opencnt);
-	return 0;
-}
-
-static int
 affs_file_release(struct inode *inode, struct file *filp)
 {
-	pr_debug("release(%lu, %d)\n",
-		 inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt));
+	pr_debug("release(%lu)\n", inode->i_ino);
 
-	if (atomic_dec_and_test(&AFFS_I(inode)->i_opencnt)) {
+	if ((filp->f_mode & FMODE_WRITE) &&
+	    (atomic_read(&inode->i_writecount) == 1)) {
 		mutex_lock(&inode->i_mutex);
 		if (inode->i_size != AFFS_I(inode)->mmu_private)
 			affs_truncate(inode);
@@ -969,7 +960,7 @@ const struct file_operations affs_file_o
 	.write		= new_sync_write,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.open		= affs_file_open,
+	.open		= generic_file_open,
 	.release	= affs_file_release,
 	.fsync		= affs_file_fsync,
 	.splice_read	= generic_file_splice_read,
diff -puN fs/affs/inode.c~fs-affs-use-inode-writecount-instead-of-local-i_opencnt fs/affs/inode.c
--- a/fs/affs/inode.c~fs-affs-use-inode-writecount-instead-of-local-i_opencnt
+++ a/fs/affs/inode.c
@@ -56,7 +56,6 @@ struct inode *affs_iget(struct super_blo
 	AFFS_I(inode)->i_extcnt = 1;
 	AFFS_I(inode)->i_ext_last = ~1;
 	AFFS_I(inode)->i_protect = prot;
-	atomic_set(&AFFS_I(inode)->i_opencnt, 0);
 	AFFS_I(inode)->i_blkcnt = 0;
 	AFFS_I(inode)->i_lc = NULL;
 	AFFS_I(inode)->i_lc_size = 0;
@@ -307,7 +306,6 @@ affs_new_inode(struct inode *dir)
 	inode->i_ino     = block;
 	set_nlink(inode, 1);
 	inode->i_mtime   = inode->i_atime = inode->i_ctime = CURRENT_TIME_SEC;
-	atomic_set(&AFFS_I(inode)->i_opencnt, 0);
 	AFFS_I(inode)->i_blkcnt = 0;
 	AFFS_I(inode)->i_lc = NULL;
 	AFFS_I(inode)->i_lc_size = 0;
_

Patches currently in -mm which might be from fabf@skynet.be are

fs-ext4-fsyncc-generic_file_fsync-call-based-on-barrier-flag.patch
ocfs2-remove-unnecessary-else-in-ocfs2_set_acl.patch
fs-befs-linuxvfsc-remove-unnecessary-casting.patch
fs-befs-linuxvfsc-remove-unnecessary-casting-fix.patch
fs-coda-dirc-forward-declaration-clean-up.patch
fs-ufs-superc-remove-unnecessary-casting.patch
fs-reiserfs-inodec-replace-0-by-null-for-pointers.patch
ptrace-remove-linux-compath-inclusion-under-config_compat.patch
vmcore-fix-pt_note-n_namesz-n_descsz-overflow-issue.patch
fs-affs-filec-replace-if-bug-by-bug_on.patch
fs-affs-filec-fix-direct-io-writes-beyond-eof.patch
fs-affs-superc-destroy-sbi-mutex-in-affs_kill_sb.patch
fs-affs-use-inode-writecount-instead-of-local-i_opencnt.patch
linux-next.patch


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Elevated i_writecount doesn't guarantee ->release to be called
       [not found] ` <20150128224534.GB29656@ZenIV.linux.org.uk>
@ 2015-01-29 12:46   ` Jan Kara
  2015-01-29 16:47     ` Fabian Frederick
  2015-01-29 17:35     ` Al Viro
  0 siblings, 2 replies; 6+ messages in thread
From: Jan Kara @ 2015-01-29 12:46 UTC (permalink / raw)
  To: Al Viro; +Cc: akpm, fabf, jack, mm-commits, linux-fsdevel

  Changed subject and added linux-fsdevel to CC so that other developers
read this don't fall into the same trap :).

On Wed 28-01-15 22:45:34, Al Viro wrote:
> On Wed, Jan 28, 2015 at 01:45:24PM -0800, akpm@linux-foundation.org wrote:
> > atomic_t i_opencnt was used to free allocation in case there were no more
> > opens.  This patch replaces affs_file_open by generic_file_open and uses
> > FMODE_WRITE/i_writecount==1 for the task like other FS.
> 
> 
> >  affs_file_release(struct inode *inode, struct file *filp)
> >  {
> > -	pr_debug("release(%lu, %d)\n",
> > -		 inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt));
> > +	pr_debug("release(%lu)\n", inode->i_ino);
> >  
> > -	if (atomic_dec_and_test(&AFFS_I(inode)->i_opencnt)) {
> > +	if ((filp->f_mode & FMODE_WRITE) &&
> > +	    (atomic_read(&inode->i_writecount) == 1)) {
> 
> I'm not at all convinced that this is correct for affs.  Or for anything
> else, for that matter.  Look: suppose somebody else is trying to open
> that sucker with O_TRUNC at that moment and they'd already gotten past
> get_write_access() in handle_truncate(), only to fail on locks_verify_locked().
> _That_ open() won't get anywhere near opening the file, so there won't be
> ->release() for it.  And our ->release() will see ->i_writecount greater
> than 1, due to get_write_access() done in handle_truncate() and still not
> balanced by coming put_write_access() in there - we'll call it after the
> locks_verify_locked() reports failure, but that hasn't happened yet.
> 
> Similar scenarios can almost certainly be constructed for other calls of
> get_write_access() as well, but this one is enough to NAK this patch, _and_
> to make the similar logics in other filesystems very suspicious...
  Thanks for pointing this out. You made me at look where exactly is
get_write_access() called and there are even places where we call it
without having file descriptor at all (e.g.  truncate path). So ext3, ext4,
udf, and gfs2 are racy. If we race, results aren't that bad (we just keep
preallocated blocks in the inode) but still it would be nice to fix.

Obviously we could maintain a private writecount in ->open() method but it
would seem a bit sad to do that for this mostly theoretical issue. Maybe we
just verify whether preallocation is truncated when evicting inode from
memory and if not, do it there. It's not perfect but even with current racy
solution noone noticed in practice.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Elevated i_writecount doesn't guarantee ->release to be called
  2015-01-29 12:46   ` Elevated i_writecount doesn't guarantee ->release to be called Jan Kara
@ 2015-01-29 16:47     ` Fabian Frederick
  2015-01-29 16:57       ` Jan Kara
  2015-01-29 17:35     ` Al Viro
  1 sibling, 1 reply; 6+ messages in thread
From: Fabian Frederick @ 2015-01-29 16:47 UTC (permalink / raw)
  To: Al Viro, Jan Kara; +Cc: akpm, mm-commits, linux-fsdevel



> On 29 January 2015 at 13:46 Jan Kara <jack@suse.cz> wrote:
>
>
>   Changed subject and added linux-fsdevel to CC so that other developers
> read this don't fall into the same trap :).
>
> On Wed 28-01-15 22:45:34, Al Viro wrote:
> > On Wed, Jan 28, 2015 at 01:45:24PM -0800, akpm@linux-foundation.org wrote:
> > > atomic_t i_opencnt was used to free allocation in case there were no more
> > > opens.  This patch replaces affs_file_open by generic_file_open and uses
> > > FMODE_WRITE/i_writecount==1 for the task like other FS.
> >
> >
> > >  affs_file_release(struct inode *inode, struct file *filp)
> > >  {
> > > - pr_debug("release(%lu, %d)\n",
> > > -          inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt));
> > > + pr_debug("release(%lu)\n", inode->i_ino);
> > > 
> > > - if (atomic_dec_and_test(&AFFS_I(inode)->i_opencnt)) {
> > > + if ((filp->f_mode & FMODE_WRITE) &&
> > > +     (atomic_read(&inode->i_writecount) == 1)) {
> >
> > I'm not at all convinced that this is correct for affs.  Or for anything
> > else, for that matter.  Look: suppose somebody else is trying to open
> > that sucker with O_TRUNC at that moment and they'd already gotten past
> > get_write_access() in handle_truncate(), only to fail on
> > locks_verify_locked().
> > _That_ open() won't get anywhere near opening the file, so there won't be
> > ->release() for it.  And our ->release() will see ->i_writecount greater
> > than 1, due to get_write_access() done in handle_truncate() and still not
> > balanced by coming put_write_access() in there - we'll call it after the
> > locks_verify_locked() reports failure, but that hasn't happened yet.
> >
> > Similar scenarios can almost certainly be constructed for other calls of
> > get_write_access() as well, but this one is enough to NAK this patch, _and_
> > to make the similar logics in other filesystems very suspicious...
>   Thanks for pointing this out. You made me at look where exactly is
> get_write_access() called and there are even places where we call it
> without having file descriptor at all (e.g.  truncate path). So ext3, ext4,
> udf, and gfs2 are racy. If we race, results aren't that bad (we just keep
> preallocated blocks in the inode) but still it would be nice to fix.
>
> Obviously we could maintain a private writecount in ->open() method but it
> would seem a bit sad to do that for this mostly theoretical issue. Maybe we
> just verify whether preallocation is truncated when evicting inode from
> memory and if not, do it there. It's not perfect but even with current racy
> solution noone noticed in practice.
Note that udf is slightly different ; it checks for i_writecount > 1 not =1
which means it would release the file in scenario described above ...

Regards,
Fabian

>
>                                                               Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Elevated i_writecount doesn't guarantee ->release to be called
  2015-01-29 16:47     ` Fabian Frederick
@ 2015-01-29 16:57       ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2015-01-29 16:57 UTC (permalink / raw)
  To: Fabian Frederick; +Cc: Al Viro, Jan Kara, akpm, mm-commits, linux-fsdevel

On Thu 29-01-15 17:47:56, Fabian Frederick wrote:
> > On 29 January 2015 at 13:46 Jan Kara <jack@suse.cz> wrote:
> >
> >
> >   Changed subject and added linux-fsdevel to CC so that other developers
> > read this don't fall into the same trap :).
> >
> > On Wed 28-01-15 22:45:34, Al Viro wrote:
> > > On Wed, Jan 28, 2015 at 01:45:24PM -0800, akpm@linux-foundation.org wrote:
> > > > atomic_t i_opencnt was used to free allocation in case there were no more
> > > > opens.  This patch replaces affs_file_open by generic_file_open and uses
> > > > FMODE_WRITE/i_writecount==1 for the task like other FS.
> > >
> > >
> > > >  affs_file_release(struct inode *inode, struct file *filp)
> > > >  {
> > > > - pr_debug("release(%lu, %d)\n",
> > > > -          inode->i_ino, atomic_read(&AFFS_I(inode)->i_opencnt));
> > > > + pr_debug("release(%lu)\n", inode->i_ino);
> > > > 
> > > > - if (atomic_dec_and_test(&AFFS_I(inode)->i_opencnt)) {
> > > > + if ((filp->f_mode & FMODE_WRITE) &&
> > > > +     (atomic_read(&inode->i_writecount) == 1)) {
> > >
> > > I'm not at all convinced that this is correct for affs.  Or for anything
> > > else, for that matter.  Look: suppose somebody else is trying to open
> > > that sucker with O_TRUNC at that moment and they'd already gotten past
> > > get_write_access() in handle_truncate(), only to fail on
> > > locks_verify_locked().
> > > _That_ open() won't get anywhere near opening the file, so there won't be
> > > ->release() for it.  And our ->release() will see ->i_writecount greater
> > > than 1, due to get_write_access() done in handle_truncate() and still not
> > > balanced by coming put_write_access() in there - we'll call it after the
> > > locks_verify_locked() reports failure, but that hasn't happened yet.
> > >
> > > Similar scenarios can almost certainly be constructed for other calls of
> > > get_write_access() as well, but this one is enough to NAK this patch, _and_
> > > to make the similar logics in other filesystems very suspicious...
> >   Thanks for pointing this out. You made me at look where exactly is
> > get_write_access() called and there are even places where we call it
> > without having file descriptor at all (e.g.  truncate path). So ext3, ext4,
> > udf, and gfs2 are racy. If we race, results aren't that bad (we just keep
> > preallocated blocks in the inode) but still it would be nice to fix.
> >
> > Obviously we could maintain a private writecount in ->open() method but it
> > would seem a bit sad to do that for this mostly theoretical issue. Maybe we
> > just verify whether preallocation is truncated when evicting inode from
> > memory and if not, do it there. It's not perfect but even with current racy
> > solution noone noticed in practice.
> Note that udf is slightly different ; it checks for i_writecount > 1 not =1
> which means it would release the file in scenario described above ...
  I know but that's because it has two bugs in a single condition ;) I have
a patch queued for udf which changes the condition to == 1.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Elevated i_writecount doesn't guarantee ->release to be called
  2015-01-29 12:46   ` Elevated i_writecount doesn't guarantee ->release to be called Jan Kara
  2015-01-29 16:47     ` Fabian Frederick
@ 2015-01-29 17:35     ` Al Viro
  2015-01-30  5:33       ` Fabian Frederick
  1 sibling, 1 reply; 6+ messages in thread
From: Al Viro @ 2015-01-29 17:35 UTC (permalink / raw)
  To: Jan Kara; +Cc: akpm, fabf, mm-commits, linux-fsdevel

On Thu, Jan 29, 2015 at 01:46:30PM +0100, Jan Kara wrote:
>   Thanks for pointing this out. You made me at look where exactly is
> get_write_access() called and there are even places where we call it
> without having file descriptor at all (e.g.  truncate path). So ext3, ext4,
> udf, and gfs2 are racy. If we race, results aren't that bad (we just keep
> preallocated blocks in the inode) but still it would be nice to fix.
> 
> Obviously we could maintain a private writecount in ->open() method but it
> would seem a bit sad to do that for this mostly theoretical issue. Maybe we
> just verify whether preallocation is truncated when evicting inode from
> memory and if not, do it there. It's not perfect but even with current racy
> solution noone noticed in practice.

The trouble with doing that on inode eviction is that we might have done
r/o remount by then, so any metadata writes are unexpected at that point...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Elevated i_writecount doesn't guarantee ->release to be called
  2015-01-29 17:35     ` Al Viro
@ 2015-01-30  5:33       ` Fabian Frederick
  0 siblings, 0 replies; 6+ messages in thread
From: Fabian Frederick @ 2015-01-30  5:33 UTC (permalink / raw)
  To: Jan Kara, Al Viro; +Cc: akpm, mm-commits, linux-fsdevel



> On 29 January 2015 at 18:35 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
>
> On Thu, Jan 29, 2015 at 01:46:30PM +0100, Jan Kara wrote:
> >   Thanks for pointing this out. You made me at look where exactly is
> > get_write_access() called and there are even places where we call it
> > without having file descriptor at all (e.g.  truncate path). So ext3, ext4,
> > udf, and gfs2 are racy. If we race, results aren't that bad (we just keep
> > preallocated blocks in the inode) but still it would be nice to fix.
> >
> > Obviously we could maintain a private writecount in ->open() method but it
> > would seem a bit sad to do that for this mostly theoretical issue. Maybe we
> > just verify whether preallocation is truncated when evicting inode from
> > memory and if not, do it there. It's not perfect but even with current racy
> > solution noone noticed in practice.
>
> The trouble with doing that on inode eviction is that we might have done
> r/o remount by then, so any metadata writes are unexpected at that point...

What would you suggest then ? I could try to do the right update on AFFS
then apply it on ext3,4,udf,gfs2 as well. Maybe some generic_release
would be useful to avoid repeating the same logic around private
preallocation management ?

Regards,
Fabian
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-01-30  5:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-28 21:45 + fs-affs-use-inode-writecount-instead-of-local-i_opencnt.patch added to -mm tree akpm
     [not found] ` <20150128224534.GB29656@ZenIV.linux.org.uk>
2015-01-29 12:46   ` Elevated i_writecount doesn't guarantee ->release to be called Jan Kara
2015-01-29 16:47     ` Fabian Frederick
2015-01-29 16:57       ` Jan Kara
2015-01-29 17:35     ` Al Viro
2015-01-30  5:33       ` Fabian Frederick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.