linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: William Koh <kkc6196@fb.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	"Theodore Ts'o" <tytso@mit.edu>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] fs: ext4: inode->i_generation not assigned 0.
Date: Thu, 29 Jun 2017 14:28:41 +0000	[thread overview]
Message-ID: <0D8EA9C1-E2E7-4D63-8F12-4BDED555F18E@fb.com> (raw)
In-Reply-To: <20170629045940.GB5865@birch.djwong.org>

On 6/28/17, 9:59 PM, "Darrick J. Wong" <darrick.wong@oracle.com> wrote:

    [add linux-xfs to cc]
    
    On Thu, Jun 29, 2017 at 04:37:14AM +0000, William Koh wrote:
    > On 6/28/17, 7:32 PM, "Andreas Dilger" <adilger@dilger.ca> wrote:
    > 
    >     On Jun 28, 2017, at 4:06 PM, Kyungchan Koh <kkc6196@fb.com> wrote:
    >     > 
    >     > In fs/ext4/super.c, the function ext4_nfs_get_inode takes as input
    >     > "generation" that can be used to specify the generation of the inode to
    >     > be returned. When 0 is given as input, then inodes of any generation can
    >     > be returned. Therefore, generation 0 is a special case that should be
    >     > avoided when assigning generation to inodes.
    >     
    >     I'd agree with this change to avoid assigning generation == 0 to real inodes.
    >     
    >     Also, the separate question arises about whether we need to allow file handle
    >     lookup with generation == 0?  That allows FID guessing easily, while requiring
    >     a non-zero generation makes that a lot harder.
    >     
    >     What are the cases where generation == 0 are used?
    > 
    > Honestly, I’m not too sure. I just noticed that generation 0 was a special
    > case from reading the code.
    > 
    >     > A new inline function, ext4_inode_set_gen, will take care of the
    >     > problem.  Now, inodes cannot have a generation of 0, so this patch fixes
    >     > the issue.
    >     > 
    >     > Signed-off-by: Kyungchan Koh <kkc6196@fb.com>
    >     > 
    >     > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
    >     > index 3219154..74c6677 100644
    >     > --- a/fs/ext4/ext4.h
    >     > +++ b/fs/ext4/ext4.h
    >     > @@ -1549,6 +1549,14 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
    >     > 		 ino <= le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count));
    >     > }
    >     > 
    >     > +static inline void ext4_inode_set_gen(struct inode *inode,
    >     > +				      struct ext4_sb_info *sbi)
    >     > +{
    >     > +	inode->i_generation = sbi->s_next_generation++;
    >     > +	if (!inode->i_generation)
    >     
    >     This should be marked "unlikely()" since it happens at most once every 4B
    >     file creations (though likely even less since it is unlikely that so many
    >     files will be created in a single mount).
    > 
    > Got it.
    >     
    >     > +		inode->i_generation = sbi->s_next_generation++;
    >     > +}
    >     > +
    >     > 
    >     > diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
    >     > index 98ac2f1..d33f6f0 100644
    >     > --- a/fs/ext4/ialloc.c
    >     > +++ b/fs/ext4/ialloc.c
    >     > @@ -1072,7 +1072,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode	}
    >     > 	spin_lock(&sbi->s_next_gen_lock);
    >     > -	inode->i_generation = sbi->s_next_generation++;
    >     > +	ext4_inode_set_gen(inode, sbi);
    >     > 	spin_unlock(&sbi->s_next_gen_lock);
    >     > 
    >     > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
    >     > index 0c21e22..d52a467 100644
    >     > --- a/fs/ext4/ioctl.c
    >     > +++ b/fs/ext4/ioctl.c
    >     > @@ -160,8 +160,8 @@ static long swap_inode_boot_loader(struct super_block *sb,
    >     > 
    >     > 	spin_lock(&sbi->s_next_gen_lock);
    >     > -	inode->i_generation = sbi->s_next_generation++;
    >     > -	inode_bl->i_generation = sbi->s_next_generation++;
    >     > +	ext4_inode_set_gen(inode, sbi);
    >     > +	ext4_inode_set_gen(inode_bl, sbi);
    >     > 	spin_unlock(&sbi->s_next_gen_lock);
    >     > 
    >     
    >     
    >     Cheers, Andreas
    > 
    > This is applicable to many fs, including ext2, ext4, exofs, jfs, and f2fs.
    > Therefore, a shared helper in linux/fs.h will allow for easy changes
    > in all fs. Is there any reason that might be a bad idea?
    
    AFAICT, i_generation == 0 in XFS and btrfs is just as valid as any other
    number.  There is no special casing of zero in either filesystem.
    
    So now, my curiosity intrigued, I surveyed all the Linux filesystems
    that can export to NFS.  I see that there are actually quite a few fs
    (ext[2-4], exofs, efs, fat, jfs, f2fs, isofs, nilfs2, reiserfs, udf,
    ufs) that treat zero as a special value meaning "ignore generation
    check"; others (xfs, btrfs, fuse, ntfs, ocfs2) that don't consider zero
    special and always require a match; and still others (affs, befs, ceph,
    gfs2, jffs2, squashfs) that don't check at all.
    
    That to mean strongly suggests that more research is necessary to figure
    out why some of the filesystems that support i_generation reserve zero
    as a special value to disable generation checks and why others always
    require an exact match.  Until we can recapture why things are they way
    they are, it doesn't make much sense to have a helper that only applies
    to half the filesystems.
    
    Granted, the contents of a file handle are generally left up to the
    individual filesystem, and the behaviors are very different, so I also
    don't see that much value in hoisting i_generation updates to the VFS
    level.
    
    I guess it wouldn't really matter if XFS stopped writing i_generation =
    0 onto disk, but I'm too curious about this odd difference in behavior
    to let it go just yet. :)
    
    --D

That makes sense. I’ll try to also look into this matter and send a
newer patch with the most optimal fix to this issue.

-Kyungchan Koh
    
    > 
    > Best,
    > Kyungchan Koh 
    >     
    >     
    >     
    >     
    > 
    

  reply	other threads:[~2017-06-29 14:29 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-28 22:06 [PATCH] fs: ext4: inode->i_generation not assigned 0 Kyungchan Koh
2017-06-29  0:48 ` Darrick J. Wong
2017-06-29  0:58   ` William Koh
2017-06-29  2:32 ` Andreas Dilger
2017-06-29  4:37   ` William Koh
2017-06-29  4:59     ` Darrick J. Wong
2017-06-29 14:28       ` William Koh [this message]
2017-06-29 14:35       ` J. Bruce Fields
2017-06-29 17:25         ` Darrick J. Wong
2017-06-29 18:30           ` J. Bruce Fields
2017-06-29 18:50             ` J. Bruce Fields
2017-07-04  4:04               ` Darrick J. Wong
2017-07-05  1:15                 ` J. Bruce Fields
2017-07-05 19:19                   ` Darrick J. Wong
2017-07-05 20:27                     ` Theodore Ts'o
2017-07-07 10:51                       ` Jeff Layton
2017-07-07 15:51                         ` Theodore Ts'o
2017-07-07 16:13                           ` Jeff Layton
2017-07-07 16:47                             ` J. Bruce Fields
2017-07-05 20:49                     ` J. Bruce Fields
2017-07-06  1:08                       ` NeilBrown
2017-07-06  2:39                         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0D8EA9C1-E2E7-4D63-8F12-4BDED555F18E@fb.com \
    --to=kkc6196@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=adilger@dilger.ca \
    --cc=bfields@fieldses.org \
    --cc=darrick.wong@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).