All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5)
@ 2012-01-09 13:21 Bernd Schubert
  2012-01-09 13:21 ` [PATCH 5 1/4] Add new FMODE flags: FMODE_32bithash and FMODE_64bithash Bernd Schubert
                   ` (4 more replies)
  0 siblings, 5 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-01-09 13:21 UTC (permalink / raw)
  To: linux-nfs, linux-ext4; +Cc: linux-fsdevel, yong.fan, bfields, sandeen, adilger

With the ext3/ext4 directory index implementation hashes are used to specify
offsets for llseek(). For compatibility with NFSv2 and 32-bit user space
on 64-bit systems (kernel space) ext3/ext4 currently only return 32-bit 
hashes and therefore the probability of hash collisions for larger directories
is rather high. As recently reported on the NFS mailing list that theoretical
problem also happens on real systems:
http://comments.gmane.org/gmane.linux.nfs/40863

The following series adds two new f_mode flags to tell ext4
to use 32-bit or 64-bit hash values for llseek() calls.
These flags can then used by network file systems, such as NFS, to
request 32-bit or 64-bit offsets (hashes).

Version 5
- update NFSD_MAY_64BIT_COOKIE to 0x1000

Version 4
- Andreas noticed there was HAVE_IS_COMPAT_TASK instead of 
  CONFIG_COMPAT in the 
  "Return 32/64-bit dir name hash according to usage type"
  patch

Version 3:
- remove patch "RFC: Remove check for a 32-bit cookie in nfsd4_readdir()", 
  as it was included upstream separately
- split "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" into two
  two separate patches as suggested by Bruce, one patch to rename 
  'access' to 'may_flags'. And the remainder of the original patch to set 
  FMODE_32BITHASH/FMODE_64BITHASH flags and to introduce the new 
  NFSD_MAY_64BIT_COOKIE flag

Version 2:
- use f_mode instead of O_* flags and also in a separate patch
- introduce EXT4_HTREE_EOF_32BIT and EXT4_HTREE_EOF_64BIT
- fix SEEK_END in ext4_dir_llseek()
- set f_mode flags in NFS code as early as possible and introduce a new
  NFSD_MAY_64BIT_COOKIE flag for that

--
Bernd Schubert
Fraunhofer ITWM


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 5 1/4] Add new FMODE flags: FMODE_32bithash and FMODE_64bithash
  2012-01-09 13:21 [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Bernd Schubert
@ 2012-01-09 13:21 ` Bernd Schubert
  2012-01-09 13:21 ` [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type Bernd Schubert
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-01-09 13:21 UTC (permalink / raw)
  To: linux-nfs, linux-ext4; +Cc: linux-fsdevel, yong.fan, bfields, sandeen, adilger

Those flags are supposed to be set by NFS readdir() to tell ext3/ext4
to 32bit (NFSv2) or 64bit hash values (offsets) in seekdir().

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
---
 include/linux/fs.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7aacf31..b158386 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -92,6 +92,11 @@ struct inodes_stat_t {
 /* File is opened using open(.., 3, ..) and is writeable only for ioctls
    (specialy hack for floppy.c) */
 #define FMODE_WRITE_IOCTL	((__force fmode_t)0x100)
+/* 32bit hashes as llseek() offset (for directories) */
+#define FMODE_32BITHASH         ((__force fmode_t)0x200)
+/* 64bit hashes as llseek() offset (for directories) */
+#define FMODE_64BITHASH         ((__force fmode_t)0x400)
+
 
 /*
  * Don't update ctime and mtime.


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-01-09 13:21 [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Bernd Schubert
  2012-01-09 13:21 ` [PATCH 5 1/4] Add new FMODE flags: FMODE_32bithash and FMODE_64bithash Bernd Schubert
@ 2012-01-09 13:21 ` Bernd Schubert
  2012-03-05 15:59   ` Ted Ts'o
       [not found]   ` <20120109132148.2616029.68798.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2012-01-09 13:21 ` [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open() Bernd Schubert
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-01-09 13:21 UTC (permalink / raw)
  To: linux-nfs, linux-ext4
  Cc: linux-fsdevel, Fan Yong, bfields, sandeen, Andreas Dilger

From: Fan Yong <yong.fan@whamcloud.com>

Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
and telldir().  However, this causes problems if there are 32-bit hash
collisions, since the NFSv2 server can get stuck resending the same
entries from the directory repeatedly.

Allow ext4 to return a full 64-bit hash (both major and minor) for
telldir to decrease the chance of hash collisions.  This still needs
integration on the NFS side.

Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
(blame me if something is not correct)

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
---
 fs/ext4/dir.c  |  185 ++++++++++++++++++++++++++++++++++++++++++++------------
 fs/ext4/ext4.h |    6 ++
 fs/ext4/hash.c |    4 +
 3 files changed, 154 insertions(+), 41 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 164c560..cee09f2 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -32,24 +32,8 @@ static unsigned char ext4_filetype_table[] = {
 	DT_UNKNOWN, DT_REG, DT_DIR, DT_CHR, DT_BLK, DT_FIFO, DT_SOCK, DT_LNK
 };
 
-static int ext4_readdir(struct file *, void *, filldir_t);
 static int ext4_dx_readdir(struct file *filp,
 			   void *dirent, filldir_t filldir);
-static int ext4_release_dir(struct inode *inode,
-				struct file *filp);
-
-const struct file_operations ext4_dir_operations = {
-	.llseek		= ext4_llseek,
-	.read		= generic_read_dir,
-	.readdir	= ext4_readdir,		/* we take BKL. needed?*/
-	.unlocked_ioctl = ext4_ioctl,
-#ifdef CONFIG_COMPAT
-	.compat_ioctl	= ext4_compat_ioctl,
-#endif
-	.fsync		= ext4_sync_file,
-	.release	= ext4_release_dir,
-};
-
 
 static unsigned char get_dtype(struct super_block *sb, int filetype)
 {
@@ -254,22 +238,134 @@ out:
 	return ret;
 }
 
+static inline int is_32bit_api(void)
+{
+#ifdef CONFIG_COMPAT
+	return is_compat_task();
+#else
+	return (BITS_PER_LONG == 32);
+#endif
+}
+
 /*
  * These functions convert from the major/minor hash to an f_pos
- * value.
+ * value for dx directories
  *
- * Currently we only use major hash numer.  This is unfortunate, but
- * on 32-bit machines, the same VFS interface is used for lseek and
- * llseek, so if we use the 64 bit offset, then the 32-bit versions of
- * lseek/telldir/seekdir will blow out spectacularly, and from within
- * the ext2 low-level routine, we don't know if we're being called by
- * a 64-bit version of the system call or the 32-bit version of the
- * system call.  Worse yet, NFSv2 only allows for a 32-bit readdir
- * cookie.  Sigh.
+ * Upper layer (for example NFS) should specify FMODE_32BITHASH or
+ * FMODE_64BITHASH explicitly. On the other hand, we allow ext4 to be mounted
+ * directly on both 32-bit and 64-bit nodes, under such case, neither
+ * FMODE_32BITHASH nor FMODE_64BITHASH is specified.
  */
-#define hash2pos(major, minor)	(major >> 1)
-#define pos2maj_hash(pos)	((pos << 1) & 0xffffffff)
-#define pos2min_hash(pos)	(0)
+static inline loff_t hash2pos(struct file *filp, __u32 major, __u32 minor)
+{
+	if ((filp->f_flags & FMODE_32BITHASH) ||
+	    (!(filp->f_flags & FMODE_64BITHASH) && is_32bit_api()))
+		return major >> 1;
+	else
+		return ((__u64)(major >> 1) << 32) | (__u64)minor;
+}
+
+static inline __u32 pos2maj_hash(struct file *filp, loff_t pos)
+{
+	if ((filp->f_flags & FMODE_32BITHASH) ||
+	    (!(filp->f_mode & FMODE_64BITHASH) && is_32bit_api()))
+		return (pos << 1) & 0xffffffff;
+	else
+		return ((pos >> 32) << 1) & 0xffffffff;
+}
+
+static inline __u32 pos2min_hash(struct file *filp, loff_t pos)
+{
+	if ((filp->f_flags & FMODE_32BITHASH) ||
+	    (!(filp->f_flags & FMODE_64BITHASH) && is_32bit_api()))
+		return 0;
+	else
+		return pos & 0xffffffff;
+}
+
+/*
+ * Return 32- or 64-bit end-of-file for dx directories
+ */
+static inline loff_t ext4_get_htree_eof(struct file *filp)
+{
+	if ((filp->f_mode & FMODE_32BITHASH) ||
+	    (!(filp->f_mode & FMODE_64BITHASH) && is_32bit_api()))
+		return EXT4_HTREE_EOF_32BIT;
+	else
+		return EXT4_HTREE_EOF_64BIT;
+}
+
+
+/*
+ * ext4_dir_llseek() based on generic_file_llseek() to handle both
+ * non-htree and htree directories, where the "offset" is in terms
+ * of the filename hash value instead of the byte offset.
+ *
+ * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
+ *       will be invalid once the directory was converted into a dx directory
+ */
+loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
+{
+	struct inode *inode = file->f_mapping->host;
+	loff_t ret = -EINVAL;
+	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
+
+	mutex_lock(&inode->i_mutex);
+
+	/* NOTE: relative offsets with dx directories might not work
+	 *       as expected, as it is difficult to figure out the
+	 *       correct offset between dx hashes */
+
+	switch (origin) {
+	case SEEK_END:
+		if (unlikely(offset > 0))
+			goto out_err; /* not supported for directories */
+
+		/* so only negative offsets are left, does that have a
+		 * meaning for directories at all? */
+		if (is_dx_dir)
+			offset += ext4_get_htree_eof(file);
+		else
+			offset += inode->i_size;
+		break;
+	case SEEK_CUR:
+		/*
+		 * Here we special-case the lseek(fd, 0, SEEK_CUR)
+		 * position-querying operation.  Avoid rewriting the "same"
+		 * f_pos value back to the file because a concurrent read(),
+		 * write() or lseek() might have altered it
+		 */
+		if (offset == 0) {
+			offset = file->f_pos;
+			goto out_ok;
+		}
+
+		offset += file->f_pos;
+		break;
+	}
+
+	if (unlikely(offset < 0))
+		goto out_err;
+
+	if (!is_dx_dir) {
+		if (offset > inode->i_sb->s_maxbytes)
+			goto out_err;
+	} else if (offset > ext4_get_htree_eof(file))
+		goto out_err;
+
+	/* Special lock needed here? */
+	if (offset != file->f_pos) {
+		file->f_pos = offset;
+		file->f_version = 0;
+	}
+
+out_ok:
+	ret = offset;
+out_err:
+	mutex_unlock(&inode->i_mutex);
+
+	return ret;
+}
 
 /*
  * This structure holds the nodes of the red-black tree used to store
@@ -330,15 +426,16 @@ static void free_rb_tree_fname(struct rb_root *root)
 }
 
 
-static struct dir_private_info *ext4_htree_create_dir_info(loff_t pos)
+static struct dir_private_info *ext4_htree_create_dir_info(struct file *filp,
+							   loff_t pos)
 {
 	struct dir_private_info *p;
 
 	p = kzalloc(sizeof(struct dir_private_info), GFP_KERNEL);
 	if (!p)
 		return NULL;
-	p->curr_hash = pos2maj_hash(pos);
-	p->curr_minor_hash = pos2min_hash(pos);
+	p->curr_hash = pos2maj_hash(filp, pos);
+	p->curr_minor_hash = pos2min_hash(filp, pos);
 	return p;
 }
 
@@ -429,7 +526,7 @@ static int call_filldir(struct file *filp, void *dirent,
 		       "null fname?!?\n");
 		return 0;
 	}
-	curr_pos = hash2pos(fname->hash, fname->minor_hash);
+	curr_pos = hash2pos(filp, fname->hash, fname->minor_hash);
 	while (fname) {
 		error = filldir(dirent, fname->name,
 				fname->name_len, curr_pos,
@@ -454,13 +551,13 @@ static int ext4_dx_readdir(struct file *filp,
 	int	ret;
 
 	if (!info) {
-		info = ext4_htree_create_dir_info(filp->f_pos);
+		info = ext4_htree_create_dir_info(filp, filp->f_pos);
 		if (!info)
 			return -ENOMEM;
 		filp->private_data = info;
 	}
 
-	if (filp->f_pos == EXT4_HTREE_EOF)
+	if (filp->f_pos == ext4_get_htree_eof(filp))
 		return 0;	/* EOF */
 
 	/* Some one has messed with f_pos; reset the world */
@@ -468,8 +565,8 @@ static int ext4_dx_readdir(struct file *filp,
 		free_rb_tree_fname(&info->root);
 		info->curr_node = NULL;
 		info->extra_fname = NULL;
-		info->curr_hash = pos2maj_hash(filp->f_pos);
-		info->curr_minor_hash = pos2min_hash(filp->f_pos);
+		info->curr_hash = pos2maj_hash(filp, filp->f_pos);
+		info->curr_minor_hash = pos2min_hash(filp, filp->f_pos);
 	}
 
 	/*
@@ -501,7 +598,7 @@ static int ext4_dx_readdir(struct file *filp,
 			if (ret < 0)
 				return ret;
 			if (ret == 0) {
-				filp->f_pos = EXT4_HTREE_EOF;
+				filp->f_pos = ext4_get_htree_eof(filp);
 				break;
 			}
 			info->curr_node = rb_first(&info->root);
@@ -521,7 +618,7 @@ static int ext4_dx_readdir(struct file *filp,
 			info->curr_minor_hash = fname->minor_hash;
 		} else {
 			if (info->next_hash == ~0) {
-				filp->f_pos = EXT4_HTREE_EOF;
+				filp->f_pos = ext4_get_htree_eof(filp);
 				break;
 			}
 			info->curr_hash = info->next_hash;
@@ -540,3 +637,15 @@ static int ext4_release_dir(struct inode *inode, struct file *filp)
 
 	return 0;
 }
+
+const struct file_operations ext4_dir_operations = {
+	.llseek		= ext4_dir_llseek,
+	.read		= generic_read_dir,
+	.readdir	= ext4_readdir,
+	.unlocked_ioctl = ext4_ioctl,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl	= ext4_compat_ioctl,
+#endif
+	.fsync		= ext4_sync_file,
+	.release	= ext4_release_dir,
+};
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1554b15..d3fe1ea 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1599,7 +1599,11 @@ struct dx_hash_info
 	u32		*seed;
 };
 
-#define EXT4_HTREE_EOF	0x7fffffff
+
+/* 32 and 64 bit signed EOF for dx directories */
+#define EXT4_HTREE_EOF_32BIT   ((1UL  << (32 - 1)) - 1)
+#define EXT4_HTREE_EOF_64BIT   ((1ULL << (64 - 1)) - 1)
+
 
 /*
  * Control parameters used by ext4_htree_next_block
diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
index ac8f168..fa8e491 100644
--- a/fs/ext4/hash.c
+++ b/fs/ext4/hash.c
@@ -200,8 +200,8 @@ int ext4fs_dirhash(const char *name, int len, struct dx_hash_info *hinfo)
 		return -1;
 	}
 	hash = hash & ~1;
-	if (hash == (EXT4_HTREE_EOF << 1))
-		hash = (EXT4_HTREE_EOF-1) << 1;
+	if (hash == (EXT4_HTREE_EOF_32BIT << 1))
+		hash = (EXT4_HTREE_EOF_32BIT - 1) << 1;
 	hinfo->hash = hash;
 	hinfo->minor_hash = minor_hash;
 	return 0;


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-01-09 13:21 [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Bernd Schubert
  2012-01-09 13:21 ` [PATCH 5 1/4] Add new FMODE flags: FMODE_32bithash and FMODE_64bithash Bernd Schubert
  2012-01-09 13:21 ` [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type Bernd Schubert
@ 2012-01-09 13:21 ` Bernd Schubert
  2012-01-09 13:21 ` [PATCH 5 4/4] nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) Bernd Schubert
  2012-01-10 11:27 ` [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Andreas Dilger
  4 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-01-09 13:21 UTC (permalink / raw)
  To: linux-nfs, linux-ext4
  Cc: linux-fsdevel, yong.fan, J. Bruce Fields, sandeen, adilger

Just rename this variable, as the next patch will add a flag and
'access' as variable name would not be correct any more.


Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Acked-by: J. Bruce Fields<bfields@redhat.com>
---
 fs/nfsd/vfs.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index d25a723..20fa252 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -726,12 +726,13 @@ static int nfsd_open_break_lease(struct inode *inode, int access)
 
 /*
  * Open an existing file or directory.
- * The access argument indicates the type of open (read/write/lock)
+ * The may_flags argument indicates the type of open (read/write/lock)
+ * and additional flags.
  * N.B. After this call fhp needs an fh_put
  */
 __be32
 nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
-			int access, struct file **filp)
+			int may_flags, struct file **filp)
 {
 	struct dentry	*dentry;
 	struct inode	*inode;
@@ -746,7 +747,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 	 * and (hopefully) checked permission - so allow OWNER_OVERRIDE
 	 * in case a chmod has now revoked permission.
 	 */
-	err = fh_verify(rqstp, fhp, type, access | NFSD_MAY_OWNER_OVERRIDE);
+	err = fh_verify(rqstp, fhp, type, may_flags | NFSD_MAY_OWNER_OVERRIDE);
 	if (err)
 		goto out;
 
@@ -757,7 +758,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 	 * or any access when mandatory locking enabled
 	 */
 	err = nfserr_perm;
-	if (IS_APPEND(inode) && (access & NFSD_MAY_WRITE))
+	if (IS_APPEND(inode) && (may_flags & NFSD_MAY_WRITE))
 		goto out;
 	/*
 	 * We must ignore files (but only files) which might have mandatory
@@ -770,12 +771,12 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 	if (!inode->i_fop)
 		goto out;
 
-	host_err = nfsd_open_break_lease(inode, access);
+	host_err = nfsd_open_break_lease(inode, may_flags);
 	if (host_err) /* NOMEM or WOULDBLOCK */
 		goto out_nfserr;
 
-	if (access & NFSD_MAY_WRITE) {
-		if (access & NFSD_MAY_READ)
+	if (may_flags & NFSD_MAY_WRITE) {
+		if (may_flags & NFSD_MAY_READ)
 			flags = O_RDWR|O_LARGEFILE;
 		else
 			flags = O_WRONLY|O_LARGEFILE;
@@ -785,7 +786,8 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 	if (IS_ERR(*filp))
 		host_err = PTR_ERR(*filp);
 	else
-		host_err = ima_file_check(*filp, access);
+		host_err = ima_file_check(*filp, may_flags);
+
 out_nfserr:
 	err = nfserrno(host_err);
 out:


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 5 4/4] nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)
  2012-01-09 13:21 [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Bernd Schubert
                   ` (2 preceding siblings ...)
  2012-01-09 13:21 ` [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open() Bernd Schubert
@ 2012-01-09 13:21 ` Bernd Schubert
       [not found]   ` <20120109132158.2616029.30467.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2012-01-10 11:27 ` [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Andreas Dilger
  4 siblings, 1 reply; 81+ messages in thread
From: Bernd Schubert @ 2012-01-09 13:21 UTC (permalink / raw)
  To: linux-nfs, linux-ext4
  Cc: linux-fsdevel, yong.fan, J. Bruce Fields, sandeen, adilger

Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
the NFS version. NFSv2 gets 32-bit hashes only.

NOTE: This patch got rather complex as Christoph asked to set the
filp->f_mode flag in the open call or immediatly after dentry_open()
in nfsd_open() to avoid races.
Personally I still do not see a reason for that and in my opinion
FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
follows directly after nfsd_open() without a chance of races.


Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Acked-by: J. Bruce Fields<bfields@redhat.com>
---
 fs/nfsd/vfs.c |   15 +++++++++++++--
 fs/nfsd/vfs.h |    2 ++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 20fa252..f162dc3 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -785,9 +785,15 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
 			    flags, current_cred());
 	if (IS_ERR(*filp))
 		host_err = PTR_ERR(*filp);
-	else
+	else {
 		host_err = ima_file_check(*filp, may_flags);
 
+		if (may_flags & NFSD_MAY_64BIT_COOKIE)
+			(*filp)->f_mode |= FMODE_64BITHASH;
+		else
+			(*filp)->f_mode |= FMODE_32BITHASH;
+	}
+
 out_nfserr:
 	err = nfserrno(host_err);
 out:
@@ -2011,8 +2017,13 @@ nfsd_readdir(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t *offsetp,
 	__be32		err;
 	struct file	*file;
 	loff_t		offset = *offsetp;
+	int             may_flags = NFSD_MAY_READ;
+
+	/* NFSv2 only supports 32 bit cookies */
+	if (rqstp->rq_vers > 2)
+		may_flags |= NFSD_MAY_64BIT_COOKIE;
 
-	err = nfsd_open(rqstp, fhp, S_IFDIR, NFSD_MAY_READ, &file);
+	err = nfsd_open(rqstp, fhp, S_IFDIR, may_flags, &file);
 	if (err)
 		goto out;
 
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 1dcd238..ec0611b 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -27,6 +27,8 @@
 #define NFSD_MAY_BYPASS_GSS		0x400
 #define NFSD_MAY_READ_IF_EXEC		0x800
 
+#define NFSD_MAY_64BIT_COOKIE		0x1000 /* 64 bit readdir cookies for >= NFSv3 */
+
 #define NFSD_MAY_CREATE		(NFSD_MAY_EXEC|NFSD_MAY_WRITE)
 #define NFSD_MAY_REMOVE		(NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC)
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5)
  2012-01-09 13:21 [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Bernd Schubert
                   ` (3 preceding siblings ...)
  2012-01-09 13:21 ` [PATCH 5 4/4] nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) Bernd Schubert
@ 2012-01-10 11:27 ` Andreas Dilger
  2012-01-11 14:48   ` J. Bruce Fields
  4 siblings, 1 reply; 81+ messages in thread
From: Andreas Dilger @ 2012-01-10 11:27 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: linux-nfs, Bernd Schubert, ext4 development, linux-fsdevel Devel,
	Fan Yong, J. Bruce Fields, Eric Sandeen

On 2012-01-09, at 6:21 AM, Bernd Schubert wrote:
> With the ext3/ext4 directory index implementation hashes are used to specify
> offsets for llseek(). For compatibility with NFSv2 and 32-bit user space
> on 64-bit systems (kernel space) ext3/ext4 currently only return 32-bit 
> hashes and therefore the probability of hash collisions for larger directories
> is rather high. As recently reported on the NFS mailing list that theoretical
> problem also happens on real systems:
> http://comments.gmane.org/gmane.linux.nfs/40863
> 
> The following series adds two new f_mode flags to tell ext4
> to use 32-bit or 64-bit hash values for llseek() calls.
> These flags can then used by network file systems, such as NFS, to
> request 32-bit or 64-bit offsets (hashes).

Ted, it would be great if these patches could land.  We hit issues like
this previously as well, which is why we started this patch series in the
first place.

> Version 5
> - update NFSD_MAY_64BIT_COOKIE to 0x1000
> 
> Version 4
> - Andreas noticed there was HAVE_IS_COMPAT_TASK instead of 
>  CONFIG_COMPAT in the 
>  "Return 32/64-bit dir name hash according to usage type"
>  patch
> 
> Version 3:
> - remove patch "RFC: Remove check for a 32-bit cookie in nfsd4_readdir()", 
>  as it was included upstream separately
> - split "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" into two
>  two separate patches as suggested by Bruce, one patch to rename 
>  'access' to 'may_flags'. And the remainder of the original patch to set 
>  FMODE_32BITHASH/FMODE_64BITHASH flags and to introduce the new 
>  NFSD_MAY_64BIT_COOKIE flag
> 
> Version 2:
> - use f_mode instead of O_* flags and also in a separate patch
> - introduce EXT4_HTREE_EOF_32BIT and EXT4_HTREE_EOF_64BIT
> - fix SEEK_END in ext4_dir_llseek()
> - set f_mode flags in NFS code as early as possible and introduce a new
>  NFSD_MAY_64BIT_COOKIE flag for that
> 
> --
> Bernd Schubert
> Fraunhofer ITWM
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Engineer                   http://www.whamcloud.com/





^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5)
  2012-01-10 11:27 ` [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Andreas Dilger
@ 2012-01-11 14:48   ` J. Bruce Fields
       [not found]     ` <20120111144827.GA32381-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: J. Bruce Fields @ 2012-01-11 14:48 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Theodore Ts'o, linux-nfs, Bernd Schubert, ext4 development,
	linux-fsdevel Devel, Fan Yong, J. Bruce Fields, Eric Sandeen

On Tue, Jan 10, 2012 at 04:27:15AM -0700, Andreas Dilger wrote:
> On 2012-01-09, at 6:21 AM, Bernd Schubert wrote:
> > With the ext3/ext4 directory index implementation hashes are used to specify
> > offsets for llseek(). For compatibility with NFSv2 and 32-bit user space
> > on 64-bit systems (kernel space) ext3/ext4 currently only return 32-bit 
> > hashes and therefore the probability of hash collisions for larger directories
> > is rather high. As recently reported on the NFS mailing list that theoretical
> > problem also happens on real systems:
> > http://comments.gmane.org/gmane.linux.nfs/40863
> > 
> > The following series adds two new f_mode flags to tell ext4
> > to use 32-bit or 64-bit hash values for llseek() calls.
> > These flags can then used by network file systems, such as NFS, to
> > request 32-bit or 64-bit offsets (hashes).
> 
> Ted, it would be great if these patches could land.  We hit issues like
> this previously as well, which is why we started this patch series in the
> first place.

Yes, this needs to be fixed--is there anything in particular holding up
these patches?

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5)
  2012-01-11 14:48   ` J. Bruce Fields
@ 2012-01-11 15:31         ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-01-11 15:31 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andreas Dilger, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Bernd Schubert,
	ext4 development, linux-fsdevel Devel, Fan Yong, J. Bruce Fields,
	Eric Sandeen

On Wed, Jan 11, 2012 at 09:48:27AM -0500, J. Bruce Fields wrote:
> 
> Yes, this needs to be fixed--is there anything in particular holding up
> these patches?

Yes, I need more people reviewing ext4 patches, so it's not me doing
all of the review.  :-)

I will put this on my "high priority to review ASAP for the next merge
window" list.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5)
@ 2012-01-11 15:31         ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-01-11 15:31 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andreas Dilger, linux-nfs, Bernd Schubert, ext4 development,
	linux-fsdevel Devel, Fan Yong, J. Bruce Fields, Eric Sandeen

On Wed, Jan 11, 2012 at 09:48:27AM -0500, J. Bruce Fields wrote:
> 
> Yes, this needs to be fixed--is there anything in particular holding up
> these patches?

Yes, I need more people reviewing ext4 patches, so it's not me doing
all of the review.  :-)

I will put this on my "high priority to review ASAP for the next merge
window" list.

						- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5)
  2012-01-11 15:31         ` Ted Ts'o
  (?)
@ 2012-03-05 12:23         ` Bernd Schubert
  -1 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-05 12:23 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: J. Bruce Fields, Andreas Dilger, linux-nfs, ext4 development,
	linux-fsdevel Devel, Fan Yong, J. Bruce Fields, Eric Sandeen

On 01/11/2012 04:31 PM, Ted Ts'o wrote:
> On Wed, Jan 11, 2012 at 09:48:27AM -0500, J. Bruce Fields wrote:
>>
>> Yes, this needs to be fixed--is there anything in particular holding up
>> these patches?
>
> Yes, I need more people reviewing ext4 patches, so it's not me doing
> all of the review.  :-)
>
> I will put this on my "high priority to review ASAP for the next merge
> window" list.


Ted,

did you have a chance to review the patches in the mean time?


I will go on vacation on Wednesday for two weeks in Moscow and I'm not 
sure yet if I will have internet access there. So if I need to update 
the patches for linux-3.4, tomorrow is probably the last day me to do that.


Thanks,
Bernd


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-01-09 13:21 ` [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type Bernd Schubert
@ 2012-03-05 15:59   ` Ted Ts'o
       [not found]     ` <20120305155939.GE21356-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
       [not found]   ` <20120109132148.2616029.68798.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  1 sibling, 1 reply; 81+ messages in thread
From: Ted Ts'o @ 2012-03-05 15:59 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, sandeen,
	Andreas Dilger

On Mon, Jan 09, 2012 at 02:21:48PM +0100, Bernd Schubert wrote:
> diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
> index ac8f168..fa8e491 100644
> --- a/fs/ext4/hash.c
> +++ b/fs/ext4/hash.c
> @@ -200,8 +200,8 @@ int ext4fs_dirhash(const char *name, int len, struct dx_hash_info *hinfo)
>  		return -1;
>  	}
>  	hash = hash & ~1;
> -	if (hash == (EXT4_HTREE_EOF << 1))
> -		hash = (EXT4_HTREE_EOF-1) << 1;
> +	if (hash == (EXT4_HTREE_EOF_32BIT << 1))
> +		hash = (EXT4_HTREE_EOF_32BIT - 1) << 1;
>  	hinfo->hash = hash;
>  	hinfo->minor_hash = minor_hash;
>  	return 0;

Is there a reason why we don't need to avoid the collsion with the
64-bit EOF value as well?  i.e., I think we also need to add:

	if (hash == (EXT4_HTREE_EOF_64BIT << 1))
		hash = (EXT4_HTREE_EOF_64BIT - 1) << 1;

		       			       	  - Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-01-09 13:21 ` [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open() Bernd Schubert
@ 2012-03-06  0:08         ` Ted Ts'o
  -1 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-06  0:08 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, J. Bruce Fields,
	sandeen-H+wXaHxf7aLQT0dZR+AlfA, adilger-KloliPT79xf2eFz/2MeuCQ

On Mon, Jan 09, 2012 at 02:21:53PM +0100, Bernd Schubert wrote:
> Just rename this variable, as the next patch will add a flag and
> 'access' as variable name would not be correct any more.
> 
> Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> Acked-by: J. Bruce Fields<bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/nfsd/vfs.c |   18 ++++++++++--------
>  1 files changed, 10 insertions(+), 8 deletions(-)

On Mon, Jan 09, 2012 at 02:21:58PM +0100, Bernd Schubert wrote:
> Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
> the NFS version. NFSv2 gets 32-bit hashes only.
> 
> NOTE: This patch got rather complex as Christoph asked to set the
> filp->f_mode flag in the open call or immediatly after dentry_open()
> in nfsd_open() to avoid races.
> Personally I still do not see a reason for that and in my opinion
> FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
> follows directly after nfsd_open() without a chance of races.
> 
> Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> Acked-by: J. Bruce Fields<bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/nfsd/vfs.c |   15 +++++++++++++--
>  fs/nfsd/vfs.h |    2 ++
>  2 files changed, 15 insertions(+), 2 deletions(-)

NFS folks,

Since Bruce has ack'ed these patches I assume it will be OK if I carry
them in the ext4 tree for merging?  (They depend on changes in ext4
found in the first two patches of these patch series.)

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-06  0:08         ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-06  0:08 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, yong.fan, J. Bruce Fields,
	sandeen, adilger

On Mon, Jan 09, 2012 at 02:21:53PM +0100, Bernd Schubert wrote:
> Just rename this variable, as the next patch will add a flag and
> 'access' as variable name would not be correct any more.
> 
> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> Acked-by: J. Bruce Fields<bfields@redhat.com>
> ---
>  fs/nfsd/vfs.c |   18 ++++++++++--------
>  1 files changed, 10 insertions(+), 8 deletions(-)

On Mon, Jan 09, 2012 at 02:21:58PM +0100, Bernd Schubert wrote:
> Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
> the NFS version. NFSv2 gets 32-bit hashes only.
> 
> NOTE: This patch got rather complex as Christoph asked to set the
> filp->f_mode flag in the open call or immediatly after dentry_open()
> in nfsd_open() to avoid races.
> Personally I still do not see a reason for that and in my opinion
> FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
> follows directly after nfsd_open() without a chance of races.
> 
> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> Acked-by: J. Bruce Fields<bfields@redhat.com>
> ---
>  fs/nfsd/vfs.c |   15 +++++++++++++--
>  fs/nfsd/vfs.h |    2 ++
>  2 files changed, 15 insertions(+), 2 deletions(-)

NFS folks,

Since Bruce has ack'ed these patches I assume it will be OK if I carry
them in the ext4 tree for merging?  (They depend on changes in ext4
found in the first two patches of these patch series.)

						- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-03-05 15:59   ` Ted Ts'o
@ 2012-03-06  0:40         ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-06  0:40 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	Andreas Dilger

On 03/05/2012 04:59 PM, Ted Ts'o wrote:
> On Mon, Jan 09, 2012 at 02:21:48PM +0100, Bernd Schubert wrote:
>> diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
>> index ac8f168..fa8e491 100644
>> --- a/fs/ext4/hash.c
>> +++ b/fs/ext4/hash.c
>> @@ -200,8 +200,8 @@ int ext4fs_dirhash(const char *name, int len, struct dx_hash_info *hinfo)
>>   		return -1;
>>   	}
>>   	hash = hash&  ~1;
>> -	if (hash == (EXT4_HTREE_EOF<<  1))
>> -		hash = (EXT4_HTREE_EOF-1)<<  1;
>> +	if (hash == (EXT4_HTREE_EOF_32BIT<<  1))
>> +		hash = (EXT4_HTREE_EOF_32BIT - 1)<<  1;
>>   	hinfo->hash = hash;
>>   	hinfo->minor_hash = minor_hash;
>>   	return 0;
>
> Is there a reason why we don't need to avoid the collsion with the
> 64-bit EOF value as well?  i.e., I think we also need to add:
>
> 	if (hash == (EXT4_HTREE_EOF_64BIT<<  1))
> 		hash = (EXT4_HTREE_EOF_64BIT - 1)<<  1;
>
> 		       			       	  - Ted


Thanks for looking into it, really appreciated!

Yeah, you are right, we also should check for 64-bit EOF. But wouldn't 
be something like this be better?

	/* check for hash collision */
	if(is_32bit_api() ) {
		if (hash == (EXT4_HTREE_EOF_32BIT<<  1))
			hash = (EXT4_HTREE_EOF_32BIT - 1)<<  1;
	} else {
		if (hash == (EXT4_HTREE_EOF_64BIT<<  1))
			hash = (EXT4_HTREE_EOF_64BIT - 1)<<  1;
	}



Or am I over engineering?


Thanks,
Bernd


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-03-06  0:40         ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-06  0:40 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, sandeen,
	Andreas Dilger

On 03/05/2012 04:59 PM, Ted Ts'o wrote:
> On Mon, Jan 09, 2012 at 02:21:48PM +0100, Bernd Schubert wrote:
>> diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
>> index ac8f168..fa8e491 100644
>> --- a/fs/ext4/hash.c
>> +++ b/fs/ext4/hash.c
>> @@ -200,8 +200,8 @@ int ext4fs_dirhash(const char *name, int len, struct dx_hash_info *hinfo)
>>   		return -1;
>>   	}
>>   	hash = hash&  ~1;
>> -	if (hash == (EXT4_HTREE_EOF<<  1))
>> -		hash = (EXT4_HTREE_EOF-1)<<  1;
>> +	if (hash == (EXT4_HTREE_EOF_32BIT<<  1))
>> +		hash = (EXT4_HTREE_EOF_32BIT - 1)<<  1;
>>   	hinfo->hash = hash;
>>   	hinfo->minor_hash = minor_hash;
>>   	return 0;
>
> Is there a reason why we don't need to avoid the collsion with the
> 64-bit EOF value as well?  i.e., I think we also need to add:
>
> 	if (hash == (EXT4_HTREE_EOF_64BIT<<  1))
> 		hash = (EXT4_HTREE_EOF_64BIT - 1)<<  1;
>
> 		       			       	  - Ted


Thanks for looking into it, really appreciated!

Yeah, you are right, we also should check for 64-bit EOF. But wouldn't 
be something like this be better?

	/* check for hash collision */
	if(is_32bit_api() ) {
		if (hash == (EXT4_HTREE_EOF_32BIT<<  1))
			hash = (EXT4_HTREE_EOF_32BIT - 1)<<  1;
	} else {
		if (hash == (EXT4_HTREE_EOF_64BIT<<  1))
			hash = (EXT4_HTREE_EOF_64BIT - 1)<<  1;
	}



Or am I over engineering?


Thanks,
Bernd



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-06  0:08         ` Ted Ts'o
@ 2012-03-06  2:08             ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-06  2:08 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Mon, Mar 05, 2012 at 07:08:37PM -0500, Ted Ts'o wrote:
> On Mon, Jan 09, 2012 at 02:21:53PM +0100, Bernd Schubert wrote:
> > Just rename this variable, as the next patch will add a flag and
> > 'access' as variable name would not be correct any more.
> > 
> > Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> > Acked-by: J. Bruce Fields<bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  fs/nfsd/vfs.c |   18 ++++++++++--------
> >  1 files changed, 10 insertions(+), 8 deletions(-)
> 
> On Mon, Jan 09, 2012 at 02:21:58PM +0100, Bernd Schubert wrote:
> > Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
> > the NFS version. NFSv2 gets 32-bit hashes only.
> > 
> > NOTE: This patch got rather complex as Christoph asked to set the
> > filp->f_mode flag in the open call or immediatly after dentry_open()
> > in nfsd_open() to avoid races.
> > Personally I still do not see a reason for that and in my opinion
> > FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
> > follows directly after nfsd_open() without a chance of races.
> > 
> > Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> > Acked-by: J. Bruce Fields<bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  fs/nfsd/vfs.c |   15 +++++++++++++--
> >  fs/nfsd/vfs.h |    2 ++
> >  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> NFS folks,
> 
> Since Bruce has ack'ed these patches I assume it will be OK if I carry
> them in the ext4 tree for merging?  (They depend on changes in ext4
> found in the first two patches of these patch series.)

Yes, that would be fine.

(There may well be some conflict: simplest might be if you can get that
pushed out to a stable public branch somewhere, hopefully before the
merge window opens, and that'll give me a chance to pull it and handle
any merge conflicts.

Then as long as your initial pull request goes in early (so that the
ext4 changes don't get sucked in with mine), everything works.)

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-06  2:08             ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-06  2:08 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, yong.fan,
	sandeen, adilger

On Mon, Mar 05, 2012 at 07:08:37PM -0500, Ted Ts'o wrote:
> On Mon, Jan 09, 2012 at 02:21:53PM +0100, Bernd Schubert wrote:
> > Just rename this variable, as the next patch will add a flag and
> > 'access' as variable name would not be correct any more.
> > 
> > Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> > Acked-by: J. Bruce Fields<bfields@redhat.com>
> > ---
> >  fs/nfsd/vfs.c |   18 ++++++++++--------
> >  1 files changed, 10 insertions(+), 8 deletions(-)
> 
> On Mon, Jan 09, 2012 at 02:21:58PM +0100, Bernd Schubert wrote:
> > Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
> > the NFS version. NFSv2 gets 32-bit hashes only.
> > 
> > NOTE: This patch got rather complex as Christoph asked to set the
> > filp->f_mode flag in the open call or immediatly after dentry_open()
> > in nfsd_open() to avoid races.
> > Personally I still do not see a reason for that and in my opinion
> > FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
> > follows directly after nfsd_open() without a chance of races.
> > 
> > Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> > Acked-by: J. Bruce Fields<bfields@redhat.com>
> > ---
> >  fs/nfsd/vfs.c |   15 +++++++++++++--
> >  fs/nfsd/vfs.h |    2 ++
> >  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> NFS folks,
> 
> Since Bruce has ack'ed these patches I assume it will be OK if I carry
> them in the ext4 tree for merging?  (They depend on changes in ext4
> found in the first two patches of these patch series.)

Yes, that would be fine.

(There may well be some conflict: simplest might be if you can get that
pushed out to a stable public branch somewhere, hopefully before the
merge window opens, and that'll give me a chance to pull it and handle
any merge conflicts.

Then as long as your initial pull request goes in early (so that the
ext4 changes don't get sucked in with mine), everything works.)

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-03-06  0:40         ` Bernd Schubert
  (?)
@ 2012-03-06  2:28         ` Ted Ts'o
       [not found]           ` <20120306022838.GA24323-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
  -1 siblings, 1 reply; 81+ messages in thread
From: Ted Ts'o @ 2012-03-06  2:28 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, sandeen,
	Andreas Dilger

On Tue, Mar 06, 2012 at 01:40:05AM +0100, Bernd Schubert wrote:
> 
> Yeah, you are right, we also should check for 64-bit EOF. But
> wouldn't be something like this be better?
> 
> 	/* check for hash collision */
> 	if(is_32bit_api() ) {
> 		if (hash == (EXT4_HTREE_EOF_32BIT<<  1))
> 			hash = (EXT4_HTREE_EOF_32BIT - 1)<<  1;
> 	} else {
> 		if (hash == (EXT4_HTREE_EOF_64BIT<<  1))
> 			hash = (EXT4_HTREE_EOF_64BIT - 1)<<  1;
> 	}

Actually, neither change is needed, now that I look at things more
closely.  hash is a __u32, so it could never been
EXT4_HTREE_EOF_64BIT.  But given that we won't let major hash become
larger than 0xfffffffc, that means the largest possible position value
is 0x7ffffffeffffffff.  So using an EOF value of 0x0x7fffffffffffffff
will work fine.

The bigger problem that I found when I looked more closely at the
patch is that the patch uses f_flags in places where f_mode needs to
be used:

static inline loff_t hash2pos(struct file *filp, __u32 major, __u32 minor)
{
	if ((filp->f_flags & FMODE_32BITHASH) ||
                   ^^^^^^^
	    (!(filp->f_flags & FMODE_64BITHASH) && is_32bit_api()))
                     ^^^^^^^
		return major >> 1;
	else
		return ((__u64)(major >> 1) << 32) | (__u64)minor;
}

static inline __u32 pos2maj_hash(struct file *filp, loff_t pos)
{
	if ((filp->f_flags & FMODE_32BITHASH) ||
                   ^^^^^^
	    (!(filp->f_mode & FMODE_64BITHASH) && is_32bit_api()))
                     ^^^^^^
		return (pos << 1) & 0xffffffff;
	else
		return ((pos >> 32) << 1) & 0xffffffff;
}

Which makes me wonder how much this has been tested?

      	       	      	       	    	     - Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-03-06  2:28         ` Ted Ts'o
@ 2012-03-06  9:59               ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-06  9:59 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	Andreas Dilger

On 03/06/2012 03:28 AM, Ted Ts'o wrote:
> On Tue, Mar 06, 2012 at 01:40:05AM +0100, Bernd Schubert wrote:
>>
>> Yeah, you are right, we also should check for 64-bit EOF. But
>> wouldn't be something like this be better?
>>
>> 	/* check for hash collision */
>> 	if(is_32bit_api() ) {
>> 		if (hash == (EXT4_HTREE_EOF_32BIT<<   1))
>> 			hash = (EXT4_HTREE_EOF_32BIT - 1)<<   1;
>> 	} else {
>> 		if (hash == (EXT4_HTREE_EOF_64BIT<<   1))
>> 			hash = (EXT4_HTREE_EOF_64BIT - 1)<<   1;
>> 	}
>
> Actually, neither change is needed, now that I look at things more
> closely.  hash is a __u32, so it could never been
> EXT4_HTREE_EOF_64BIT.  But given that we won't let major hash become
> larger than 0xfffffffc, that means the largest possible position value
> is 0x7ffffffeffffffff.  So using an EOF value of 0x0x7fffffffffffffff
> will work fine.

Ah, I looked after 1 a.m., seems that was too late for me to notice.

>
> The bigger problem that I found when I looked more closely at the
> patch is that the patch uses f_flags in places where f_mode needs to
> be used:
>
> static inline loff_t hash2pos(struct file *filp, __u32 major, __u32 minor)
> {
> 	if ((filp->f_flags&  FMODE_32BITHASH) ||
>                     ^^^^^^^
> 	    (!(filp->f_flags&  FMODE_64BITHASH)&&  is_32bit_api()))
>                       ^^^^^^^
> 		return major>>  1;
> 	else
> 		return ((__u64)(major>>  1)<<  32) | (__u64)minor;
> }
>
> static inline __u32 pos2maj_hash(struct file *filp, loff_t pos)
> {
> 	if ((filp->f_flags&  FMODE_32BITHASH) ||
>                     ^^^^^^
> 	    (!(filp->f_mode&  FMODE_64BITHASH)&&  is_32bit_api()))
>                       ^^^^^^
> 		return (pos<<  1)&  0xffffffff;
> 	else
> 		return ((pos>>  32)<<  1)&  0xffffffff;
> }
>
> Which makes me wonder how much this has been tested?

Arg, my bad, I introduced this issue when I converted from f_flags to 
f_mode, seems I forgot all of those above :(
Hrm, I thought I had tested sufficiently, but obviously I did not :( 
Here's the test tool.
http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/

While quickly looking, I think it only affects NFSv2, which I think I 
indeed didn't test. I only run tests for 32 bit and 64-bit user space 
and NFSv3. But yes, NFSv2 is an important test too. Not sure if I will 
find time for that today.

Will send an updated version later on.


Thanks for your review,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-03-06  9:59               ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-06  9:59 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, sandeen,
	Andreas Dilger

On 03/06/2012 03:28 AM, Ted Ts'o wrote:
> On Tue, Mar 06, 2012 at 01:40:05AM +0100, Bernd Schubert wrote:
>>
>> Yeah, you are right, we also should check for 64-bit EOF. But
>> wouldn't be something like this be better?
>>
>> 	/* check for hash collision */
>> 	if(is_32bit_api() ) {
>> 		if (hash == (EXT4_HTREE_EOF_32BIT<<   1))
>> 			hash = (EXT4_HTREE_EOF_32BIT - 1)<<   1;
>> 	} else {
>> 		if (hash == (EXT4_HTREE_EOF_64BIT<<   1))
>> 			hash = (EXT4_HTREE_EOF_64BIT - 1)<<   1;
>> 	}
>
> Actually, neither change is needed, now that I look at things more
> closely.  hash is a __u32, so it could never been
> EXT4_HTREE_EOF_64BIT.  But given that we won't let major hash become
> larger than 0xfffffffc, that means the largest possible position value
> is 0x7ffffffeffffffff.  So using an EOF value of 0x0x7fffffffffffffff
> will work fine.

Ah, I looked after 1 a.m., seems that was too late for me to notice.

>
> The bigger problem that I found when I looked more closely at the
> patch is that the patch uses f_flags in places where f_mode needs to
> be used:
>
> static inline loff_t hash2pos(struct file *filp, __u32 major, __u32 minor)
> {
> 	if ((filp->f_flags&  FMODE_32BITHASH) ||
>                     ^^^^^^^
> 	    (!(filp->f_flags&  FMODE_64BITHASH)&&  is_32bit_api()))
>                       ^^^^^^^
> 		return major>>  1;
> 	else
> 		return ((__u64)(major>>  1)<<  32) | (__u64)minor;
> }
>
> static inline __u32 pos2maj_hash(struct file *filp, loff_t pos)
> {
> 	if ((filp->f_flags&  FMODE_32BITHASH) ||
>                     ^^^^^^
> 	    (!(filp->f_mode&  FMODE_64BITHASH)&&  is_32bit_api()))
>                       ^^^^^^
> 		return (pos<<  1)&  0xffffffff;
> 	else
> 		return ((pos>>  32)<<  1)&  0xffffffff;
> }
>
> Which makes me wonder how much this has been tested?

Arg, my bad, I introduced this issue when I converted from f_flags to 
f_mode, seems I forgot all of those above :(
Hrm, I thought I had tested sufficiently, but obviously I did not :( 
Here's the test tool.
http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/

While quickly looking, I think it only affects NFSv2, which I think I 
indeed didn't test. I only run tests for 32 bit and 64-bit user space 
and NFSv3. But yes, NFSv2 is an important test too. Not sure if I will 
find time for that today.

Will send an updated version later on.


Thanks for your review,
Bernd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-03-06  9:59               ` Bernd Schubert
@ 2012-03-06 15:15                   ` Ted Ts'o
  -1 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-06 15:15 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	Andreas Dilger

On Tue, Mar 06, 2012 at 10:59:55AM +0100, Bernd Schubert wrote:
> Arg, my bad, I introduced this issue when I converted from f_flags
> to f_mode, seems I forgot all of those above :(
> Hrm, I thought I had tested sufficiently, but obviously I did not :(
> Here's the test tool.
> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> 
> While quickly looking, I think it only affects NFSv2, which I think
> I indeed didn't test. I only run tests for 32 bit and 64-bit user
> space and NFSv3. But yes, NFSv2 is an important test too. Not sure
> if I will find time for that today.

I think the problem case to worry about would have been on a 32-bit
NFS server (think embedded/bookshelf NAS servers) and NFSv3, since
that's where the inconsistency would have meant that hash2pos would
use the 32-bit codepath, but pos2maj_hash would use the 64-bit
codepath, and so then...  kablooey.

> Will send an updated version later on.

If it's just those changes, no worries, as I've already fixed up the
f_flags vs. f_mode issue in my copy of the patches.

Let me know if there are any other changes that you'd like to make or
issues that you've spotted.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-03-06 15:15                   ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-06 15:15 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, sandeen,
	Andreas Dilger

On Tue, Mar 06, 2012 at 10:59:55AM +0100, Bernd Schubert wrote:
> Arg, my bad, I introduced this issue when I converted from f_flags
> to f_mode, seems I forgot all of those above :(
> Hrm, I thought I had tested sufficiently, but obviously I did not :(
> Here's the test tool.
> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> 
> While quickly looking, I think it only affects NFSv2, which I think
> I indeed didn't test. I only run tests for 32 bit and 64-bit user
> space and NFSv3. But yes, NFSv2 is an important test too. Not sure
> if I will find time for that today.

I think the problem case to worry about would have been on a 32-bit
NFS server (think embedded/bookshelf NAS servers) and NFSv3, since
that's where the inconsistency would have meant that hash2pos would
use the 32-bit codepath, but pos2maj_hash would use the 64-bit
codepath, and so then...  kablooey.

> Will send an updated version later on.

If it's just those changes, no worries, as I've already fixed up the
f_flags vs. f_mode issue in my copy of the patches.

Let me know if there are any other changes that you'd like to make or
issues that you've spotted.

						- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-06  2:08             ` J. Bruce Fields
  (?)
@ 2012-03-06 15:18             ` Ted Ts'o
  2012-03-06 15:28               ` J. Bruce Fields
  -1 siblings, 1 reply; 81+ messages in thread
From: Ted Ts'o @ 2012-03-06 15:18 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, yong.fan,
	sandeen, adilger

On Mon, Mar 05, 2012 at 09:08:19PM -0500, J. Bruce Fields wrote:
> Yes, that would be fine.
> 
> (There may well be some conflict: simplest might be if you can get that
> pushed out to a stable public branch somewhere, hopefully before the
> merge window opens, and that'll give me a chance to pull it and handle
> any merge conflicts.

Are the commits you're worried about on the NFS side already in
linux-next?  If so, I'll do a quick test merge and see if there are
any conflicts.  If so, I'll segregate out these patches into a
separate pre-merge branch --- and then if you like you could pull them
into your branch and we can deal with it that way.

     	  	     	    	      	 - Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-06 15:18             ` Ted Ts'o
@ 2012-03-06 15:28               ` J. Bruce Fields
  2012-03-09 20:51                 ` Ted Ts'o
  0 siblings, 1 reply; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-06 15:28 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, yong.fan,
	sandeen, adilger

On Tue, Mar 06, 2012 at 10:18:16AM -0500, Ted Ts'o wrote:
> On Mon, Mar 05, 2012 at 09:08:19PM -0500, J. Bruce Fields wrote:
> > Yes, that would be fine.
> > 
> > (There may well be some conflict: simplest might be if you can get that
> > pushed out to a stable public branch somewhere, hopefully before the
> > merge window opens, and that'll give me a chance to pull it and handle
> > any merge conflicts.
> 
> Are the commits you're worried about on the NFS side already in
> linux-next?

No (though there might be some other conflicts there I haven't thought
of).

> If so, I'll do a quick test merge and see if there are
> any conflicts.  If so, I'll segregate out these patches into a
> separate pre-merge branch --- and then if you like you could pull them
> into your branch and we can deal with it that way.

That sounds like a good way to do it regardless?

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-03-06 15:15                   ` Ted Ts'o
@ 2012-03-07  9:01                       ` Bernd Schubert
  -1 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-07  9:01 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	Andreas Dilger

On 03/06/2012 04:15 PM, Ted Ts'o wrote:
> On Tue, Mar 06, 2012 at 10:59:55AM +0100, Bernd Schubert wrote:
>> Arg, my bad, I introduced this issue when I converted from f_flags
>> to f_mode, seems I forgot all of those above :(
>> Hrm, I thought I had tested sufficiently, but obviously I did not :(
>> Here's the test tool.
>> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
>>
>> While quickly looking, I think it only affects NFSv2, which I think
>> I indeed didn't test. I only run tests for 32 bit and 64-bit user
>> space and NFSv3. But yes, NFSv2 is an important test too. Not sure
>> if I will find time for that today.
>
> I think the problem case to worry about would have been on a 32-bit
> NFS server (think embedded/bookshelf NAS servers) and NFSv3, since
> that's where the inconsistency would have meant that hash2pos would
> use the 32-bit codepath, but pos2maj_hash would use the 64-bit
> codepath, and so then...  kablooey.
>
>> Will send an updated version later on.
>
> If it's just those changes, no worries, as I've already fixed up the
> f_flags vs. f_mode issue in my copy of the patches.
>
> Let me know if there are any other changes that you'd like to make or
> issues that you've spotted.

Ted, thanks a bunch! I really will try to be more careful updating 
patches next time. And during the next days I'm also going to review the 
patches again, at least I hope I find the time for that while being on 
vacation).


Thanks again,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-03-07  9:01                       ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-07  9:01 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, sandeen,
	Andreas Dilger

On 03/06/2012 04:15 PM, Ted Ts'o wrote:
> On Tue, Mar 06, 2012 at 10:59:55AM +0100, Bernd Schubert wrote:
>> Arg, my bad, I introduced this issue when I converted from f_flags
>> to f_mode, seems I forgot all of those above :(
>> Hrm, I thought I had tested sufficiently, but obviously I did not :(
>> Here's the test tool.
>> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
>>
>> While quickly looking, I think it only affects NFSv2, which I think
>> I indeed didn't test. I only run tests for 32 bit and 64-bit user
>> space and NFSv3. But yes, NFSv2 is an important test too. Not sure
>> if I will find time for that today.
>
> I think the problem case to worry about would have been on a 32-bit
> NFS server (think embedded/bookshelf NAS servers) and NFSv3, since
> that's where the inconsistency would have meant that hash2pos would
> use the 32-bit codepath, but pos2maj_hash would use the 64-bit
> codepath, and so then...  kablooey.
>
>> Will send an updated version later on.
>
> If it's just those changes, no worries, as I've already fixed up the
> f_flags vs. f_mode issue in my copy of the patches.
>
> Let me know if there are any other changes that you'd like to make or
> issues that you've spotted.

Ted, thanks a bunch! I really will try to be more careful updating 
patches next time. And during the next days I'm also going to review the 
patches again, at least I hope I find the time for that while being on 
vacation).


Thanks again,
Bernd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-06 15:28               ` J. Bruce Fields
@ 2012-03-09 20:51                 ` Ted Ts'o
       [not found]                   ` <20120309205148.GB5635-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: Ted Ts'o @ 2012-03-09 20:51 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, yong.fan,
	sandeen, adilger

On Tue, Mar 06, 2012 at 10:28:43AM -0500, J. Bruce Fields wrote:
> > If so, I'll do a quick test merge and see if there are
> > any conflicts.  If so, I'll segregate out these patches into a
> > separate pre-merge branch --- and then if you like you could pull them
> > into your branch and we can deal with it that way.
> 
> That sounds like a good way to do it regardless?

Linus in general doesn't like cross tree merges (or any extraneous
merges) unless they are absolutely necessary; but then, he trusts git
merges more than many of us do.  :-)

I've put the the patch series on a separate patch, with a signed tag, at:

     git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge

Can you confirm that you've pulled it into your tree and it's landed
in linux-next?  I probably won't bother pulling it in mine unless
there is definitely a merge conflict that I need to resolve.  (I've
checked and there do not appear to be any against linux-next as of
last night.)


					- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-09 20:51                 ` Ted Ts'o
@ 2012-03-12 15:09                       ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-12 15:09 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> 
> Linus in general doesn't like cross tree merges (or any extraneous
> merges) unless they are absolutely necessary; but then, he trusts git
> merges more than many of us do.  :-)
> 
> I've put the the patch series on a separate patch, with a signed tag, at:
> 
>      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> 
> Can you confirm that you've pulled it into your tree and it's landed
> in linux-next?  I probably won't bother pulling it in mine unless
> there is definitely a merge conflict that I need to resolve.  (I've
> checked and there do not appear to be any against linux-next as of
> last night.)

Ping?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-12 15:09                       ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-12 15:09 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, yong.fan,
	sandeen, adilger

On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> 
> Linus in general doesn't like cross tree merges (or any extraneous
> merges) unless they are absolutely necessary; but then, he trusts git
> merges more than many of us do.  :-)
> 
> I've put the the patch series on a separate patch, with a signed tag, at:
> 
>      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> 
> Can you confirm that you've pulled it into your tree and it's landed
> in linux-next?  I probably won't bother pulling it in mine unless
> there is definitely a merge conflict that I need to resolve.  (I've
> checked and there do not appear to be any against linux-next as of
> last night.)

Ping?

						- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-12 15:09                       ` Ted Ts'o
@ 2012-03-12 15:49                           ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-12 15:49 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
> On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> > 
> > Linus in general doesn't like cross tree merges (or any extraneous
> > merges) unless they are absolutely necessary; but then, he trusts git
> > merges more than many of us do.  :-)
> > 
> > I've put the the patch series on a separate patch, with a signed tag, at:
> > 
> >      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> > 
> > Can you confirm that you've pulled it into your tree and it's landed
> > in linux-next?  I probably won't bother pulling it in mine unless
> > there is definitely a merge conflict that I need to resolve.  (I've
> > checked and there do not appear to be any against linux-next as of
> > last night.)
> 
> Ping?

Whoops, sorry.

Looks perfect, thanks for handling these!

I'm running them through my usual regression tests and then I'll push
the result out soon, with a "here's why I'm merging this" comment on the
merge commit.  (Though I can't see it would trigger Linus's usual
back-merge rant anyway, as it's obviously quite targetted, not a case of
"oh I think I'll update to -rc5 now".)

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-12 15:49                           ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-12 15:49 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, yong.fan,
	sandeen, adilger

On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
> On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> > 
> > Linus in general doesn't like cross tree merges (or any extraneous
> > merges) unless they are absolutely necessary; but then, he trusts git
> > merges more than many of us do.  :-)
> > 
> > I've put the the patch series on a separate patch, with a signed tag, at:
> > 
> >      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> > 
> > Can you confirm that you've pulled it into your tree and it's landed
> > in linux-next?  I probably won't bother pulling it in mine unless
> > there is definitely a merge conflict that I need to resolve.  (I've
> > checked and there do not appear to be any against linux-next as of
> > last night.)
> 
> Ping?

Whoops, sorry.

Looks perfect, thanks for handling these!

I'm running them through my usual regression tests and then I'll push
the result out soon, with a "here's why I'm merging this" comment on the
merge commit.  (Though I can't see it would trigger Linus's usual
back-merge rant anyway, as it's obviously quite targetted, not a case of
"oh I think I'll update to -rc5 now".)

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-12 15:49                           ` J. Bruce Fields
  (?)
@ 2012-03-12 22:22                           ` J. Bruce Fields
  2012-03-13 20:01                             ` J. Bruce Fields
  -1 siblings, 1 reply; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-12 22:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Ted Ts'o, Bernd Schubert, linux-nfs, linux-ext4,
	linux-fsdevel, yong.fan, sandeen, adilger

On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
> On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
> > On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> > > 
> > > Linus in general doesn't like cross tree merges (or any extraneous
> > > merges) unless they are absolutely necessary; but then, he trusts git
> > > merges more than many of us do.  :-)
> > > 
> > > I've put the the patch series on a separate patch, with a signed tag, at:
> > > 
> > >      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> > > 
> > > Can you confirm that you've pulled it into your tree and it's landed
> > > in linux-next?  I probably won't bother pulling it in mine unless
> > > there is definitely a merge conflict that I need to resolve.  (I've
> > > checked and there do not appear to be any against linux-next as of
> > > last night.)
> > 
> > Ping?
> 
> Whoops, sorry.
> 
> Looks perfect, thanks for handling these!
> 
> I'm running them through my usual regression tests

Urgh, I'm seeing a failure on the telldir test (part of the "special"
connectathon tests).  I haven't looked at what it's trying to do yet.
Reproduceable even just on a local filesystem without NFS involved.  If
someone wants to look at it, you can just do:

	git clone git://linux-nfs.org/~bfields/cthon04.git
	cd cthon04
	make 
	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s

(Or after running that, more specifically,

	cd /somewhere_on_an_ext4_fs/TMP/
	./telldir

)

I'll look at it tomorrow.

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-12 22:22                           ` J. Bruce Fields
@ 2012-03-13 20:01                             ` J. Bruce Fields
       [not found]                               ` <20120313200117.GA21991-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-13 20:01 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Ted Ts'o, Bernd Schubert, linux-nfs, linux-ext4,
	linux-fsdevel, yong.fan, sandeen, adilger

On Mon, Mar 12, 2012 at 06:22:50PM -0400, bfields wrote:
> On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
> > On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
> > > On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> > > > 
> > > > Linus in general doesn't like cross tree merges (or any extraneous
> > > > merges) unless they are absolutely necessary; but then, he trusts git
> > > > merges more than many of us do.  :-)
> > > > 
> > > > I've put the the patch series on a separate patch, with a signed tag, at:
> > > > 
> > > >      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> > > > 
> > > > Can you confirm that you've pulled it into your tree and it's landed
> > > > in linux-next?  I probably won't bother pulling it in mine unless
> > > > there is definitely a merge conflict that I need to resolve.  (I've
> > > > checked and there do not appear to be any against linux-next as of
> > > > last night.)
> > > 
> > > Ping?
> > 
> > Whoops, sorry.
> > 
> > Looks perfect, thanks for handling these!
> > 
> > I'm running them through my usual regression tests
> 
> Urgh, I'm seeing a failure on the telldir test (part of the "special"
> connectathon tests).  I haven't looked at what it's trying to do yet.
> Reproduceable even just on a local filesystem without NFS involved.  If
> someone wants to look at it, you can just do:
> 
> 	git clone git://linux-nfs.org/~bfields/cthon04.git
> 	cd cthon04
> 	make 
> 	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s
> 
> (Or after running that, more specifically,
> 
> 	cd /somewhere_on_an_ext4_fs/TMP/
> 	./telldir
> 
> )
> 
> I'll look at it tomorrow.

Looks like seekdir is just not accepting an offset returned from
getdents:

	openat(AT_FDCWD, "telldir-test", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
	getdents(3, {...{d_ino=20386, d_off=5728083968307607285, d_reclen=24, d_name="192"}...} = 4848
	lseek(3, 5728083968307607285, SEEK_SET) = -1 EINVAL (Invalid argument)

Still investigating.

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 20:01                             ` J. Bruce Fields
@ 2012-03-13 20:03                                   ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-13 20:03 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, Ted Ts'o, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On 03/13/2012 09:01 PM, J. Bruce Fields wrote:
> On Mon, Mar 12, 2012 at 06:22:50PM -0400, bfields wrote:
>> On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
>>> On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
>>>> On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
>>>>>
>>>>> Linus in general doesn't like cross tree merges (or any extraneous
>>>>> merges) unless they are absolutely necessary; but then, he trusts git
>>>>> merges more than many of us do.  :-)
>>>>>
>>>>> I've put the the patch series on a separate patch, with a signed tag, at:
>>>>>
>>>>>       git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
>>>>>
>>>>> Can you confirm that you've pulled it into your tree and it's landed
>>>>> in linux-next?  I probably won't bother pulling it in mine unless
>>>>> there is definitely a merge conflict that I need to resolve.  (I've
>>>>> checked and there do not appear to be any against linux-next as of
>>>>> last night.)
>>>>
>>>> Ping?
>>>
>>> Whoops, sorry.
>>>
>>> Looks perfect, thanks for handling these!
>>>
>>> I'm running them through my usual regression tests
>>
>> Urgh, I'm seeing a failure on the telldir test (part of the "special"
>> connectathon tests).  I haven't looked at what it's trying to do yet.
>> Reproduceable even just on a local filesystem without NFS involved.  If
>> someone wants to look at it, you can just do:
>>
>> 	git clone git://linux-nfs.org/~bfields/cthon04.git
>> 	cd cthon04
>> 	make
>> 	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s
>>
>> (Or after running that, more specifically,
>>
>> 	cd /somewhere_on_an_ext4_fs/TMP/
>> 	./telldir
>>
>> )
>>
>> I'll look at it tomorrow.
>
> Looks like seekdir is just not accepting an offset returned from
> getdents:
>
> 	openat(AT_FDCWD, "telldir-test", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> 	getdents(3, {...{d_ino=20386, d_off=5728083968307607285, d_reclen=24, d_name="192"}...} = 4848
> 	lseek(3, 5728083968307607285, SEEK_SET) = -1 EINVAL (Invalid argument)
>
> Still investigating.

Thanks! I'm also just going to work on it (just noticed your mail while 
on vacation ...).

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-13 20:03                                   ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-13 20:03 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, Ted Ts'o, linux-nfs, linux-ext4,
	linux-fsdevel, yong.fan, sandeen, adilger

On 03/13/2012 09:01 PM, J. Bruce Fields wrote:
> On Mon, Mar 12, 2012 at 06:22:50PM -0400, bfields wrote:
>> On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
>>> On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
>>>> On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
>>>>>
>>>>> Linus in general doesn't like cross tree merges (or any extraneous
>>>>> merges) unless they are absolutely necessary; but then, he trusts git
>>>>> merges more than many of us do.  :-)
>>>>>
>>>>> I've put the the patch series on a separate patch, with a signed tag, at:
>>>>>
>>>>>       git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
>>>>>
>>>>> Can you confirm that you've pulled it into your tree and it's landed
>>>>> in linux-next?  I probably won't bother pulling it in mine unless
>>>>> there is definitely a merge conflict that I need to resolve.  (I've
>>>>> checked and there do not appear to be any against linux-next as of
>>>>> last night.)
>>>>
>>>> Ping?
>>>
>>> Whoops, sorry.
>>>
>>> Looks perfect, thanks for handling these!
>>>
>>> I'm running them through my usual regression tests
>>
>> Urgh, I'm seeing a failure on the telldir test (part of the "special"
>> connectathon tests).  I haven't looked at what it's trying to do yet.
>> Reproduceable even just on a local filesystem without NFS involved.  If
>> someone wants to look at it, you can just do:
>>
>> 	git clone git://linux-nfs.org/~bfields/cthon04.git
>> 	cd cthon04
>> 	make
>> 	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s
>>
>> (Or after running that, more specifically,
>>
>> 	cd /somewhere_on_an_ext4_fs/TMP/
>> 	./telldir
>>
>> )
>>
>> I'll look at it tomorrow.
>
> Looks like seekdir is just not accepting an offset returned from
> getdents:
>
> 	openat(AT_FDCWD, "telldir-test", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> 	getdents(3, {...{d_ino=20386, d_off=5728083968307607285, d_reclen=24, d_name="192"}...} = 4848
> 	lseek(3, 5728083968307607285, SEEK_SET) = -1 EINVAL (Invalid argument)
>
> Still investigating.

Thanks! I'm also just going to work on it (just noticed your mail while 
on vacation ...).


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 20:03                                   ` Bernd Schubert
@ 2012-03-13 20:34                                       ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-13 20:34 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Ted Ts'o, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Tue, Mar 13, 2012 at 09:03:51PM +0100, Bernd Schubert wrote:
> On 03/13/2012 09:01 PM, J. Bruce Fields wrote:
> >On Mon, Mar 12, 2012 at 06:22:50PM -0400, bfields wrote:
> >>On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
> >>>On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
> >>>>On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> >>>>>
> >>>>>Linus in general doesn't like cross tree merges (or any extraneous
> >>>>>merges) unless they are absolutely necessary; but then, he trusts git
> >>>>>merges more than many of us do.  :-)
> >>>>>
> >>>>>I've put the the patch series on a separate patch, with a signed tag, at:
> >>>>>
> >>>>>      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> >>>>>
> >>>>>Can you confirm that you've pulled it into your tree and it's landed
> >>>>>in linux-next?  I probably won't bother pulling it in mine unless
> >>>>>there is definitely a merge conflict that I need to resolve.  (I've
> >>>>>checked and there do not appear to be any against linux-next as of
> >>>>>last night.)
> >>>>
> >>>>Ping?
> >>>
> >>>Whoops, sorry.
> >>>
> >>>Looks perfect, thanks for handling these!
> >>>
> >>>I'm running them through my usual regression tests
> >>
> >>Urgh, I'm seeing a failure on the telldir test (part of the "special"
> >>connectathon tests).  I haven't looked at what it's trying to do yet.
> >>Reproduceable even just on a local filesystem without NFS involved.  If
> >>someone wants to look at it, you can just do:
> >>
> >>	git clone git://linux-nfs.org/~bfields/cthon04.git
> >>	cd cthon04
> >>	make
> >>	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s
> >>
> >>(Or after running that, more specifically,
> >>
> >>	cd /somewhere_on_an_ext4_fs/TMP/
> >>	./telldir
> >>
> >>)
> >>
> >>I'll look at it tomorrow.
> >
> >Looks like seekdir is just not accepting an offset returned from
> >getdents:
> >
> >	openat(AT_FDCWD, "telldir-test", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> >	getdents(3, {...{d_ino=20386, d_off=5728083968307607285, d_reclen=24, d_name="192"}...} = 4848
> >	lseek(3, 5728083968307607285, SEEK_SET) = -1 EINVAL (Invalid argument)
> >
> >Still investigating.
> 
> Thanks! I'm also just going to work on it (just noticed your mail
> while on vacation ...).

OK, good.  You can probably figure this out faster than me....

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-13 20:34                                       ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-13 20:34 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Ted Ts'o, linux-nfs, linux-ext4,
	linux-fsdevel, yong.fan, sandeen, adilger

On Tue, Mar 13, 2012 at 09:03:51PM +0100, Bernd Schubert wrote:
> On 03/13/2012 09:01 PM, J. Bruce Fields wrote:
> >On Mon, Mar 12, 2012 at 06:22:50PM -0400, bfields wrote:
> >>On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
> >>>On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
> >>>>On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
> >>>>>
> >>>>>Linus in general doesn't like cross tree merges (or any extraneous
> >>>>>merges) unless they are absolutely necessary; but then, he trusts git
> >>>>>merges more than many of us do.  :-)
> >>>>>
> >>>>>I've put the the patch series on a separate patch, with a signed tag, at:
> >>>>>
> >>>>>      git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> >>>>>
> >>>>>Can you confirm that you've pulled it into your tree and it's landed
> >>>>>in linux-next?  I probably won't bother pulling it in mine unless
> >>>>>there is definitely a merge conflict that I need to resolve.  (I've
> >>>>>checked and there do not appear to be any against linux-next as of
> >>>>>last night.)
> >>>>
> >>>>Ping?
> >>>
> >>>Whoops, sorry.
> >>>
> >>>Looks perfect, thanks for handling these!
> >>>
> >>>I'm running them through my usual regression tests
> >>
> >>Urgh, I'm seeing a failure on the telldir test (part of the "special"
> >>connectathon tests).  I haven't looked at what it's trying to do yet.
> >>Reproduceable even just on a local filesystem without NFS involved.  If
> >>someone wants to look at it, you can just do:
> >>
> >>	git clone git://linux-nfs.org/~bfields/cthon04.git
> >>	cd cthon04
> >>	make
> >>	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s
> >>
> >>(Or after running that, more specifically,
> >>
> >>	cd /somewhere_on_an_ext4_fs/TMP/
> >>	./telldir
> >>
> >>)
> >>
> >>I'll look at it tomorrow.
> >
> >Looks like seekdir is just not accepting an offset returned from
> >getdents:
> >
> >	openat(AT_FDCWD, "telldir-test", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> >	getdents(3, {...{d_ino=20386, d_off=5728083968307607285, d_reclen=24, d_name="192"}...} = 4848
> >	lseek(3, 5728083968307607285, SEEK_SET) = -1 EINVAL (Invalid argument)
> >
> >Still investigating.
> 
> Thanks! I'm also just going to work on it (just noticed your mail
> while on vacation ...).

OK, good.  You can probably figure this out faster than me....

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 20:34                                       ` J. Bruce Fields
  (?)
@ 2012-03-13 21:09                                       ` Bernd Schubert
  2012-03-13 21:29                                         ` J. Bruce Fields
  -1 siblings, 1 reply; 81+ messages in thread
From: Bernd Schubert @ 2012-03-13 21:09 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On 03/13/2012 09:34 PM, J. Bruce Fields wrote:
> On Tue, Mar 13, 2012 at 09:03:51PM +0100, Bernd Schubert wrote:
>> On 03/13/2012 09:01 PM, J. Bruce Fields wrote:
>>> On Mon, Mar 12, 2012 at 06:22:50PM -0400, bfields wrote:
>>>> On Mon, Mar 12, 2012 at 11:49:21AM -0400, J. Bruce Fields wrote:
>>>>> On Mon, Mar 12, 2012 at 11:09:12AM -0400, Ted Ts'o wrote:
>>>>>> On Fri, Mar 09, 2012 at 03:51:48PM -0500, Ted Ts'o wrote:
>>>>>>>
>>>>>>> Linus in general doesn't like cross tree merges (or any extraneous
>>>>>>> merges) unless they are absolutely necessary; but then, he trusts git
>>>>>>> merges more than many of us do.  :-)
>>>>>>>
>>>>>>> I've put the the patch series on a separate patch, with a signed tag, at:
>>>>>>>
>>>>>>>       git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
>>>>>>>
>>>>>>> Can you confirm that you've pulled it into your tree and it's landed
>>>>>>> in linux-next?  I probably won't bother pulling it in mine unless
>>>>>>> there is definitely a merge conflict that I need to resolve.  (I've
>>>>>>> checked and there do not appear to be any against linux-next as of
>>>>>>> last night.)
>>>>>>
>>>>>> Ping?
>>>>>
>>>>> Whoops, sorry.
>>>>>
>>>>> Looks perfect, thanks for handling these!
>>>>>
>>>>> I'm running them through my usual regression tests
>>>>
>>>> Urgh, I'm seeing a failure on the telldir test (part of the "special"
>>>> connectathon tests).  I haven't looked at what it's trying to do yet.
>>>> Reproduceable even just on a local filesystem without NFS involved.  If
>>>> someone wants to look at it, you can just do:
>>>>
>>>> 	git clone git://linux-nfs.org/~bfields/cthon04.git
>>>> 	cd cthon04
>>>> 	make
>>>> 	NFS_TESTDIR=/somewhere_on_an_ext4_fs/TMP ./runtests -s
>>>>
>>>> (Or after running that, more specifically,
>>>>
>>>> 	cd /somewhere_on_an_ext4_fs/TMP/
>>>> 	./telldir
>>>>
>>>> )
>>>>
>>>> I'll look at it tomorrow.
>>>
>>> Looks like seekdir is just not accepting an offset returned from
>>> getdents:
>>>
>>> 	openat(AT_FDCWD, "telldir-test", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
>>> 	getdents(3, {...{d_ino=20386, d_off=5728083968307607285, d_reclen=24, d_name="192"}...} = 4848
>>> 	lseek(3, 5728083968307607285, SEEK_SET) = -1 EINVAL (Invalid argument)
>>>
>>> Still investigating.
>>
>> Thanks! I'm also just going to work on it (just noticed your mail
>> while on vacation ...).
>
> OK, good.  You can probably figure this out faster than me....
>

Hmm, there must have gone something wrong on merging, my own test also 
fails

http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/

(Sorry, it does not say 'failure', but one needs to compare the file 
names and telldir-offset numbers)

I think I will continue in the morning as its already 1 a.m. here.


Cheers,
Bernd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 20:34                                       ` J. Bruce Fields
@ 2012-03-13 21:10                                           ` Ted Ts'o
  -1 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-13 21:10 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Tue, Mar 13, 2012 at 04:34:46PM -0400, J. Bruce Fields wrote:
> > 
> > Thanks! I'm also just going to work on it (just noticed your mail
> > while on vacation ...).

Brend, thanks!!

Bruce, random question --- do you know what the copyright and
licensing terms are for the connectathon test suite?  It occurs to me
that it might be good to get some of these tests into xfstests, given
that you mentioned that the telldir test was failing even when run on
a local file system.  That says to me this is a good file system level
test that we should try to get into xfstests if possible.

Getting it into xfstests would also tend to reduce the chance of
regressions, since I always run xfstests before I submit patches (and
in fact I did run xfstests before sending you a pull request Bernd's
patches --- and I'm glad your testing caught the problem when mine
didn't!)

Thanks,

     	       	      	       		- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-13 21:10                                           ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-13 21:10 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, linux-nfs, linux-ext4,
	linux-fsdevel, yong.fan, sandeen, adilger

On Tue, Mar 13, 2012 at 04:34:46PM -0400, J. Bruce Fields wrote:
> > 
> > Thanks! I'm also just going to work on it (just noticed your mail
> > while on vacation ...).

Brend, thanks!!

Bruce, random question --- do you know what the copyright and
licensing terms are for the connectathon test suite?  It occurs to me
that it might be good to get some of these tests into xfstests, given
that you mentioned that the telldir test was failing even when run on
a local file system.  That says to me this is a good file system level
test that we should try to get into xfstests if possible.

Getting it into xfstests would also tend to reduce the chance of
regressions, since I always run xfstests before I submit patches (and
in fact I did run xfstests before sending you a pull request Bernd's
patches --- and I'm glad your testing caught the problem when mine
didn't!)

Thanks,

     	       	      	       		- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 21:10                                           ` Ted Ts'o
@ 2012-03-13 21:27                                               ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-13 21:27 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: J. Bruce Fields, Bernd Schubert,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Tue, Mar 13, 2012 at 05:10:09PM -0400, Ted Ts'o wrote:
> On Tue, Mar 13, 2012 at 04:34:46PM -0400, J. Bruce Fields wrote:
> > > 
> > > Thanks! I'm also just going to work on it (just noticed your mail
> > > while on vacation ...).
> 
> Brend, thanks!!
> 
> Bruce, random question --- do you know what the copyright and
> licensing terms are for the connectathon test suite?

Alas.  Nobody does!

I'ts unlikely there's any owner who would care enough to make a fuss--so
we keep using the tests, but they don't really get any love.

It's one of those annoyances we argue about once a year or so, and then
forget till next time.

So, maybe one way forward would be to reimplement the connectathon tests
for xfstests?

--b.

> It occurs to me
> that it might be good to get some of these tests into xfstests, given
> that you mentioned that the telldir test was failing even when run on
> a local file system.  That says to me this is a good file system level
> test that we should try to get into xfstests if possible.
> 
> Getting it into xfstests would also tend to reduce the chance of
> regressions, since I always run xfstests before I submit patches (and
> in fact I did run xfstests before sending you a pull request Bernd's
> patches --- and I'm glad your testing caught the problem when mine
> didn't!)
> 
> Thanks,
> 
>      	       	      	       		- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-13 21:27                                               ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-13 21:27 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: J. Bruce Fields, Bernd Schubert, linux-nfs, linux-ext4,
	linux-fsdevel, yong.fan, sandeen, adilger

On Tue, Mar 13, 2012 at 05:10:09PM -0400, Ted Ts'o wrote:
> On Tue, Mar 13, 2012 at 04:34:46PM -0400, J. Bruce Fields wrote:
> > > 
> > > Thanks! I'm also just going to work on it (just noticed your mail
> > > while on vacation ...).
> 
> Brend, thanks!!
> 
> Bruce, random question --- do you know what the copyright and
> licensing terms are for the connectathon test suite?

Alas.  Nobody does!

I'ts unlikely there's any owner who would care enough to make a fuss--so
we keep using the tests, but they don't really get any love.

It's one of those annoyances we argue about once a year or so, and then
forget till next time.

So, maybe one way forward would be to reimplement the connectathon tests
for xfstests?

--b.

> It occurs to me
> that it might be good to get some of these tests into xfstests, given
> that you mentioned that the telldir test was failing even when run on
> a local file system.  That says to me this is a good file system level
> test that we should try to get into xfstests if possible.
> 
> Getting it into xfstests would also tend to reduce the chance of
> regressions, since I always run xfstests before I submit patches (and
> in fact I did run xfstests before sending you a pull request Bernd's
> patches --- and I'm glad your testing caught the problem when mine
> didn't!)
> 
> Thanks,
> 
>      	       	      	       		- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 21:09                                       ` Bernd Schubert
@ 2012-03-13 21:29                                         ` J. Bruce Fields
       [not found]                                           ` <20120313212947.GK31995-spRCxval1Z7TsXDwO4sDpg@public.gmane.org>
  0 siblings, 1 reply; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-13 21:29 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Bernd Schubert, Ted Ts'o, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
> Hmm, there must have gone something wrong on merging,

In that case one approach would be to rebase your last-sent patches on
to the same base as Ted's versions, confirm that one still works and the
other doesn't, and do a diff....

> my own test
> also fails
> 
> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> 
> (Sorry, it does not say 'failure', but one needs to compare the file
> names and telldir-offset numbers)
> 
> I think I will continue in the morning as its already 1 a.m. here.

OK, thanks for following up!

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-13 21:29                                         ` J. Bruce Fields
@ 2012-03-14 14:32                                               ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-14 14:32 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
> On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
>> Hmm, there must have gone something wrong on merging,
>
> In that case one approach would be to rebase your last-sent patches on
> to the same base as Ted's versions, confirm that one still works and the
> other doesn't, and do a diff....
>
>> my own test
>> also fails
>>
>> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
>>
>> (Sorry, it does not say 'failure', but one needs to compare the file
>> names and telldir-offset numbers)
>>
>> I think I will continue in the morning as its already 1 a.m. here.
>
> OK, thanks for following up!

Took me some time to figure out what is going on and in the end I previously forgot 
another test case - I always tested directories being sufficiently large, so that 
they got the EXT4_INODE_INDEX flag. However, the cython tests failed on a small 
directory, which doesn't have that flag. But then ext4_readdir() always uses the 
dx version, I guess in order to always return the same offset values 
(in the sense of converting a non-dx dir to dx). In ext4_dir_llseek() I only 
tested for the EXT4_INODE_INDEX flag, which is not correct in any case.
So here is the patch, shall I sent an updated version of the previous ext4 patch 
or is this patch sufficient?

Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 71a66ff..6a19a35 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -44,6 +44,24 @@ static unsigned char get_dtype(struct super_block *sb, int filetype)
 	return (ext4_filetype_table[filetype]);
 }
 
+/**
+ * Check if the given dir-inode refers to an htree indexed directory
+ *
+ * Return 1 if it is a dx dir, 0 if not
+ */
+static int is_dx_dir(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+
+	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
+		     EXT4_FEATURE_COMPAT_DIR_INDEX) &&
+	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
+	     ((inode->i_size >> sb->s_blocksize_bits) == 1)))
+		return 1;
+
+	return 0;
+}
+
 /*
  * Return 0 if the directory entry is OK, and 1 if there is a problem
  *
@@ -99,18 +117,13 @@ static int ext4_readdir(struct file *filp,
 	unsigned int offset;
 	int i, stored;
 	struct ext4_dir_entry_2 *de;
-	struct super_block *sb;
 	int err;
 	struct inode *inode = filp->f_path.dentry->d_inode;
+	struct super_block *sb = inode->i_sb;
 	int ret = 0;
 	int dir_has_error = 0;
 
-	sb = inode->i_sb;
-
-	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
-				    EXT4_FEATURE_COMPAT_DIR_INDEX) &&
-	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
-	     ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
+	if (is_dx_dir(inode)) {
 		err = ext4_dx_readdir(filp, dirent, filldir);
 		if (err != ERR_BAD_DX_DIR) {
 			ret = err;
@@ -308,7 +321,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
 {
 	struct inode *inode = file->f_mapping->host;
 	loff_t ret = -EINVAL;
-	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
+	int dx_dir = is_dx_dir(inode);
 
 	mutex_lock(&inode->i_mutex);
 
@@ -323,7 +336,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
 
 		/* so only negative offsets are left, does that have a
 		 * meaning for directories at all? */
-		if (is_dx_dir)
+		if (dx_dir)
 			offset += ext4_get_htree_eof(file);
 		else
 			offset += inode->i_size;
@@ -347,7 +360,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
 	if (unlikely(offset < 0))
 		goto out_err;
 
-	if (!is_dx_dir) {
+	if (!dx_dir) {
 		if (offset > inode->i_sb->s_maxbytes)
 			goto out_err;
 	} else if (offset > ext4_get_htree_eof(file))




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-14 14:32                                               ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-14 14:32 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
> On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
>> Hmm, there must have gone something wrong on merging,
>
> In that case one approach would be to rebase your last-sent patches on
> to the same base as Ted's versions, confirm that one still works and the
> other doesn't, and do a diff....
>
>> my own test
>> also fails
>>
>> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
>>
>> (Sorry, it does not say 'failure', but one needs to compare the file
>> names and telldir-offset numbers)
>>
>> I think I will continue in the morning as its already 1 a.m. here.
>
> OK, thanks for following up!

Took me some time to figure out what is going on and in the end I previously forgot 
another test case - I always tested directories being sufficiently large, so that 
they got the EXT4_INODE_INDEX flag. However, the cython tests failed on a small 
directory, which doesn't have that flag. But then ext4_readdir() always uses the 
dx version, I guess in order to always return the same offset values 
(in the sense of converting a non-dx dir to dx). In ext4_dir_llseek() I only 
tested for the EXT4_INODE_INDEX flag, which is not correct in any case.
So here is the patch, shall I sent an updated version of the previous ext4 patch 
or is this patch sufficient?

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 71a66ff..6a19a35 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -44,6 +44,24 @@ static unsigned char get_dtype(struct super_block *sb, int filetype)
 	return (ext4_filetype_table[filetype]);
 }
 
+/**
+ * Check if the given dir-inode refers to an htree indexed directory
+ *
+ * Return 1 if it is a dx dir, 0 if not
+ */
+static int is_dx_dir(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+
+	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
+		     EXT4_FEATURE_COMPAT_DIR_INDEX) &&
+	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
+	     ((inode->i_size >> sb->s_blocksize_bits) == 1)))
+		return 1;
+
+	return 0;
+}
+
 /*
  * Return 0 if the directory entry is OK, and 1 if there is a problem
  *
@@ -99,18 +117,13 @@ static int ext4_readdir(struct file *filp,
 	unsigned int offset;
 	int i, stored;
 	struct ext4_dir_entry_2 *de;
-	struct super_block *sb;
 	int err;
 	struct inode *inode = filp->f_path.dentry->d_inode;
+	struct super_block *sb = inode->i_sb;
 	int ret = 0;
 	int dir_has_error = 0;
 
-	sb = inode->i_sb;
-
-	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
-				    EXT4_FEATURE_COMPAT_DIR_INDEX) &&
-	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
-	     ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
+	if (is_dx_dir(inode)) {
 		err = ext4_dx_readdir(filp, dirent, filldir);
 		if (err != ERR_BAD_DX_DIR) {
 			ret = err;
@@ -308,7 +321,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
 {
 	struct inode *inode = file->f_mapping->host;
 	loff_t ret = -EINVAL;
-	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
+	int dx_dir = is_dx_dir(inode);
 
 	mutex_lock(&inode->i_mutex);
 
@@ -323,7 +336,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
 
 		/* so only negative offsets are left, does that have a
 		 * meaning for directories at all? */
-		if (is_dx_dir)
+		if (dx_dir)
 			offset += ext4_get_htree_eof(file);
 		else
 			offset += inode->i_size;
@@ -347,7 +360,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
 	if (unlikely(offset < 0))
 		goto out_err;
 
-	if (!is_dx_dir) {
+	if (!dx_dir) {
 		if (offset > inode->i_sb->s_maxbytes)
 			goto out_err;
 	} else if (offset > ext4_get_htree_eof(file))





^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-14 14:32                                               ` Bernd Schubert
@ 2012-03-14 16:05                                                   ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-14 16:05 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Bernd Schubert, Ted Ts'o,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
> > On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
> >> Hmm, there must have gone something wrong on merging,
> >
> > In that case one approach would be to rebase your last-sent patches on
> > to the same base as Ted's versions, confirm that one still works and the
> > other doesn't, and do a diff....
> >
> >> my own test
> >> also fails
> >>
> >> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> >>
> >> (Sorry, it does not say 'failure', but one needs to compare the file
> >> names and telldir-offset numbers)
> >>
> >> I think I will continue in the morning as its already 1 a.m. here.
> >
> > OK, thanks for following up!
> 
> Took me some time to figure out what is going on and in the end I previously forgot 
> another test case - I always tested directories being sufficiently large, so that 
> they got the EXT4_INODE_INDEX flag. However, the cython tests failed on a small 
> directory, which doesn't have that flag. But then ext4_readdir() always uses the 
> dx version, I guess in order to always return the same offset values 
> (in the sense of converting a non-dx dir to dx). In ext4_dir_llseek() I only 
> tested for the EXT4_INODE_INDEX flag, which is not correct in any case.
> So here is the patch,

Yep, that passes my usual tests now, thanks!

> shall I sent an updated version of the previous ext4 patch 
> or is this patch sufficient?

Up to Ted.

Ted, I haven't published my merge of your branch, so at least for my
purposes I don't care whether you fold this in or pop another patch on
top.

--b.

> 
> Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> 
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index 71a66ff..6a19a35 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -44,6 +44,24 @@ static unsigned char get_dtype(struct super_block *sb, int filetype)
>  	return (ext4_filetype_table[filetype]);
>  }
>  
> +/**
> + * Check if the given dir-inode refers to an htree indexed directory
> + *
> + * Return 1 if it is a dx dir, 0 if not
> + */
> +static int is_dx_dir(struct inode *inode)
> +{
> +	struct super_block *sb = inode->i_sb;
> +
> +	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> +		     EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> +	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> +	     ((inode->i_size >> sb->s_blocksize_bits) == 1)))
> +		return 1;
> +
> +	return 0;
> +}
> +
>  /*
>   * Return 0 if the directory entry is OK, and 1 if there is a problem
>   *
> @@ -99,18 +117,13 @@ static int ext4_readdir(struct file *filp,
>  	unsigned int offset;
>  	int i, stored;
>  	struct ext4_dir_entry_2 *de;
> -	struct super_block *sb;
>  	int err;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
> +	struct super_block *sb = inode->i_sb;
>  	int ret = 0;
>  	int dir_has_error = 0;
>  
> -	sb = inode->i_sb;
> -
> -	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> -				    EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> -	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> -	     ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
> +	if (is_dx_dir(inode)) {
>  		err = ext4_dx_readdir(filp, dirent, filldir);
>  		if (err != ERR_BAD_DX_DIR) {
>  			ret = err;
> @@ -308,7 +321,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  {
>  	struct inode *inode = file->f_mapping->host;
>  	loff_t ret = -EINVAL;
> -	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
> +	int dx_dir = is_dx_dir(inode);
>  
>  	mutex_lock(&inode->i_mutex);
>  
> @@ -323,7 +336,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  
>  		/* so only negative offsets are left, does that have a
>  		 * meaning for directories at all? */
> -		if (is_dx_dir)
> +		if (dx_dir)
>  			offset += ext4_get_htree_eof(file);
>  		else
>  			offset += inode->i_size;
> @@ -347,7 +360,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  	if (unlikely(offset < 0))
>  		goto out_err;
>  
> -	if (!is_dx_dir) {
> +	if (!dx_dir) {
>  		if (offset > inode->i_sb->s_maxbytes)
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
> 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-14 16:05                                                   ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-14 16:05 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Bernd Schubert, Ted Ts'o, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
> > On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
> >> Hmm, there must have gone something wrong on merging,
> >
> > In that case one approach would be to rebase your last-sent patches on
> > to the same base as Ted's versions, confirm that one still works and the
> > other doesn't, and do a diff....
> >
> >> my own test
> >> also fails
> >>
> >> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> >>
> >> (Sorry, it does not say 'failure', but one needs to compare the file
> >> names and telldir-offset numbers)
> >>
> >> I think I will continue in the morning as its already 1 a.m. here.
> >
> > OK, thanks for following up!
> 
> Took me some time to figure out what is going on and in the end I previously forgot 
> another test case - I always tested directories being sufficiently large, so that 
> they got the EXT4_INODE_INDEX flag. However, the cython tests failed on a small 
> directory, which doesn't have that flag. But then ext4_readdir() always uses the 
> dx version, I guess in order to always return the same offset values 
> (in the sense of converting a non-dx dir to dx). In ext4_dir_llseek() I only 
> tested for the EXT4_INODE_INDEX flag, which is not correct in any case.
> So here is the patch,

Yep, that passes my usual tests now, thanks!

> shall I sent an updated version of the previous ext4 patch 
> or is this patch sufficient?

Up to Ted.

Ted, I haven't published my merge of your branch, so at least for my
purposes I don't care whether you fold this in or pop another patch on
top.

--b.

> 
> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> 
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index 71a66ff..6a19a35 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -44,6 +44,24 @@ static unsigned char get_dtype(struct super_block *sb, int filetype)
>  	return (ext4_filetype_table[filetype]);
>  }
>  
> +/**
> + * Check if the given dir-inode refers to an htree indexed directory
> + *
> + * Return 1 if it is a dx dir, 0 if not
> + */
> +static int is_dx_dir(struct inode *inode)
> +{
> +	struct super_block *sb = inode->i_sb;
> +
> +	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> +		     EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> +	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> +	     ((inode->i_size >> sb->s_blocksize_bits) == 1)))
> +		return 1;
> +
> +	return 0;
> +}
> +
>  /*
>   * Return 0 if the directory entry is OK, and 1 if there is a problem
>   *
> @@ -99,18 +117,13 @@ static int ext4_readdir(struct file *filp,
>  	unsigned int offset;
>  	int i, stored;
>  	struct ext4_dir_entry_2 *de;
> -	struct super_block *sb;
>  	int err;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
> +	struct super_block *sb = inode->i_sb;
>  	int ret = 0;
>  	int dir_has_error = 0;
>  
> -	sb = inode->i_sb;
> -
> -	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> -				    EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> -	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> -	     ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
> +	if (is_dx_dir(inode)) {
>  		err = ext4_dx_readdir(filp, dirent, filldir);
>  		if (err != ERR_BAD_DX_DIR) {
>  			ret = err;
> @@ -308,7 +321,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  {
>  	struct inode *inode = file->f_mapping->host;
>  	loff_t ret = -EINVAL;
> -	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
> +	int dx_dir = is_dx_dir(inode);
>  
>  	mutex_lock(&inode->i_mutex);
>  
> @@ -323,7 +336,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  
>  		/* so only negative offsets are left, does that have a
>  		 * meaning for directories at all? */
> -		if (is_dx_dir)
> +		if (dx_dir)
>  			offset += ext4_get_htree_eof(file);
>  		else
>  			offset += inode->i_size;
> @@ -347,7 +360,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  	if (unlikely(offset < 0))
>  		goto out_err;
>  
> -	if (!is_dx_dir) {
> +	if (!dx_dir) {
>  		if (offset > inode->i_sb->s_maxbytes)
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-14 16:05                                                   ` J. Bruce Fields
@ 2012-03-16 21:22                                                       ` Bernd Schubert
  -1 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-16 21:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On 03/14/2012 05:05 PM, J. Bruce Fields wrote:
> On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
>> shall I sent an updated version of the previous ext4 patch
>> or is this patch sufficient?
>
> Up to Ted.
>
> Ted, I haven't published my merge of your branch, so at least for my
> purposes I don't care whether you fold this in or pop another patch on
> top.

Ted, could you please tell me how to proceed?


Thanks,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-16 21:22                                                       ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-03-16 21:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On 03/14/2012 05:05 PM, J. Bruce Fields wrote:
> On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
>> shall I sent an updated version of the previous ext4 patch
>> or is this patch sufficient?
>
> Up to Ted.
>
> Ted, I haven't published my merge of your branch, so at least for my
> purposes I don't care whether you fold this in or pop another patch on
> top.

Ted, could you please tell me how to proceed?


Thanks,
Bernd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-16 21:22                                                       ` Bernd Schubert
  (?)
@ 2012-03-19  2:54                                                       ` Ted Ts'o
       [not found]                                                         ` <20120319025455.GD31682-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
  -1 siblings, 1 reply; 81+ messages in thread
From: Ted Ts'o @ 2012-03-19  2:54 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Bernd Schubert, J. Bruce Fields, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Fri, Mar 16, 2012 at 10:22:51PM +0100, Bernd Schubert wrote:
> On 03/14/2012 05:05 PM, J. Bruce Fields wrote:
> >On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> >>shall I sent an updated version of the previous ext4 patch
> >>or is this patch sufficient?
> >
> >Up to Ted.
> >
> >Ted, I haven't published my merge of your branch, so at least for my
> >purposes I don't care whether you fold this in or pop another patch on
> >top.
> 
> Ted, could you please tell me how to proceed?

I've merged it in.

Bruce, I have an updated tag and branch at:

   git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge

Can you give it a try and pull it in if it passes your tests?

    	     	      	       	     	- Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-19  2:54                                                       ` Ted Ts'o
@ 2012-03-19 20:00                                                             ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-19 20:00 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, Bernd Schubert, J. Bruce Fields,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Sun, Mar 18, 2012 at 10:54:55PM -0400, Ted Ts'o wrote:
> On Fri, Mar 16, 2012 at 10:22:51PM +0100, Bernd Schubert wrote:
> > On 03/14/2012 05:05 PM, J. Bruce Fields wrote:
> > >On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> > >>shall I sent an updated version of the previous ext4 patch
> > >>or is this patch sufficient?
> > >
> > >Up to Ted.
> > >
> > >Ted, I haven't published my merge of your branch, so at least for my
> > >purposes I don't care whether you fold this in or pop another patch on
> > >top.
> > 
> > Ted, could you please tell me how to proceed?
> 
> I've merged it in.
> 
> Bruce, I have an updated tag and branch at:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> 
> Can you give it a try and pull it in if it passes your tests?

Looks good.

(Merged and pushed out to my git://linux-nfs.org/~bfields/linux.git.)

Sorry that was more of a hassle than it could have been....  It would be
nice to try rewriting the cthon tests for xfstests, if that would help
catch nfs-related bugs sooner.  I don't have the time for now, though.

Merging a topic branch like this looks like it should work fine, anyway.

Thanks!

--b.

commit 62b9510cb373d5722fdaba71d961d8f695acfcd5
Merge: 8546ee5 06effdb
Author: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date:   Mon Mar 19 12:34:39 2012 -0400

    nfsd: merge cookie collision fixes from ext4 tree
    
    These changes fix readdir loops on ext4 filesystems with dir_index
    turned on.  I'm pulling them from Ted's tree as I'd like to give them
    some extra nfsd testing, and expect to be applying (potentially
    conflicting) patches to the same code before the next merge window.
    
    From the nfs-ext4-premerge branch of
    
    	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
    
    Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-19 20:00                                                             ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-03-19 20:00 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Bernd Schubert, Bernd Schubert, J. Bruce Fields, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Sun, Mar 18, 2012 at 10:54:55PM -0400, Ted Ts'o wrote:
> On Fri, Mar 16, 2012 at 10:22:51PM +0100, Bernd Schubert wrote:
> > On 03/14/2012 05:05 PM, J. Bruce Fields wrote:
> > >On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> > >>shall I sent an updated version of the previous ext4 patch
> > >>or is this patch sufficient?
> > >
> > >Up to Ted.
> > >
> > >Ted, I haven't published my merge of your branch, so at least for my
> > >purposes I don't care whether you fold this in or pop another patch on
> > >top.
> > 
> > Ted, could you please tell me how to proceed?
> 
> I've merged it in.
> 
> Bruce, I have an updated tag and branch at:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git nfs-ext4-premerge
> 
> Can you give it a try and pull it in if it passes your tests?

Looks good.

(Merged and pushed out to my git://linux-nfs.org/~bfields/linux.git.)

Sorry that was more of a hassle than it could have been....  It would be
nice to try rewriting the cthon tests for xfstests, if that would help
catch nfs-related bugs sooner.  I don't have the time for now, though.

Merging a topic branch like this looks like it should work fine, anyway.

Thanks!

--b.

commit 62b9510cb373d5722fdaba71d961d8f695acfcd5
Merge: 8546ee5 06effdb
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Mon Mar 19 12:34:39 2012 -0400

    nfsd: merge cookie collision fixes from ext4 tree
    
    These changes fix readdir loops on ext4 filesystems with dir_index
    turned on.  I'm pulling them from Ted's tree as I'd like to give them
    some extra nfsd testing, and expect to be applying (potentially
    conflicting) patches to the same code before the next merge window.
    
    From the nfs-ext4-premerge branch of
    
    	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-19 20:00                                                             ` J. Bruce Fields
@ 2012-03-20  0:10                                                                 ` Ted Ts'o
  -1 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-20  0:10 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, Bernd Schubert, J. Bruce Fields,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Mon, Mar 19, 2012 at 04:00:41PM -0400, J. Bruce Fields wrote:
> Sorry that was more of a hassle than it could have been....  It would be
> nice to try rewriting the cthon tests for xfstests, if that would help
> catch nfs-related bugs sooner.  I don't have the time for now, though.

No problem.  I'd love to look at adapting the connectathon tests for
xfstests as well, but I don't have the time either.

I'll put it on an ext4 todo list; maybe someone who is looking to get
into ext4 programming will take this on as a task.

     	  	      	   	     	   - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-03-20  0:10                                                                 ` Ted Ts'o
  0 siblings, 0 replies; 81+ messages in thread
From: Ted Ts'o @ 2012-03-20  0:10 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, Bernd Schubert, J. Bruce Fields, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Mon, Mar 19, 2012 at 04:00:41PM -0400, J. Bruce Fields wrote:
> Sorry that was more of a hassle than it could have been....  It would be
> nice to try rewriting the cthon tests for xfstests, if that would help
> catch nfs-related bugs sooner.  I don't have the time for now, though.

No problem.  I'd love to look at adapting the connectathon tests for
xfstests as well, but I don't have the time either.

I'll put it on an ext4 todo list; maybe someone who is looking to get
into ext4 programming will take this on as a task.

     	  	      	   	     	   - Ted

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-03-14 14:32                                               ` Bernd Schubert
@ 2012-04-12 20:49                                                   ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-04-12 20:49 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Bernd Schubert, Ted Ts'o,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
> > On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
> >> Hmm, there must have gone something wrong on merging,
> >
> > In that case one approach would be to rebase your last-sent patches on
> > to the same base as Ted's versions, confirm that one still works and the
> > other doesn't, and do a diff....
> >
> >> my own test
> >> also fails
> >>
> >> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> >>
> >> (Sorry, it does not say 'failure', but one needs to compare the file
> >> names and telldir-offset numbers)
> >>
> >> I think I will continue in the morning as its already 1 a.m. here.
> >
> > OK, thanks for following up!

Stupid question: is there any fundamental feature of ext4 this depends
on, or would this fix work equally well for fs/ext3?

--b.

> 
> Took me some time to figure out what is going on and in the end I previously forgot 
> another test case - I always tested directories being sufficiently large, so that 
> they got the EXT4_INODE_INDEX flag. However, the cython tests failed on a small 
> directory, which doesn't have that flag. But then ext4_readdir() always uses the 
> dx version, I guess in order to always return the same offset values 
> (in the sense of converting a non-dx dir to dx). In ext4_dir_llseek() I only 
> tested for the EXT4_INODE_INDEX flag, which is not correct in any case.
> So here is the patch, shall I sent an updated version of the previous ext4 patch 
> or is this patch sufficient?
> 
> Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> 
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index 71a66ff..6a19a35 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -44,6 +44,24 @@ static unsigned char get_dtype(struct super_block *sb, int filetype)
>  	return (ext4_filetype_table[filetype]);
>  }
>  
> +/**
> + * Check if the given dir-inode refers to an htree indexed directory
> + *
> + * Return 1 if it is a dx dir, 0 if not
> + */
> +static int is_dx_dir(struct inode *inode)
> +{
> +	struct super_block *sb = inode->i_sb;
> +
> +	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> +		     EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> +	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> +	     ((inode->i_size >> sb->s_blocksize_bits) == 1)))
> +		return 1;
> +
> +	return 0;
> +}
> +
>  /*
>   * Return 0 if the directory entry is OK, and 1 if there is a problem
>   *
> @@ -99,18 +117,13 @@ static int ext4_readdir(struct file *filp,
>  	unsigned int offset;
>  	int i, stored;
>  	struct ext4_dir_entry_2 *de;
> -	struct super_block *sb;
>  	int err;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
> +	struct super_block *sb = inode->i_sb;
>  	int ret = 0;
>  	int dir_has_error = 0;
>  
> -	sb = inode->i_sb;
> -
> -	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> -				    EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> -	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> -	     ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
> +	if (is_dx_dir(inode)) {
>  		err = ext4_dx_readdir(filp, dirent, filldir);
>  		if (err != ERR_BAD_DX_DIR) {
>  			ret = err;
> @@ -308,7 +321,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  {
>  	struct inode *inode = file->f_mapping->host;
>  	loff_t ret = -EINVAL;
> -	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
> +	int dx_dir = is_dx_dir(inode);
>  
>  	mutex_lock(&inode->i_mutex);
>  
> @@ -323,7 +336,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  
>  		/* so only negative offsets are left, does that have a
>  		 * meaning for directories at all? */
> -		if (is_dx_dir)
> +		if (dx_dir)
>  			offset += ext4_get_htree_eof(file);
>  		else
>  			offset += inode->i_size;
> @@ -347,7 +360,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  	if (unlikely(offset < 0))
>  		goto out_err;
>  
> -	if (!is_dx_dir) {
> +	if (!dx_dir) {
>  		if (offset > inode->i_sb->s_maxbytes)
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
> 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-04-12 20:49                                                   ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-04-12 20:49 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: J. Bruce Fields, Bernd Schubert, Ted Ts'o, linux-nfs,
	linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
> On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
> > On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
> >> Hmm, there must have gone something wrong on merging,
> >
> > In that case one approach would be to rebase your last-sent patches on
> > to the same base as Ted's versions, confirm that one still works and the
> > other doesn't, and do a diff....
> >
> >> my own test
> >> also fails
> >>
> >> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
> >>
> >> (Sorry, it does not say 'failure', but one needs to compare the file
> >> names and telldir-offset numbers)
> >>
> >> I think I will continue in the morning as its already 1 a.m. here.
> >
> > OK, thanks for following up!

Stupid question: is there any fundamental feature of ext4 this depends
on, or would this fix work equally well for fs/ext3?

--b.

> 
> Took me some time to figure out what is going on and in the end I previously forgot 
> another test case - I always tested directories being sufficiently large, so that 
> they got the EXT4_INODE_INDEX flag. However, the cython tests failed on a small 
> directory, which doesn't have that flag. But then ext4_readdir() always uses the 
> dx version, I guess in order to always return the same offset values 
> (in the sense of converting a non-dx dir to dx). In ext4_dir_llseek() I only 
> tested for the EXT4_INODE_INDEX flag, which is not correct in any case.
> So here is the patch, shall I sent an updated version of the previous ext4 patch 
> or is this patch sufficient?
> 
> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> 
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index 71a66ff..6a19a35 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -44,6 +44,24 @@ static unsigned char get_dtype(struct super_block *sb, int filetype)
>  	return (ext4_filetype_table[filetype]);
>  }
>  
> +/**
> + * Check if the given dir-inode refers to an htree indexed directory
> + *
> + * Return 1 if it is a dx dir, 0 if not
> + */
> +static int is_dx_dir(struct inode *inode)
> +{
> +	struct super_block *sb = inode->i_sb;
> +
> +	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> +		     EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> +	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> +	     ((inode->i_size >> sb->s_blocksize_bits) == 1)))
> +		return 1;
> +
> +	return 0;
> +}
> +
>  /*
>   * Return 0 if the directory entry is OK, and 1 if there is a problem
>   *
> @@ -99,18 +117,13 @@ static int ext4_readdir(struct file *filp,
>  	unsigned int offset;
>  	int i, stored;
>  	struct ext4_dir_entry_2 *de;
> -	struct super_block *sb;
>  	int err;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
> +	struct super_block *sb = inode->i_sb;
>  	int ret = 0;
>  	int dir_has_error = 0;
>  
> -	sb = inode->i_sb;
> -
> -	if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb,
> -				    EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> -	    ((ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) ||
> -	     ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
> +	if (is_dx_dir(inode)) {
>  		err = ext4_dx_readdir(filp, dirent, filldir);
>  		if (err != ERR_BAD_DX_DIR) {
>  			ret = err;
> @@ -308,7 +321,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  {
>  	struct inode *inode = file->f_mapping->host;
>  	loff_t ret = -EINVAL;
> -	int is_dx_dir = ext4_test_inode_flag(inode, EXT4_INODE_INDEX);
> +	int dx_dir = is_dx_dir(inode);
>  
>  	mutex_lock(&inode->i_mutex);
>  
> @@ -323,7 +336,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  
>  		/* so only negative offsets are left, does that have a
>  		 * meaning for directories at all? */
> -		if (is_dx_dir)
> +		if (dx_dir)
>  			offset += ext4_get_htree_eof(file);
>  		else
>  			offset += inode->i_size;
> @@ -347,7 +360,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>  	if (unlikely(offset < 0))
>  		goto out_err;
>  
> -	if (!is_dx_dir) {
> +	if (!dx_dir) {
>  		if (offset > inode->i_sb->s_maxbytes)
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-04-12 20:49                                                   ` J. Bruce Fields
@ 2012-04-12 21:22                                                       ` Bernd Schubert
  -1 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-12 21:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, Jan Kara,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On 04/12/2012 10:49 PM, J. Bruce Fields wrote:
> On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
>> On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
>>> On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
>>>> Hmm, there must have gone something wrong on merging,
>>>
>>> In that case one approach would be to rebase your last-sent patches on
>>> to the same base as Ted's versions, confirm that one still works and the
>>> other doesn't, and do a diff....
>>>
>>>> my own test
>>>> also fails
>>>>
>>>> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
>>>>
>>>> (Sorry, it does not say 'failure', but one needs to compare the file
>>>> names and telldir-offset numbers)
>>>>
>>>> I think I will continue in the morning as its already 1 a.m. here.
>>>
>>> OK, thanks for following up!
> 
> Stupid question: is there any fundamental feature of ext4 this depends
> on, or would this fix work equally well for fs/ext3?

First of all, Bruce and Ted, thanks a lot for your patience with those
patches!

I have not looked into it in detail yet, but it should be possible to
back port it to ext3. I don't think there is anything that would depend
on ext4. I also have ext3 on my todo list, but I first wanted to have it
in ext4 (in the sense of stability of ext3, if there still should be a
bug...) and I think for 3.4 it is too late now anyways. So unless there
are objections, lets target the back port to ext3 for 3.5?


Cheers,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-04-12 21:22                                                       ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-12 21:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, Jan Kara,
	linux-nfs, linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On 04/12/2012 10:49 PM, J. Bruce Fields wrote:
> On Wed, Mar 14, 2012 at 03:32:45PM +0100, Bernd Schubert wrote:
>> On 03/13/2012 10:29 PM, J. Bruce Fields wrote:
>>> On Tue, Mar 13, 2012 at 10:09:05PM +0100, Bernd Schubert wrote:
>>>> Hmm, there must have gone something wrong on merging,
>>>
>>> In that case one approach would be to rebase your last-sent patches on
>>> to the same base as Ted's versions, confirm that one still works and the
>>> other doesn't, and do a diff....
>>>
>>>> my own test
>>>> also fails
>>>>
>>>> http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/test_seekdir/
>>>>
>>>> (Sorry, it does not say 'failure', but one needs to compare the file
>>>> names and telldir-offset numbers)
>>>>
>>>> I think I will continue in the morning as its already 1 a.m. here.
>>>
>>> OK, thanks for following up!
> 
> Stupid question: is there any fundamental feature of ext4 this depends
> on, or would this fix work equally well for fs/ext3?

First of all, Bruce and Ted, thanks a lot for your patience with those
patches!

I have not looked into it in detail yet, but it should be possible to
back port it to ext3. I don't think there is anything that would depend
on ext4. I also have ext3 on my todo list, but I first wanted to have it
in ext4 (in the sense of stability of ext3, if there still should be a
bug...) and I think for 3.4 it is too late now anyways. So unless there
are objections, lets target the back port to ext3 for 3.5?


Cheers,
Bernd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
  2012-04-12 21:22                                                       ` Bernd Schubert
@ 2012-04-12 21:25                                                           ` J. Bruce Fields
  -1 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-04-12 21:25 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, Jan Kara,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	yong.fan-KloliPT79xf2eFz/2MeuCQ, sandeen-H+wXaHxf7aLQT0dZR+AlfA,
	adilger-KloliPT79xf2eFz/2MeuCQ

On Thu, Apr 12, 2012 at 11:22:41PM +0200, Bernd Schubert wrote:
> On 04/12/2012 10:49 PM, J. Bruce Fields wrote:
> > Stupid question: is there any fundamental feature of ext4 this depends
> > on, or would this fix work equally well for fs/ext3?
> 
> First of all, Bruce and Ted, thanks a lot for your patience with those
> patches!
> 
> I have not looked into it in detail yet, but it should be possible to
> back port it to ext3. I don't think there is anything that would depend
> on ext4. I also have ext3 on my todo list, but I first wanted to have it
> in ext4 (in the sense of stability of ext3, if there still should be a
> bug...) and I think for 3.4 it is too late now anyways. So unless there
> are objections, lets target the back port to ext3 for 3.5?

That sounds sensible, thanks!

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open()
@ 2012-04-12 21:25                                                           ` J. Bruce Fields
  0 siblings, 0 replies; 81+ messages in thread
From: J. Bruce Fields @ 2012-04-12 21:25 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Bernd Schubert, J. Bruce Fields, Ted Ts'o, Jan Kara,
	linux-nfs, linux-ext4, linux-fsdevel, yong.fan, sandeen, adilger

On Thu, Apr 12, 2012 at 11:22:41PM +0200, Bernd Schubert wrote:
> On 04/12/2012 10:49 PM, J. Bruce Fields wrote:
> > Stupid question: is there any fundamental feature of ext4 this depends
> > on, or would this fix work equally well for fs/ext3?
> 
> First of all, Bruce and Ted, thanks a lot for your patience with those
> patches!
> 
> I have not looked into it in detail yet, but it should be possible to
> back port it to ext3. I don't think there is anything that would depend
> on ext4. I also have ext3 on my todo list, but I first wanted to have it
> in ext4 (in the sense of stability of ext3, if there still should be a
> bug...) and I think for 3.4 it is too late now anyways. So unless there
> are objections, lets target the back port to ext3 for 3.5?

That sounds sensible, thanks!

--b.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-01-09 13:21 ` [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type Bernd Schubert
@ 2012-04-20 20:04       ` Eric Sandeen
       [not found]   ` <20120109132148.2616029.68798.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  1 sibling, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-20 20:04 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, Andreas Dilger

On 1/9/12 7:21 AM, Bernd Schubert wrote:
> From: Fan Yong <yong.fan-KloliPT79xf2eFz/2MeuCQ@public.gmane.org>
> 
> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
> and telldir().  However, this causes problems if there are 32-bit hash
> collisions, since the NFSv2 server can get stuck resending the same
> entries from the directory repeatedly.
> 
> Allow ext4 to return a full 64-bit hash (both major and minor) for
> telldir to decrease the chance of hash collisions.  This still needs
> integration on the NFS side.
> 
> Patch-updated-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> (blame me if something is not correct)

Bernd, I've merged this to ext3.  Bruce thought maybe you were working
on the same.  Should I send mine?

Also...

> +/*
> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
> + * non-htree and htree directories, where the "offset" is in terms
> + * of the filename hash value instead of the byte offset.
> + *
> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
> + *       will be invalid once the directory was converted into a dx directory
> + */
> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)

ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
files.  Do we need to worry about the same thing in this function?

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-04-20 20:04       ` Eric Sandeen
  0 siblings, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-20 20:04 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, Andreas Dilger

On 1/9/12 7:21 AM, Bernd Schubert wrote:
> From: Fan Yong <yong.fan@whamcloud.com>
> 
> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
> and telldir().  However, this causes problems if there are 32-bit hash
> collisions, since the NFSv2 server can get stuck resending the same
> entries from the directory repeatedly.
> 
> Allow ext4 to return a full 64-bit hash (both major and minor) for
> telldir to decrease the chance of hash collisions.  This still needs
> integration on the NFS side.
> 
> Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
> (blame me if something is not correct)

Bernd, I've merged this to ext3.  Bruce thought maybe you were working
on the same.  Should I send mine?

Also...

> +/*
> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
> + * non-htree and htree directories, where the "offset" is in terms
> + * of the filename hash value instead of the byte offset.
> + *
> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
> + *       will be invalid once the directory was converted into a dx directory
> + */
> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)

ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
files.  Do we need to worry about the same thing in this function?

-Eric

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-20 20:04       ` Eric Sandeen
@ 2012-04-22 12:51           ` Bernd Schubert
  -1 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-22 12:51 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, Andreas Dilger

On 04/20/2012 10:04 PM, Eric Sandeen wrote:
> On 1/9/12 7:21 AM, Bernd Schubert wrote:
>> From: Fan Yong <yong.fan-KloliPT79xf2eFz/2MeuCQ@public.gmane.org>
>>
>> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
>> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
>> and telldir().  However, this causes problems if there are 32-bit hash
>> collisions, since the NFSv2 server can get stuck resending the same
>> entries from the directory repeatedly.
>>
>> Allow ext4 to return a full 64-bit hash (both major and minor) for
>> telldir to decrease the chance of hash collisions.  This still needs
>> integration on the NFS side.
>>
>> Patch-updated-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
>> (blame me if something is not correct)
> 
> Bernd, I've merged this to ext3.  Bruce thought maybe you were working
> on the same.  Should I send mine?

That is perfectly fine with me.

> 
> Also...
> 
>> +/*
>> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
>> + * non-htree and htree directories, where the "offset" is in terms
>> + * of the filename hash value instead of the byte offset.
>> + *
>> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
>> + *       will be invalid once the directory was converted into a dx directory
>> + */
>> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
> 
> ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
> files.  Do we need to worry about the same thing in this function?

Hrmm, I just checked it and I think either is wrong. We only have to
care about non-dx directories, so ext4_readdir() applies, which limits
filp->f_pos < inode->i_size.
Going to send a patch tomorrow. Thanks for spotting this!


Cheers,
Bernd



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-04-22 12:51           ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-22 12:51 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, Andreas Dilger

On 04/20/2012 10:04 PM, Eric Sandeen wrote:
> On 1/9/12 7:21 AM, Bernd Schubert wrote:
>> From: Fan Yong <yong.fan@whamcloud.com>
>>
>> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
>> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
>> and telldir().  However, this causes problems if there are 32-bit hash
>> collisions, since the NFSv2 server can get stuck resending the same
>> entries from the directory repeatedly.
>>
>> Allow ext4 to return a full 64-bit hash (both major and minor) for
>> telldir to decrease the chance of hash collisions.  This still needs
>> integration on the NFS side.
>>
>> Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
>> (blame me if something is not correct)
> 
> Bernd, I've merged this to ext3.  Bruce thought maybe you were working
> on the same.  Should I send mine?

That is perfectly fine with me.

> 
> Also...
> 
>> +/*
>> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
>> + * non-htree and htree directories, where the "offset" is in terms
>> + * of the filename hash value instead of the byte offset.
>> + *
>> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
>> + *       will be invalid once the directory was converted into a dx directory
>> + */
>> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
> 
> ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
> files.  Do we need to worry about the same thing in this function?

Hrmm, I just checked it and I think either is wrong. We only have to
care about non-dx directories, so ext4_readdir() applies, which limits
filp->f_pos < inode->i_size.
Going to send a patch tomorrow. Thanks for spotting this!


Cheers,
Bernd




^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-22 12:51           ` Bernd Schubert
  (?)
@ 2012-04-23 20:37           ` Eric Sandeen
       [not found]             ` <4F95BD72.6090200-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  -1 siblings, 1 reply; 81+ messages in thread
From: Eric Sandeen @ 2012-04-23 20:37 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, Andreas Dilger

On 4/22/12 7:51 AM, Bernd Schubert wrote:
> On 04/20/2012 10:04 PM, Eric Sandeen wrote:
>> On 1/9/12 7:21 AM, Bernd Schubert wrote:
>>> From: Fan Yong <yong.fan@whamcloud.com>
>>>
>>> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
>>> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
>>> and telldir().  However, this causes problems if there are 32-bit hash
>>> collisions, since the NFSv2 server can get stuck resending the same
>>> entries from the directory repeatedly.
>>>
>>> Allow ext4 to return a full 64-bit hash (both major and minor) for
>>> telldir to decrease the chance of hash collisions.  This still needs
>>> integration on the NFS side.
>>>
>>> Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
>>> (blame me if something is not correct)
>>
>> Bernd, I've merged this to ext3.  Bruce thought maybe you were working
>> on the same.  Should I send mine?
> 
> That is perfectly fine with me.
> 
>>
>> Also...
>>
>>> +/*
>>> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
>>> + * non-htree and htree directories, where the "offset" is in terms
>>> + * of the filename hash value instead of the byte offset.
>>> + *
>>> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
>>> + *       will be invalid once the directory was converted into a dx directory
>>> + */
>>> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>>
>> ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
>> files.  Do we need to worry about the same thing in this function?
> 
> Hrmm, I just checked it and I think either is wrong. We only have to
> care about non-dx directories, so ext4_readdir() applies, which limits
> filp->f_pos < inode->i_size.
> Going to send a patch tomorrow. Thanks for spotting this!

The other thing I'm wondering is whether, in light of

ef3d0fd27e90f67e35da516dafc1482c82939a60 vfs: do (nearly) lockless generic_file_llseek

taking the i_mutex in ext4_dir_llseek could be a perf regression vs what was there before?  Is there anything about the new function which requires stronger locking?

I may be missing something obvious about the nfs interaction, not sure.

-Eric

> Cheers,
> Bernd
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-23 20:37           ` Eric Sandeen
@ 2012-04-23 20:52                 ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-23 20:52 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, Andreas Dilger

On 04/23/2012 10:37 PM, Eric Sandeen wrote:
> On 4/22/12 7:51 AM, Bernd Schubert wrote:
>> On 04/20/2012 10:04 PM, Eric Sandeen wrote:
>>> On 1/9/12 7:21 AM, Bernd Schubert wrote:
>>>> From: Fan Yong <yong.fan-KloliPT79xf2eFz/2MeuCQ@public.gmane.org>
>>>>
>>>> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
>>>> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
>>>> and telldir().  However, this causes problems if there are 32-bit hash
>>>> collisions, since the NFSv2 server can get stuck resending the same
>>>> entries from the directory repeatedly.
>>>>
>>>> Allow ext4 to return a full 64-bit hash (both major and minor) for
>>>> telldir to decrease the chance of hash collisions.  This still needs
>>>> integration on the NFS side.
>>>>
>>>> Patch-updated-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
>>>> (blame me if something is not correct)
>>>
>>> Bernd, I've merged this to ext3.  Bruce thought maybe you were working
>>> on the same.  Should I send mine?
>>
>> That is perfectly fine with me.
>>
>>>
>>> Also...
>>>
>>>> +/*
>>>> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
>>>> + * non-htree and htree directories, where the "offset" is in terms
>>>> + * of the filename hash value instead of the byte offset.
>>>> + *
>>>> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
>>>> + *       will be invalid once the directory was converted into a dx directory
>>>> + */
>>>> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>>>
>>> ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
>>> files.  Do we need to worry about the same thing in this function?
>>
>> Hrmm, I just checked it and I think either is wrong. We only have to
>> care about non-dx directories, so ext4_readdir() applies, which limits
>> filp->f_pos < inode->i_size.
>> Going to send a patch tomorrow. Thanks for spotting this!
> 
> The other thing I'm wondering is whether, in light of
> 
> ef3d0fd27e90f67e35da516dafc1482c82939a60 vfs: do (nearly) lockless generic_file_llseek
> 
> taking the i_mutex in ext4_dir_llseek could be a perf regression vs what was there before?  Is there anything about the new function which requires stronger locking?
> 
> I may be missing something obvious about the nfs interaction, not sure.
> 

Oh, good point. I was just about to send a small patch, but reading
through the lockless commit will take some time - its already too late
for me for today. Will work on that tomorrow. Thanks again for your review!

Cheers,
Bernd



diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index b867862..3a4988e2 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -363,7 +363,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t
offset, int origin)
 		goto out_err;

 	if (!dx_dir) {
-		if (offset > inode->i_sb->s_maxbytes)
+		if (offset > i_size_read(inode))
 			goto out_err;
 	} else if (offset > ext4_get_htree_eof(file))
 		goto out_err;


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-04-23 20:52                 ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-23 20:52 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, Andreas Dilger

On 04/23/2012 10:37 PM, Eric Sandeen wrote:
> On 4/22/12 7:51 AM, Bernd Schubert wrote:
>> On 04/20/2012 10:04 PM, Eric Sandeen wrote:
>>> On 1/9/12 7:21 AM, Bernd Schubert wrote:
>>>> From: Fan Yong <yong.fan@whamcloud.com>
>>>>
>>>> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
>>>> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
>>>> and telldir().  However, this causes problems if there are 32-bit hash
>>>> collisions, since the NFSv2 server can get stuck resending the same
>>>> entries from the directory repeatedly.
>>>>
>>>> Allow ext4 to return a full 64-bit hash (both major and minor) for
>>>> telldir to decrease the chance of hash collisions.  This still needs
>>>> integration on the NFS side.
>>>>
>>>> Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
>>>> (blame me if something is not correct)
>>>
>>> Bernd, I've merged this to ext3.  Bruce thought maybe you were working
>>> on the same.  Should I send mine?
>>
>> That is perfectly fine with me.
>>
>>>
>>> Also...
>>>
>>>> +/*
>>>> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
>>>> + * non-htree and htree directories, where the "offset" is in terms
>>>> + * of the filename hash value instead of the byte offset.
>>>> + *
>>>> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
>>>> + *       will be invalid once the directory was converted into a dx directory
>>>> + */
>>>> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>>>
>>> ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
>>> files.  Do we need to worry about the same thing in this function?
>>
>> Hrmm, I just checked it and I think either is wrong. We only have to
>> care about non-dx directories, so ext4_readdir() applies, which limits
>> filp->f_pos < inode->i_size.
>> Going to send a patch tomorrow. Thanks for spotting this!
> 
> The other thing I'm wondering is whether, in light of
> 
> ef3d0fd27e90f67e35da516dafc1482c82939a60 vfs: do (nearly) lockless generic_file_llseek
> 
> taking the i_mutex in ext4_dir_llseek could be a perf regression vs what was there before?  Is there anything about the new function which requires stronger locking?
> 
> I may be missing something obvious about the nfs interaction, not sure.
> 

Oh, good point. I was just about to send a small patch, but reading
through the lockless commit will take some time - its already too late
for me for today. Will work on that tomorrow. Thanks again for your review!

Cheers,
Bernd



diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index b867862..3a4988e2 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -363,7 +363,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t
offset, int origin)
 		goto out_err;

 	if (!dx_dir) {
-		if (offset > inode->i_sb->s_maxbytes)
+		if (offset > i_size_read(inode))
 			goto out_err;
 	} else if (offset > ext4_get_htree_eof(file))
 		goto out_err;



^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-23 20:52                 ` Bernd Schubert
  (?)
@ 2012-04-23 21:22                 ` Eric Sandeen
  -1 siblings, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-23 21:22 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, Andreas Dilger

On 4/23/12 3:52 PM, Bernd Schubert wrote:
> On 04/23/2012 10:37 PM, Eric Sandeen wrote:
>> On 4/22/12 7:51 AM, Bernd Schubert wrote:
>>> On 04/20/2012 10:04 PM, Eric Sandeen wrote:
>>>> On 1/9/12 7:21 AM, Bernd Schubert wrote:
>>>>> From: Fan Yong <yong.fan@whamcloud.com>
>>>>>
>>>>> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
>>>>> to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
>>>>> and telldir().  However, this causes problems if there are 32-bit hash
>>>>> collisions, since the NFSv2 server can get stuck resending the same
>>>>> entries from the directory repeatedly.
>>>>>
>>>>> Allow ext4 to return a full 64-bit hash (both major and minor) for
>>>>> telldir to decrease the chance of hash collisions.  This still needs
>>>>> integration on the NFS side.
>>>>>
>>>>> Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
>>>>> (blame me if something is not correct)
>>>>
>>>> Bernd, I've merged this to ext3.  Bruce thought maybe you were working
>>>> on the same.  Should I send mine?
>>>
>>> That is perfectly fine with me.
>>>
>>>>
>>>> Also...
>>>>
>>>>> +/*
>>>>> + * ext4_dir_llseek() based on generic_file_llseek() to handle both
>>>>> + * non-htree and htree directories, where the "offset" is in terms
>>>>> + * of the filename hash value instead of the byte offset.
>>>>> + *
>>>>> + * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX)
>>>>> + *       will be invalid once the directory was converted into a dx directory
>>>>> + */
>>>>> +loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin)
>>>>
>>>> ext4_llseek() worries about max offset for direct/indirect vs. extent-mapped
>>>> files.  Do we need to worry about the same thing in this function?
>>>
>>> Hrmm, I just checked it and I think either is wrong. We only have to
>>> care about non-dx directories, so ext4_readdir() applies, which limits
>>> filp->f_pos < inode->i_size.
>>> Going to send a patch tomorrow. Thanks for spotting this!
>>
>> The other thing I'm wondering is whether, in light of
>>
>> ef3d0fd27e90f67e35da516dafc1482c82939a60 vfs: do (nearly) lockless generic_file_llseek
>>
>> taking the i_mutex in ext4_dir_llseek could be a perf regression vs what was there before?  Is there anything about the new function which requires stronger locking?
>>
>> I may be missing something obvious about the nfs interaction, not sure.
>>
> 
> Oh, good point. I was just about to send a small patch, but reading
> through the lockless commit will take some time - its already too late
> for me for today. Will work on that tomorrow. Thanks again for your review!

Sorry it's so late :(

-Eric

> Cheers,
> Bernd
> 
> 
> 
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index b867862..3a4988e2 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -363,7 +363,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t
> offset, int origin)
>  		goto out_err;
> 
>  	if (!dx_dir) {
> -		if (offset > inode->i_sb->s_maxbytes)
> +		if (offset > i_size_read(inode))
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
>  		goto out_err;
> 
> 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-23 20:52                 ` Bernd Schubert
@ 2012-04-23 22:23                     ` Eric Sandeen
  -1 siblings, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-23 22:23 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA, Andreas Dilger

On 4/23/12 3:52 PM, Bernd Schubert wrote:
> On 04/23/2012 10:37 PM, Eric Sandeen wrote:

...

> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index b867862..3a4988e2 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -363,7 +363,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t
> offset, int origin)
>  		goto out_err;
> 
>  	if (!dx_dir) {
> -		if (offset > inode->i_sb->s_maxbytes)
> +		if (offset > i_size_read(inode))
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
>  		goto out_err;

I'm curious about the above as well as:

        case SEEK_END:
                if (unlikely(offset > 0))
                        goto out_err; /* not supported for directories */

The previous .llseek handler, and the generic handler for other filesystems, allow seeking past the end of the dir AFAICT.  (not sure why you'd want to, but I don't see that you'd get an error back).

Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-04-23 22:23                     ` Eric Sandeen
  0 siblings, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-23 22:23 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields, Andreas Dilger

On 4/23/12 3:52 PM, Bernd Schubert wrote:
> On 04/23/2012 10:37 PM, Eric Sandeen wrote:

...

> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index b867862..3a4988e2 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -363,7 +363,7 @@ loff_t ext4_dir_llseek(struct file *file, loff_t
> offset, int origin)
>  		goto out_err;
> 
>  	if (!dx_dir) {
> -		if (offset > inode->i_sb->s_maxbytes)
> +		if (offset > i_size_read(inode))
>  			goto out_err;
>  	} else if (offset > ext4_get_htree_eof(file))
>  		goto out_err;

I'm curious about the above as well as:

        case SEEK_END:
                if (unlikely(offset > 0))
                        goto out_err; /* not supported for directories */

The previous .llseek handler, and the generic handler for other filesystems, allow seeking past the end of the dir AFAICT.  (not sure why you'd want to, but I don't see that you'd get an error back).

Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?

-Eric


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-23 22:23                     ` Eric Sandeen
  (?)
@ 2012-04-23 22:42                     ` Andreas Dilger
       [not found]                       ` <A754D23B-B946-4E80-ACEA-0E2C2E6FAA2E-KloliPT79xf2eFz/2MeuCQ@public.gmane.org>
  -1 siblings, 1 reply; 81+ messages in thread
From: Andreas Dilger @ 2012-04-23 22:42 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Bernd Schubert, linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields

On 2012-04-23, at 5:23 PM, Eric Sandeen wrote:
> I'm curious about the above as well as:
> 
>        case SEEK_END:
>                if (unlikely(offset > 0))
>                        goto out_err; /* not supported for directories */
> 
> The previous .llseek handler, and the generic handler for other filesystems, allow seeking past the end of the dir AFAICT.  (not sure why you'd want to, but I don't see that you'd get an error back).
> 
> Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?

I don't know what the origin of this was...  I don't think there is a real reason for it except that it doesn't make any sense to do so.

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/





^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-23 22:42                     ` Andreas Dilger
@ 2012-04-24 16:10                           ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-24 16:10 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Eric Sandeen, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Fan Yong,
	bfields-H+wXaHxf7aLQT0dZR+AlfA

On 04/24/2012 12:42 AM, Andreas Dilger wrote:
> On 2012-04-23, at 5:23 PM, Eric Sandeen wrote:
>> I'm curious about the above as well as:
>>
>>         case SEEK_END:
>>                 if (unlikely(offset>  0))
>>                         goto out_err; /* not supported for directories */
>>
>> The previous .llseek handler, and the generic handler for other filesystems, allow seeking past the end of the dir AFAICT.  (not sure why you'd want to, but I don't see that you'd get an error back).
>>
>> Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?
>
> I don't know what the origin of this was...  I don't think there is a real reason for it except that it doesn't make any sense to do so.
>

I think I added that. According to pubs.opengroup.org:
(http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html)

void seekdir(DIR *dirp, long loc);

<quote>

If the value of loc was not obtained from an earlier call to telldir(), 
or if a call to rewinddir() occurred between the call to telldir() and 
the call to seekdir(), the results of subsequent calls to readdir() are 
unspecified.

</quote>


As telldir(), which should correlate to 'case SEEK_CUR' will not provide 
invalid values, the behaviour is undefined.


Also,


case SEEK_END:
[...]
                 if (dx_dir)
                         offset += ext4_get_htree_eof(file);
                 else
                         offset += inode->i_size;
[...]


         if (!dx_dir) {
                 if (offset > inode->i_sb->s_maxbytes)
                         goto out_err;
         } else if (offset > ext4_get_htree_eof(file))
                 goto out_err;




Hence, the additional:

          case SEEK_END:
                  if (unlikely(offset>  0))
                       goto out_err; /* not supported for directories */

	
is just a shortcut to avoid useless calculations.

Unless I missed something, it only remains the question if could break 
existing applications relying on undefined behaviour. However, I have no 
idea how an application might trigger that?


Thanks,
Bernd

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
@ 2012-04-24 16:10                           ` Bernd Schubert
  0 siblings, 0 replies; 81+ messages in thread
From: Bernd Schubert @ 2012-04-24 16:10 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Eric Sandeen, linux-nfs, linux-ext4, linux-fsdevel, Fan Yong, bfields

On 04/24/2012 12:42 AM, Andreas Dilger wrote:
> On 2012-04-23, at 5:23 PM, Eric Sandeen wrote:
>> I'm curious about the above as well as:
>>
>>         case SEEK_END:
>>                 if (unlikely(offset>  0))
>>                         goto out_err; /* not supported for directories */
>>
>> The previous .llseek handler, and the generic handler for other filesystems, allow seeking past the end of the dir AFAICT.  (not sure why you'd want to, but I don't see that you'd get an error back).
>>
>> Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?
>
> I don't know what the origin of this was...  I don't think there is a real reason for it except that it doesn't make any sense to do so.
>

I think I added that. According to pubs.opengroup.org:
(http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html)

void seekdir(DIR *dirp, long loc);

<quote>

If the value of loc was not obtained from an earlier call to telldir(), 
or if a call to rewinddir() occurred between the call to telldir() and 
the call to seekdir(), the results of subsequent calls to readdir() are 
unspecified.

</quote>


As telldir(), which should correlate to 'case SEEK_CUR' will not provide 
invalid values, the behaviour is undefined.


Also,


case SEEK_END:
[...]
                 if (dx_dir)
                         offset += ext4_get_htree_eof(file);
                 else
                         offset += inode->i_size;
[...]


         if (!dx_dir) {
                 if (offset > inode->i_sb->s_maxbytes)
                         goto out_err;
         } else if (offset > ext4_get_htree_eof(file))
                 goto out_err;




Hence, the additional:

          case SEEK_END:
                  if (unlikely(offset>  0))
                       goto out_err; /* not supported for directories */

	
is just a shortcut to avoid useless calculations.

Unless I missed something, it only remains the question if could break 
existing applications relying on undefined behaviour. However, I have no 
idea how an application might trigger that?


Thanks,
Bernd


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-24 16:10                           ` Bernd Schubert
  (?)
@ 2012-04-24 19:21                           ` Eric Sandeen
  2012-04-24 21:07                             ` Bernd Schubert
  2012-04-24 21:28                             ` Andreas Dilger
  -1 siblings, 2 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-24 19:21 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: Andreas Dilger, linux-ext4, Fan Yong, bfields

On 4/24/12 11:10 AM, Bernd Schubert wrote:
> On 04/24/2012 12:42 AM, Andreas Dilger wrote:
>> On 2012-04-23, at 5:23 PM, Eric Sandeen wrote:
>>> I'm curious about the above as well as:
>>>
>>>         case SEEK_END:
>>>                 if (unlikely(offset>  0))
>>>                         goto out_err; /* not supported for directories */
>>>
>>> The previous .llseek handler, and the generic handler for other
>>> filesystems, allow seeking past the end of the dir AFAICT. (not
>>> sure why you'd want to, but I don't see that you'd get an error
>>> back).
>>>
>>> Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?
>>
>> I don't know what the origin of this was... I don't think there is
>> a real reason for it except that it doesn't make any sense to do
>> so.
>>
> 
> I think I added that. According to pubs.opengroup.org:
> (http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html)
> 
> void seekdir(DIR *dirp, long loc);
> 
> <quote>
> 
> If the value of loc was not obtained from an earlier call to
> telldir(), or if a call to rewinddir() occurred between the call to
> telldir() and the call to seekdir(), the results of subsequent calls
> to readdir() are unspecified.
> 
> </quote>
> 
> 
> As telldir(), which should correlate to 'case SEEK_CUR' will not
> provide invalid values, the behaviour is undefined.
> 
> 
> Also,
> 
> 
> case SEEK_END:
> [...]
>                 if (dx_dir)
>                         offset += ext4_get_htree_eof(file);
>                 else
>                         offset += inode->i_size;
> [...]
> 
> 
>         if (!dx_dir) {
>                 if (offset > inode->i_sb->s_maxbytes)
>                         goto out_err;
>         } else if (offset > ext4_get_htree_eof(file))
>                 goto out_err;
> 
> 
> 
> 
> Hence, the additional:
> 
>          case SEEK_END:
>                  if (unlikely(offset>  0))
>                       goto out_err; /* not supported for directories */
> 
>     
> is just a shortcut to avoid useless calculations.
> 
> Unless I missed something, it only remains the question if could
> break existing applications relying on undefined behaviour. However,
> I have no idea how an application might trigger that?

(other lists removed at this point, this is ext4-specific)

I know I'm being a little pedantic w/ the late review here....

It seems like the only differences between ext4_dir_llseek and the old ext4_llseek are these:

1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)
2) For SEEK_END, we seek to ext4_get_htree_eof() not to inode->i_size
3) For SEEK_SET, we impose different limits for max offset
  - s_maxbytes / ext4_get_htree_eof for !dx/dx, vs. s_bitmap_maxbytes/s_maxbytes

Do any of these changes relate to the hash collision problem?  Are any of them uniquely
required for ext4, enough to warrant cut & paste of the vfs llseek code (again?)

What I'm getting at is: what are the reasons that we cannot use generic_file_llseek_size(),
maybe with a new argument to specify a non-standard location for SEEK_END.  Such
a change would require a solid explanation, but it'd probably go in if it meant
one less seek implementation to worry about.

-Eric

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-24 19:21                           ` Eric Sandeen
@ 2012-04-24 21:07                             ` Bernd Schubert
  2012-04-24 22:24                               ` Andreas Dilger
  2012-04-24 21:28                             ` Andreas Dilger
  1 sibling, 1 reply; 81+ messages in thread
From: Bernd Schubert @ 2012-04-24 21:07 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Andreas Dilger, linux-ext4, Fan Yong, bfields

On 04/24/2012 09:21 PM, Eric Sandeen wrote:
> On 4/24/12 11:10 AM, Bernd Schubert wrote:
>> On 04/24/2012 12:42 AM, Andreas Dilger wrote:
>>> On 2012-04-23, at 5:23 PM, Eric Sandeen wrote:
>>>> I'm curious about the above as well as:
>>>>
>>>>         case SEEK_END:
>>>>                 if (unlikely(offset>  0))
>>>>                         goto out_err; /* not supported for directories */
>>>>
>>>> The previous .llseek handler, and the generic handler for other
>>>> filesystems, allow seeking past the end of the dir AFAICT. (not
>>>> sure why you'd want to, but I don't see that you'd get an error
>>>> back).
>>>>
>>>> Is there a reason to uniquely exclude it in ext4?  Does that line up with POSIX?
>>>
>>> I don't know what the origin of this was... I don't think there is
>>> a real reason for it except that it doesn't make any sense to do
>>> so.
>>>
>>
>> I think I added that. According to pubs.opengroup.org:
>> (http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html)
>>
>> void seekdir(DIR *dirp, long loc);
>>
>> <quote>
>>
>> If the value of loc was not obtained from an earlier call to
>> telldir(), or if a call to rewinddir() occurred between the call to
>> telldir() and the call to seekdir(), the results of subsequent calls
>> to readdir() are unspecified.
>>
>> </quote>
>>
>>
>> As telldir(), which should correlate to 'case SEEK_CUR' will not
>> provide invalid values, the behaviour is undefined.
>>
>>
>> Also,
>>
>>
>> case SEEK_END:
>> [...]
>>                 if (dx_dir)
>>                         offset += ext4_get_htree_eof(file);
>>                 else
>>                         offset += inode->i_size;
>> [...]
>>
>>
>>         if (!dx_dir) {
>>                 if (offset > inode->i_sb->s_maxbytes)
>>                         goto out_err;
>>         } else if (offset > ext4_get_htree_eof(file))
>>                 goto out_err;
>>
>>
>>
>>
>> Hence, the additional:
>>
>>          case SEEK_END:
>>                  if (unlikely(offset>  0))
>>                       goto out_err; /* not supported for directories */
>>
>>     
>> is just a shortcut to avoid useless calculations.
>>
>> Unless I missed something, it only remains the question if could
>> break existing applications relying on undefined behaviour. However,
>> I have no idea how an application might trigger that?
> 
> (other lists removed at this point, this is ext4-specific)
> 
> I know I'm being a little pedantic w/ the late review here....

That is fine, lets better be pedantic now than cause trouble to ext4
users...

> 
> It seems like the only differences between ext4_dir_llseek and the old ext4_llseek are these:
> 
> 1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)

I definitely introduces that one, as I cannot see how an application
might ever run into it. Especially as ext4 directories cannot shrink. So
if an application tries to exceed the directory size limit, it looks to
me as some of attempt to break something or as an error in the
application. However, if there should be the slightest chance to break
existing applications relying on that, we need to remove that.



I thought about 2) and 3) it on my way home and I think I remembered the
reason for it.

> 2) For SEEK_END, we seek to ext4_get_htree_eof() not to inode->i_size

Lets assume an application wants to seek to the last directory entry. If
it would seek to inode->i_size and then would attempt another readdir
from that offset, we probably would succeed, as inode->i_size is
probably just an arbitrary value in between two hashes, or even smaller
than the very first hash  value, so the next readdir() probably even
would read the very first  directory entry. I think i_size and
ext4_get_htree_eof()  makes a very big difference here.

> 3) For SEEK_SET, we impose different limits for max offset
>   - s_maxbytes / ext4_get_htree_eof for !dx/dx, vs. s_bitmap_maxbytes/s_maxbytes

Its a bit too late for me to check that today (and I'm almost
starving...), but is it possible that s_maxbytes is smaller than
ext4_get_htree_eof? So is possible that valid hash values get larger
than s_maxbytes? I will check that tomorrow morning.

> 
> Do any of these changes relate to the hash collision problem?  Are any of them uniquely
> required for ext4, enough to warrant cut & paste of the vfs llseek code (again?)
> 
> What I'm getting at is: what are the reasons that we cannot use generic_file_llseek_size(),
> maybe with a new argument to specify a non-standard location for SEEK_END.  Such
> a change would require a solid explanation, but it'd probably go in if it meant
> one less seek implementation to worry about.


I think we probably need to extent generic_file_llseek_size() by a
parameter 'max_fs_limit' (well something like that name, I don't find a
better one now) and then it should be possible to use it.


Cheers,
Bernd



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-24 21:28                             ` Andreas Dilger
@ 2012-04-24 21:26                               ` Eric Sandeen
  0 siblings, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-24 21:26 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Bernd Schubert, linux-ext4, Fan Yong, bfields

On 4/24/12 4:28 PM, Andreas Dilger wrote:
> On 2012-04-24, at 2:21 PM, Eric Sandeen wrote:
>> I know I'm being a little pedantic w/ the late review here....
>>
>> It seems like the only differences between ext4_dir_llseek and the old ext4_llseek are these:
>>
>> 1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)
>> 2) For SEEK_END, we seek to ext4_get_htree_eof() not to inode->i_size
>> 3) For SEEK_SET, we impose different limits for max offset
>>  - s_maxbytes / ext4_get_htree_eof for !dx/dx, vs. s_bitmap_maxbytes/s_maxbytes
>>
>> Do any of these changes relate to the hash collision problem?  Are any of them uniquely required for ext4, enough to warrant cut & paste of the vfs llseek code (again?)
>>
>> What I'm getting at is: what are the reasons that we cannot use generic_file_llseek_size(), maybe with a new argument to specify a non-standard location for SEEK_END.  Such a change would require a solid explanation, but it'd probably go in if it meant one less seek implementation to worry about.
> 
> So, when we were looking at this code, it makes sense that if dir seek is being done for telldir/seekdir that the parameters for ext4 are hash functions, so they should be compared against hash limits instead of the file size.
> 
> This probably makes sense for other filesystems that use hash cookies instead of byte offsets to have a similar dir seek implementation, but I thought there might be a controversy about this and I'm happy to get it into ext4 as a starting point.

That makes sense... but I think the generic code could be expanded to handle this, if we used the existing size parameter to specify max seekable offset in the dir, and a new parameter to cause SEEK_END to behave in a special way (not depending on i_size?)

-Eric

> Cheers, Andreas
> --
> Andreas Dilger                       Whamcloud, Inc.
> Principal Lustre Engineer            http://www.whamcloud.com/
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-24 19:21                           ` Eric Sandeen
  2012-04-24 21:07                             ` Bernd Schubert
@ 2012-04-24 21:28                             ` Andreas Dilger
  2012-04-24 21:26                               ` Eric Sandeen
  1 sibling, 1 reply; 81+ messages in thread
From: Andreas Dilger @ 2012-04-24 21:28 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Bernd Schubert, linux-ext4, Fan Yong, bfields

On 2012-04-24, at 2:21 PM, Eric Sandeen wrote:
> I know I'm being a little pedantic w/ the late review here....
> 
> It seems like the only differences between ext4_dir_llseek and the old ext4_llseek are these:
> 
> 1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)
> 2) For SEEK_END, we seek to ext4_get_htree_eof() not to inode->i_size
> 3) For SEEK_SET, we impose different limits for max offset
>  - s_maxbytes / ext4_get_htree_eof for !dx/dx, vs. s_bitmap_maxbytes/s_maxbytes
> 
> Do any of these changes relate to the hash collision problem?  Are any of them uniquely required for ext4, enough to warrant cut & paste of the vfs llseek code (again?)
> 
> What I'm getting at is: what are the reasons that we cannot use generic_file_llseek_size(), maybe with a new argument to specify a non-standard location for SEEK_END.  Such a change would require a solid explanation, but it'd probably go in if it meant one less seek implementation to worry about.

So, when we were looking at this code, it makes sense that if dir seek is being done for telldir/seekdir that the parameters for ext4 are hash functions, so they should be compared against hash limits instead of the file size.

This probably makes sense for other filesystems that use hash cookies instead of byte offsets to have a similar dir seek implementation, but I thought there might be a controversy about this and I'm happy to get it into ext4 as a starting point.

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/





^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-24 21:07                             ` Bernd Schubert
@ 2012-04-24 22:24                               ` Andreas Dilger
  2012-04-25 15:05                                 ` Eric Sandeen
  0 siblings, 1 reply; 81+ messages in thread
From: Andreas Dilger @ 2012-04-24 22:24 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: Eric Sandeen, linux-ext4, Fan Yong, bfields

On 2012-04-24, at 4:07 PM, Bernd Schubert wrote:
>> 1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)
> 
> I definitely introduces that one, as I cannot see how an application
> might ever run into it. Especially as ext4 directories cannot shrink. So
> if an application tries to exceed the directory size limit, it looks to
> me as some of attempt to break something or as an error in the
> application. However, if there should be the slightest chance to break
> existing applications relying on that, we need to remove that.

I think the other reason to avoid SEEK_END + n is that since SEEK_END
for a hash offset is (signed) MAX_LONG, so if one seeks beyond that
it will wrap to a negative offset.

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/





^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-24 22:24                               ` Andreas Dilger
@ 2012-04-25 15:05                                 ` Eric Sandeen
  2012-04-25 15:12                                   ` Bernd Schubert
  0 siblings, 1 reply; 81+ messages in thread
From: Eric Sandeen @ 2012-04-25 15:05 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Bernd Schubert, linux-ext4, Fan Yong, bfields

On 4/24/12 5:24 PM, Andreas Dilger wrote:
> On 2012-04-24, at 4:07 PM, Bernd Schubert wrote:
>>> 1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)
>>
>> I definitely introduces that one, as I cannot see how an application
>> might ever run into it. Especially as ext4 directories cannot shrink. So
>> if an application tries to exceed the directory size limit, it looks to
>> me as some of attempt to break something or as an error in the
>> application. However, if there should be the slightest chance to break
>> existing applications relying on that, we need to remove that.
> 
> I think the other reason to avoid SEEK_END + n is that since SEEK_END
> for a hash offset is (signed) MAX_LONG, so if one seeks beyond that
> it will wrap to a negative offset.

Makes sense.

Wishing this had been done as a separate patch, though, since it's really addressing a separate issue from the $SUBJECT, and could have used specific documentation of the change.  Nitpicky I know, but it helps.

-Eric

> Cheers, Andreas
> --
> Andreas Dilger                       Whamcloud, Inc.
> Principal Lustre Engineer            http://www.whamcloud.com/
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-25 15:05                                 ` Eric Sandeen
@ 2012-04-25 15:12                                   ` Bernd Schubert
  2012-04-25 15:36                                     ` Eric Sandeen
  0 siblings, 1 reply; 81+ messages in thread
From: Bernd Schubert @ 2012-04-25 15:12 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Andreas Dilger, linux-ext4, Fan Yong, bfields

On 04/25/2012 05:05 PM, Eric Sandeen wrote:
> On 4/24/12 5:24 PM, Andreas Dilger wrote:
>> On 2012-04-24, at 4:07 PM, Bernd Schubert wrote:
>>>> 1) For SEEK_END, we now return -EINVAL for a positive offset (i.e. past EOF)
>>>
>>> I definitely introduces that one, as I cannot see how an application
>>> might ever run into it. Especially as ext4 directories cannot shrink. So
>>> if an application tries to exceed the directory size limit, it looks to
>>> me as some of attempt to break something or as an error in the
>>> application. However, if there should be the slightest chance to break
>>> existing applications relying on that, we need to remove that.
>>
>> I think the other reason to avoid SEEK_END + n is that since SEEK_END
>> for a hash offset is (signed) MAX_LONG, so if one seeks beyond that
>> it will wrap to a negative offset.
>
> Makes sense.
>
> Wishing this had been done as a separate patch, though, since it's really addressing a separate issue from the $SUBJECT, and could have used specific documentation of the change.  Nitpicky I know, but it helps.

Sorry, my fault. Maybe we should simply document it in the code?
And how do we proceed in general. Shall I write a patch to use 
generic_file_llseek() and update that function to take more arguments? I 
don't think that would go into 3.4.

Thanks,
Bernd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type
  2012-04-25 15:12                                   ` Bernd Schubert
@ 2012-04-25 15:36                                     ` Eric Sandeen
  0 siblings, 0 replies; 81+ messages in thread
From: Eric Sandeen @ 2012-04-25 15:36 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: Andreas Dilger, linux-ext4, Fan Yong, bfields

On 4/25/12 10:12 AM, Bernd Schubert wrote:
> On 04/25/2012 05:05 PM, Eric Sandeen wrote:
>> On 4/24/12 5:24 PM, Andreas Dilger wrote:
>>> On 2012-04-24, at 4:07 PM, Bernd Schubert wrote:
>>>>> 1) For SEEK_END, we now return -EINVAL for a positive offset
>>>>> (i.e. past EOF)
>>>> 
>>>> I definitely introduces that one, as I cannot see how an
>>>> application might ever run into it. Especially as ext4
>>>> directories cannot shrink. So if an application tries to exceed
>>>> the directory size limit, it looks to me as some of attempt to
>>>> break something or as an error in the application. However, if
>>>> there should be the slightest chance to break existing
>>>> applications relying on that, we need to remove that.
>>> 
>>> I think the other reason to avoid SEEK_END + n is that since
>>> SEEK_END for a hash offset is (signed) MAX_LONG, so if one seeks
>>> beyond that it will wrap to a negative offset.
>> 
>> Makes sense.
>> 
>> Wishing this had been done as a separate patch, though, since it's
>> really addressing a separate issue from the $SUBJECT, and could
>> have used specific documentation of the change.  Nitpicky I know,
>> but it helps.
> 
> Sorry, my fault. 

No worries, I should have reviewed sooner too :)

> Maybe we should simply document it in the code? And
> how do we proceed in general. Shall I write a patch to use
> generic_file_llseek() and update that function to take more
> arguments? I don't think that would go into 3.4.

Unless there is obviously _wrong_ behavior to be fixed I don't think it's needed for 3.4.

We could maybe do one patch to make it lockless again, if there's good confidence in that, since it's sort of a "regression."

Trying to munge things into the upstream seek function would be post-3.4, I'm sure, if it turns out it can be done at all.

Thanks,
-Eric

> Thanks, Bernd


^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2012-04-25 15:36 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-09 13:21 [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Bernd Schubert
2012-01-09 13:21 ` [PATCH 5 1/4] Add new FMODE flags: FMODE_32bithash and FMODE_64bithash Bernd Schubert
2012-01-09 13:21 ` [PATCH 5 2/4] Return 32/64-bit dir name hash according to usage type Bernd Schubert
2012-03-05 15:59   ` Ted Ts'o
     [not found]     ` <20120305155939.GE21356-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-06  0:40       ` Bernd Schubert
2012-03-06  0:40         ` Bernd Schubert
2012-03-06  2:28         ` Ted Ts'o
     [not found]           ` <20120306022838.GA24323-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-06  9:59             ` Bernd Schubert
2012-03-06  9:59               ` Bernd Schubert
     [not found]               ` <4F55E01B.3060105-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-03-06 15:15                 ` Ted Ts'o
2012-03-06 15:15                   ` Ted Ts'o
     [not found]                   ` <20120306151543.GA32282-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-07  9:01                     ` Bernd Schubert
2012-03-07  9:01                       ` Bernd Schubert
     [not found]   ` <20120109132148.2616029.68798.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2012-04-20 20:04     ` Eric Sandeen
2012-04-20 20:04       ` Eric Sandeen
     [not found]       ` <4F91C15B.6070200-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-22 12:51         ` Bernd Schubert
2012-04-22 12:51           ` Bernd Schubert
2012-04-23 20:37           ` Eric Sandeen
     [not found]             ` <4F95BD72.6090200-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-04-23 20:52               ` Bernd Schubert
2012-04-23 20:52                 ` Bernd Schubert
2012-04-23 21:22                 ` Eric Sandeen
     [not found]                 ` <4F95C109.1030401-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-04-23 22:23                   ` Eric Sandeen
2012-04-23 22:23                     ` Eric Sandeen
2012-04-23 22:42                     ` Andreas Dilger
     [not found]                       ` <A754D23B-B946-4E80-ACEA-0E2C2E6FAA2E-KloliPT79xf2eFz/2MeuCQ@public.gmane.org>
2012-04-24 16:10                         ` Bernd Schubert
2012-04-24 16:10                           ` Bernd Schubert
2012-04-24 19:21                           ` Eric Sandeen
2012-04-24 21:07                             ` Bernd Schubert
2012-04-24 22:24                               ` Andreas Dilger
2012-04-25 15:05                                 ` Eric Sandeen
2012-04-25 15:12                                   ` Bernd Schubert
2012-04-25 15:36                                     ` Eric Sandeen
2012-04-24 21:28                             ` Andreas Dilger
2012-04-24 21:26                               ` Eric Sandeen
2012-01-09 13:21 ` [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open() Bernd Schubert
2012-01-09 13:21 ` [PATCH 5 4/4] nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) Bernd Schubert
     [not found]   ` <20120109132158.2616029.30467.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
     [not found]     ` <20120109132153.2616029.26302.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2012-03-06  0:08       ` [PATCH 5 3/4] nfsd_open(): rename 'int access' to 'int may_flags' in nfsd_open() Ted Ts'o
2012-03-06  0:08         ` Ted Ts'o
     [not found]         ` <20120306000837.GA17164-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-06  2:08           ` J. Bruce Fields
2012-03-06  2:08             ` J. Bruce Fields
2012-03-06 15:18             ` Ted Ts'o
2012-03-06 15:28               ` J. Bruce Fields
2012-03-09 20:51                 ` Ted Ts'o
     [not found]                   ` <20120309205148.GB5635-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-12 15:09                     ` Ted Ts'o
2012-03-12 15:09                       ` Ted Ts'o
     [not found]                       ` <20120312150912.GB12440-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-12 15:49                         ` J. Bruce Fields
2012-03-12 15:49                           ` J. Bruce Fields
2012-03-12 22:22                           ` J. Bruce Fields
2012-03-13 20:01                             ` J. Bruce Fields
     [not found]                               ` <20120313200117.GA21991-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-03-13 20:03                                 ` Bernd Schubert
2012-03-13 20:03                                   ` Bernd Schubert
     [not found]                                   ` <4F5FA827.8020606-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-03-13 20:34                                     ` J. Bruce Fields
2012-03-13 20:34                                       ` J. Bruce Fields
2012-03-13 21:09                                       ` Bernd Schubert
2012-03-13 21:29                                         ` J. Bruce Fields
     [not found]                                           ` <20120313212947.GK31995-spRCxval1Z7TsXDwO4sDpg@public.gmane.org>
2012-03-14 14:32                                             ` Bernd Schubert
2012-03-14 14:32                                               ` Bernd Schubert
     [not found]                                               ` <4F60AC0D.9020204-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
2012-03-14 16:05                                                 ` J. Bruce Fields
2012-03-14 16:05                                                   ` J. Bruce Fields
     [not found]                                                   ` <20120314160529.GB31194-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-03-16 21:22                                                     ` Bernd Schubert
2012-03-16 21:22                                                       ` Bernd Schubert
2012-03-19  2:54                                                       ` Ted Ts'o
     [not found]                                                         ` <20120319025455.GD31682-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-19 20:00                                                           ` J. Bruce Fields
2012-03-19 20:00                                                             ` J. Bruce Fields
     [not found]                                                             ` <20120319200041.GA25161-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-03-20  0:10                                                               ` Ted Ts'o
2012-03-20  0:10                                                                 ` Ted Ts'o
2012-04-12 20:49                                                 ` J. Bruce Fields
2012-04-12 20:49                                                   ` J. Bruce Fields
     [not found]                                                   ` <20120412204948.GE6667-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-04-12 21:22                                                     ` Bernd Schubert
2012-04-12 21:22                                                       ` Bernd Schubert
     [not found]                                                       ` <4F8747A1.8060800-97jfqw80gc6171pxa8y+qA@public.gmane.org>
2012-04-12 21:25                                                         ` J. Bruce Fields
2012-04-12 21:25                                                           ` J. Bruce Fields
     [not found]                                       ` <20120313203446.GB21991-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-03-13 21:10                                         ` Ted Ts'o
2012-03-13 21:10                                           ` Ted Ts'o
     [not found]                                           ` <20120313211009.GA11969-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-03-13 21:27                                             ` J. Bruce Fields
2012-03-13 21:27                                               ` J. Bruce Fields
2012-01-10 11:27 ` [PATCH 0/4] [RESEND] 32/64 bit llseek hashes (v5) Andreas Dilger
2012-01-11 14:48   ` J. Bruce Fields
     [not found]     ` <20120111144827.GA32381-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-01-11 15:31       ` Ted Ts'o
2012-01-11 15:31         ` Ted Ts'o
2012-03-05 12:23         ` Bernd Schubert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.