linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
@ 2020-09-14  3:37 Jianyong Wu
  2020-09-14  3:37 ` [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom Jianyong Wu
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  3:37 UTC (permalink / raw)
  To: ericvh, lucho, asmadeus, v9fs-developer
  Cc: linux-kernel, justin.he, jianyong.wu

open-unlink-f*syscall bug is a well-known bug in 9p, we try to fix the bug
in this patch set.
I take Eric's and Greg's patches which constiute the 1/4 - 3/4 of this patch
set as the main frame of the solution. In patch 4/4, I fix the fid race issue
exists in Greg's patch.

Eric Van Hensbergen (1):
  fs/9p: fix create-unlink-getattr idiom

Greg Kurz (1):
  fs/9p: search open fids first

Jianyong Wu (2):
  fs/9p: track open fids
  9p: fix race issue in fid contention.

 fs/9p/fid.c             | 72 +++++++++++++++++++++++++++++++++++------
 fs/9p/fid.h             | 25 +++++++++++---
 fs/9p/vfs_dentry.c      |  2 +-
 fs/9p/vfs_dir.c         | 20 ++++++++++--
 fs/9p/vfs_file.c        |  3 +-
 fs/9p/vfs_inode.c       | 52 +++++++++++++++++++++--------
 fs/9p/vfs_inode_dotl.c  | 44 ++++++++++++++++---------
 fs/9p/vfs_super.c       |  7 ++--
 fs/9p/xattr.c           | 18 ++++++++---
 include/net/9p/client.h |  8 +++++
 net/9p/client.c         |  7 +++-
 11 files changed, 206 insertions(+), 52 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom
  2020-09-14  3:37 [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Jianyong Wu
@ 2020-09-14  3:37 ` Jianyong Wu
  2020-09-14  6:00   ` Dominique Martinet
  2020-09-14  3:37 ` [PATCH RFC 2/4] fs/9p: track open fids Jianyong Wu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  3:37 UTC (permalink / raw)
  To: ericvh, lucho, asmadeus, v9fs-developer
  Cc: linux-kernel, justin.he, jianyong.wu, Greg Kurz

From: Eric Van Hensbergen <ericvh@gmail.com>

Fixes several outstanding bug reports of not being able to getattr from an
open file after an unlink.  This patch cleans up transient fids on an unlink
and will search open fids on a client if it detects a dentry that appears to
have been unlinked.  This search is necessary because fstat does not pass fd
information through the VFS API to the filesystem, only the dentry which for
9p has an imperfect match to fids.

Inherent in this patch is also a fix for the qid handling on create/open
which apparently wasn't being set correctly and was necessary for the search
to succeed.

A possible optimization over this fix is to include accounting of open fids
with the inode in the private data (in a similar fashion to the way we track
transient fids with dentries).  This would allow a much quicker search for
a matching open fid.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
(changed v9fs_fid_find_global to v9fs_fid_find_inode in comment)
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>

Change-Id: Ifd5c8cdca8b40216e3e7d021eb6d0afd750096e7
---
 fs/9p/fid.c       | 30 ++++++++++++++++++++++++++++++
 fs/9p/vfs_inode.c |  4 ++++
 net/9p/client.c   |  5 ++++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 3d681a2c2731..3304984c0fad 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -38,6 +38,33 @@ void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid)
 	spin_unlock(&dentry->d_lock);
 }
 
+/**
+ * v9fs_fid_find_inode - search for a fid off of the client list
+ * @inode: return a fid pointing to a specific inode
+ * @uid: return a fid belonging to the specified user
+ *
+ */
+
+static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
+{
+	struct p9_client *clnt = v9fs_inode2v9ses(inode)->clnt;
+	struct p9_fid *fid, *fidptr, *ret = NULL;
+	unsigned long flags;
+
+	p9_debug(P9_DEBUG_VFS, " inode: %p\n", inode);
+
+	spin_lock_irqsave(&clnt->lock, flags);
+	list_for_each_entry_safe(fid, fidptr, &clnt->fidlist, flist) {
+		if (uid_eq(fid->uid, uid) &&
+		   (inode->i_ino == v9fs_qid2ino(&fid->qid))) {
+			ret = fid;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&clnt->lock, flags);
+	return ret;
+}
+
 /**
  * v9fs_fid_find - retrieve a fid that belongs to the specified uid
  * @dentry: dentry to look for fid in
@@ -65,6 +92,9 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
 			}
 		}
 		spin_unlock(&dentry->d_lock);
+	} else {
+		if (dentry->d_inode)
+			ret = v9fs_fid_find_inode(dentry->d_inode, uid);
 	}
 
 	return ret;
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index ae0c38ad1fcb..31c2fddabb82 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -570,6 +570,10 @@ static int v9fs_remove(struct inode *dir, struct dentry *dentry, int flags)
 
 		v9fs_invalidate_inode_attr(inode);
 		v9fs_invalidate_inode_attr(dir);
+
+		/* invalidate all fids associated with dentry */
+		/* NOTE: This will not include open fids */
+		dentry->d_op->d_release(dentry);
 	}
 	return retval;
 }
diff --git a/net/9p/client.c b/net/9p/client.c
index 09f1ec589b80..1a3f72bf45fc 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1219,7 +1219,7 @@ struct p9_fid *p9_client_walk(struct p9_fid *oldfid, uint16_t nwname,
 	if (nwname)
 		memmove(&fid->qid, &wqids[nwqids - 1], sizeof(struct p9_qid));
 	else
-		fid->qid = oldfid->qid;
+		memmove(&fid->qid, &oldfid->qid, sizeof(struct p9_qid));
 
 	kfree(wqids);
 	return fid;
@@ -1272,6 +1272,7 @@ int p9_client_open(struct p9_fid *fid, int mode)
 		p9_is_proto_dotl(clnt) ? "RLOPEN" : "ROPEN",  qid.type,
 		(unsigned long long)qid.path, qid.version, iounit);
 
+	memmove(&fid->qid, &qid, sizeof(struct p9_qid));
 	fid->mode = mode;
 	fid->iounit = iounit;
 
@@ -1317,6 +1318,7 @@ int p9_client_create_dotl(struct p9_fid *ofid, const char *name, u32 flags, u32
 			(unsigned long long)qid->path,
 			qid->version, iounit);
 
+	memmove(&ofid->qid, qid, sizeof(struct p9_qid));
 	ofid->mode = mode;
 	ofid->iounit = iounit;
 
@@ -1362,6 +1364,7 @@ int p9_client_fcreate(struct p9_fid *fid, const char *name, u32 perm, int mode,
 				(unsigned long long)qid.path,
 				qid.version, iounit);
 
+	memmove(&fid->qid, &qid, sizeof(struct p9_qid));
 	fid->mode = mode;
 	fid->iounit = iounit;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH RFC 2/4] fs/9p: track open fids
  2020-09-14  3:37 [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Jianyong Wu
  2020-09-14  3:37 ` [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom Jianyong Wu
@ 2020-09-14  3:37 ` Jianyong Wu
  2020-09-14  3:37 ` [PATCH RFC 3/4] fs/9p: search open fids first Jianyong Wu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  3:37 UTC (permalink / raw)
  To: ericvh, lucho, asmadeus, v9fs-developer
  Cc: linux-kernel, justin.he, jianyong.wu, Greg Kurz

From: Greg Kurz <gkurz@linux.vnet.ibm.com>

This patch adds accounting of open fids in a list hanging off the i_private
field of the corresponding inode. This allows faster lookups compared to
searching the full 9p client list.

The lookup code is modified accordingly.

Change-Id: I9f1b99d9c7ab36a15534726cce034a8a1443d680
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
---
 fs/9p/fid.c             | 32 +++++++++++++++++++++++---------
 fs/9p/fid.h             |  1 +
 fs/9p/vfs_dir.c         |  3 +++
 fs/9p/vfs_file.c        |  1 +
 fs/9p/vfs_inode.c       |  6 +++++-
 fs/9p/vfs_inode_dotl.c  |  1 +
 include/net/9p/client.h |  1 +
 7 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 3304984c0fad..d11dd430590d 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -39,7 +39,7 @@ void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid)
 }
 
 /**
- * v9fs_fid_find_inode - search for a fid off of the client list
+ * v9fs_fid_find_inode - search for an open fid off of the inode list
  * @inode: return a fid pointing to a specific inode
  * @uid: return a fid belonging to the specified user
  *
@@ -47,24 +47,38 @@ void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid)
 
 static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
 {
-	struct p9_client *clnt = v9fs_inode2v9ses(inode)->clnt;
-	struct p9_fid *fid, *fidptr, *ret = NULL;
-	unsigned long flags;
+	struct hlist_head *h;
+	struct p9_fid *fid, *ret = NULL;
 
 	p9_debug(P9_DEBUG_VFS, " inode: %p\n", inode);
 
-	spin_lock_irqsave(&clnt->lock, flags);
-	list_for_each_entry_safe(fid, fidptr, &clnt->fidlist, flist) {
-		if (uid_eq(fid->uid, uid) &&
-		   (inode->i_ino == v9fs_qid2ino(&fid->qid))) {
+	spin_lock(&inode->i_lock);
+	h = (struct hlist_head *)&inode->i_private;
+	hlist_for_each_entry(fid, h, ilist) {
+		if (uid_eq(fid->uid, uid)) {
 			ret = fid;
 			break;
 		}
 	}
-	spin_unlock_irqrestore(&clnt->lock, flags);
+	spin_unlock(&inode->i_lock);
 	return ret;
 }
 
+/**
+ * v9fs_open_fid_add - add an open fid to an inode
+ * @dentry: inode that the fid is being added to
+ * @fid: fid to add
+ *
+ */
+
+void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid)
+{
+	spin_lock(&inode->i_lock);
+	hlist_add_head(&fid->ilist, (struct hlist_head *)&inode->i_private);
+	spin_unlock(&inode->i_lock);
+}
+
+
 /**
  * v9fs_fid_find - retrieve a fid that belongs to the specified uid
  * @dentry: dentry to look for fid in
diff --git a/fs/9p/fid.h b/fs/9p/fid.h
index 928b1093f511..dfa11df02818 100644
--- a/fs/9p/fid.h
+++ b/fs/9p/fid.h
@@ -15,6 +15,7 @@ static inline struct p9_fid *v9fs_parent_fid(struct dentry *dentry)
 }
 void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid);
 struct p9_fid *v9fs_writeback_fid(struct dentry *dentry);
+void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid);
 static inline struct p9_fid *clone_fid(struct p9_fid *fid)
 {
 	return IS_ERR(fid) ? fid :  p9_client_walk(fid, 0, NULL, 1);
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index 674d22bf4f6f..d82d8a346f86 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -210,6 +210,9 @@ int v9fs_dir_release(struct inode *inode, struct file *filp)
 	fid = filp->private_data;
 	p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n",
 		 inode, filp, fid ? fid->fid : -1);
+	spin_lock(&inode->i_lock);
+	hlist_del(&fid->ilist);
+	spin_unlock(&inode->i_lock);
 	if (fid)
 		p9_client_clunk(fid);
 	return 0;
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 92cd1d80218d..b42cc1752cd1 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -96,6 +96,7 @@ int v9fs_file_open(struct inode *inode, struct file *file)
 	mutex_unlock(&v9inode->v_mutex);
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
 		v9fs_cache_inode_set_cookie(inode, file);
+	v9fs_open_fid_add(inode, fid);
 	return 0;
 out_error:
 	p9_client_clunk(file->private_data);
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 31c2fddabb82..6b243ffcbcf0 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -256,6 +256,7 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses,
 	inode->i_rdev = rdev;
 	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
 	inode->i_mapping->a_ops = &v9fs_addr_operations;
+	inode->i_private = NULL;
 
 	switch (mode & S_IFMT) {
 	case S_IFIFO:
@@ -796,6 +797,7 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
 	struct v9fs_session_info *v9ses;
 	struct p9_fid *fid, *inode_fid;
 	struct dentry *res = NULL;
+	struct inode *inode;
 
 	if (d_in_lookup(dentry)) {
 		res = v9fs_vfs_lookup(dir, dentry, 0);
@@ -824,7 +826,8 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
 	}
 
 	v9fs_invalidate_inode_attr(dir);
-	v9inode = V9FS_I(d_inode(dentry));
+	inode = d_inode(dentry);
+	v9inode = V9FS_I(inode);
 	mutex_lock(&v9inode->v_mutex);
 	if ((v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) &&
 	    !v9inode->writeback_fid &&
@@ -852,6 +855,7 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
 	file->private_data = fid;
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
 		v9fs_cache_inode_set_cookie(d_inode(dentry), file);
+	v9fs_open_fid_add(inode, fid);
 
 	file->f_mode |= FMODE_CREATED;
 out:
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 0028eccb665a..08f2e089fb0e 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -342,6 +342,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
 	file->private_data = ofid;
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
 		v9fs_cache_inode_set_cookie(inode, file);
+	v9fs_open_fid_add(inode, ofid);
 	file->f_mode |= FMODE_CREATED;
 out:
 	v9fs_put_acl(dacl, pacl);
diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index dd5b5bd781a4..ce7882da8e86 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -152,6 +152,7 @@ struct p9_fid {
 	void *rdir;
 
 	struct hlist_node dlist;	/* list of all fids attached to a dentry */
+	struct hlist_node ilist;
 };
 
 /**
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH RFC 3/4] fs/9p: search open fids first
  2020-09-14  3:37 [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Jianyong Wu
  2020-09-14  3:37 ` [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom Jianyong Wu
  2020-09-14  3:37 ` [PATCH RFC 2/4] fs/9p: track open fids Jianyong Wu
@ 2020-09-14  3:37 ` Jianyong Wu
  2020-09-14  3:37 ` [PATCH RFC 4/4] 9p: fix race issue in fid contention Jianyong Wu
  2020-09-14  8:35 ` [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Greg Kurz
  4 siblings, 0 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  3:37 UTC (permalink / raw)
  To: ericvh, lucho, asmadeus, v9fs-developer
  Cc: linux-kernel, justin.he, jianyong.wu, Greg Kurz

From: Greg Kurz <groug@kaod.org>

A previous patch fixed the "create-unlink-getattr" idiom: if getattr is
called on an unlinked file, we try to find an open fid attached to the
corresponding inode.

We have a similar issue with file permissions and setattr:

open("./test.txt", O_RDWR|O_CREAT, 0666) = 4
chmod("./test.txt", 0)                  = 0
truncate("./test.txt", 0)               = -1 EACCES (Permission denied)
ftruncate(4, 0)                         = -1 EACCES (Permission denied)

The failure is expected with truncate() but not with ftruncate().

This happens because the lookup code does find a matching fid in the
dentry list. Unfortunately, this is not an open fid and the server
will be forced to rely on the path name, rather than on an open file
descriptor. This is the case in QEMU: the setattr operation will use
truncate() and fail because of bad write permissions.

This patch changes the logic in the lookup code, so that we consider
open fids first. It gives a chance to the server to match this open
fid to an open file descriptor and use ftruncate() instead of truncate().
This does not change the current behaviour for truncate() and other
path name based syscalls, since file permissions are checked earlier
in the VFS layer.

With this patch, we get:

open("./test.txt", O_RDWR|O_CREAT, 0666) = 4
chmod("./test.txt", 0)                  = 0
truncate("./test.txt", 0)               = -1 EACCES (Permission denied)
ftruncate(4, 0)                         = 0

Change-Id: Icb657359493fc9c06956881551e83c7e1af4f024
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
---
 fs/9p/fid.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index d11dd430590d..0b23b0fe6c51 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -95,8 +95,12 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
 		 dentry, dentry, from_kuid(&init_user_ns, uid),
 		 any);
 	ret = NULL;
+
+	if (d_inode(dentry))
+		ret = v9fs_fid_find_inode(d_inode(dentry), uid);
+
 	/* we'll recheck under lock if there's anything to look in */
-	if (dentry->d_fsdata) {
+	if (!ret && dentry->d_fsdata) {
 		struct hlist_head *h = (struct hlist_head *)&dentry->d_fsdata;
 		spin_lock(&dentry->d_lock);
 		hlist_for_each_entry(fid, h, dlist) {
@@ -106,9 +110,6 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
 			}
 		}
 		spin_unlock(&dentry->d_lock);
-	} else {
-		if (dentry->d_inode)
-			ret = v9fs_fid_find_inode(dentry->d_inode, uid);
 	}
 
 	return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  3:37 [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Jianyong Wu
                   ` (2 preceding siblings ...)
  2020-09-14  3:37 ` [PATCH RFC 3/4] fs/9p: search open fids first Jianyong Wu
@ 2020-09-14  3:37 ` Jianyong Wu
  2020-09-14  5:55   ` Dominique Martinet
  2020-09-14  8:35 ` [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Greg Kurz
  4 siblings, 1 reply; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  3:37 UTC (permalink / raw)
  To: ericvh, lucho, asmadeus, v9fs-developer
  Cc: linux-kernel, justin.he, jianyong.wu

Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
bug in 9p. But there is race issue in fid comtention.
As Greg's patch stores all of fids from opened files into according inode,
so all the lookup fid ops can retrieve fid from inode preferentially. But
there is no mechanism to handle the fid comtention issue. For example,
there are two threads get the same fid in the same time and one of them
clunk the fid before the other thread ready to discard the fid. In this
scenario, it will lead to some fatal problems, even kernel core dump.

I introduce a mechanism to fix this race issue. A counter field introduced
into p9_fid struct to store the reference counter to the fid. When a fid
is allocated from the inode, the counter will increase, and will decrease
at the end of its occupation. It is guaranteed that the fid won't be clunked
before the reference counter go down to 0, then we can avoid the clunked
fid to be used.
As there is no need to retrieve fid from inode in all conditions, a enum value
denotes the source of the fid is introduced to 9p_fid either. So we can only
handle the reference counter as to the fid obtained from inode.

tests:
race issue test from the old test case:
for file in {01..50}; do touch f.${file}; done
seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null

open-unlink-f*syscall test:
I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
---
 fs/9p/fid.c             | 27 +++++++++++++++++---------
 fs/9p/fid.h             | 24 +++++++++++++++++++----
 fs/9p/vfs_dentry.c      |  2 +-
 fs/9p/vfs_dir.c         | 23 +++++++++++++++++-----
 fs/9p/vfs_file.c        |  2 +-
 fs/9p/vfs_inode.c       | 42 ++++++++++++++++++++++++++++------------
 fs/9p/vfs_inode_dotl.c  | 43 +++++++++++++++++++++++++++--------------
 fs/9p/vfs_super.c       |  7 +++++--
 fs/9p/xattr.c           | 18 +++++++++++++----
 include/net/9p/client.h |  7 +++++++
 net/9p/client.c         |  2 ++
 11 files changed, 144 insertions(+), 53 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 0b23b0fe6c51..fd8cfa4776c9 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -60,6 +60,10 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
 			break;
 		}
 	}
+	if (ret && !IS_ERR(ret)) {
+		atomic_inc(&ret->count);
+		ret->s = FID_FROM_INODE;
+	}
 	spin_unlock(&inode->i_lock);
 	return ret;
 }
@@ -84,10 +88,13 @@ void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid)
  * @dentry: dentry to look for fid in
  * @uid: return fid that belongs to the specified user
  * @any: if non-zero, return any fid associated with the dentry
+ * @source: from which of inode or dentry caller want to retrieve fid, 0
+ * denotes dentry and other denotes inode. Only if the f* syscall related
+ * funcs are recommended to set to non-zero.
  *
  */
 
-static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
+static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any, int source)
 {
 	struct p9_fid *fid, *ret;
 
@@ -96,7 +103,7 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
 		 any);
 	ret = NULL;
 
-	if (d_inode(dentry))
+	if (d_inode(dentry) && source)
 		ret = v9fs_fid_find_inode(d_inode(dentry), uid);
 
 	/* we'll recheck under lock if there's anything to look in */
@@ -109,6 +116,8 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
 				break;
 			}
 		}
+		if (ret && !IS_ERR(ret))
+			ret->s = FID_FROM_DENTRY;
 		spin_unlock(&dentry->d_lock);
 	}
 
@@ -144,7 +153,7 @@ static int build_path_from_dentry(struct v9fs_session_info *v9ses,
 }
 
 static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
-					       kuid_t uid, int any)
+					       kuid_t uid, int any, int source)
 {
 	struct dentry *ds;
 	const unsigned char **wnames, *uname;
@@ -154,7 +163,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
 
 	v9ses = v9fs_dentry2v9ses(dentry);
 	access = v9ses->flags & V9FS_ACCESS_MASK;
-	fid = v9fs_fid_find(dentry, uid, any);
+	fid = v9fs_fid_find(dentry, uid, any, source);
 	if (fid)
 		return fid;
 	/*
@@ -164,7 +173,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
 	 */
 	down_read(&v9ses->rename_sem);
 	ds = dentry->d_parent;
-	fid = v9fs_fid_find(ds, uid, any);
+	fid = v9fs_fid_find(ds, uid, any, 0);
 	if (fid) {
 		/* Found the parent fid do a lookup with that */
 		fid = p9_client_walk(fid, 1, &dentry->d_name.name, 1);
@@ -173,7 +182,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
 	up_read(&v9ses->rename_sem);
 
 	/* start from the root and try to do a lookup */
-	fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any);
+	fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any, 0);
 	if (!fid) {
 		/* the user is not attached to the fs yet */
 		if (access == V9FS_ACCESS_SINGLE)
@@ -258,7 +267,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
  * the fs yet, attach now and walk from the root.
  */
 
-struct p9_fid *v9fs_fid_lookup(struct dentry *dentry)
+struct p9_fid *v9fs_fid_lookup(struct dentry *dentry, int source)
 {
 	kuid_t uid;
 	int  any, access;
@@ -284,7 +293,7 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry)
 		any = 0;
 		break;
 	}
-	return v9fs_fid_lookup_with_uid(dentry, uid, any);
+	return v9fs_fid_lookup_with_uid(dentry, uid, any, source);
 }
 
 struct p9_fid *v9fs_writeback_fid(struct dentry *dentry)
@@ -292,7 +301,7 @@ struct p9_fid *v9fs_writeback_fid(struct dentry *dentry)
 	int err;
 	struct p9_fid *fid;
 
-	fid = clone_fid(v9fs_fid_lookup_with_uid(dentry, GLOBAL_ROOT_UID, 0));
+	fid = clone_fid(v9fs_fid_lookup_with_uid(dentry, GLOBAL_ROOT_UID, 0, 0));
 	if (IS_ERR(fid))
 		goto error_out;
 	/*
diff --git a/fs/9p/fid.h b/fs/9p/fid.h
index dfa11df02818..bea25fd2b983 100644
--- a/fs/9p/fid.h
+++ b/fs/9p/fid.h
@@ -8,10 +8,26 @@
 #define FS_9P_FID_H
 #include <linux/list.h>
 
-struct p9_fid *v9fs_fid_lookup(struct dentry *dentry);
+struct p9_fid *v9fs_fid_lookup(struct dentry *dentry, int source);
+static inline int fid_atomic_dec(struct p9_fid *fid)
+{
+	if (fid && !IS_ERR(fid) && (fid->s & FID_FROM_INODE)) {
+		if (atomic_sub_return(1, &fid->count) < 0) {
+			pr_err("fid counter should be positive\n");
+			while (atomic_inc_and_test(&fid->count));
+			return -1;
+		}
+	}
+	return 0;
+}
+
 static inline struct p9_fid *v9fs_parent_fid(struct dentry *dentry)
 {
-	return v9fs_fid_lookup(dentry->d_parent);
+	/*
+	 * The "*at syscall" often need parent fd, so let's search
+	 * fid from inode first.
+	 */
+	return v9fs_fid_lookup(dentry->d_parent, 1);
 }
 void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid);
 struct p9_fid *v9fs_writeback_fid(struct dentry *dentry);
@@ -20,8 +36,8 @@ static inline struct p9_fid *clone_fid(struct p9_fid *fid)
 {
 	return IS_ERR(fid) ? fid :  p9_client_walk(fid, 0, NULL, 1);
 }
-static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry)
+static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry, int source)
 {
-	return clone_fid(v9fs_fid_lookup(dentry));
+	return clone_fid(v9fs_fid_lookup(dentry, source));
 }
 #endif
diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index 7d6f69aefd45..17bb872d7203 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -76,7 +76,7 @@ static int v9fs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
 	if (v9inode->cache_validity & V9FS_INO_INVALID_ATTR) {
 		int retval;
 		struct v9fs_session_info *v9ses;
-		fid = v9fs_fid_lookup(dentry);
+		fid = v9fs_fid_lookup(dentry, 0);
 		if (IS_ERR(fid))
 			return PTR_ERR(fid);
 
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index d82d8a346f86..f817c6a5fb42 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -19,6 +19,7 @@
 #include <linux/idr.h>
 #include <linux/slab.h>
 #include <linux/uio.h>
+#include <linux/delay.h>
 #include <net/9p/9p.h>
 #include <net/9p/client.h>
 
@@ -206,15 +207,27 @@ static int v9fs_dir_readdir_dotl(struct file *file, struct dir_context *ctx)
 int v9fs_dir_release(struct inode *inode, struct file *filp)
 {
 	struct p9_fid *fid;
+	int count;
 
 	fid = filp->private_data;
 	p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n",
 		 inode, filp, fid ? fid->fid : -1);
-	spin_lock(&inode->i_lock);
-	hlist_del(&fid->ilist);
-	spin_unlock(&inode->i_lock);
-	if (fid)
-		p9_client_clunk(fid);
+	if (fid) {
+		spin_lock(&inode->i_lock);
+		hlist_del(&fid->ilist);
+		spin_unlock(&inode->i_lock);
+		/*
+		 * Here we wait up to 10ms, if the counter won't down to 0
+		 * the fid will be left behind.
+		 */
+		count = 100;
+		while (count > 0 && fid->count.counter > 0) {
+			count--;
+			udelay(100);
+		}
+		if (fid->count.counter <= 0)
+			p9_client_clunk(fid);
+	}
 	return 0;
 }
 
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index b42cc1752cd1..efc3f5cc1c14 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -59,7 +59,7 @@ int v9fs_file_open(struct inode *inode, struct file *file)
 					v9fs_proto_dotu(v9ses));
 	fid = file->private_data;
 	if (!fid) {
-		fid = v9fs_fid_clone(file_dentry(file));
+		fid = v9fs_fid_clone(file_dentry(file), 0);
 		if (IS_ERR(fid))
 			return PTR_ERR(fid);
 
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 6b243ffcbcf0..f8718b9b1915 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -551,9 +551,10 @@ static int v9fs_remove(struct inode *dir, struct dentry *dentry, int flags)
 	if (v9fs_proto_dotl(v9ses))
 		retval = p9_client_unlinkat(dfid, dentry->d_name.name,
 					    v9fs_at_to_dotl_flags(flags));
+	fid_atomic_dec(dfid);
 	if (retval == -EOPNOTSUPP) {
 		/* Try the one based on path */
-		v9fid = v9fs_fid_clone(dentry);
+		v9fid = v9fs_fid_clone(dentry, 0);
 		if (IS_ERR(v9fid))
 			return PTR_ERR(v9fid);
 		retval = p9_client_remove(v9fid);
@@ -616,12 +617,14 @@ v9fs_create(struct v9fs_session_info *v9ses, struct inode *dir,
 	if (IS_ERR(ofid)) {
 		err = PTR_ERR(ofid);
 		p9_debug(P9_DEBUG_VFS, "p9_client_walk failed %d\n", err);
+		fid_atomic_dec(dfid);
 		return ERR_PTR(err);
 	}
 
 	err = p9_client_fcreate(ofid, name, perm, mode, extension);
 	if (err < 0) {
 		p9_debug(P9_DEBUG_VFS, "p9_client_fcreate failed %d\n", err);
+		fid_atomic_dec(dfid);
 		goto error;
 	}
 
@@ -633,6 +636,7 @@ v9fs_create(struct v9fs_session_info *v9ses, struct inode *dir,
 			p9_debug(P9_DEBUG_VFS,
 				   "p9_client_walk failed %d\n", err);
 			fid = NULL;
+			fid_atomic_dec(dfid);
 			goto error;
 		}
 		/*
@@ -643,11 +647,13 @@ v9fs_create(struct v9fs_session_info *v9ses, struct inode *dir,
 			err = PTR_ERR(inode);
 			p9_debug(P9_DEBUG_VFS,
 				   "inode creation failed %d\n", err);
+			fid_atomic_dec(dfid);
 			goto error;
 		}
 		v9fs_fid_add(dentry, fid);
 		d_instantiate(dentry, inode);
 	}
+	fid_atomic_dec(dfid);
 	return ofid;
 error:
 	if (ofid)
@@ -760,6 +766,7 @@ struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry,
 	 */
 	name = dentry->d_name.name;
 	fid = p9_client_walk(dfid, 1, &name, 1);
+	fid_atomic_dec(dfid);
 	if (fid == ERR_PTR(-ENOENT))
 		inode = NULL;
 	else if (IS_ERR(fid))
@@ -910,7 +917,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	struct inode *old_inode;
 	struct inode *new_inode;
 	struct v9fs_session_info *v9ses;
-	struct p9_fid *oldfid;
+	struct p9_fid *oldfid, *dfid;
 	struct p9_fid *olddirfid;
 	struct p9_fid *newdirfid;
 	struct p9_wstat wstat;
@@ -923,17 +930,23 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	old_inode = d_inode(old_dentry);
 	new_inode = d_inode(new_dentry);
 	v9ses = v9fs_inode2v9ses(old_inode);
-	oldfid = v9fs_fid_lookup(old_dentry);
+	oldfid = v9fs_fid_lookup(old_dentry, 1);
 	if (IS_ERR(oldfid))
 		return PTR_ERR(oldfid);
 
-	olddirfid = clone_fid(v9fs_parent_fid(old_dentry));
+	dfid = v9fs_parent_fid(old_dentry);
+	olddirfid = clone_fid(dfid);
+	if (dfid && !IS_ERR(dfid))
+		fid_atomic_dec(dfid);
 	if (IS_ERR(olddirfid)) {
 		retval = PTR_ERR(olddirfid);
 		goto done;
 	}
 
-	newdirfid = clone_fid(v9fs_parent_fid(new_dentry));
+	dfid = v9fs_parent_fid(new_dentry);
+	newdirfid = clone_fid(dfid);
+	if (dfid && !IS_ERR(dfid))
+		fid_atomic_dec(dfid);
 	if (IS_ERR(newdirfid)) {
 		retval = PTR_ERR(newdirfid);
 		goto clunk_olddir;
@@ -990,6 +1003,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	p9_client_clunk(olddirfid);
 
 done:
+	fid_atomic_dec(oldfid);
 	return retval;
 }
 
@@ -1017,11 +1031,12 @@ v9fs_vfs_getattr(const struct path *path, struct kstat *stat,
 		generic_fillattr(d_inode(dentry), stat);
 		return 0;
 	}
-	fid = v9fs_fid_lookup(dentry);
+	fid = v9fs_fid_lookup(dentry, 1);
 	if (IS_ERR(fid))
 		return PTR_ERR(fid);
 
 	st = p9_client_stat(fid);
+	fid_atomic_dec(fid);
 	if (IS_ERR(st))
 		return PTR_ERR(st);
 
@@ -1042,7 +1057,7 @@ v9fs_vfs_getattr(const struct path *path, struct kstat *stat,
 
 static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr)
 {
-	int retval;
+	int retval, use_dentry = 0;
 	struct v9fs_session_info *v9ses;
 	struct p9_fid *fid = NULL;
 	struct p9_wstat wstat;
@@ -1058,11 +1073,12 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr)
 		fid = iattr->ia_file->private_data;
 		WARN_ON(!fid);
 	}
-	if (!fid)
-		fid = v9fs_fid_lookup(dentry);
+	if (!fid) {
+		fid = v9fs_fid_lookup(dentry, 1);
+		use_dentry = 1;
+	}
 	if(IS_ERR(fid))
 		return PTR_ERR(fid);
-
 	v9fs_blank_wstat(&wstat);
 	if (iattr->ia_valid & ATTR_MODE)
 		wstat.mode = unixmode2p9mode(v9ses, iattr->ia_mode);
@@ -1089,6 +1105,8 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr)
 		filemap_write_and_wait(d_inode(dentry)->i_mapping);
 
 	retval = p9_client_wstat(fid, &wstat);
+	if (use_dentry)
+		fid_atomic_dec(fid);
 	if (retval < 0)
 		return retval;
 
@@ -1203,7 +1221,7 @@ static const char *v9fs_vfs_get_link(struct dentry *dentry,
 		return ERR_PTR(-ECHILD);
 
 	v9ses = v9fs_dentry2v9ses(dentry);
-	fid = v9fs_fid_lookup(dentry);
+	fid = v9fs_fid_lookup(dentry, 0);
 	p9_debug(P9_DEBUG_VFS, "%pd\n", dentry);
 
 	if (IS_ERR(fid))
@@ -1303,7 +1321,7 @@ v9fs_vfs_link(struct dentry *old_dentry, struct inode *dir,
 	p9_debug(P9_DEBUG_VFS, " %lu,%pd,%pd\n",
 		 dir->i_ino, dentry, old_dentry);
 
-	oldfid = v9fs_fid_clone(old_dentry);
+	oldfid = v9fs_fid_clone(old_dentry, 0);
 	if (IS_ERR(oldfid))
 		return PTR_ERR(oldfid);
 
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 08f2e089fb0e..be68db87ef61 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -296,6 +296,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
 
 	/* instantiate inode and assign the unopened fid to the dentry */
 	fid = p9_client_walk(dfid, 1, &name, 1);
+	fid_atomic_dec(dfid);
 	if (IS_ERR(fid)) {
 		err = PTR_ERR(fid);
 		p9_debug(P9_DEBUG_VFS, "p9_client_walk failed %d\n", err);
@@ -394,7 +395,6 @@ static int v9fs_vfs_mkdir_dotl(struct inode *dir,
 		dfid = NULL;
 		goto error;
 	}
-
 	gid = v9fs_get_fsgid_for_create(dir);
 	mode = omode;
 	/* Update mode based on ACL value */
@@ -452,6 +452,7 @@ static int v9fs_vfs_mkdir_dotl(struct inode *dir,
 	if (fid)
 		p9_client_clunk(fid);
 	v9fs_put_acl(dacl, pacl);
+	fid_atomic_dec(dfid);
 	return err;
 }
 
@@ -470,7 +471,7 @@ v9fs_vfs_getattr_dotl(const struct path *path, struct kstat *stat,
 		generic_fillattr(d_inode(dentry), stat);
 		return 0;
 	}
-	fid = v9fs_fid_lookup(dentry);
+	fid = v9fs_fid_lookup(dentry, 1);
 	if (IS_ERR(fid))
 		return PTR_ERR(fid);
 
@@ -479,6 +480,7 @@ v9fs_vfs_getattr_dotl(const struct path *path, struct kstat *stat,
 	 */
 
 	st = p9_client_getattr_dotl(fid, P9_STATS_ALL);
+	fid_atomic_dec(fid);
 	if (IS_ERR(st))
 		return PTR_ERR(st);
 
@@ -540,7 +542,7 @@ static int v9fs_mapped_iattr_valid(int iattr_valid)
 
 int v9fs_vfs_setattr_dotl(struct dentry *dentry, struct iattr *iattr)
 {
-	int retval;
+	int retval, use_dentry = 0;
 	struct p9_fid *fid = NULL;
 	struct p9_iattr_dotl p9attr;
 	struct inode *inode = d_inode(dentry);
@@ -565,8 +567,10 @@ int v9fs_vfs_setattr_dotl(struct dentry *dentry, struct iattr *iattr)
 		fid = iattr->ia_file->private_data;
 		WARN_ON(!fid);
 	}
-	if (!fid)
-		fid = v9fs_fid_lookup(dentry);
+	if (!fid) {
+		fid = v9fs_fid_lookup(dentry, 1);
+		use_dentry = 1;
+	}
 	if (IS_ERR(fid))
 		return PTR_ERR(fid);
 
@@ -575,8 +579,10 @@ int v9fs_vfs_setattr_dotl(struct dentry *dentry, struct iattr *iattr)
 		filemap_write_and_wait(inode->i_mapping);
 
 	retval = p9_client_setattr(fid, &p9attr);
-	if (retval < 0)
+	if (retval < 0) {
+		fid_atomic_dec(fid);
 		return retval;
+	}
 
 	if ((iattr->ia_valid & ATTR_SIZE) &&
 	    iattr->ia_size != i_size_read(inode))
@@ -588,9 +594,13 @@ int v9fs_vfs_setattr_dotl(struct dentry *dentry, struct iattr *iattr)
 	if (iattr->ia_valid & ATTR_MODE) {
 		/* We also want to update ACL when we update mode bits */
 		retval = v9fs_acl_chmod(inode, fid);
-		if (retval < 0)
+		if (retval < 0) {
+			fid_atomic_dec(fid);
 			return retval;
+		}
 	}
+	if (use_dentry)
+		fid_atomic_dec(fid);
 	return 0;
 }
 
@@ -741,7 +751,7 @@ v9fs_vfs_symlink_dotl(struct inode *dir, struct dentry *dentry,
 error:
 	if (fid)
 		p9_client_clunk(fid);
-
+	fid_atomic_dec(dfid);
 	return err;
 }
 
@@ -769,12 +779,15 @@ v9fs_vfs_link_dotl(struct dentry *old_dentry, struct inode *dir,
 	if (IS_ERR(dfid))
 		return PTR_ERR(dfid);
 
-	oldfid = v9fs_fid_lookup(old_dentry);
-	if (IS_ERR(oldfid))
+	oldfid = v9fs_fid_lookup(old_dentry, 0);
+	if (IS_ERR(oldfid)) {
+		fid_atomic_dec(dfid);
 		return PTR_ERR(oldfid);
+	}
 
 	err = p9_client_link(dfid, oldfid, dentry->d_name.name);
-
+	fid_atomic_dec(dfid);
+	fid_atomic_dec(oldfid);
 	if (err < 0) {
 		p9_debug(P9_DEBUG_VFS, "p9_client_link failed %d\n", err);
 		return err;
@@ -784,10 +797,9 @@ v9fs_vfs_link_dotl(struct dentry *old_dentry, struct inode *dir,
 	if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) {
 		/* Get the latest stat info from server. */
 		struct p9_fid *fid;
-		fid = v9fs_fid_lookup(old_dentry);
+		fid = v9fs_fid_lookup(old_dentry, 0);
 		if (IS_ERR(fid))
 			return PTR_ERR(fid);
-
 		v9fs_refresh_inode_dotl(fid, d_inode(old_dentry));
 	}
 	ihold(d_inode(old_dentry));
@@ -830,7 +842,6 @@ v9fs_vfs_mknod_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 		dfid = NULL;
 		goto error;
 	}
-
 	gid = v9fs_get_fsgid_for_create(dir);
 	mode = omode;
 	/* Update mode based on ACL value */
@@ -887,6 +898,7 @@ v9fs_vfs_mknod_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (fid)
 		p9_client_clunk(fid);
 	v9fs_put_acl(dacl, pacl);
+	fid_atomic_dec(dfid);
 	return err;
 }
 
@@ -911,10 +923,11 @@ v9fs_vfs_get_link_dotl(struct dentry *dentry,
 
 	p9_debug(P9_DEBUG_VFS, "%pd\n", dentry);
 
-	fid = v9fs_fid_lookup(dentry);
+	fid = v9fs_fid_lookup(dentry, 0);
 	if (IS_ERR(fid))
 		return ERR_CAST(fid);
 	retval = p9_client_readlink(fid, &target);
+	fid_atomic_dec(fid);
 	if (retval)
 		return ERR_PTR(retval);
 	set_delayed_call(done, kfree_link, target);
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index 74df32be4c6a..ef35d4f07c40 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -241,7 +241,7 @@ static int v9fs_statfs(struct dentry *dentry, struct kstatfs *buf)
 	struct p9_rstatfs rs;
 	int res;
 
-	fid = v9fs_fid_lookup(dentry);
+	fid = v9fs_fid_lookup(dentry, 1);
 	if (IS_ERR(fid)) {
 		res = PTR_ERR(fid);
 		goto done;
@@ -262,10 +262,13 @@ static int v9fs_statfs(struct dentry *dentry, struct kstatfs *buf)
 			buf->f_fsid.val[1] = (rs.fsid >> 32) & 0xFFFFFFFFUL;
 			buf->f_namelen = rs.namelen;
 		}
-		if (res != -ENOSYS)
+		if (res != -ENOSYS) {
+			fid_atomic_dec(fid);
 			goto done;
+		}
 	}
 	res = simple_statfs(dentry, buf);
+	fid_atomic_dec(fid);
 done:
 	return res;
 }
diff --git a/fs/9p/xattr.c b/fs/9p/xattr.c
index ac8ff8ca4c11..f4e90a2dc975 100644
--- a/fs/9p/xattr.c
+++ b/fs/9p/xattr.c
@@ -71,14 +71,17 @@ ssize_t v9fs_xattr_get(struct dentry *dentry, const char *name,
 		       void *buffer, size_t buffer_size)
 {
 	struct p9_fid *fid;
+	int ret;
 
 	p9_debug(P9_DEBUG_VFS, "name = %s value_len = %zu\n",
 		 name, buffer_size);
-	fid = v9fs_fid_lookup(dentry);
+	fid = v9fs_fid_lookup(dentry, 1);
 	if (IS_ERR(fid))
 		return PTR_ERR(fid);
+	ret = v9fs_fid_xattr_get(fid, name, buffer, buffer_size);
+	fid_atomic_dec(fid);
 
-	return v9fs_fid_xattr_get(fid, name, buffer, buffer_size);
+	return ret;
 }
 
 /*
@@ -96,8 +99,15 @@ ssize_t v9fs_xattr_get(struct dentry *dentry, const char *name,
 int v9fs_xattr_set(struct dentry *dentry, const char *name,
 		   const void *value, size_t value_len, int flags)
 {
-	struct p9_fid *fid = v9fs_fid_lookup(dentry);
-	return v9fs_fid_xattr_set(fid, name, value, value_len, flags);
+	int ret;
+	struct p9_fid *fid;
+
+	fid  = v9fs_fid_lookup(dentry, 1);
+	if (IS_ERR(fid))
+		return PTR_ERR(fid);
+	ret = v9fs_fid_xattr_set(fid, name, value, value_len, flags);
+	fid_atomic_dec(fid);
+	return ret;
 }
 
 int v9fs_fid_xattr_set(struct p9_fid *fid, const char *name,
diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index ce7882da8e86..831cb1a903b1 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -140,14 +140,21 @@ struct p9_client {
  *
  * TODO: This needs lots of explanation.
  */
+enum fid_source {
+	FID_FROM_OTHER,
+	FID_FROM_INODE,
+	FID_FROM_DENTRY,
+};
 
 struct p9_fid {
 	struct p9_client *clnt;
 	u32 fid;
+	atomic_t count;
 	int mode;
 	struct p9_qid qid;
 	u32 iounit;
 	kuid_t uid;
+	enum fid_source s;
 
 	void *rdir;
 
diff --git a/net/9p/client.c b/net/9p/client.c
index 1a3f72bf45fc..51f1277ba58d 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -901,6 +901,8 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt)
 	fid->clnt = clnt;
 	fid->rdir = NULL;
 	fid->fid = 0;
+	fid->s = 0;
+	atomic_set(&fid->count, 0);
 
 	idr_preload(GFP_KERNEL);
 	spin_lock_irq(&clnt->lock);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  3:37 ` [PATCH RFC 4/4] 9p: fix race issue in fid contention Jianyong Wu
@ 2020-09-14  5:55   ` Dominique Martinet
  2020-09-14  6:31     ` [V9fs-developer] " Dominique Martinet
  2020-09-14  7:32     ` Jianyong Wu
  0 siblings, 2 replies; 24+ messages in thread
From: Dominique Martinet @ 2020-09-14  5:55 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, justin.he, Greg Kurz


Thanks for having a look a this!

Jianyong Wu wrote on Mon, Sep 14, 2020:
> Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
> bug in 9p. But there is race issue in fid comtention.
> As Greg's patch stores all of fids from opened files into according inode,
> so all the lookup fid ops can retrieve fid from inode preferentially. But
> there is no mechanism to handle the fid comtention issue. For example,
> there are two threads get the same fid in the same time and one of them
> clunk the fid before the other thread ready to discard the fid. In this
> scenario, it will lead to some fatal problems, even kernel core dump.

Ah, so that's what the problem was. Good job finding the problem!


> I introduce a mechanism to fix this race issue. A counter field introduced
> into p9_fid struct to store the reference counter to the fid. When a fid
> is allocated from the inode, the counter will increase, and will decrease
> at the end of its occupation. It is guaranteed that the fid won't be clunked
> before the reference counter go down to 0, then we can avoid the clunked
> fid to be used.
> As there is no need to retrieve fid from inode in all conditions, a enum value
> denotes the source of the fid is introduced to 9p_fid either. So we can only
> handle the reference counter as to the fid obtained from inode.

If there is no contention then an always-one refcount and an enum are
the same thing.
I'd rather not make a difference but make it a full-fledged refcount
thing; the enum in the code introduces quite a bit of code churn that
doesn't strike me as useful (and I don't like int arguments like this,
but if we can just do away with it there's no need to argue about that)

Not having exceptions for that will also make the code around
fid_atomic_dec much simpler: just have clunk do an atomic dec and only
do the actual clunk if that hit zero, and we should be able to get rid
of that helper?


Timing wise it's a bit awkward but I just dug out the async clunk
mechanism I wrote two years ago, that will conflict with this patch but
might also help a bit I guess?
I should probably have reposted them...


So to recap:
 - Let's try some more straight-forward refcounting: set to 1 on alloc,
increment when it's found in fid.c, decrement in clunk and only send the
actual clunk if counter hit 0

 - Ideally base yourself of my 9p-test branch to have async clunk:
https://github.com/martinetd/linux/commits/9p-test
I've been promising to push it to next this week™ for a couple of weeks
but if something is based on it I won't be able to delay this much
longer, it'll get pushed to 5.10 cycle anyway.
(I'll resend the patches to be clean)

 - (please, no polling 10ms then leaking something!)


Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom
  2020-09-14  3:37 ` [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom Jianyong Wu
@ 2020-09-14  6:00   ` Dominique Martinet
  2020-09-14  8:11     ` Greg Kurz
  0 siblings, 1 reply; 24+ messages in thread
From: Dominique Martinet @ 2020-09-14  6:00 UTC (permalink / raw)
  To: Jianyong Wu, Greg Kurz
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, justin.he

Jianyong Wu wrote on Mon, Sep 14, 2020:
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Just on a note on that mail: gkurz@linux.vnet.ibm.com has no longer been
working for a while, probably want to update to groug@kaod.org in both
first two patches.
Greg, sorry I had forgotten to fix Cc earlier, can you confirm?


I'll (re)do a proper review of the first three patches again in a bit
but iirc they looked good on paper, will resend a mail if required.

-- 
Dominique

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  5:55   ` Dominique Martinet
@ 2020-09-14  6:31     ` Dominique Martinet
  2020-09-14  7:50       ` Jianyong Wu
  2020-09-14  7:32     ` Jianyong Wu
  1 sibling, 1 reply; 24+ messages in thread
From: Dominique Martinet @ 2020-09-14  6:31 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: lucho, justin.he, ericvh, linux-kernel, v9fs-developer, Greg Kurz

Dominique Martinet wrote on Mon, Sep 14, 2020:
> Jianyong Wu wrote on Mon, Sep 14, 2020:
>  - Ideally base yourself of my 9p-test branch to have async clunk:
> https://github.com/martinetd/linux/commits/9p-test
> I've been promising to push it to next this week™ for a couple of weeks
> but if something is based on it I won't be able to delay this much
> longer, it'll get pushed to 5.10 cycle anyway.
> (I'll resend the patches to be clean)
>
>> tests:
>> race issue test from the old test case:
>> for file in {01..50}; do touch f.${file}; done
>> seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null

hmpf, so that made me insist a bit on this test on my patch and I see
a problem with that as well. The me from a few years ago was good!

With that said I'll want to work a bit more on this, so feel free to
base off master and I'll deal with rebase if required.

Part of me thinks it's the same bug that will be fixed with refcounting
and I just made it easier to hit, but I'm honestly unsure at this point
and testing would basically mean I just code what I asked you to...

Well, let me know if you want me to do the refcounting, but I'd rather
let you finish what you started. If possible put the patch first in the
series so commits can be tested independently.


Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  5:55   ` Dominique Martinet
  2020-09-14  6:31     ` [V9fs-developer] " Dominique Martinet
@ 2020-09-14  7:32     ` Jianyong Wu
  2020-09-14  8:32       ` Dominique Martinet
  1 sibling, 1 reply; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  7:32 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, Justin He, Greg Kurz

Hi Dominique,

> -----Original Message-----
> From: Dominique Martinet <asmadeus@codewreck.org>
> Sent: Monday, September 14, 2020 1:56 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: ericvh@gmail.com; lucho@ionkov.net; v9fs-
> developer@lists.sourceforge.net; linux-kernel@vger.kernel.org; Justin He
> <Justin.He@arm.com>; Greg Kurz <gkurz@linux.vnet.ibm.com>
> Subject: Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
>
>
> Thanks for having a look a this!
>
> Jianyong Wu wrote on Mon, Sep 14, 2020:
> > Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
> > bug in 9p. But there is race issue in fid comtention.
> > As Greg's patch stores all of fids from opened files into according
> > inode, so all the lookup fid ops can retrieve fid from inode
> > preferentially. But there is no mechanism to handle the fid comtention
> > issue. For example, there are two threads get the same fid in the same
> > time and one of them clunk the fid before the other thread ready to
> > discard the fid. In this scenario, it will lead to some fatal problems, even
> kernel core dump.
>
> Ah, so that's what the problem was. Good job finding the problem!
>
Thanks! Very pleasure.
>
> > I introduce a mechanism to fix this race issue. A counter field
> > introduced into p9_fid struct to store the reference counter to the
> > fid. When a fid is allocated from the inode, the counter will
> > increase, and will decrease at the end of its occupation. It is
> > guaranteed that the fid won't be clunked before the reference counter
> > go down to 0, then we can avoid the clunked fid to be used.
> > As there is no need to retrieve fid from inode in all conditions, a
> > enum value denotes the source of the fid is introduced to 9p_fid
> > either. So we can only handle the reference counter as to the fid obtained
> from inode.
>
> If there is no contention then an always-one refcount and an enum are the
> same thing.
> I'd rather not make a difference but make it a full-fledged refcount thing; the
> enum in the code introduces quite a bit of code churn that doesn't strike me
> as useful (and I don't like int arguments like this, but if we can just do away
> with it there's no need to argue about that)
>
> Not having exceptions for that will also make the code around
> fid_atomic_dec much simpler: just have clunk do an atomic dec and only do
> the actual clunk if that hit zero, and we should be able to get rid of that
> helper?
>
Sorry, I think always-one refcount  won't work at this point, as the fid will be clunked only by
File context itself not the every consumer of every fid. We can't decrease the refcounter at just one
static point. Am I wrong?
This enum value is not functionally necessary, but I think it can reduce the contention of fid, as there are
really lots of scenarios that fid from inode is not necessary.

>
> Timing wise it's a bit awkward but I just dug out the async clunk mechanism I
> wrote two years ago, that will conflict with this patch but might also help a bit
> I guess?
> I should probably have reposted them...
>
Interesting!

>
> So to recap:
>  - Let's try some more straight-forward refcounting: set to 1 on alloc,
> increment when it's found in fid.c, decrement in clunk and only send the
> actual clunk if counter hit 0
>
it may not work, I think.

>  - Ideally base yourself of my 9p-test branch to have async clunk:
> https://github.com/martinetd/linux/commits/9p-test
> I've been promising to push it to next this week™ for a couple of weeks but if
> something is based on it I won't be able to delay this much longer, it'll get
> pushed to 5.10 cycle anyway.
> (I'll resend the patches to be clean)
>
>  - (please, no polling 10ms then leaking something!)
>
Yeah, it will lead fid to leak sometimes, unfortunately,  I'm afraid that the CPU may be stuck here.  we
must wait here (v9fs_dir_release) for the counter down to 0, as this is the only place to release the fid.
That's the problem.

Thanks
Jianyong
> Thanks,
> --
> Dominique
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [V9fs-developer] [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  6:31     ` [V9fs-developer] " Dominique Martinet
@ 2020-09-14  7:50       ` Jianyong Wu
  0 siblings, 0 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14  7:50 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: lucho, Justin He, ericvh, linux-kernel, v9fs-developer, Greg Kurz

Hi Dominique,

> -----Original Message-----
> From: Dominique Martinet <asmadeus@codewreck.org>
> Sent: Monday, September 14, 2020 2:32 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: lucho@ionkov.net; Justin He <Justin.He@arm.com>; ericvh@gmail.com;
> linux-kernel@vger.kernel.org; v9fs-developer@lists.sourceforge.net; Greg
> Kurz <groug@kaod.org>
> Subject: Re: [V9fs-developer] [PATCH RFC 4/4] 9p: fix race issue in fid
> contention.
>
> Dominique Martinet wrote on Mon, Sep 14, 2020:
> > Jianyong Wu wrote on Mon, Sep 14, 2020:
> >  - Ideally base yourself of my 9p-test branch to have async clunk:
> > https://github.com/martinetd/linux/commits/9p-test
> > I've been promising to push it to next this week™ for a couple of
> > weeks but if something is based on it I won't be able to delay this
> > much longer, it'll get pushed to 5.10 cycle anyway.
> > (I'll resend the patches to be clean)
> >
> >> tests:
> >> race issue test from the old test case:
> >> for file in {01..50}; do touch f.${file}; done seq 1 1000 | xargs -n
> >> 1 -P 50 -I{} cat f.* > /dev/null
>
> hmpf, so that made me insist a bit on this test on my patch and I see a
> problem with that as well. The me from a few years ago was good!
>
> With that said I'll want to work a bit more on this, so feel free to base off
> master and I'll deal with rebase if required.
>
> Part of me thinks it's the same bug that will be fixed with refcounting and I
> just made it easier to hit, but I'm honestly unsure at this point and testing
> would basically mean I just code what I asked you to...
>
> Well, let me know if you want me to do the refcounting, but I'd rather let you
> finish what you started.

Thanks, I'm happy to work this.
>If possible put the patch first in the series so commits
> can be tested independently.

Ah, this patch depends on the previous patches, how can I put it as the first of the series?

Thanks
Jianyong

> Thanks,
> --
> Dominique
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom
  2020-09-14  6:00   ` Dominique Martinet
@ 2020-09-14  8:11     ` Greg Kurz
  0 siblings, 0 replies; 24+ messages in thread
From: Greg Kurz @ 2020-09-14  8:11 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Jianyong Wu, ericvh, lucho, v9fs-developer, linux-kernel, justin.he

On Mon, 14 Sep 2020 08:00:36 +0200
Dominique Martinet <asmadeus@codewreck.org> wrote:

> Jianyong Wu wrote on Mon, Sep 14, 2020:
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> 
> Just on a note on that mail: gkurz@linux.vnet.ibm.com has no longer been
> working for a while, probably want to update to groug@kaod.org in both
> first two patches.
> Greg, sorry I had forgotten to fix Cc earlier, can you confirm?
> 

Np :) and yes I confirm that groug@kaod.org is to be used now.

14a36a435343 ("mailmap: add entry for Greg Kurz")

> 
> I'll (re)do a proper review of the first three patches again in a bit
> but iirc they looked good on paper, will resend a mail if required.
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  7:32     ` Jianyong Wu
@ 2020-09-14  8:32       ` Dominique Martinet
  2020-09-14 12:34         ` Jianyong Wu
  2020-09-18  8:57         ` Jianyong Wu
  0 siblings, 2 replies; 24+ messages in thread
From: Dominique Martinet @ 2020-09-14  8:32 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, Justin He, Greg Kurz

Jianyong Wu wrote on Mon, Sep 14, 2020:
> > Not having exceptions for that will also make the code around
> > fid_atomic_dec much simpler: just have clunk do an atomic dec and only do
> > the actual clunk if that hit zero, and we should be able to get rid of that
> > helper?
> 
> Sorry, I think always-one refcount  won't work at this point, as the
> fid will be clunked only by file context itself not the every consumer
> of every fid. We can't decrease the refcounter at just one static
> point.
> Am I wrong?

I don't understand the "We can't decrease the refcounter at just one
static point".
Basically everywhere you added a fid_atomic_dec() will just need to be
changed to clunk -- the basic rule of refcounting is that anywhere you
take a ref you need to put it back.

All these places take a fid to do some RPC already so it's not a problem
to add a clunk and do one more; especially since the "clunk" will just
be just a deref.
For consistency I'd advocate for the kref API as we use that for
requests already; it might be better to rename the clunk calls to
p9_fid_put or something but I think that's more churn than it's
worth....


Is there anywhere you think cannot do that?

> This enum value is not functionally necessary, but I think it can
> reduce the contention of fid, as there are really lots of scenarios
> that fid from inode is not necessary.

I really don't think it makes things slower if done correctly (e.g. no
waiting as currently done but the last deref does the actual clunk), and
would rather keep code simpler unless the difference is big (so would
need to do an actual benchmark of both if you're convinced it helps) --
sorry.

>> If possible put the patch first in the series so commits can be
>> tested independently.
> 
> Ah, this patch depends on the previous patches, how can I put it as
> the first of the series?

Basically build the logic in the first patch, then either apply the
other three as close as they currently are as possible and add the
missing refcalls in a new patch or incorporate the refcounting in them
as well.
It's fine if you keep it it last, that was just a greedy request on my
part to be able to test async clunk more easily ; forget about it.

-- 
Dominique

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14  3:37 [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Jianyong Wu
                   ` (3 preceding siblings ...)
  2020-09-14  3:37 ` [PATCH RFC 4/4] 9p: fix race issue in fid contention Jianyong Wu
@ 2020-09-14  8:35 ` Greg Kurz
  2020-09-14 11:06   ` Christian Schoenebeck
  2020-09-14 12:36   ` Jianyong Wu
  4 siblings, 2 replies; 24+ messages in thread
From: Greg Kurz @ 2020-09-14  8:35 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: ericvh, lucho, asmadeus, v9fs-developer, justin.he, linux-kernel,
	Christian Schoenebeck

On Mon, 14 Sep 2020 11:37:50 +0800
Jianyong Wu <jianyong.wu@arm.com> wrote:

> open-unlink-f*syscall bug is a well-known bug in 9p, we try to fix the bug
> in this patch set.
> I take Eric's and Greg's patches which constiute the 1/4 - 3/4 of this patch
> set as the main frame of the solution. In patch 4/4, I fix the fid race issue
> exists in Greg's patch.
> 

IIRC some patches were needed on the QEMU side as well... I'm spending
less time on 9pfs in QEMU, so Cc'ing the new maintainer:

Christian Schoenebeck <qemu_oss@crudebyte.com>

> Eric Van Hensbergen (1):
>   fs/9p: fix create-unlink-getattr idiom
> 
> Greg Kurz (1):
>   fs/9p: search open fids first
> 
> Jianyong Wu (2):
>   fs/9p: track open fids
>   9p: fix race issue in fid contention.
> 
>  fs/9p/fid.c             | 72 +++++++++++++++++++++++++++++++++++------
>  fs/9p/fid.h             | 25 +++++++++++---
>  fs/9p/vfs_dentry.c      |  2 +-
>  fs/9p/vfs_dir.c         | 20 ++++++++++--
>  fs/9p/vfs_file.c        |  3 +-
>  fs/9p/vfs_inode.c       | 52 +++++++++++++++++++++--------
>  fs/9p/vfs_inode_dotl.c  | 44 ++++++++++++++++---------
>  fs/9p/vfs_super.c       |  7 ++--
>  fs/9p/xattr.c           | 18 ++++++++---
>  include/net/9p/client.h |  8 +++++
>  net/9p/client.c         |  7 +++-
>  11 files changed, 206 insertions(+), 52 deletions(-)
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14  8:35 ` [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Greg Kurz
@ 2020-09-14 11:06   ` Christian Schoenebeck
  2020-09-14 12:43     ` Greg Kurz
  2020-09-14 12:36   ` Jianyong Wu
  1 sibling, 1 reply; 24+ messages in thread
From: Christian Schoenebeck @ 2020-09-14 11:06 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Jianyong Wu, ericvh, lucho, asmadeus, v9fs-developer, justin.he,
	linux-kernel

On Montag, 14. September 2020 10:35:46 CEST Greg Kurz wrote:
> On Mon, 14 Sep 2020 11:37:50 +0800
> 
> Jianyong Wu <jianyong.wu@arm.com> wrote:
> > open-unlink-f*syscall bug is a well-known bug in 9p, we try to fix the bug
> > in this patch set.
> > I take Eric's and Greg's patches which constiute the 1/4 - 3/4 of this
> > patch set as the main frame of the solution. In patch 4/4, I fix the fid
> > race issue exists in Greg's patch.
> 
> IIRC some patches were needed on the QEMU side as well... I'm spending
> less time on 9pfs in QEMU, so Cc'ing the new maintainer:
> 
> Christian Schoenebeck <qemu_oss@crudebyte.com>

AFAICS this is about this old bug report:
https://bugs.launchpad.net/qemu/+bug/1336794

So yes, looks like this also requires changes to the 9pfs 'local' fs driver on 
QEMU side:
https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07586.html

Eric, Greg, would there be an easy way to establish QEMU test cases running 
the 9pfs 'local' fs driver? Right now we only have 9pfs qtest cases for QEMU 
which can only use the 'synth' driver, which is not helpful for such kind of 
issues.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  8:32       ` Dominique Martinet
@ 2020-09-14 12:34         ` Jianyong Wu
  2020-09-18  8:57         ` Jianyong Wu
  1 sibling, 0 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14 12:34 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, Justin He, Greg Kurz

Hi Dominique,

> -----Original Message-----
> From: Dominique Martinet <asmadeus@codewreck.org>
> Sent: Monday, September 14, 2020 4:32 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: ericvh@gmail.com; lucho@ionkov.net; v9fs-
> developer@lists.sourceforge.net; linux-kernel@vger.kernel.org; Justin He
> <Justin.He@arm.com>; Greg Kurz <groug@kaod.org>
> Subject: Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
>
> Jianyong Wu wrote on Mon, Sep 14, 2020:
> > > Not having exceptions for that will also make the code around
> > > fid_atomic_dec much simpler: just have clunk do an atomic dec and
> > > only do the actual clunk if that hit zero, and we should be able to
> > > get rid of that helper?
> >
> > Sorry, I think always-one refcount  won't work at this point, as the
> > fid will be clunked only by file context itself not the every consumer
> > of every fid. We can't decrease the refcounter at just one static
> > point.
> > Am I wrong?
>
> I don't understand the "We can't decrease the refcounter at just one static
> point".
> Basically everywhere you added a fid_atomic_dec() will just need to be
> changed to clunk -- the basic rule of refcounting is that anywhere you take a
> ref you need to put it back.
>
Oh, maybe I just miss your point. It is ok to  put the fid_atomic_dec() into p9_client_clunk() and
Let the clunk replace the refcount dec.

> All these places take a fid to do some RPC already so it's not a problem to add
> a clunk and do one more; especially since the "clunk" will just be just a deref.
> For consistency I'd advocate for the kref API as we use that for requests
> already; it might be better to rename the clunk calls to p9_fid_put or
> something but I think that's more churn than it's worth....
>
Ok, I see.
>
> Is there anywhere you think cannot do that?
>
No.
> > This enum value is not functionally necessary, but I think it can
> > reduce the contention of fid, as there are really lots of scenarios
> > that fid from inode is not necessary.
>
> I really don't think it makes things slower if done correctly (e.g. no waiting as
> currently done but the last deref does the actual clunk), and would rather
> keep code simpler unless the difference is big (so would need to do an actual
> benchmark of both if you're convinced it helps) -- sorry.
>
Ok, fair enough.

> >> If possible put the patch first in the series so commits can be
> >> tested independently.
> >
> > Ah, this patch depends on the previous patches, how can I put it as
> > the first of the series?
>
> Basically build the logic in the first patch, then either apply the other three as
> close as they currently are as possible and add the missing refcalls in a new
> patch or incorporate the refcounting in them as well.
> It's fine if you keep it it last, that was just a greedy request on my part to be
> able to test async clunk more easily ; forget about it.

Ok, keep this in original state is easy for me.

Thanks
Jianyong
>
> --
> Dominique
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14  8:35 ` [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Greg Kurz
  2020-09-14 11:06   ` Christian Schoenebeck
@ 2020-09-14 12:36   ` Jianyong Wu
  1 sibling, 0 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-14 12:36 UTC (permalink / raw)
  To: Greg Kurz
  Cc: ericvh, lucho, asmadeus, v9fs-developer, Justin He, linux-kernel,
	Christian Schoenebeck



> -----Original Message-----
> From: Greg Kurz <groug@kaod.org>
> Sent: Monday, September 14, 2020 4:36 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: ericvh@gmail.com; lucho@ionkov.net; asmadeus@codewreck.org; v9fs-
> developer@lists.sourceforge.net; Justin He <Justin.He@arm.com>; linux-
> kernel@vger.kernel.org; Christian Schoenebeck
> <qemu_oss@crudebyte.com>
> Subject: Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall
> bug
>
> On Mon, 14 Sep 2020 11:37:50 +0800
> Jianyong Wu <jianyong.wu@arm.com> wrote:
>
> > open-unlink-f*syscall bug is a well-known bug in 9p, we try to fix the
> > bug in this patch set.
> > I take Eric's and Greg's patches which constiute the 1/4 - 3/4 of this
> > patch set as the main frame of the solution. In patch 4/4, I fix the
> > fid race issue exists in Greg's patch.
> >
>
> IIRC some patches were needed on the QEMU side as well... I'm spending
> less time on 9pfs in QEMU, so Cc'ing the new maintainer:
>
> Christian Schoenebeck <qemu_oss@crudebyte.com>
>
Ok, very kind of you.

Thanks
Jianyong
> > Eric Van Hensbergen (1):
> >   fs/9p: fix create-unlink-getattr idiom
> >
> > Greg Kurz (1):
> >   fs/9p: search open fids first
> >
> > Jianyong Wu (2):
> >   fs/9p: track open fids
> >   9p: fix race issue in fid contention.
> >
> >  fs/9p/fid.c             | 72 +++++++++++++++++++++++++++++++++++------
> >  fs/9p/fid.h             | 25 +++++++++++---
> >  fs/9p/vfs_dentry.c      |  2 +-
> >  fs/9p/vfs_dir.c         | 20 ++++++++++--
> >  fs/9p/vfs_file.c        |  3 +-
> >  fs/9p/vfs_inode.c       | 52 +++++++++++++++++++++--------
> >  fs/9p/vfs_inode_dotl.c  | 44 ++++++++++++++++---------
> >  fs/9p/vfs_super.c       |  7 ++--
> >  fs/9p/xattr.c           | 18 ++++++++---
> >  include/net/9p/client.h |  8 +++++
> >  net/9p/client.c         |  7 +++-
> >  11 files changed, 206 insertions(+), 52 deletions(-)
> >

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14 11:06   ` Christian Schoenebeck
@ 2020-09-14 12:43     ` Greg Kurz
  2020-09-14 15:19       ` Christian Schoenebeck
  0 siblings, 1 reply; 24+ messages in thread
From: Greg Kurz @ 2020-09-14 12:43 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: Jianyong Wu, ericvh, lucho, asmadeus, v9fs-developer, justin.he,
	linux-kernel

On Mon, 14 Sep 2020 13:06:34 +0200
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Montag, 14. September 2020 10:35:46 CEST Greg Kurz wrote:
> > On Mon, 14 Sep 2020 11:37:50 +0800
> > 
> > Jianyong Wu <jianyong.wu@arm.com> wrote:
> > > open-unlink-f*syscall bug is a well-known bug in 9p, we try to fix the bug
> > > in this patch set.
> > > I take Eric's and Greg's patches which constiute the 1/4 - 3/4 of this
> > > patch set as the main frame of the solution. In patch 4/4, I fix the fid
> > > race issue exists in Greg's patch.
> > 
> > IIRC some patches were needed on the QEMU side as well... I'm spending
> > less time on 9pfs in QEMU, so Cc'ing the new maintainer:
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com>
> 
> AFAICS this is about this old bug report:
> https://bugs.launchpad.net/qemu/+bug/1336794
> 

Correct.

> So yes, looks like this also requires changes to the 9pfs 'local' fs driver on 
> QEMU side:
> https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07586.html
> 
> Eric, Greg, would there be an easy way to establish QEMU test cases running 
> the 9pfs 'local' fs driver? Right now we only have 9pfs qtest cases for QEMU 
> which can only use the 'synth' driver, which is not helpful for such kind of 
> issues.
> 

I guess it's possible to introduce new qtests that start QEMU with
-fsdev local instead of -fsdev synth... I haven't looked in a while
though, so I won't comment on "easy way" ;-)

> Best regards,
> Christian Schoenebeck
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14 12:43     ` Greg Kurz
@ 2020-09-14 15:19       ` Christian Schoenebeck
  2020-09-14 15:46         ` Greg Kurz
  0 siblings, 1 reply; 24+ messages in thread
From: Christian Schoenebeck @ 2020-09-14 15:19 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Jianyong Wu, ericvh, lucho, asmadeus, v9fs-developer, justin.he,
	linux-kernel

On Montag, 14. September 2020 14:43:25 CEST Greg Kurz wrote:
> > So yes, looks like this also requires changes to the 9pfs 'local' fs
> > driver on QEMU side:
> > https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07586.html
> > 
> > Eric, Greg, would there be an easy way to establish QEMU test cases
> > running
> > the 9pfs 'local' fs driver? Right now we only have 9pfs qtest cases for
> > QEMU which can only use the 'synth' driver, which is not helpful for such
> > kind of issues.
> 
> I guess it's possible to introduce new qtests that start QEMU with
> -fsdev local instead of -fsdev synth... I haven't looked in a while
> though, so I won't comment on "easy way" ;-)

Makes sense, and I considered that approach as well.

The question is the following: is there a QEMU policy about test cases that 
create/write/read/delete *real* files? I.e. should those test files be written 
to a certain location, and are there measures of sandboxing required?

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14 15:19       ` Christian Schoenebeck
@ 2020-09-14 15:46         ` Greg Kurz
  2020-09-16 12:16           ` Greg Kurz
  0 siblings, 1 reply; 24+ messages in thread
From: Greg Kurz @ 2020-09-14 15:46 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: Jianyong Wu, ericvh, lucho, asmadeus, v9fs-developer, justin.he,
	linux-kernel

On Mon, 14 Sep 2020 17:19:20 +0200
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Montag, 14. September 2020 14:43:25 CEST Greg Kurz wrote:
> > > So yes, looks like this also requires changes to the 9pfs 'local' fs
> > > driver on QEMU side:
> > > https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07586.html
> > > 
> > > Eric, Greg, would there be an easy way to establish QEMU test cases
> > > running
> > > the 9pfs 'local' fs driver? Right now we only have 9pfs qtest cases for
> > > QEMU which can only use the 'synth' driver, which is not helpful for such
> > > kind of issues.
> > 
> > I guess it's possible to introduce new qtests that start QEMU with
> > -fsdev local instead of -fsdev synth... I haven't looked in a while
> > though, so I won't comment on "easy way" ;-)
> 
> Makes sense, and I considered that approach as well.
> 
> The question is the following: is there a QEMU policy about test cases that 
> create/write/read/delete *real* files? I.e. should those test files be written 
> to a certain location, and are there measures of sandboxing required?
> 

I don't know. You'll need to figure out by yourself, reading code from
other tests or asking on IRC.

> Best regards,
> Christian Schoenebeck
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-14 15:46         ` Greg Kurz
@ 2020-09-16 12:16           ` Greg Kurz
  2020-09-17 10:07             ` Christian Schoenebeck
  0 siblings, 1 reply; 24+ messages in thread
From: Greg Kurz @ 2020-09-16 12:16 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: Jianyong Wu, ericvh, lucho, asmadeus, v9fs-developer, justin.he,
	linux-kernel, Thomas Huth

On Mon, 14 Sep 2020 17:46:30 +0200
Greg Kurz <groug@kaod.org> wrote:

> On Mon, 14 Sep 2020 17:19:20 +0200
> Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> 
> > On Montag, 14. September 2020 14:43:25 CEST Greg Kurz wrote:
> > > > So yes, looks like this also requires changes to the 9pfs 'local' fs
> > > > driver on QEMU side:
> > > > https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07586.html
> > > > 
> > > > Eric, Greg, would there be an easy way to establish QEMU test cases
> > > > running
> > > > the 9pfs 'local' fs driver? Right now we only have 9pfs qtest cases for
> > > > QEMU which can only use the 'synth' driver, which is not helpful for such
> > > > kind of issues.
> > > 
> > > I guess it's possible to introduce new qtests that start QEMU with
> > > -fsdev local instead of -fsdev synth... I haven't looked in a while
> > > though, so I won't comment on "easy way" ;-)
> > 
> > Makes sense, and I considered that approach as well.
> > 
> > The question is the following: is there a QEMU policy about test cases that 
> > create/write/read/delete *real* files? I.e. should those test files be written 
> > to a certain location, and are there measures of sandboxing required?
> > 
> 
> I don't know. You'll need to figure out by yourself, reading code from
> other tests or asking on IRC.
> 

Maybe Thomas (added in Cc) can give some hints on how test cases should
handle creation/deletion of real files ?

> > Best regards,
> > Christian Schoenebeck
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug
  2020-09-16 12:16           ` Greg Kurz
@ 2020-09-17 10:07             ` Christian Schoenebeck
  0 siblings, 0 replies; 24+ messages in thread
From: Christian Schoenebeck @ 2020-09-17 10:07 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Jianyong Wu, ericvh, lucho, asmadeus, v9fs-developer, justin.he,
	linux-kernel

On Mittwoch, 16. September 2020 14:16:21 CEST Greg Kurz wrote:
> On Mon, 14 Sep 2020 17:46:30 +0200
> 
> Greg Kurz <groug@kaod.org> wrote:
> > On Mon, 14 Sep 2020 17:19:20 +0200
> > 
> > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:
> > > On Montag, 14. September 2020 14:43:25 CEST Greg Kurz wrote:
> > > > > So yes, looks like this also requires changes to the 9pfs 'local' fs
> > > > > driver on QEMU side:
> > > > > https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07586.ht
> > > > > ml
> > > > > 
> > > > > Eric, Greg, would there be an easy way to establish QEMU test cases
> > > > > running
> > > > > the 9pfs 'local' fs driver? Right now we only have 9pfs qtest cases
> > > > > for
> > > > > QEMU which can only use the 'synth' driver, which is not helpful for
> > > > > such
> > > > > kind of issues.
> > > > 
> > > > I guess it's possible to introduce new qtests that start QEMU with
> > > > -fsdev local instead of -fsdev synth... I haven't looked in a while
> > > > though, so I won't comment on "easy way" ;-)
> > > 
> > > Makes sense, and I considered that approach as well.
> > > 
> > > The question is the following: is there a QEMU policy about test cases
> > > that
> > > create/write/read/delete *real* files? I.e. should those test files be
> > > written to a certain location, and are there measures of sandboxing
> > > required?> 
> > I don't know. You'll need to figure out by yourself, reading code from
> > other tests or asking on IRC.
> 
> Maybe Thomas (added in Cc) can give some hints on how test cases should
> handle creation/deletion of real files ?

Got this QEMU policy issue clarified on qemu-devel in the meantime.

Back on topic: I can lay the ground on QEMU side by establishing a test suite 
for the 9p 'local' driver, including one test case ready to pass for this 
particular unlinked-I/O bug discussed here.

But to be clear: since the proposed patch set is a non-trivial and old one 
(2016), I currently don't have plans to handle the actual bug fix patches by 
myself. So if anyone wants this unlinked issue to be fixed on QEMU side, 
please dedust that patch set and send a v2 the common way to qemu-devel ML for 
further review, and actually test them!

So if anybody wants to do that, let me know, so that I would prepare the test 
suite in the meantime.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-14  8:32       ` Dominique Martinet
  2020-09-14 12:34         ` Jianyong Wu
@ 2020-09-18  8:57         ` Jianyong Wu
  2020-09-18  9:34           ` Dominique Martinet
  1 sibling, 1 reply; 24+ messages in thread
From: Jianyong Wu @ 2020-09-18  8:57 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, Justin He, Greg Kurz

Hi Dominique,

> -----Original Message-----
> From: Dominique Martinet <asmadeus@codewreck.org>
> Sent: Monday, September 14, 2020 4:32 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: ericvh@gmail.com; lucho@ionkov.net; v9fs-
> developer@lists.sourceforge.net; linux-kernel@vger.kernel.org; Justin He
> <Justin.He@arm.com>; Greg Kurz <groug@kaod.org>
> Subject: Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
>
> Jianyong Wu wrote on Mon, Sep 14, 2020:
> > > Not having exceptions for that will also make the code around
> > > fid_atomic_dec much simpler: just have clunk do an atomic dec and
> > > only do the actual clunk if that hit zero, and we should be able to
> > > get rid of that helper?
> >
> > Sorry, I think always-one refcount  won't work at this point, as the
> > fid will be clunked only by file context itself not the every consumer
> > of every fid. We can't decrease the refcounter at just one static
> > point.
> > Am I wrong?
>
> I don't understand the "We can't decrease the refcounter at just one static
> point".
> Basically everywhere you added a fid_atomic_dec() will just need to be
> changed to clunk -- the basic rule of refcounting is that anywhere you take a
> ref you need to put it back.
>
> All these places take a fid to do some RPC already so it's not a problem to add
> a clunk and do one more; especially since the "clunk" will just be just a deref.
> For consistency I'd advocate for the kref API as we use that for requests
> already; it might be better to rename the clunk calls to p9_fid_put or
> something but I think that's more churn than it's worth....
>
If we move the counter decrease code into p9_client_clunk and put it instead of fid_atomic_dec, we need delete fid off the inode where it stores in p9_client_clunk.
But there is no way can we acquire the inode in p9_client_clunk. Do you have any idea? I think introduce another parameter for p9_client_clunk
Is not graceful.

Thanks
Jianyong Wu
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-18  8:57         ` Jianyong Wu
@ 2020-09-18  9:34           ` Dominique Martinet
  2020-09-18 10:05             ` Jianyong Wu
  0 siblings, 1 reply; 24+ messages in thread
From: Dominique Martinet @ 2020-09-18  9:34 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, Justin He, Greg Kurz

Jianyong Wu wrote on Fri, Sep 18, 2020:
> If we move the counter decrease code into p9_client_clunk and put it
> instead of fid_atomic_dec, we need delete fid off the inode where it
> stores in p9_client_clunk.
> But there is no way can we acquire the inode in p9_client_clunk. Do
> you have any idea? I think introduce another parameter for
> p9_client_clunk
> Is not graceful.

You cannot write code about the inode in p9_client_clunk, the way the
code is split fs/9p can refer to net/9p but not the other way around
(module-wise, 9p can refer to 9pnet but 9pnet cannot refer to 9p or we
would have cyclic dependencies)

However I don't see what bothers you.
v9fs_dir_release can remove the fid from the inode as it does currently
and just clunk immediately afterwards.


If another user of the fid had gotten the fid from the inode previously,
it has a ref, so the fid will not be actually clunked then but it will
be clunked later when it is done being used -- that is perfectly fine ?

p9_client_clunk should not have to worry about anything in the vfs.

-- 
Dominique

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
  2020-09-18  9:34           ` Dominique Martinet
@ 2020-09-18 10:05             ` Jianyong Wu
  0 siblings, 0 replies; 24+ messages in thread
From: Jianyong Wu @ 2020-09-18 10:05 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: ericvh, lucho, v9fs-developer, linux-kernel, Justin He, Greg Kurz



> -----Original Message-----
> From: Dominique Martinet <asmadeus@codewreck.org>
> Sent: Friday, September 18, 2020 5:35 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: ericvh@gmail.com; lucho@ionkov.net; v9fs-
> developer@lists.sourceforge.net; linux-kernel@vger.kernel.org; Justin He
> <Justin.He@arm.com>; Greg Kurz <groug@kaod.org>
> Subject: Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
>
> Jianyong Wu wrote on Fri, Sep 18, 2020:
> > If we move the counter decrease code into p9_client_clunk and put it
> > instead of fid_atomic_dec, we need delete fid off the inode where it
> > stores in p9_client_clunk.
> > But there is no way can we acquire the inode in p9_client_clunk. Do
> > you have any idea? I think introduce another parameter for
> > p9_client_clunk Is not graceful.
>
> You cannot write code about the inode in p9_client_clunk, the way the code
> is split fs/9p can refer to net/9p but not the other way around (module-wise,
> 9p can refer to 9pnet but 9pnet cannot refer to 9p or we would have cyclic
> dependencies)
>
> However I don't see what bothers you.
> v9fs_dir_release can remove the fid from the inode as it does currently and
> just clunk immediately afterwards.
>
>
> If another user of the fid had gotten the fid from the inode previously, it has
> a ref, so the fid will not be actually clunked then but it will be clunked later
> when it is done being used -- that is perfectly fine ?
>
> p9_client_clunk should not have to worry about anything in the vfs.
>
Yeah, I see. That's it.

Thanks
Jianyong
> --
> Dominique
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-09-18 10:05 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14  3:37 [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Jianyong Wu
2020-09-14  3:37 ` [PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom Jianyong Wu
2020-09-14  6:00   ` Dominique Martinet
2020-09-14  8:11     ` Greg Kurz
2020-09-14  3:37 ` [PATCH RFC 2/4] fs/9p: track open fids Jianyong Wu
2020-09-14  3:37 ` [PATCH RFC 3/4] fs/9p: search open fids first Jianyong Wu
2020-09-14  3:37 ` [PATCH RFC 4/4] 9p: fix race issue in fid contention Jianyong Wu
2020-09-14  5:55   ` Dominique Martinet
2020-09-14  6:31     ` [V9fs-developer] " Dominique Martinet
2020-09-14  7:50       ` Jianyong Wu
2020-09-14  7:32     ` Jianyong Wu
2020-09-14  8:32       ` Dominique Martinet
2020-09-14 12:34         ` Jianyong Wu
2020-09-18  8:57         ` Jianyong Wu
2020-09-18  9:34           ` Dominique Martinet
2020-09-18 10:05             ` Jianyong Wu
2020-09-14  8:35 ` [V9fs-developer] [PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug Greg Kurz
2020-09-14 11:06   ` Christian Schoenebeck
2020-09-14 12:43     ` Greg Kurz
2020-09-14 15:19       ` Christian Schoenebeck
2020-09-14 15:46         ` Greg Kurz
2020-09-16 12:16           ` Greg Kurz
2020-09-17 10:07             ` Christian Schoenebeck
2020-09-14 12:36   ` Jianyong Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).