[Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
@ 2016-10-19  5:19 Eric Ren
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock Eric Ren
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-19  5:19 UTC (permalink / raw)
  To: ocfs2-devel

Hi all!

Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
results in another deadlock as we have discussed in the recent thread:
    https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html

Before this one, a similiar deadlock has been fixed by Junxiao:
    commit c25a1e0671fb ("ocfs2: fix posix_acl_create deadlock")
    commit 5ee0fbd50fdf ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")

We are in the situation that we have to avoid recursive cluster locking, but
there is no way to check if a cluster lock has been taken by a precess already.

Mostly, we can avoid recursive locking by writing code carefully. However, as
the deadlock issues have proved out, it's very hard to handle the routines
that are called directly by vfs. For instance:

    const struct inode_operations ocfs2_file_iops = {
            .permission     = ocfs2_permission,
            .get_acl        = ocfs2_iop_get_acl,
            .set_acl        = ocfs2_iop_set_acl,
    };

ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
The problem is that the call chain of ocfs2_permission() includes *_acl().

Possibly, there are three solutions I can think of.  The first one is to
implement the inode permission routine for ocfs2 itself, replacing the
existing generic_permission(); this will bring lots of changes and
involve too many trivial vfs functions into ocfs2 code. Frown on this.

The second one is, what I am trying now, to keep track of the processes who
lock/unlock a cluster lock by the following draft patches. But, I quickly
find out that a cluster locking which has been taken by processA can be unlocked
by processB. For example, systemfiles like journal:0000 is locked during mout, and
unlocked during umount. 

The thrid one is to revert that problematic commit! It looks like get/set_acl()
are always been called by other vfs callback like ocfs2_permission(). I think
we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)

Hope for your input to solve this problem;-)

Thanks,
Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock
  2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
@ 2016-10-19  5:19 ` Eric Ren
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-19  5:19 UTC (permalink / raw)
  To: ocfs2-devel

We are in the situation that we have to avoid recursive cluster locking,
but there is no way to check if a cluster lock has been taken by a
precess already.

Mostly, we can avoid recursive locking by writing code carefully.
However, we found that it's very hard to handle the routines that
are invoked directly by vfs. For instance:

const struct inode_operations ocfs2_file_iops = {
        .permission     = ocfs2_permission,
        .get_acl        = ocfs2_iop_get_acl,
        .set_acl        = ocfs2_iop_set_acl,
};

ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
The problem is that the call chain of ocfs2_permission() includes *_acl().

Possibly, there are two solutions I can think of. One way is to
implement the inode permission routine for ocfs2 itself, replacing the
existing generic_permission(); this will bring lots of changes and
involve too many trivial vfs functions into ocfs2 code.

Another way is to keep track of the processes who lock/unlock a cluster
lock. This patch provides ocfs2_is_locked_by_me() for process to check
if the cluster lock has been requested before. This is now only used for
avoiding recursive locking, though it also can help debug cluster locking
issue. Unfortunately, this may incur some performance lost.

Signed-off-by: Eric Ren <zren@suse.com>
---
 fs/ocfs2/dlmglue.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/dlmglue.h | 13 ++++++++++++
 fs/ocfs2/ocfs2.h   |  1 +
 3 files changed, 74 insertions(+)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 83d576f..9f91884 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -532,6 +532,7 @@ void ocfs2_lock_res_init_once(struct ocfs2_lock_res *res)
 	init_waitqueue_head(&res->l_event);
 	INIT_LIST_HEAD(&res->l_blocked_list);
 	INIT_LIST_HEAD(&res->l_mask_waiters);
+	INIT_LIST_HEAD(&res->l_holders);
 }
 
 void ocfs2_inode_lock_res_init(struct ocfs2_lock_res *res,
@@ -749,6 +750,48 @@ void ocfs2_lock_res_free(struct ocfs2_lock_res *res)
 	res->l_flags = 0UL;
 }
 
+static inline void ocfs2_add_holder(struct ocfs2_lock_res *lockres)
+{
+	struct ocfs2_holder *oh = kmalloc(sizeof(struct ocfs2_holder), GFP_KERNEL);
+
+	INIT_LIST_HEAD(&oh->oh_list);
+	oh->oh_lockres = lockres;
+	oh->oh_owner_pid =  get_pid(task_pid(current));
+
+	spin_lock(&lockres->l_lock);
+	list_add_tail(&oh->oh_list, &lockres->l_holders);
+	spin_unlock(&lockres->l_lock);
+}
+
+static inline void ocfs2_remove_holder(struct ocfs2_lock_res *lockres,
+				       struct ocfs2_holder *oh)
+{
+	spin_lock(&lockres->l_lock);
+	list_del(&oh->oh_list);
+	spin_unlock(&lockres->l_lock);
+
+	put_pid(oh->oh_owner_pid);
+	kfree(oh);
+}
+
+struct ocfs2_holder *ocfs2_is_locked_by_me(struct ocfs2_lock_res *lockres)
+{
+	struct ocfs2_holder *oh;
+	struct pid *pid;
+
+	spin_lock(&lockres->l_lock);
+	pid = task_pid(current);
+	list_for_each_entry(oh, &lockres->l_holders, oh_list) {
+		if (oh->oh_owner_pid == pid)
+			goto out;
+	}
+	oh = NULL;
+out:
+	spin_unlock(&lockres->l_lock);
+
+	return oh;
+}
+
 static inline void ocfs2_inc_holders(struct ocfs2_lock_res *lockres,
 				     int level)
 {
@@ -1392,6 +1435,7 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb,
 	int noqueue_attempted = 0;
 	int dlm_locked = 0;
 	int kick_dc = 0;
+	struct ocfs2_holder *oh;
 
 	if (!(lockres->l_flags & OCFS2_LOCK_INITIALIZED)) {
 		mlog_errno(-EINVAL);
@@ -1403,6 +1447,14 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb,
 	if (lockres->l_ops->flags & LOCK_TYPE_USES_LVB)
 		lkm_flags |= DLM_LKF_VALBLK;
 
+	/* This block is just used to check recursive locking now */
+	oh = ocfs2_is_locked_by_me(lockres);
+	if (unlikely(oh))
+		mlog_bug_on_msg(1, "PID(%d) locks on lockres(%s) recursively\n",
+				pid_nr(oh->oh_owner_pid), lockres->l_name);
+	else
+		ocfs2_add_holder(lockres);
+
 again:
 	wait = 0;
 
@@ -1596,6 +1648,14 @@ static void __ocfs2_cluster_unlock(struct ocfs2_super *osb,
 				   unsigned long caller_ip)
 {
 	unsigned long flags;
+	struct ocfs2_holder *oh = ocfs2_is_locked_by_me(lockres);
+
+	/* This block is just used to check recursive locking now */
+	if (unlikely(!oh))
+		mlog_bug_on_msg(1, "PID(%d) unlock lockres(%s) unnecessarily\n",
+				pid_nr(task_pid(current)), lockres->l_name);
+	else
+		ocfs2_remove_holder(lockres, oh);
 
 	spin_lock_irqsave(&lockres->l_lock, flags);
 	ocfs2_dec_holders(lockres, level);
diff --git a/fs/ocfs2/dlmglue.h b/fs/ocfs2/dlmglue.h
index d293a22..3b1d4e7 100644
--- a/fs/ocfs2/dlmglue.h
+++ b/fs/ocfs2/dlmglue.h
@@ -70,6 +70,13 @@ struct ocfs2_orphan_scan_lvb {
 	__be32	lvb_os_seqno;
 };
 
+struct ocfs2_holder {
+	struct list_head oh_list;
+
+	struct ocfs2_lock_res *oh_lockres;
+	struct pid *oh_owner_pid;
+};
+
 /* ocfs2_inode_lock_full() 'arg_flags' flags */
 /* don't wait on recovery. */
 #define OCFS2_META_LOCK_RECOVERY	(0x01)
@@ -170,4 +177,10 @@ void ocfs2_put_dlm_debug(struct ocfs2_dlm_debug *dlm_debug);
 
 /* To set the locking protocol on module initialization */
 void ocfs2_set_locking_protocol(void);
+
+/*
+ * Keep a list of processes that have interest in a lockres.
+ * Note: this is now only uesed for check recursive cluster lock.
+ */
+struct ocfs2_holder *ocfs2_is_locked_by_me(struct ocfs2_lock_res *lockres);
 #endif	/* DLMGLUE_H */
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index e63af7d..12933b8 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -172,6 +172,7 @@ struct ocfs2_lock_res {
 
 	struct list_head         l_blocked_list;
 	struct list_head         l_mask_waiters;
+	struct list_head 	 l_holders;
 
 	unsigned long		 l_flags;
 	char                     l_name[OCFS2_LOCK_ID_MAX_LEN];
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock Eric Ren
@ 2016-10-19  5:19 ` Eric Ren
  2016-10-31 10:55   ` piaojun
  2016-11-09  4:55   ` Eric Ren
  2016-10-19  6:57 ` [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Junxiao Bi
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-19  5:19 UTC (permalink / raw)
  To: ocfs2-devel

The deadlock issue happens when running discontiguous block
group testing on multiple nodes. The easier way to reproduce
is to do "chmod -R 777 /mnt/ocfs2" things like this on multiple
nodes at the same time by pssh.

This is indeed another deadlock caused by: commit 743b5f1434f5
("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). The reason
had been explained well by Tariq Saeed in this thread:

https://oss.oracle.com/pipermail/ocfs2-devel/2015-September/011085.html

For this case, the ocfs2_inode_lock() is misused recursively as below:

do_sys_open
 do_filp_open
  path_openat
   may_open
    inode_permission
     __inode_permission
      ocfs2_permission  <====== ocfs2_inode_lock()
       generic_permission
        get_acl
         ocfs2_iop_get_acl  <====== ocfs2_inode_lock()
          ocfs2_inode_lock_full_nested <===== deadlock if a remote EX request
comes between two ocfs2_inode_lock()

Fix by checking if the cluster lock has been acquired aready in the call-chain
path.

Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
Signed-off-by: Eric Ren <zren@suse.com>
---
 fs/ocfs2/acl.c | 39 +++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
index bed1fcb..7e3544e 100644
--- a/fs/ocfs2/acl.c
+++ b/fs/ocfs2/acl.c
@@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
 int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 {
 	struct buffer_head *bh = NULL;
+	struct ocfs2_holder *oh;
+	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
 	int status = 0;
 
-	status = ocfs2_inode_lock(inode, &bh, 1);
-	if (status < 0) {
-		if (status != -ENOENT)
-			mlog_errno(status);
-		return status;
+	oh = ocfs2_is_locked_by_me(lockres);
+	if (!oh) {
+		status = ocfs2_inode_lock(inode, &bh, 1);
+		if (status < 0) {
+			if (status != -ENOENT)
+				mlog_errno(status);
+			return status;
+		}
 	}
+
 	status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
-	ocfs2_inode_unlock(inode, 1);
+
+	if (!oh)
+		ocfs2_inode_unlock(inode, 1);
 	brelse(bh);
 	return status;
 }
@@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
 	struct ocfs2_super *osb;
 	struct buffer_head *di_bh = NULL;
 	struct posix_acl *acl;
+	struct ocfs2_holder *oh;
+	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
 	int ret;
 
 	osb = OCFS2_SB(inode->i_sb);
 	if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
 		return NULL;
-	ret = ocfs2_inode_lock(inode, &di_bh, 0);
-	if (ret < 0) {
-		if (ret != -ENOENT)
-			mlog_errno(ret);
-		return ERR_PTR(ret);
+
+	oh = ocfs2_is_locked_by_me(lockres);
+	if (!oh) {
+		ret = ocfs2_inode_lock(inode, &di_bh, 0);
+		if (ret < 0) {
+			if (ret != -ENOENT)
+				mlog_errno(ret);
+			return ERR_PTR(ret);
+		}
 	}
 
 	acl = ocfs2_get_acl_nolock(inode, type, di_bh);
 
-	ocfs2_inode_unlock(inode, 0);
+	if (!oh)
+		ocfs2_inode_unlock(inode, 0);
 	brelse(di_bh);
 	return acl;
 }
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
  2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock Eric Ren
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
@ 2016-10-19  6:57 ` Junxiao Bi
  2016-10-19  7:46   ` Eric Ren
  2016-10-24  9:13 ` Eric Ren
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Junxiao Bi @ 2016-10-19  6:57 UTC (permalink / raw)
  To: ocfs2-devel

Hi Eric,

> ? 2016?10?19????1:19?Eric Ren <zren@suse.com> ???
> 
> Hi all!
> 
> Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> results in another deadlock as we have discussed in the recent thread:
>    https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html
> 
> Before this one, a similiar deadlock has been fixed by Junxiao:
>    commit c25a1e0671fb ("ocfs2: fix posix_acl_create deadlock")
>    commit 5ee0fbd50fdf ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")
> 
> We are in the situation that we have to avoid recursive cluster locking, but
> there is no way to check if a cluster lock has been taken by a precess already.
> 
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
>    const struct inode_operations ocfs2_file_iops = {
>            .permission     = ocfs2_permission,
>            .get_acl        = ocfs2_iop_get_acl,
>            .set_acl        = ocfs2_iop_set_acl,
>    };
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().
> 
> Possibly, there are three solutions I can think of.  The first one is to
> implement the inode permission routine for ocfs2 itself, replacing the
> existing generic_permission(); this will bring lots of changes and
> involve too many trivial vfs functions into ocfs2 code. Frown on this.
> 
> The second one is, what I am trying now, to keep track of the processes who
> lock/unlock a cluster lock by the following draft patches. But, I quickly
> find out that a cluster locking which has been taken by processA can be unlocked
> by processB. For example, systemfiles like journal:0000 is locked during mout, and
> unlocked during umount. 
I had ever implemented generic recursive locking support, please check the patch at https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html <https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html> , the issue that locking and unlocking in different processes was considered. But it was rejected by Mark as recursive locking is not allowed in ocfs2/kernel . 
> 
> The thrid one is to revert that problematic commit! It looks like get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)
Not sure whether get/set_acl() will be called directly by vfs. Even not now, we can?t make sure that in the future. So revert it may be a little risky. But if refactor is complicated, then this maybe the only way we can do.

Thanks,
Junxiao.
> 
> Hope for your input to solve this problem;-)
> 
> Thanks,
> Eric
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20161019/324b727e/attachment.html 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
  2016-10-19  6:57 ` [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Junxiao Bi
@ 2016-10-19  7:46   ` Eric Ren
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-19  7:46 UTC (permalink / raw)
  To: ocfs2-devel

Hi Junxiao,

On 10/19/2016 02:57 PM, Junxiao Bi wrote:
> I had ever implemented generic recursive locking support, please check the patch at https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html <https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html> , the issue that locking and unlocking in different processes was considered. But it was rejected by Mark as recursive locking is not allowed in ocfs2/kernel .
Yes, I can remember it. The different point is that I just want to have a function to check 
recursive locking
than supporting recursive locking;-)

Honestly, I cannot understand your patch thoroughly until now.  Back to that time, it's the 
complication of your patch
that concerns me. Besides, looks like the "PR+EX" + "non-block" request cannot be handled well?
>> The thrid one is to revert that problematic commit! It looks like get/set_acl()
>> are always been called by other vfs callback like ocfs2_permission(). I think
>> we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)
> Not sure whether get/set_acl() will be called directly by vfs. Even not now, we can?t make sure that in the future. So revert it may be a little risky. But if refactor is complicated, then this maybe the only way we can do.
Agree. Let's investigate more into it;-)

Thanks,
Junxiao
>
> Thanks,
> Junxiao.
>> Hope for your input to solve this problem;-)
>>
>> Thanks,
>> Eric
>>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
  2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
                   ` (2 preceding siblings ...)
  2016-10-19  6:57 ` [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Junxiao Bi
@ 2016-10-24  9:13 ` Eric Ren
  2016-10-28  6:20   ` Christoph Hellwig
  2016-11-09  4:47 ` Eric Ren
  5 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-24  9:13 UTC (permalink / raw)
  To: ocfs2-devel

Hi all,


On 10/19/2016 01:19 PM, Eric Ren wrote:
> The thrid one is to revert that problematic commit! It looks like get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)
After looking into more, I get to know get/set_acl() can be invoked 
directly from vfs,
for instance:

fsetxattr()
  setxattr()
    vfs_setxattr()
      __vfs_setxattr()
        handler->set(handler, dentry, inode, name, value, size) // 
posix_acl_access_xattr_handler.set = posix_acl_xattr_set
          posix_acl_xattr_set()
             set_posix_acl()
               inode->i_op->set_acl()


So, this problem looks really hard to solve:-/

Eric

>
> Hope for your input to solve this problem;-)
>
> Thanks,
> Eric
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
  2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
@ 2016-10-28  6:20   ` Christoph Hellwig
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2016-10-28  6:20 UTC (permalink / raw)
  To: Eric Ren; +Cc: ocfs2-devel, linux-fsdevel

Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
>     const struct inode_operations ocfs2_file_iops = {
>             .permission     = ocfs2_permission,
>             .get_acl        = ocfs2_iop_get_acl,
>             .set_acl        = ocfs2_iop_set_acl,
>     };
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
@ 2016-10-28  6:20   ` Christoph Hellwig
  0 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2016-10-28  6:20 UTC (permalink / raw)
  To: Eric Ren; +Cc: ocfs2-devel, linux-fsdevel

Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
>     const struct inode_operations ocfs2_file_iops = {
>             .permission     = ocfs2_permission,
>             .get_acl        = ocfs2_iop_get_acl,
>             .set_acl        = ocfs2_iop_set_acl,
>     };
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
  2016-10-28  6:20   ` Christoph Hellwig
@ 2016-10-28  7:06     ` Eric Ren
  -1 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-28  7:06 UTC (permalink / raw)
  To: Christoph Hellwig, ocfs2-devel, linux-fsdevel, Mark Fasheh


[-- Attachment #1.1: Type: text/plain, Size: 1970 bytes --]

Hi Christoph!

Thanks for your attention.

On 10/28/2016 02:20 PM, Christoph Hellwig wrote:
> Hi Eric,
>
> I've added linux-fsdevel to the cc list as this should get a bit
> broader attention.
>
> On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
>> Mostly, we can avoid recursive locking by writing code carefully. However, as
>> the deadlock issues have proved out, it's very hard to handle the routines
>> that are called directly by vfs. For instance:
>>
>>      const struct inode_operations ocfs2_file_iops = {
>>              .permission     = ocfs2_permission,
>>              .get_acl        = ocfs2_iop_get_acl,
>>              .set_acl        = ocfs2_iop_set_acl,
>>      };
>>
>>
>> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
>> The problem is that the call chain of ocfs2_permission() includes *_acl().
> What do you actually protect in ocfs2_permission?  It's a trivial
> wrapper around generic_permission which just looks at the VFS inode.
Yes, it is.

https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L1321
---
ocfs2_permission
       ocfs2_inode_lock()
generic_permission
ocfs2_inode_unlock
>
> I think the right fix is to remove ocfs2_permission entirely and use
> the default VFS implementation.  That both solves your locking problem,
> and it will also get you RCU lookup instead of dropping out of
> RCU mode all the time.
But, from my understanding, the pair of ocfs2_inode_lock/unlock() is used to prevent any 
concurrent changes
to the permission of the inode on the other cluster node while we are checking on it. It's a 
common  case for cluster
filesystem, such as GFS2: https://github.com/torvalds/linux/blob/master/fs/gfs2/inode.c#L1777

Thanks for your suggestion again!
Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


[-- Attachment #1.2: Type: text/html, Size: 3429 bytes --]

[-- Attachment #2: Type: text/plain, Size: 151 bytes --]

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
@ 2016-10-28  7:06     ` Eric Ren
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-28  7:06 UTC (permalink / raw)
  To: Christoph Hellwig, ocfs2-devel, linux-fsdevel, Mark Fasheh

Hi Christoph!

Thanks for your attention.

On 10/28/2016 02:20 PM, Christoph Hellwig wrote:
> Hi Eric,
>
> I've added linux-fsdevel to the cc list as this should get a bit
> broader attention.
>
> On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
>> Mostly, we can avoid recursive locking by writing code carefully. However, as
>> the deadlock issues have proved out, it's very hard to handle the routines
>> that are called directly by vfs. For instance:
>>
>>      const struct inode_operations ocfs2_file_iops = {
>>              .permission     = ocfs2_permission,
>>              .get_acl        = ocfs2_iop_get_acl,
>>              .set_acl        = ocfs2_iop_set_acl,
>>      };
>>
>>
>> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
>> The problem is that the call chain of ocfs2_permission() includes *_acl().
> What do you actually protect in ocfs2_permission?  It's a trivial
> wrapper around generic_permission which just looks at the VFS inode.
Yes, it is.

https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L1321
---
ocfs2_permission
       ocfs2_inode_lock()
generic_permission
ocfs2_inode_unlock
>
> I think the right fix is to remove ocfs2_permission entirely and use
> the default VFS implementation.  That both solves your locking problem,
> and it will also get you RCU lookup instead of dropping out of
> RCU mode all the time.
But, from my understanding, the pair of ocfs2_inode_lock/unlock() is used to prevent any 
concurrent changes
to the permission of the inode on the other cluster node while we are checking on it. It's a 
common  case for cluster
filesystem, such as GFS2: https://github.com/torvalds/linux/blob/master/fs/gfs2/inode.c#L1777

Thanks for your suggestion again!
Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20161028/9a90961d/attachment.html 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
@ 2016-10-31 10:55   ` piaojun
  2016-11-01  1:45     ` Eric Ren
  2016-11-09  4:55   ` Eric Ren
  1 sibling, 1 reply; 19+ messages in thread
From: piaojun @ 2016-10-31 10:55 UTC (permalink / raw)
  To: ocfs2-devel

Hi Eric,

On 2016-10-19 13:19, Eric Ren wrote:
> The deadlock issue happens when running discontiguous block
> group testing on multiple nodes. The easier way to reproduce
> is to do "chmod -R 777 /mnt/ocfs2" things like this on multiple
> nodes at the same time by pssh.
> 
> This is indeed another deadlock caused by: commit 743b5f1434f5
> ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). The reason
> had been explained well by Tariq Saeed in this thread:
> 
> https://oss.oracle.com/pipermail/ocfs2-devel/2015-September/011085.html
> 
> For this case, the ocfs2_inode_lock() is misused recursively as below:
> 
> do_sys_open
>  do_filp_open
>   path_openat
>    may_open
>     inode_permission
>      __inode_permission
>       ocfs2_permission  <====== ocfs2_inode_lock()
>        generic_permission
>         get_acl
>          ocfs2_iop_get_acl  <====== ocfs2_inode_lock()
>           ocfs2_inode_lock_full_nested <===== deadlock if a remote EX request
Do you mean another node wants to get ex of the inode? or another process?
> comes between two ocfs2_inode_lock()
> 
> Fix by checking if the cluster lock has been acquired aready in the call-chain
> path.
> 
> Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> Signed-off-by: Eric Ren <zren@suse.com>
> ---
>  fs/ocfs2/acl.c | 39 +++++++++++++++++++++++++++------------
>  1 file changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
> index bed1fcb..7e3544e 100644
> --- a/fs/ocfs2/acl.c
> +++ b/fs/ocfs2/acl.c
> @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
>  int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>  {
>  	struct buffer_head *bh = NULL;
> +	struct ocfs2_holder *oh;
> +	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>  	int status = 0;
>  
> -	status = ocfs2_inode_lock(inode, &bh, 1);
> -	if (status < 0) {
> -		if (status != -ENOENT)
> -			mlog_errno(status);
> -		return status;
> +	oh = ocfs2_is_locked_by_me(lockres);
> +	if (!oh) {
> +		status = ocfs2_inode_lock(inode, &bh, 1);
> +		if (status < 0) {
> +			if (status != -ENOENT)
> +				mlog_errno(status);
> +			return status;
> +		}
>  	}
> +
>  	status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
> -	ocfs2_inode_unlock(inode, 1);
> +
> +	if (!oh)
> +		ocfs2_inode_unlock(inode, 1);
>  	brelse(bh);
>  	return status;
>  }
> @@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
>  	struct ocfs2_super *osb;
>  	struct buffer_head *di_bh = NULL;
>  	struct posix_acl *acl;
> +	struct ocfs2_holder *oh;
> +	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>  	int ret;
>  
>  	osb = OCFS2_SB(inode->i_sb);
>  	if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
>  		return NULL;
> -	ret = ocfs2_inode_lock(inode, &di_bh, 0);
> -	if (ret < 0) {
> -		if (ret != -ENOENT)
> -			mlog_errno(ret);
> -		return ERR_PTR(ret);
> +
> +	oh = ocfs2_is_locked_by_me(lockres);
> +	if (!oh) {
> +		ret = ocfs2_inode_lock(inode, &di_bh, 0);
> +		if (ret < 0) {
> +			if (ret != -ENOENT)
> +				mlog_errno(ret);
> +			return ERR_PTR(ret);
> +		}
>  	}
>  
>  	acl = ocfs2_get_acl_nolock(inode, type, di_bh);
>  
> -	ocfs2_inode_unlock(inode, 0);
> +	if (!oh)
> +		ocfs2_inode_unlock(inode, 0);
>  	brelse(di_bh);
>  	return acl;
>  }
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-10-31 10:55   ` piaojun
@ 2016-11-01  1:45     ` Eric Ren
  2016-11-10 10:49       ` piaojun
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Ren @ 2016-11-01  1:45 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

On 10/31/2016 06:55 PM, piaojun wrote:
> Hi Eric,
>
> On 2016-10-19 13:19, Eric Ren wrote:
>> The deadlock issue happens when running discontiguous block
>> group testing on multiple nodes. The easier way to reproduce
>> is to do "chmod -R 777 /mnt/ocfs2" things like this on multiple
>> nodes at the same time by pssh.
>>
>> This is indeed another deadlock caused by: commit 743b5f1434f5
>> ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). The reason
>> had been explained well by Tariq Saeed in this thread:
>>
>> https://oss.oracle.com/pipermail/ocfs2-devel/2015-September/011085.html
>>
>> For this case, the ocfs2_inode_lock() is misused recursively as below:
>>
>> do_sys_open
>>   do_filp_open
>>    path_openat
>>     may_open
>>      inode_permission
>>       __inode_permission
>>        ocfs2_permission  <====== ocfs2_inode_lock()
>>         generic_permission
>>          get_acl
>>           ocfs2_iop_get_acl  <====== ocfs2_inode_lock()
>>            ocfs2_inode_lock_full_nested <===== deadlock if a remote EX request
> Do you mean another node wants to get ex of the inode? or another process?
Remote EX request means "another node wants to get ex of the inode";-)

Eric
>> comes between two ocfs2_inode_lock()
>>
>> Fix by checking if the cluster lock has been acquired aready in the call-chain
>> path.
>>
>> Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
>> Signed-off-by: Eric Ren <zren@suse.com>
>> ---
>>   fs/ocfs2/acl.c | 39 +++++++++++++++++++++++++++------------
>>   1 file changed, 27 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
>> index bed1fcb..7e3544e 100644
>> --- a/fs/ocfs2/acl.c
>> +++ b/fs/ocfs2/acl.c
>> @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
>>   int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>>   {
>>   	struct buffer_head *bh = NULL;
>> +	struct ocfs2_holder *oh;
>> +	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>   	int status = 0;
>>   
>> -	status = ocfs2_inode_lock(inode, &bh, 1);
>> -	if (status < 0) {
>> -		if (status != -ENOENT)
>> -			mlog_errno(status);
>> -		return status;
>> +	oh = ocfs2_is_locked_by_me(lockres);
>> +	if (!oh) {
>> +		status = ocfs2_inode_lock(inode, &bh, 1);
>> +		if (status < 0) {
>> +			if (status != -ENOENT)
>> +				mlog_errno(status);
>> +			return status;
>> +		}
>>   	}
>> +
>>   	status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
>> -	ocfs2_inode_unlock(inode, 1);
>> +
>> +	if (!oh)
>> +		ocfs2_inode_unlock(inode, 1);
>>   	brelse(bh);
>>   	return status;
>>   }
>> @@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
>>   	struct ocfs2_super *osb;
>>   	struct buffer_head *di_bh = NULL;
>>   	struct posix_acl *acl;
>> +	struct ocfs2_holder *oh;
>> +	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>   	int ret;
>>   
>>   	osb = OCFS2_SB(inode->i_sb);
>>   	if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
>>   		return NULL;
>> -	ret = ocfs2_inode_lock(inode, &di_bh, 0);
>> -	if (ret < 0) {
>> -		if (ret != -ENOENT)
>> -			mlog_errno(ret);
>> -		return ERR_PTR(ret);
>> +
>> +	oh = ocfs2_is_locked_by_me(lockres);
>> +	if (!oh) {
>> +		ret = ocfs2_inode_lock(inode, &di_bh, 0);
>> +		if (ret < 0) {
>> +			if (ret != -ENOENT)
>> +				mlog_errno(ret);
>> +			return ERR_PTR(ret);
>> +		}
>>   	}
>>   
>>   	acl = ocfs2_get_acl_nolock(inode, type, di_bh);
>>   
>> -	ocfs2_inode_unlock(inode, 0);
>> +	if (!oh)
>> +		ocfs2_inode_unlock(inode, 0);
>>   	brelse(di_bh);
>>   	return acl;
>>   }
>>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
  2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
                   ` (4 preceding siblings ...)
  2016-10-28  6:20   ` Christoph Hellwig
@ 2016-11-09  4:47 ` Eric Ren
  5 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-11-09  4:47 UTC (permalink / raw)
  To: ocfs2-devel

Hi all,

On 10/19/2016 01:19 PM, Eric Ren wrote:
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().
>
> Possibly, there are three solutions I can think of.  The first one is to
> implement the inode permission routine for ocfs2 itself, replacing the
> existing generic_permission(); this will bring lots of changes and
> involve too many trivial vfs functions into ocfs2 code. Frown on this.
>
> The second one is, what I am trying now, to keep track of the processes who
> lock/unlock a cluster lock by the following draft patches. But, I quickly
> find out that a cluster locking which has been taken by processA can be unlocked
> by processB. For example, systemfiles like journal:0000 is locked during mout, and
> unlocked during umount.
We can avoid the problem above by:

1) not keeping track of system file inode:

    if (!(OCFS2_I(inode)->ip_flags & OCFS2_INODE_SYSTEM_FILE)) {
        ....
   }

2) only keeping track of inode metadata lockres:

    OCFS2_I(inode)->ip_inode_lockres;

because inode open lockres can also be get/release by different processes.

Eric
>
> The thrid one is to revert that problematic commit! It looks like get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)
>
> Hope for your input to solve this problem;-)
>
> Thanks,
> Eric
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20161109/df737ced/attachment.html 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
  2016-10-31 10:55   ` piaojun
@ 2016-11-09  4:55   ` Eric Ren
  1 sibling, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-11-09  4:55 UTC (permalink / raw)
  To: ocfs2-devel

Hi all,

On 10/19/2016 01:19 PM, Eric Ren wrote:
> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
> index bed1fcb..7e3544e 100644
> --- a/fs/ocfs2/acl.c
> +++ b/fs/ocfs2/acl.c
> @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
>   int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>   {
>   	struct buffer_head *bh = NULL;
> +	struct ocfs2_holder *oh;
> +	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>   	int status = 0;
>   
> -	status = ocfs2_inode_lock(inode, &bh, 1);
> -	if (status < 0) {
> -		if (status != -ENOENT)
> -			mlog_errno(status);
> -		return status;
> +	oh = ocfs2_is_locked_by_me(lockres);
> +	if (!oh) {
> +		status = ocfs2_inode_lock(inode, &bh, 1);
> +		if (status < 0) {
> +			if (status != -ENOENT)
> +				mlog_errno(status);
> +			return status;
> +		}
>   	}
This is wrong. We also depend ocfs2_inode_lock() pass out "bh" for later use.

So, we may need another function something like ocfs2_inode_getbh():
      if (!oh)
         ocfs2_inode_lock();
    else
        ocfs2_inode_getbh();

Eric
> +
>   	status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
> -	ocfs2_inode_unlock(inode, 1);
> +
> +	if (!oh)
> +		ocfs2_inode_unlock(inode, 1);
>   	brelse(bh);
>   	return status;
>   }
> @@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
>   	struct ocfs2_super *osb;
>   	struct buffer_head *di_bh = NULL;
>   	struct posix_acl *acl;
> +	struct ocfs2_holder *oh;
> +	struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>   	int ret;
>   
>   	osb = OCFS2_SB(inode->i_sb);
>   	if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
>   		return NULL;
> -	ret = ocfs2_inode_lock(inode, &di_bh, 0);
> -	if (ret < 0) {
> -		if (ret != -ENOENT)
> -			mlog_errno(ret);
> -		return ERR_PTR(ret);
> +
> +	oh = ocfs2_is_locked_by_me(lockres);
> +	if (!oh) {
> +		ret = ocfs2_inode_lock(inode, &di_bh, 0);
> +		if (ret < 0) {
> +			if (ret != -ENOENT)
> +				mlog_errno(ret);
> +			return ERR_PTR(ret);
> +		}
>   	}
>   
>   	acl = ocfs2_get_acl_nolock(inode, type, di_bh);
>   
> -	ocfs2_inode_unlock(inode, 0);
> +	if (!oh)
> +		ocfs2_inode_unlock(inode, 0);
>   	brelse(di_bh);
>   	return acl;
>   }


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20161109/ddb6d0e4/attachment.html 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-11-01  1:45     ` Eric Ren
@ 2016-11-10 10:49       ` piaojun
  2016-11-11  1:56         ` Eric Ren
  0 siblings, 1 reply; 19+ messages in thread
From: piaojun @ 2016-11-10 10:49 UTC (permalink / raw)
  To: ocfs2-devel

Hi Eric,

On 2016-11-1 9:45, Eric Ren wrote:
> Hi,
> 
> On 10/31/2016 06:55 PM, piaojun wrote:
>> Hi Eric,
>>
>> On 2016-10-19 13:19, Eric Ren wrote:
>>> The deadlock issue happens when running discontiguous block
>>> group testing on multiple nodes. The easier way to reproduce
>>> is to do "chmod -R 777 /mnt/ocfs2" things like this on multiple
>>> nodes at the same time by pssh.
>>>
>>> This is indeed another deadlock caused by: commit 743b5f1434f5
>>> ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). The reason
>>> had been explained well by Tariq Saeed in this thread:
>>>
>>> https://oss.oracle.com/pipermail/ocfs2-devel/2015-September/011085.html
>>>
>>> For this case, the ocfs2_inode_lock() is misused recursively as below:
>>>
>>> do_sys_open
>>>   do_filp_open
>>>    path_openat
>>>     may_open
>>>      inode_permission
>>>       __inode_permission
>>>        ocfs2_permission  <====== ocfs2_inode_lock()
>>>         generic_permission
>>>          get_acl
>>>           ocfs2_iop_get_acl  <====== ocfs2_inode_lock()
>>>            ocfs2_inode_lock_full_nested <===== deadlock if a remote EX request
>> Do you mean another node wants to get ex of the inode? or another process?
> Remote EX request means "another node wants to get ex of the inode";-)
> 
> Eric
If another node wants to get ex, it will get blocked as this node has
got pr. Why will the ex request make this node get blocked? Expect your
detailed description.

thanks,
Jun
>>> comes between two ocfs2_inode_lock()
>>>
>>> Fix by checking if the cluster lock has been acquired aready in the call-chain
>>> path.
>>>
>>> Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
>>> Signed-off-by: Eric Ren <zren@suse.com>
>>> ---
>>>   fs/ocfs2/acl.c | 39 +++++++++++++++++++++++++++------------
>>>   1 file changed, 27 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
>>> index bed1fcb..7e3544e 100644
>>> --- a/fs/ocfs2/acl.c
>>> +++ b/fs/ocfs2/acl.c
>>> @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
>>>   int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>>>   {
>>>       struct buffer_head *bh = NULL;
>>> +    struct ocfs2_holder *oh;
>>> +    struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>>       int status = 0;
>>>   -    status = ocfs2_inode_lock(inode, &bh, 1);
>>> -    if (status < 0) {
>>> -        if (status != -ENOENT)
>>> -            mlog_errno(status);
>>> -        return status;
>>> +    oh = ocfs2_is_locked_by_me(lockres);
>>> +    if (!oh) {
>>> +        status = ocfs2_inode_lock(inode, &bh, 1);
>>> +        if (status < 0) {
>>> +            if (status != -ENOENT)
>>> +                mlog_errno(status);
>>> +            return status;
>>> +        }
>>>       }
>>> +
>>>       status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
>>> -    ocfs2_inode_unlock(inode, 1);
>>> +
>>> +    if (!oh)
>>> +        ocfs2_inode_unlock(inode, 1);
>>>       brelse(bh);
>>>       return status;
>>>   }
>>> @@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
>>>       struct ocfs2_super *osb;
>>>       struct buffer_head *di_bh = NULL;
>>>       struct posix_acl *acl;
>>> +    struct ocfs2_holder *oh;
>>> +    struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>>       int ret;
>>>         osb = OCFS2_SB(inode->i_sb);
>>>       if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
>>>           return NULL;
>>> -    ret = ocfs2_inode_lock(inode, &di_bh, 0);
>>> -    if (ret < 0) {
>>> -        if (ret != -ENOENT)
>>> -            mlog_errno(ret);
>>> -        return ERR_PTR(ret);
>>> +
>>> +    oh = ocfs2_is_locked_by_me(lockres);
>>> +    if (!oh) {
>>> +        ret = ocfs2_inode_lock(inode, &di_bh, 0);
>>> +        if (ret < 0) {
>>> +            if (ret != -ENOENT)
>>> +                mlog_errno(ret);
>>> +            return ERR_PTR(ret);
>>> +        }
>>>       }
>>>         acl = ocfs2_get_acl_nolock(inode, type, di_bh);
>>>   -    ocfs2_inode_unlock(inode, 0);
>>> +    if (!oh)
>>> +        ocfs2_inode_unlock(inode, 0);
>>>       brelse(di_bh);
>>>       return acl;
>>>   }
>>>
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-11-10 10:49       ` piaojun
@ 2016-11-11  1:56         ` Eric Ren
  2016-11-14  5:42           ` piaojun
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Ren @ 2016-11-11  1:56 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

On 11/10/2016 06:49 PM, piaojun wrote:
> Hi Eric,
>
> On 2016-11-1 9:45, Eric Ren wrote:
>> Hi,
>>
>> On 10/31/2016 06:55 PM, piaojun wrote:
>>> Hi Eric,
>>>
>>> On 2016-10-19 13:19, Eric Ren wrote:
>>>> The deadlock issue happens when running discontiguous block
>>>> group testing on multiple nodes. The easier way to reproduce
>>>> is to do "chmod -R 777 /mnt/ocfs2" things like this on multiple
>>>> nodes at the same time by pssh.
>>>>
>>>> This is indeed another deadlock caused by: commit 743b5f1434f5
>>>> ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). The reason
>>>> had been explained well by Tariq Saeed in this thread:
>>>>
>>>> https://oss.oracle.com/pipermail/ocfs2-devel/2015-September/011085.html
>>>>
>>>> For this case, the ocfs2_inode_lock() is misused recursively as below:
>>>>
>>>> do_sys_open
>>>>    do_filp_open
>>>>     path_openat
>>>>      may_open
>>>>       inode_permission
>>>>        __inode_permission
>>>>         ocfs2_permission  <====== ocfs2_inode_lock()
>>>>          generic_permission
>>>>           get_acl
>>>>            ocfs2_iop_get_acl  <====== ocfs2_inode_lock()
>>>>             ocfs2_inode_lock_full_nested <===== deadlock if a remote EX request
>>> Do you mean another node wants to get ex of the inode? or another process?
>> Remote EX request means "another node wants to get ex of the inode";-)
>>
>> Eric
> If another node wants to get ex, it will get blocked as this node has
> got pr. Why will the ex request make this node get blocked? Expect your
> detailed description.
Did you look at this link I mentioned above?

OCFS2_LOCK_BLOCKED flag of this lockres is set in BAST (ocfs2_generic_handle_bast) when 
downconvert is needed
on behalf of remote lock request.

The recursive cluster lock (the second one) will be blocked in __ocfs2_cluster_lock() 
because of OCFS2_LOCK_BLOCKED.
But the downconvert cannot be done, why? because there is no chance for the first cluster 
lock on this node to be unlocked -
we blocked ourselves in the code path.

Eric
>
> thanks,
> Jun
>>>> comes between two ocfs2_inode_lock()
>>>>
>>>> Fix by checking if the cluster lock has been acquired aready in the call-chain
>>>> path.
>>>>
>>>> Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
>>>> Signed-off-by: Eric Ren <zren@suse.com>
>>>> ---
>>>>    fs/ocfs2/acl.c | 39 +++++++++++++++++++++++++++------------
>>>>    1 file changed, 27 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
>>>> index bed1fcb..7e3544e 100644
>>>> --- a/fs/ocfs2/acl.c
>>>> +++ b/fs/ocfs2/acl.c
>>>> @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
>>>>    int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>>>>    {
>>>>        struct buffer_head *bh = NULL;
>>>> +    struct ocfs2_holder *oh;
>>>> +    struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>>>        int status = 0;
>>>>    -    status = ocfs2_inode_lock(inode, &bh, 1);
>>>> -    if (status < 0) {
>>>> -        if (status != -ENOENT)
>>>> -            mlog_errno(status);
>>>> -        return status;
>>>> +    oh = ocfs2_is_locked_by_me(lockres);
>>>> +    if (!oh) {
>>>> +        status = ocfs2_inode_lock(inode, &bh, 1);
>>>> +        if (status < 0) {
>>>> +            if (status != -ENOENT)
>>>> +                mlog_errno(status);
>>>> +            return status;
>>>> +        }
>>>>        }
>>>> +
>>>>        status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
>>>> -    ocfs2_inode_unlock(inode, 1);
>>>> +
>>>> +    if (!oh)
>>>> +        ocfs2_inode_unlock(inode, 1);
>>>>        brelse(bh);
>>>>        return status;
>>>>    }
>>>> @@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
>>>>        struct ocfs2_super *osb;
>>>>        struct buffer_head *di_bh = NULL;
>>>>        struct posix_acl *acl;
>>>> +    struct ocfs2_holder *oh;
>>>> +    struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>>>        int ret;
>>>>          osb = OCFS2_SB(inode->i_sb);
>>>>        if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
>>>>            return NULL;
>>>> -    ret = ocfs2_inode_lock(inode, &di_bh, 0);
>>>> -    if (ret < 0) {
>>>> -        if (ret != -ENOENT)
>>>> -            mlog_errno(ret);
>>>> -        return ERR_PTR(ret);
>>>> +
>>>> +    oh = ocfs2_is_locked_by_me(lockres);
>>>> +    if (!oh) {
>>>> +        ret = ocfs2_inode_lock(inode, &di_bh, 0);
>>>> +        if (ret < 0) {
>>>> +            if (ret != -ENOENT)
>>>> +                mlog_errno(ret);
>>>> +            return ERR_PTR(ret);
>>>> +        }
>>>>        }
>>>>          acl = ocfs2_get_acl_nolock(inode, type, di_bh);
>>>>    -    ocfs2_inode_unlock(inode, 0);
>>>> +    if (!oh)
>>>> +        ocfs2_inode_unlock(inode, 0);
>>>>        brelse(di_bh);
>>>>        return acl;
>>>>    }
>>>>
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-11-11  1:56         ` Eric Ren
@ 2016-11-14  5:42           ` piaojun
  2016-11-14 10:03             ` Eric Ren
  0 siblings, 1 reply; 19+ messages in thread
From: piaojun @ 2016-11-14  5:42 UTC (permalink / raw)
  To: ocfs2-devel

Hi Eric,

On 2016-11-11 9:56, Eric Ren wrote:
> Hi,
> 
> On 11/10/2016 06:49 PM, piaojun wrote:
>> Hi Eric,
>>
>> On 2016-11-1 9:45, Eric Ren wrote:
>>> Hi,
>>>
>>> On 10/31/2016 06:55 PM, piaojun wrote:
>>>> Hi Eric,
>>>>
>>>> On 2016-10-19 13:19, Eric Ren wrote:
>>>>> The deadlock issue happens when running discontiguous block
>>>>> group testing on multiple nodes. The easier way to reproduce
>>>>> is to do "chmod -R 777 /mnt/ocfs2" things like this on multiple
>>>>> nodes at the same time by pssh.
>>>>>
>>>>> This is indeed another deadlock caused by: commit 743b5f1434f5
>>>>> ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). The reason
>>>>> had been explained well by Tariq Saeed in this thread:
>>>>>
>>>>> https://oss.oracle.com/pipermail/ocfs2-devel/2015-September/011085.html
>>>>>
>>>>> For this case, the ocfs2_inode_lock() is misused recursively as below:
>>>>>
>>>>> do_sys_open
>>>>>    do_filp_open
>>>>>     path_openat
>>>>>      may_open
>>>>>       inode_permission
>>>>>        __inode_permission
>>>>>         ocfs2_permission  <====== ocfs2_inode_lock()
>>>>>          generic_permission
>>>>>           get_acl
>>>>>            ocfs2_iop_get_acl  <====== ocfs2_inode_lock()
>>>>>             ocfs2_inode_lock_full_nested <===== deadlock if a remote EX request
>>>> Do you mean another node wants to get ex of the inode? or another process?
>>> Remote EX request means "another node wants to get ex of the inode";-)
>>>
>>> Eric
>> If another node wants to get ex, it will get blocked as this node has
>> got pr. Why will the ex request make this node get blocked? Expect your
>> detailed description.
> Did you look at this link I mentioned above?
> 
> OCFS2_LOCK_BLOCKED flag of this lockres is set in BAST (ocfs2_generic_handle_bast) when downconvert is needed
> on behalf of remote lock request.
> 
> The recursive cluster lock (the second one) will be blocked in __ocfs2_cluster_lock() because of OCFS2_LOCK_BLOCKED.
> But the downconvert cannot be done, why? because there is no chance for the first cluster lock on this node to be unlocked -
> we blocked ourselves in the code path.
> 
> Eric
You clear my doubt. I will look through your solution.

thanks
Jun
>>
>> thanks,
>> Jun
>>>>> comes between two ocfs2_inode_lock()
>>>>>
>>>>> Fix by checking if the cluster lock has been acquired aready in the call-chain
>>>>> path.
>>>>>
>>>>> Fixes: commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
>>>>> Signed-off-by: Eric Ren <zren@suse.com>
>>>>> ---
>>>>>    fs/ocfs2/acl.c | 39 +++++++++++++++++++++++++++------------
>>>>>    1 file changed, 27 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
>>>>> index bed1fcb..7e3544e 100644
>>>>> --- a/fs/ocfs2/acl.c
>>>>> +++ b/fs/ocfs2/acl.c
>>>>> @@ -283,16 +283,24 @@ int ocfs2_set_acl(handle_t *handle,
>>>>>    int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>>>>>    {
>>>>>        struct buffer_head *bh = NULL;
>>>>> +    struct ocfs2_holder *oh;
>>>>> +    struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>>>>        int status = 0;
>>>>>    -    status = ocfs2_inode_lock(inode, &bh, 1);
>>>>> -    if (status < 0) {
>>>>> -        if (status != -ENOENT)
>>>>> -            mlog_errno(status);
>>>>> -        return status;
>>>>> +    oh = ocfs2_is_locked_by_me(lockres);
>>>>> +    if (!oh) {
>>>>> +        status = ocfs2_inode_lock(inode, &bh, 1);
>>>>> +        if (status < 0) {
>>>>> +            if (status != -ENOENT)
>>>>> +                mlog_errno(status);
>>>>> +            return status;
>>>>> +        }
>>>>>        }
>>>>> +
>>>>>        status = ocfs2_set_acl(NULL, inode, bh, type, acl, NULL, NULL);
>>>>> -    ocfs2_inode_unlock(inode, 1);
>>>>> +
>>>>> +    if (!oh)
>>>>> +        ocfs2_inode_unlock(inode, 1);
>>>>>        brelse(bh);
>>>>>        return status;
>>>>>    }
>>>>> @@ -302,21 +310,28 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
>>>>>        struct ocfs2_super *osb;
>>>>>        struct buffer_head *di_bh = NULL;
>>>>>        struct posix_acl *acl;
>>>>> +    struct ocfs2_holder *oh;
>>>>> +    struct ocfs2_lock_res *lockres = &OCFS2_I(inode)->ip_inode_lockres;
>>>>>        int ret;
>>>>>          osb = OCFS2_SB(inode->i_sb);
>>>>>        if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
>>>>>            return NULL;
>>>>> -    ret = ocfs2_inode_lock(inode, &di_bh, 0);
>>>>> -    if (ret < 0) {
>>>>> -        if (ret != -ENOENT)
>>>>> -            mlog_errno(ret);
>>>>> -        return ERR_PTR(ret);
>>>>> +
>>>>> +    oh = ocfs2_is_locked_by_me(lockres);
>>>>> +    if (!oh) {
>>>>> +        ret = ocfs2_inode_lock(inode, &di_bh, 0);
>>>>> +        if (ret < 0) {
>>>>> +            if (ret != -ENOENT)
>>>>> +                mlog_errno(ret);
>>>>> +            return ERR_PTR(ret);
>>>>> +        }
>>>>>        }
>>>>>          acl = ocfs2_get_acl_nolock(inode, type, di_bh);
>>>>>    -    ocfs2_inode_unlock(inode, 0);
>>>>> +    if (!oh)
>>>>> +        ocfs2_inode_unlock(inode, 0);
>>>>>        brelse(di_bh);
>>>>>        return acl;
>>>>>    }
>>>>>
>>>
>>> .
>>>
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-11-14  5:42           ` piaojun
@ 2016-11-14 10:03             ` Eric Ren
  2016-11-15  2:13               ` Eric Ren
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Ren @ 2016-11-14 10:03 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

On 11/14/2016 01:42 PM, piaojun wrote:
> Hi Eric,
>
>
> OCFS2_LOCK_BLOCKED flag of this lockres is set in BAST (ocfs2_generic_handle_bast) when downconvert is needed
> on behalf of remote lock request.
>
> The recursive cluster lock (the second one) will be blocked in __ocfs2_cluster_lock() because of OCFS2_LOCK_BLOCKED.
> But the downconvert cannot be done, why? because there is no chance for the first cluster lock on this node to be unlocked -
> we blocked ourselves in the code path.
>
> Eric
> You clear my doubt. I will look through your solution.

Thanks for your attention. Actually, I tried different versions of draft patch locally. 
Either of them can satisfy myself so far.
Some rules I'd like to follow:
1) check and avoid recursive cluster locking, rather than allow it which Junxiao had tried 
before;
2) Just keep track of lock resource that meets the following requirements:
      a. normal inodes (non systemfile);
      b. inode metadata lockres (not open, rw lockres);
why? to avoid more special cluster locking usecases, like journal systemfile, "LOST+FOUND" 
open lockres, that lock/unlock
operations are performed by different processes, making tracking task more tricky.
3) There is another problem if we follow "check + avoid" pattern, which I have mentioned in 
this thread:
"""
This is wrong. We also depend ocfs2_inode_lock() pass out "bh" for later use.

So, we may need another function something like ocfs2_inode_getbh():
      if (!oh)
         ocfs2_inode_lock();
    else
        ocfs2_inode_getbh();
"""

Hope we can work out a nice solution for this tricky issue ;-)

Eric

>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking
  2016-11-14 10:03             ` Eric Ren
@ 2016-11-15  2:13               ` Eric Ren
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Ren @ 2016-11-15  2:13 UTC (permalink / raw)
  To: ocfs2-devel

Hi,
> Thanks for your attention. Actually, I tried different versions of draft patch locally.
> Either of them can satisfy myself so far.

Sorry, I meat "neither of them".

Eric
> Some rules I'd like to follow:
> 1) check and avoid recursive cluster locking, rather than allow it which Junxiao had tried
> before;
> 2) Just keep track of lock resource that meets the following requirements:
>        a. normal inodes (non systemfile);
>        b. inode metadata lockres (not open, rw lockres);
> why? to avoid more special cluster locking usecases, like journal systemfile, "LOST+FOUND"
> open lockres, that lock/unlock
> operations are performed by different processes, making tracking task more tricky.
> 3) There is another problem if we follow "check + avoid" pattern, which I have mentioned in
> this thread:
> """
> This is wrong. We also depend ocfs2_inode_lock() pass out "bh" for later use.
>
> So, we may need another function something like ocfs2_inode_getbh():
>        if (!oh)
>           ocfs2_inode_lock();
>      else
>          ocfs2_inode_getbh();
> """
>
> Hope we can work out a nice solution for this tricky issue ;-)
>
> Eric
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-11-15  2:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock Eric Ren
2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
2016-10-31 10:55   ` piaojun
2016-11-01  1:45     ` Eric Ren
2016-11-10 10:49       ` piaojun
2016-11-11  1:56         ` Eric Ren
2016-11-14  5:42           ` piaojun
2016-11-14 10:03             ` Eric Ren
2016-11-15  2:13               ` Eric Ren
2016-11-09  4:55   ` Eric Ren
2016-10-19  6:57 ` [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Junxiao Bi
2016-10-19  7:46   ` Eric Ren
2016-10-24  9:13 ` Eric Ren
2016-10-28  6:20 ` Christoph Hellwig
2016-10-28  6:20   ` Christoph Hellwig
2016-10-28  7:06   ` Eric Ren
2016-10-28  7:06     ` [Ocfs2-devel] " Eric Ren
2016-11-09  4:47 ` Eric Ren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.