ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 v6.5-rc2 0/3] fs: dlm: lock cancellation feature
@ 2023-07-18 18:07 Alexander Aring
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alexander Aring @ 2023-07-18 18:07 UTC (permalink / raw)
  To: teigland
  Cc: cluster-devel, aahringo, agruenba, mark, jlbec, joseph.qi, ocfs2-devel

Hi,

this patch series implements lock cancellation feature. Either if a
POSIX F_SETLKW got interrupted by a signal (most common use case) or
lockd sends a F_CANCELLK request.

As I note: the current nfs handling seems to be broken. This patch
assumes that the current behaviour works. Future patches will fix the
nfs handling which seems to be broken anyway.

- Alex

changes since v3:

- drop patch "fs: dlm: ignore DLM_PLOCK_FL_CLOSE flag results", we assume
  that plock ops with DLM_PLOCK_FL_CLOSE flag set will never fail.
- Let DLM_PLOCK_OP_CANCEL to always send a reply back. This is useful when
  the op fails, e.g. older dlm_controld will return -EINVAL and we can
  implement F_CANCELLK which does not have a reference of the plock_op
  instance.
- remove DLM_PLOCK_OP_FLAG_SENT as it was only a optimization for a
  rare case. That DLM_PLOCK_OP_CANCEL sends a reply back will
  synchronize it now.
- remove DLM_PLOCK_OP_FLAG_INTERRUPTED as it's not necessary anymore
  because waiting for a reply of DLM_PLOCK_OP_CANCEL we don't need to
  handle this special case anymore.
- add "fs: dlm: remove twice newline" because I saw this while doing nfs
  lockd experiments.

Alexander Aring (3):
  fs: dlm: remove twice newline
  fs: dlm: allow to F_SETLKW getting interrupted
  fs: dlm: fix F_CANCELLK to cancel pending request

 fs/dlm/plock.c                 | 162 ++++++++++++++++++++++++++-------
 fs/gfs2/file.c                 |   9 +-
 fs/ocfs2/stack_user.c          |  13 +--
 include/linux/dlm_plock.h      |   2 +
 include/uapi/linux/dlm_plock.h |   1 +
 5 files changed, 136 insertions(+), 51 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCHv3 v6.5-rc2 1/3] fs: dlm: remove twice newline
  2023-07-18 18:07 [PATCHv3 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
@ 2023-07-18 18:07 ` Alexander Aring
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
  2 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2023-07-18 18:07 UTC (permalink / raw)
  To: teigland
  Cc: cluster-devel, aahringo, agruenba, mark, jlbec, joseph.qi, ocfs2-devel

This patch removes a newline which log_print() already adds, also
removes wrapped string that causes a checkpatch warning.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/plock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 70a4752ed913..a34f605d8505 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -240,8 +240,8 @@ static int dlm_plock_callback(struct plock_op *op)
 	rv = notify(fl, 0);
 	if (rv) {
 		/* XXX: We need to cancel the fs lock here: */
-		log_print("dlm_plock_callback: lock granted after lock request "
-			  "failed; dangling lock!\n");
+		log_print("%s: lock granted after lock request failed; dangling lock!",
+			  __func__);
 		goto out;
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted
  2023-07-18 18:07 [PATCHv3 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
@ 2023-07-18 18:07 ` Alexander Aring
  2024-03-25 15:08   ` Andreas Gruenbacher
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
  2 siblings, 1 reply; 10+ messages in thread
From: Alexander Aring @ 2023-07-18 18:07 UTC (permalink / raw)
  To: teigland
  Cc: cluster-devel, aahringo, agruenba, mark, jlbec, joseph.qi, ocfs2-devel

This patch implements dlm plock F_SETLKW interruption feature. If a
blocking posix lock request got interrupted in user space by a signal a
cancellation request for a non granted lock request to the user space
lock manager will be send. The user lock manager answers either with
zero or a negative errno code. A errno of -ENOENT signals that there is
currently no blocking lock request waiting to being granted. In case of
-ENOENT it was probably to late to request a cancellation and the
pending lock got granted. In any error case we will wait until the lock
is being granted as cancellation failed, this causes also that in case
of an older user lock manager returning -EINVAL we will wait as
cancellation is not supported which should be fine. If a user requires
this feature the user should update dlm user space to support lock
request cancellation.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/plock.c                 | 56 ++++++++++++++++++++++------------
 include/uapi/linux/dlm_plock.h |  1 +
 2 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index a34f605d8505..a8ffa0760913 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -74,30 +74,26 @@ static void send_op(struct plock_op *op)
 	wake_up(&send_wq);
 }
 
-/* If a process was killed while waiting for the only plock on a file,
-   locks_remove_posix will not see any lock on the file so it won't
-   send an unlock-close to us to pass on to userspace to clean up the
-   abandoned waiter.  So, we have to insert the unlock-close when the
-   lock call is interrupted. */
-
-static void do_unlock_close(const struct dlm_plock_info *info)
+static int do_lock_cancel(const struct dlm_plock_info *orig_info)
 {
 	struct plock_op *op;
+	int rv;
 
 	op = kzalloc(sizeof(*op), GFP_NOFS);
 	if (!op)
-		return;
+		return -ENOMEM;
+
+	op->info = *orig_info;
+	op->info.optype = DLM_PLOCK_OP_CANCEL;
+	op->info.wait = 0;
 
-	op->info.optype		= DLM_PLOCK_OP_UNLOCK;
-	op->info.pid		= info->pid;
-	op->info.fsid		= info->fsid;
-	op->info.number		= info->number;
-	op->info.start		= 0;
-	op->info.end		= OFFSET_MAX;
-	op->info.owner		= info->owner;
-
-	op->info.flags |= DLM_PLOCK_FL_CLOSE;
 	send_op(op);
+	wait_event(recv_wq, (op->done != 0));
+
+	rv = op->info.rv;
+
+	dlm_release_plock_op(op);
+	return rv;
 }
 
 int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
@@ -156,7 +152,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 	send_op(op);
 
 	if (op->info.wait) {
-		rv = wait_event_killable(recv_wq, (op->done != 0));
+		rv = wait_event_interruptible(recv_wq, (op->done != 0));
 		if (rv == -ERESTARTSYS) {
 			spin_lock(&ops_lock);
 			/* recheck under ops_lock if we got a done != 0,
@@ -166,17 +162,37 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 				spin_unlock(&ops_lock);
 				goto do_lock_wait;
 			}
-			list_del(&op->list);
 			spin_unlock(&ops_lock);
 
+			rv = do_lock_cancel(&op->info);
+			switch (rv) {
+			case 0:
+				/* waiter was deleted in user space, answer will never come
+				 * remove original request. The original request must be
+				 * on recv_list because the answer of do_lock_cancel()
+				 * synchronized it.
+				 */
+				spin_lock(&ops_lock);
+				list_del(&op->list);
+				spin_unlock(&ops_lock);
+				rv = -EINTR;
+				break;
+			case -ENOENT:
+				/* cancellation wasn't successful but op should be done */
+				fallthrough;
+			default:
+				/* internal error doing cancel we need to wait */
+				goto wait;
+			}
+
 			log_debug(ls, "%s: wait interrupted %x %llx pid %d",
 				  __func__, ls->ls_global_id,
 				  (unsigned long long)number, op->info.pid);
-			do_unlock_close(&op->info);
 			dlm_release_plock_op(op);
 			goto out;
 		}
 	} else {
+wait:
 		wait_event(recv_wq, (op->done != 0));
 	}
 
diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h
index 63b6c1fd9169..eb66afcac40e 100644
--- a/include/uapi/linux/dlm_plock.h
+++ b/include/uapi/linux/dlm_plock.h
@@ -22,6 +22,7 @@ enum {
 	DLM_PLOCK_OP_LOCK = 1,
 	DLM_PLOCK_OP_UNLOCK,
 	DLM_PLOCK_OP_GET,
+	DLM_PLOCK_OP_CANCEL,
 };
 
 #define DLM_PLOCK_FL_CLOSE 1
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-18 18:07 [PATCHv3 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
@ 2023-07-18 18:07 ` Alexander Aring
  2023-07-19 18:33   ` Alexander Aring
  2023-07-25 14:26   ` Alexander Aring
  2 siblings, 2 replies; 10+ messages in thread
From: Alexander Aring @ 2023-07-18 18:07 UTC (permalink / raw)
  To: teigland
  Cc: cluster-devel, aahringo, agruenba, mark, jlbec, joseph.qi, ocfs2-devel

This patch fixes the current handling of F_CANCELLK by not just doing a
unlock as we need to try to cancel a lock at first. A unlock makes sense
on a non-blocking lock request but if it's a blocking lock request we
need to cancel the request until it's not granted yet. This patch is fixing
this behaviour by first try to cancel a lock request and if it's failed
it's unlocking the lock which seems to be granted.

Note: currently the nfs locking handling was disabled by commit
40595cdc93ed ("nfs: block notification on fs with its own ->lock").
However DLM was never being updated regarding to this change. Future
patches will try to fix lockd lock requests for DLM. This patch is
currently assuming the upstream DLM lockd handling is correct.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/plock.c            | 102 +++++++++++++++++++++++++++++++++-----
 fs/gfs2/file.c            |   9 ++--
 fs/ocfs2/stack_user.c     |  13 ++---
 include/linux/dlm_plock.h |   2 +
 4 files changed, 97 insertions(+), 29 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index a8ffa0760913..84510994b177 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -42,6 +42,27 @@ static inline void set_version(struct dlm_plock_info *info)
 	info->version[2] = DLM_PLOCK_VERSION_PATCH;
 }
 
+static struct plock_op *plock_lookup_waiter(const struct dlm_plock_info *info)
+{
+	struct plock_op *op = NULL, *iter;
+
+	list_for_each_entry(iter, &recv_list, list) {
+		if (iter->info.fsid == info->fsid &&
+		    iter->info.number == info->number &&
+		    iter->info.owner == info->owner &&
+		    iter->info.pid == info->pid &&
+		    iter->info.start == info->start &&
+		    iter->info.end == info->end &&
+		    iter->info.ex == info->ex &&
+		    iter->info.wait) {
+			op = iter;
+			break;
+		}
+	}
+
+	return op;
+}
+
 static int check_version(struct dlm_plock_info *info)
 {
 	if ((DLM_PLOCK_VERSION_MAJOR != info->version[0]) ||
@@ -334,6 +355,73 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 }
 EXPORT_SYMBOL_GPL(dlm_posix_unlock);
 
+/*
+ * NOTE: This implementation can only handle async lock requests as nfs
+ * do it. It cannot handle cancellation of a pending lock request sitting
+ * in wait_event(), but for now only nfs is the only user local kernel
+ * user.
+ */
+int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+		     struct file_lock *fl)
+{
+	struct dlm_plock_info info;
+	struct plock_op *op;
+	struct dlm_ls *ls;
+	int rv;
+
+	/* this only works for async request for now and nfs is the only
+	 * kernel user right now.
+	 */
+	if (WARN_ON_ONCE(!fl->fl_lmops || !fl->fl_lmops->lm_grant))
+		return -EOPNOTSUPP;
+
+	ls = dlm_find_lockspace_local(lockspace);
+	if (!ls)
+		return -EINVAL;
+
+	info.pid = fl->fl_pid;
+	info.ex = (fl->fl_type == F_WRLCK);
+	info.fsid = ls->ls_global_id;
+	dlm_put_lockspace(ls);
+	info.number = number;
+	info.start = fl->fl_start;
+	info.end = fl->fl_end;
+	info.owner = (__u64)fl->fl_pid;
+
+	rv = do_lock_cancel(&info);
+	switch (rv) {
+	case 0:
+		spin_lock(&ops_lock);
+		/* lock request to cancel must be on recv_list because
+		 * do_lock_cancel() synchronizes it.
+		 */
+		op = plock_lookup_waiter(&info);
+		if (WARN_ON_ONCE(!op)) {
+			rv = -ENOLCK;
+			break;
+		}
+
+		list_del(&op->list);
+		spin_unlock(&ops_lock);
+		WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK);
+		op->data->callback(op->data->fl, -EINTR);
+		dlm_release_plock_op(op);
+		rv = -EINTR;
+		break;
+	case -ENOENT:
+		/* if cancel wasn't successful we probably were to late
+		 * or it was a non-blocking lock request, so just unlock it.
+		 */
+		rv = dlm_posix_unlock(lockspace, number, file, fl);
+		break;
+	default:
+		break;
+	}
+
+	return rv;
+}
+EXPORT_SYMBOL_GPL(dlm_posix_cancel);
+
 int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		  struct file_lock *fl)
 {
@@ -457,19 +545,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
 	 */
 	spin_lock(&ops_lock);
 	if (info.wait) {
-		list_for_each_entry(iter, &recv_list, list) {
-			if (iter->info.fsid == info.fsid &&
-			    iter->info.number == info.number &&
-			    iter->info.owner == info.owner &&
-			    iter->info.pid == info.pid &&
-			    iter->info.start == info.start &&
-			    iter->info.end == info.end &&
-			    iter->info.ex == info.ex &&
-			    iter->info.wait) {
-				op = iter;
-				break;
-			}
-		}
+		op = plock_lookup_waiter(&info);
 	} else {
 		list_for_each_entry(iter, &recv_list, list) {
 			if (!iter->info.wait) {
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 1bf3c4453516..386eceb2f574 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1436,17 +1436,14 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
 
 	if (!(fl->fl_flags & FL_POSIX))
 		return -ENOLCK;
-	if (cmd == F_CANCELLK) {
-		/* Hack: */
-		cmd = F_SETLK;
-		fl->fl_type = F_UNLCK;
-	}
 	if (unlikely(gfs2_withdrawn(sdp))) {
 		if (fl->fl_type == F_UNLCK)
 			locks_lock_file_wait(file, fl);
 		return -EIO;
 	}
-	if (IS_GETLK(cmd))
+	if (cmd == F_CANCELLK)
+		return dlm_posix_cancel(ls->ls_dlm, ip->i_no_addr, file, fl);
+	else if (IS_GETLK(cmd))
 		return dlm_posix_get(ls->ls_dlm, ip->i_no_addr, file, fl);
 	else if (fl->fl_type == F_UNLCK)
 		return dlm_posix_unlock(ls->ls_dlm, ip->i_no_addr, file, fl);
diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c
index 05d4414d0c33..9b76ee66aeb2 100644
--- a/fs/ocfs2/stack_user.c
+++ b/fs/ocfs2/stack_user.c
@@ -738,18 +738,11 @@ static int user_plock(struct ocfs2_cluster_connection *conn,
 	 *
 	 * Internally, fs/dlm will pass these to a misc device, which
 	 * a userspace daemon will read and write to.
-	 *
-	 * For now, cancel requests (which happen internally only),
-	 * are turned into unlocks. Most of this function taken from
-	 * gfs2_lock.
 	 */
 
-	if (cmd == F_CANCELLK) {
-		cmd = F_SETLK;
-		fl->fl_type = F_UNLCK;
-	}
-
-	if (IS_GETLK(cmd))
+	if (cmd == F_CANCELLK)
+		return dlm_posix_cancel(conn->cc_lockspace, ino, file, fl);
+	else if (IS_GETLK(cmd))
 		return dlm_posix_get(conn->cc_lockspace, ino, file, fl);
 	else if (fl->fl_type == F_UNLCK)
 		return dlm_posix_unlock(conn->cc_lockspace, ino, file, fl);
diff --git a/include/linux/dlm_plock.h b/include/linux/dlm_plock.h
index e6d76e8715a6..15fc856d198c 100644
--- a/include/linux/dlm_plock.h
+++ b/include/linux/dlm_plock.h
@@ -11,6 +11,8 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		int cmd, struct file_lock *fl);
 int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		struct file_lock *fl);
+int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+		     struct file_lock *fl);
 int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
 		struct file_lock *fl);
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
@ 2023-07-19 18:33   ` Alexander Aring
  2023-07-25 14:26   ` Alexander Aring
  1 sibling, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2023-07-19 18:33 UTC (permalink / raw)
  To: teigland; +Cc: cluster-devel, agruenba, mark, jlbec, joseph.qi, ocfs2-devel

Hi,

On Tue, Jul 18, 2023 at 2:07 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> This patch fixes the current handling of F_CANCELLK by not just doing a
> unlock as we need to try to cancel a lock at first. A unlock makes sense
> on a non-blocking lock request but if it's a blocking lock request we
> need to cancel the request until it's not granted yet. This patch is fixing
> this behaviour by first try to cancel a lock request and if it's failed
> it's unlocking the lock which seems to be granted.
>
> Note: currently the nfs locking handling was disabled by commit
> 40595cdc93ed ("nfs: block notification on fs with its own ->lock").
> However DLM was never being updated regarding to this change. Future
> patches will try to fix lockd lock requests for DLM. This patch is
> currently assuming the upstream DLM lockd handling is correct.
>
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/plock.c            | 102 +++++++++++++++++++++++++++++++++-----
>  fs/gfs2/file.c            |   9 ++--
>  fs/ocfs2/stack_user.c     |  13 ++---
>  include/linux/dlm_plock.h |   2 +
>  4 files changed, 97 insertions(+), 29 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index a8ffa0760913..84510994b177 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -42,6 +42,27 @@ static inline void set_version(struct dlm_plock_info *info)
>         info->version[2] = DLM_PLOCK_VERSION_PATCH;
>  }
>
> +static struct plock_op *plock_lookup_waiter(const struct dlm_plock_info *info)
> +{
> +       struct plock_op *op = NULL, *iter;
> +
> +       list_for_each_entry(iter, &recv_list, list) {
> +               if (iter->info.fsid == info->fsid &&
> +                   iter->info.number == info->number &&
> +                   iter->info.owner == info->owner &&
> +                   iter->info.pid == info->pid &&
> +                   iter->info.start == info->start &&
> +                   iter->info.end == info->end &&
> +                   iter->info.ex == info->ex &&
> +                   iter->info.wait) {
> +                       op = iter;
> +                       break;
> +               }
> +       }
> +
> +       return op;
> +}
> +
>  static int check_version(struct dlm_plock_info *info)
>  {
>         if ((DLM_PLOCK_VERSION_MAJOR != info->version[0]) ||
> @@ -334,6 +355,73 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  }
>  EXPORT_SYMBOL_GPL(dlm_posix_unlock);
>
> +/*
> + * NOTE: This implementation can only handle async lock requests as nfs
> + * do it. It cannot handle cancellation of a pending lock request sitting
> + * in wait_event(), but for now only nfs is the only user local kernel
> + * user.
> + */
> +int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> +                    struct file_lock *fl)
> +{
> +       struct dlm_plock_info info;
> +       struct plock_op *op;
> +       struct dlm_ls *ls;
> +       int rv;
> +
> +       /* this only works for async request for now and nfs is the only
> +        * kernel user right now.
> +        */
> +       if (WARN_ON_ONCE(!fl->fl_lmops || !fl->fl_lmops->lm_grant))
> +               return -EOPNOTSUPP;
> +
> +       ls = dlm_find_lockspace_local(lockspace);
> +       if (!ls)
> +               return -EINVAL;
> +

here is a:

memset(&info, 0, sizeof(info));

missing.

- Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
  2023-07-19 18:33   ` Alexander Aring
@ 2023-07-25 14:26   ` Alexander Aring
  1 sibling, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2023-07-25 14:26 UTC (permalink / raw)
  To: teigland; +Cc: cluster-devel, agruenba, mark, jlbec, joseph.qi, ocfs2-devel

Hi,

On Tue, Jul 18, 2023 at 2:07 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> This patch fixes the current handling of F_CANCELLK by not just doing a
> unlock as we need to try to cancel a lock at first. A unlock makes sense
> on a non-blocking lock request but if it's a blocking lock request we
> need to cancel the request until it's not granted yet. This patch is fixing
> this behaviour by first try to cancel a lock request and if it's failed
> it's unlocking the lock which seems to be granted.
>
> Note: currently the nfs locking handling was disabled by commit
> 40595cdc93ed ("nfs: block notification on fs with its own ->lock").
> However DLM was never being updated regarding to this change. Future
> patches will try to fix lockd lock requests for DLM. This patch is
> currently assuming the upstream DLM lockd handling is correct.
>
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/plock.c            | 102 +++++++++++++++++++++++++++++++++-----
>  fs/gfs2/file.c            |   9 ++--
>  fs/ocfs2/stack_user.c     |  13 ++---
>  include/linux/dlm_plock.h |   2 +
>  4 files changed, 97 insertions(+), 29 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index a8ffa0760913..84510994b177 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -42,6 +42,27 @@ static inline void set_version(struct dlm_plock_info *info)
>         info->version[2] = DLM_PLOCK_VERSION_PATCH;
>  }
>
> +static struct plock_op *plock_lookup_waiter(const struct dlm_plock_info *info)
> +{
> +       struct plock_op *op = NULL, *iter;
> +
> +       list_for_each_entry(iter, &recv_list, list) {
> +               if (iter->info.fsid == info->fsid &&
> +                   iter->info.number == info->number &&
> +                   iter->info.owner == info->owner &&
> +                   iter->info.pid == info->pid &&
> +                   iter->info.start == info->start &&
> +                   iter->info.end == info->end &&
> +                   iter->info.ex == info->ex &&
> +                   iter->info.wait) {
> +                       op = iter;
> +                       break;
> +               }
> +       }
> +
> +       return op;
> +}
> +
>  static int check_version(struct dlm_plock_info *info)
>  {
>         if ((DLM_PLOCK_VERSION_MAJOR != info->version[0]) ||
> @@ -334,6 +355,73 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>  }
>  EXPORT_SYMBOL_GPL(dlm_posix_unlock);
>
> +/*
> + * NOTE: This implementation can only handle async lock requests as nfs
> + * do it. It cannot handle cancellation of a pending lock request sitting
> + * in wait_event(), but for now only nfs is the only user local kernel
> + * user.
> + */
> +int dlm_posix_cancel(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> +                    struct file_lock *fl)
> +{
> +       struct dlm_plock_info info;
> +       struct plock_op *op;
> +       struct dlm_ls *ls;
> +       int rv;
> +
> +       /* this only works for async request for now and nfs is the only
> +        * kernel user right now.
> +        */
> +       if (WARN_ON_ONCE(!fl->fl_lmops || !fl->fl_lmops->lm_grant))
> +               return -EOPNOTSUPP;
> +
> +       ls = dlm_find_lockspace_local(lockspace);
> +       if (!ls)
> +               return -EINVAL;
> +
> +       info.pid = fl->fl_pid;
> +       info.ex = (fl->fl_type == F_WRLCK);
> +       info.fsid = ls->ls_global_id;
> +       dlm_put_lockspace(ls);
> +       info.number = number;
> +       info.start = fl->fl_start;
> +       info.end = fl->fl_end;
> +       info.owner = (__u64)fl->fl_pid;
> +
> +       rv = do_lock_cancel(&info);
> +       switch (rv) {
> +       case 0:
> +               spin_lock(&ops_lock);
> +               /* lock request to cancel must be on recv_list because
> +                * do_lock_cancel() synchronizes it.
> +                */
> +               op = plock_lookup_waiter(&info);
> +               if (WARN_ON_ONCE(!op)) {
> +                       rv = -ENOLCK;
> +                       break;

missing spin_unlock() here. I will add it to my upcoming patch series.

- Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted
  2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
@ 2024-03-25 15:08   ` Andreas Gruenbacher
  2024-03-26  0:32     ` Alexander Aring
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Gruenbacher @ 2024-03-25 15:08 UTC (permalink / raw)
  To: Alexander Aring; +Cc: gfs2, teigland, mark, jlbec, joseph.qi, ocfs2-devel

On Tue, Jul 18, 2023 at 8:07 PM Alexander Aring <aahringo@redhat.com> wrote:
> This patch implements dlm plock F_SETLKW interruption feature. If a
> blocking posix lock request got interrupted in user space by a signal a
> cancellation request for a non granted lock request to the user space
> lock manager will be send. The user lock manager answers either with
> zero or a negative errno code. A errno of -ENOENT signals that there is
> currently no blocking lock request waiting to being granted. In case of
> -ENOENT it was probably to late to request a cancellation and the
> pending lock got granted. In any error case we will wait until the lock
> is being granted as cancellation failed, this causes also that in case
> of an older user lock manager returning -EINVAL we will wait as
> cancellation is not supported which should be fine. If a user requires
> this feature the user should update dlm user space to support lock
> request cancellation.
>
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/plock.c                 | 56 ++++++++++++++++++++++------------
>  include/uapi/linux/dlm_plock.h |  1 +
>  2 files changed, 37 insertions(+), 20 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index a34f605d8505..a8ffa0760913 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -74,30 +74,26 @@ static void send_op(struct plock_op *op)
>         wake_up(&send_wq);
>  }
>
> -/* If a process was killed while waiting for the only plock on a file,
> -   locks_remove_posix will not see any lock on the file so it won't
> -   send an unlock-close to us to pass on to userspace to clean up the
> -   abandoned waiter.  So, we have to insert the unlock-close when the
> -   lock call is interrupted. */
> -
> -static void do_unlock_close(const struct dlm_plock_info *info)
> +static int do_lock_cancel(const struct dlm_plock_info *orig_info)
>  {
>         struct plock_op *op;
> +       int rv;
>
>         op = kzalloc(sizeof(*op), GFP_NOFS);
>         if (!op)
> -               return;
> +               return -ENOMEM;
> +
> +       op->info = *orig_info;
> +       op->info.optype = DLM_PLOCK_OP_CANCEL;
> +       op->info.wait = 0;
>
> -       op->info.optype         = DLM_PLOCK_OP_UNLOCK;
> -       op->info.pid            = info->pid;
> -       op->info.fsid           = info->fsid;
> -       op->info.number         = info->number;
> -       op->info.start          = 0;
> -       op->info.end            = OFFSET_MAX;
> -       op->info.owner          = info->owner;
> -
> -       op->info.flags |= DLM_PLOCK_FL_CLOSE;
>         send_op(op);
> +       wait_event(recv_wq, (op->done != 0));
> +
> +       rv = op->info.rv;
> +
> +       dlm_release_plock_op(op);
> +       return rv;
>  }
>
>  int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> @@ -156,7 +152,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>         send_op(op);
>
>         if (op->info.wait) {
> -               rv = wait_event_killable(recv_wq, (op->done != 0));
> +               rv = wait_event_interruptible(recv_wq, (op->done != 0));

It seems that this patch leads to an unnecessary change in behavior
when a fatal signal is received (fatal_signal_pending()): before, the
process would terminate. Now, it will try to cancel the lock, and when
that fails, the process will keep waiting. In case of a fatal signal,
can we skip the cancelling and do what we did before?

>                 if (rv == -ERESTARTSYS) {
>                         spin_lock(&ops_lock);
>                         /* recheck under ops_lock if we got a done != 0,
> @@ -166,17 +162,37 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
>                                 spin_unlock(&ops_lock);
>                                 goto do_lock_wait;
>                         }
> -                       list_del(&op->list);
>                         spin_unlock(&ops_lock);
>
> +                       rv = do_lock_cancel(&op->info);
> +                       switch (rv) {
> +                       case 0:
> +                               /* waiter was deleted in user space, answer will never come
> +                                * remove original request. The original request must be
> +                                * on recv_list because the answer of do_lock_cancel()
> +                                * synchronized it.
> +                                */
> +                               spin_lock(&ops_lock);
> +                               list_del(&op->list);
> +                               spin_unlock(&ops_lock);
> +                               rv = -EINTR;
> +                               break;
> +                       case -ENOENT:
> +                               /* cancellation wasn't successful but op should be done */
> +                               fallthrough;
> +                       default:
> +                               /* internal error doing cancel we need to wait */
> +                               goto wait;
> +                       }
> +
>                         log_debug(ls, "%s: wait interrupted %x %llx pid %d",
>                                   __func__, ls->ls_global_id,
>                                   (unsigned long long)number, op->info.pid);
> -                       do_unlock_close(&op->info);
>                         dlm_release_plock_op(op);
>                         goto out;
>                 }
>         } else {
> +wait:
>                 wait_event(recv_wq, (op->done != 0));
>         }
>
> diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h
> index 63b6c1fd9169..eb66afcac40e 100644
> --- a/include/uapi/linux/dlm_plock.h
> +++ b/include/uapi/linux/dlm_plock.h
> @@ -22,6 +22,7 @@ enum {
>         DLM_PLOCK_OP_LOCK = 1,
>         DLM_PLOCK_OP_UNLOCK,
>         DLM_PLOCK_OP_GET,
> +       DLM_PLOCK_OP_CANCEL,
>  };
>
>  #define DLM_PLOCK_FL_CLOSE 1
> --
> 2.31.1
>

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted
  2024-03-25 15:08   ` Andreas Gruenbacher
@ 2024-03-26  0:32     ` Alexander Aring
  2024-03-26 11:31       ` Andreas Gruenbacher
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Aring @ 2024-03-26  0:32 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: gfs2, teigland, mark, jlbec, joseph.qi, ocfs2-devel

Hi,

On Mon, Mar 25, 2024 at 11:09 AM Andreas Gruenbacher
<agruenba@redhat.com> wrote:
>
> On Tue, Jul 18, 2023 at 8:07 PM Alexander Aring <aahringo@redhat.com> wrote:
> > This patch implements dlm plock F_SETLKW interruption feature. If a
> > blocking posix lock request got interrupted in user space by a signal a
> > cancellation request for a non granted lock request to the user space
> > lock manager will be send. The user lock manager answers either with
> > zero or a negative errno code. A errno of -ENOENT signals that there is
> > currently no blocking lock request waiting to being granted. In case of
> > -ENOENT it was probably to late to request a cancellation and the
> > pending lock got granted. In any error case we will wait until the lock
> > is being granted as cancellation failed, this causes also that in case
> > of an older user lock manager returning -EINVAL we will wait as
> > cancellation is not supported which should be fine. If a user requires
> > this feature the user should update dlm user space to support lock
> > request cancellation.
> >
> > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > ---
> >  fs/dlm/plock.c                 | 56 ++++++++++++++++++++++------------
> >  include/uapi/linux/dlm_plock.h |  1 +
> >  2 files changed, 37 insertions(+), 20 deletions(-)
> >
> > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > index a34f605d8505..a8ffa0760913 100644
> > --- a/fs/dlm/plock.c
> > +++ b/fs/dlm/plock.c
> > @@ -74,30 +74,26 @@ static void send_op(struct plock_op *op)
> >         wake_up(&send_wq);
> >  }
> >
> > -/* If a process was killed while waiting for the only plock on a file,
> > -   locks_remove_posix will not see any lock on the file so it won't
> > -   send an unlock-close to us to pass on to userspace to clean up the
> > -   abandoned waiter.  So, we have to insert the unlock-close when the
> > -   lock call is interrupted. */
> > -
> > -static void do_unlock_close(const struct dlm_plock_info *info)
> > +static int do_lock_cancel(const struct dlm_plock_info *orig_info)
> >  {
> >         struct plock_op *op;
> > +       int rv;
> >
> >         op = kzalloc(sizeof(*op), GFP_NOFS);
> >         if (!op)
> > -               return;
> > +               return -ENOMEM;
> > +
> > +       op->info = *orig_info;
> > +       op->info.optype = DLM_PLOCK_OP_CANCEL;
> > +       op->info.wait = 0;
> >
> > -       op->info.optype         = DLM_PLOCK_OP_UNLOCK;
> > -       op->info.pid            = info->pid;
> > -       op->info.fsid           = info->fsid;
> > -       op->info.number         = info->number;
> > -       op->info.start          = 0;
> > -       op->info.end            = OFFSET_MAX;
> > -       op->info.owner          = info->owner;
> > -
> > -       op->info.flags |= DLM_PLOCK_FL_CLOSE;
> >         send_op(op);
> > +       wait_event(recv_wq, (op->done != 0));
> > +
> > +       rv = op->info.rv;
> > +
> > +       dlm_release_plock_op(op);
> > +       return rv;
> >  }
> >
> >  int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > @@ -156,7 +152,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> >         send_op(op);
> >
> >         if (op->info.wait) {
> > -               rv = wait_event_killable(recv_wq, (op->done != 0));
> > +               rv = wait_event_interruptible(recv_wq, (op->done != 0));
>
> It seems that this patch leads to an unnecessary change in behavior
> when a fatal signal is received (fatal_signal_pending()): before, the
> process would terminate. Now, it will try to cancel the lock, and when
> that fails, the process will keep waiting. In case of a fatal signal,
> can we skip the cancelling and do what we did before?

From my understanding interruptible() "reacts" on everything that is
also killable() and returns -ERESTARTSYS on "fatal signal". I even
tested it because wait_event_killable() has an issue, see reproducer
[0]. The issue was that it cleans too many waiters, the other waiter
of child in F_SETLKW was also cleared and it will never get a result
back from dlm_controld. I fixed that with an additional check on pid
in [1], but I have no idea about other side effects that could have
occurred as FL_CLOSE is also being used on other parts in the DLM
plock handling.

I rechecked the behaviour with the cancellation feature and sent
SIGKILL and the issue was gone without changing anything in user
space. The only thing I see why it would not have the old behaviour
(killable - that having the mentioned issue above) is that the
dlm_controld version is too old. To not run into this known issue we
just do a wait_event() that does not have those issues.

The mentioned "cancellation fails" - is not that it failed to cancel
the lock, there is some unexpected behaviour of dlm_controld, only
then we do wait_event() e.g. when we receive -EINVAL because
dlm_controld does not understand the op.

- Alex

[0] https://gitlab.com/netcoder/ltp/-/blob/dlm_fcntl_owner_testcase/testcases/kernel/syscalls/fcntl/fcntl44.c
[1] https://pagure.io/dlm/blob/main/f/dlm_controld/plock.c#_655


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted
  2024-03-26  0:32     ` Alexander Aring
@ 2024-03-26 11:31       ` Andreas Gruenbacher
  2024-03-26 13:02         ` Alexander Aring
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Gruenbacher @ 2024-03-26 11:31 UTC (permalink / raw)
  To: Alexander Aring; +Cc: gfs2, teigland, mark, jlbec, joseph.qi, ocfs2-devel

On Tue, Mar 26, 2024 at 1:32 AM Alexander Aring <aahringo@redhat.com> wrote:
> Hi,
>
> On Mon, Mar 25, 2024 at 11:09 AM Andreas Gruenbacher
> <agruenba@redhat.com> wrote:
> >
> > On Tue, Jul 18, 2023 at 8:07 PM Alexander Aring <aahringo@redhat.com> wrote:
> > > This patch implements dlm plock F_SETLKW interruption feature. If a
> > > blocking posix lock request got interrupted in user space by a signal a
> > > cancellation request for a non granted lock request to the user space
> > > lock manager will be send. The user lock manager answers either with
> > > zero or a negative errno code. A errno of -ENOENT signals that there is
> > > currently no blocking lock request waiting to being granted. In case of
> > > -ENOENT it was probably to late to request a cancellation and the
> > > pending lock got granted. In any error case we will wait until the lock
> > > is being granted as cancellation failed, this causes also that in case
> > > of an older user lock manager returning -EINVAL we will wait as
> > > cancellation is not supported which should be fine. If a user requires
> > > this feature the user should update dlm user space to support lock
> > > request cancellation.
> > >
> > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > > ---
> > >  fs/dlm/plock.c                 | 56 ++++++++++++++++++++++------------
> > >  include/uapi/linux/dlm_plock.h |  1 +
> > >  2 files changed, 37 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > index a34f605d8505..a8ffa0760913 100644
> > > --- a/fs/dlm/plock.c
> > > +++ b/fs/dlm/plock.c
> > > @@ -74,30 +74,26 @@ static void send_op(struct plock_op *op)
> > >         wake_up(&send_wq);
> > >  }
> > >
> > > -/* If a process was killed while waiting for the only plock on a file,
> > > -   locks_remove_posix will not see any lock on the file so it won't
> > > -   send an unlock-close to us to pass on to userspace to clean up the
> > > -   abandoned waiter.  So, we have to insert the unlock-close when the
> > > -   lock call is interrupted. */
> > > -
> > > -static void do_unlock_close(const struct dlm_plock_info *info)
> > > +static int do_lock_cancel(const struct dlm_plock_info *orig_info)
> > >  {
> > >         struct plock_op *op;
> > > +       int rv;
> > >
> > >         op = kzalloc(sizeof(*op), GFP_NOFS);
> > >         if (!op)
> > > -               return;
> > > +               return -ENOMEM;
> > > +
> > > +       op->info = *orig_info;
> > > +       op->info.optype = DLM_PLOCK_OP_CANCEL;
> > > +       op->info.wait = 0;
> > >
> > > -       op->info.optype         = DLM_PLOCK_OP_UNLOCK;
> > > -       op->info.pid            = info->pid;
> > > -       op->info.fsid           = info->fsid;
> > > -       op->info.number         = info->number;
> > > -       op->info.start          = 0;
> > > -       op->info.end            = OFFSET_MAX;
> > > -       op->info.owner          = info->owner;
> > > -
> > > -       op->info.flags |= DLM_PLOCK_FL_CLOSE;
> > >         send_op(op);
> > > +       wait_event(recv_wq, (op->done != 0));
> > > +
> > > +       rv = op->info.rv;
> > > +
> > > +       dlm_release_plock_op(op);
> > > +       return rv;
> > >  }
> > >
> > >  int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > > @@ -156,7 +152,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > >         send_op(op);
> > >
> > >         if (op->info.wait) {
> > > -               rv = wait_event_killable(recv_wq, (op->done != 0));
> > > +               rv = wait_event_interruptible(recv_wq, (op->done != 0));
> >
> > It seems that this patch leads to an unnecessary change in behavior
> > when a fatal signal is received (fatal_signal_pending()): before, the
> > process would terminate. Now, it will try to cancel the lock, and when
> > that fails, the process will keep waiting. In case of a fatal signal,
> > can we skip the cancelling and do what we did before?
>
> From my understanding interruptible() "reacts" on everything that is
> also killable() and returns -ERESTARTSYS on "fatal signal". I even
> tested it because wait_event_killable() has an issue, see reproducer
> [0]. The issue was that it cleans too many waiters, the other waiter
> of child in F_SETLKW was also cleared and it will never get a result
> back from dlm_controld. I fixed that with an additional check on pid
> in [1], but I have no idea about other side effects that could have
> occurred as FL_CLOSE is also being used on other parts in the DLM
> plock handling.
>
> I rechecked the behaviour with the cancellation feature and sent
> SIGKILL and the issue was gone without changing anything in user
> space. The only thing I see why it would not have the old behaviour
> (killable - that having the mentioned issue above) is that the
> dlm_controld version is too old. To not run into this known issue we
> just do a wait_event() that does not have those issues.
>
> The mentioned "cancellation fails" - is not that it failed to cancel
> the lock, there is some unexpected behaviour of dlm_controld, only
> then we do wait_event() e.g. when we receive -EINVAL because
> dlm_controld does not understand the op.

What happens on a system that has this kernel change, but doesn't have
the corresponding dlm_controld change for DLM_PLOCK_OP_CANCEL support?
In this scenario, when a process is killed with SIGKILL, the kernel
will send a DLM_PLOCK_OP_CANCEL request to dlm_controld. dlm_controld
doesn't understand the DLM_PLOCK_OP_CANCEL request, so I assume that
it will return a value other than 0 or -ENOENT. As a consequence, we
will end up in wait_event(), which isn't interruptible. So before this
kernel change, the process would be killable, but with this kernel
change, it isn't killable anymore.

I'm worried about this scenario because it isn't entirely unrealistic
for a system to end up with this kernel change but without the
corresponding user-space change.

Andreas


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted
  2024-03-26 11:31       ` Andreas Gruenbacher
@ 2024-03-26 13:02         ` Alexander Aring
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Aring @ 2024-03-26 13:02 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: gfs2, teigland, mark, jlbec, joseph.qi, ocfs2-devel

Hi,

On Tue, Mar 26, 2024 at 7:31 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
>
> On Tue, Mar 26, 2024 at 1:32 AM Alexander Aring <aahringo@redhat.com> wrote:
> > Hi,
> >
> > On Mon, Mar 25, 2024 at 11:09 AM Andreas Gruenbacher
> > <agruenba@redhat.com> wrote:
> > >
> > > On Tue, Jul 18, 2023 at 8:07 PM Alexander Aring <aahringo@redhat.com> wrote:
> > > > This patch implements dlm plock F_SETLKW interruption feature. If a
> > > > blocking posix lock request got interrupted in user space by a signal a
> > > > cancellation request for a non granted lock request to the user space
> > > > lock manager will be send. The user lock manager answers either with
> > > > zero or a negative errno code. A errno of -ENOENT signals that there is
> > > > currently no blocking lock request waiting to being granted. In case of
> > > > -ENOENT it was probably to late to request a cancellation and the
> > > > pending lock got granted. In any error case we will wait until the lock
> > > > is being granted as cancellation failed, this causes also that in case
> > > > of an older user lock manager returning -EINVAL we will wait as
> > > > cancellation is not supported which should be fine. If a user requires
> > > > this feature the user should update dlm user space to support lock
> > > > request cancellation.
> > > >
> > > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > > > ---
> > > >  fs/dlm/plock.c                 | 56 ++++++++++++++++++++++------------
> > > >  include/uapi/linux/dlm_plock.h |  1 +
> > > >  2 files changed, 37 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > > index a34f605d8505..a8ffa0760913 100644
> > > > --- a/fs/dlm/plock.c
> > > > +++ b/fs/dlm/plock.c
> > > > @@ -74,30 +74,26 @@ static void send_op(struct plock_op *op)
> > > >         wake_up(&send_wq);
> > > >  }
> > > >
> > > > -/* If a process was killed while waiting for the only plock on a file,
> > > > -   locks_remove_posix will not see any lock on the file so it won't
> > > > -   send an unlock-close to us to pass on to userspace to clean up the
> > > > -   abandoned waiter.  So, we have to insert the unlock-close when the
> > > > -   lock call is interrupted. */
> > > > -
> > > > -static void do_unlock_close(const struct dlm_plock_info *info)
> > > > +static int do_lock_cancel(const struct dlm_plock_info *orig_info)
> > > >  {
> > > >         struct plock_op *op;
> > > > +       int rv;
> > > >
> > > >         op = kzalloc(sizeof(*op), GFP_NOFS);
> > > >         if (!op)
> > > > -               return;
> > > > +               return -ENOMEM;
> > > > +
> > > > +       op->info = *orig_info;
> > > > +       op->info.optype = DLM_PLOCK_OP_CANCEL;
> > > > +       op->info.wait = 0;
> > > >
> > > > -       op->info.optype         = DLM_PLOCK_OP_UNLOCK;
> > > > -       op->info.pid            = info->pid;
> > > > -       op->info.fsid           = info->fsid;
> > > > -       op->info.number         = info->number;
> > > > -       op->info.start          = 0;
> > > > -       op->info.end            = OFFSET_MAX;
> > > > -       op->info.owner          = info->owner;
> > > > -
> > > > -       op->info.flags |= DLM_PLOCK_FL_CLOSE;
> > > >         send_op(op);
> > > > +       wait_event(recv_wq, (op->done != 0));
> > > > +
> > > > +       rv = op->info.rv;
> > > > +
> > > > +       dlm_release_plock_op(op);
> > > > +       return rv;
> > > >  }
> > > >
> > > >  int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > > > @@ -156,7 +152,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
> > > >         send_op(op);
> > > >
> > > >         if (op->info.wait) {
> > > > -               rv = wait_event_killable(recv_wq, (op->done != 0));
> > > > +               rv = wait_event_interruptible(recv_wq, (op->done != 0));
> > >
> > > It seems that this patch leads to an unnecessary change in behavior
> > > when a fatal signal is received (fatal_signal_pending()): before, the
> > > process would terminate. Now, it will try to cancel the lock, and when
> > > that fails, the process will keep waiting. In case of a fatal signal,
> > > can we skip the cancelling and do what we did before?
> >
> > From my understanding interruptible() "reacts" on everything that is
> > also killable() and returns -ERESTARTSYS on "fatal signal". I even
> > tested it because wait_event_killable() has an issue, see reproducer
> > [0]. The issue was that it cleans too many waiters, the other waiter
> > of child in F_SETLKW was also cleared and it will never get a result
> > back from dlm_controld. I fixed that with an additional check on pid
> > in [1], but I have no idea about other side effects that could have
> > occurred as FL_CLOSE is also being used on other parts in the DLM
> > plock handling.
> >
> > I rechecked the behaviour with the cancellation feature and sent
> > SIGKILL and the issue was gone without changing anything in user
> > space. The only thing I see why it would not have the old behaviour
> > (killable - that having the mentioned issue above) is that the
> > dlm_controld version is too old. To not run into this known issue we
> > just do a wait_event() that does not have those issues.
> >
> > The mentioned "cancellation fails" - is not that it failed to cancel
> > the lock, there is some unexpected behaviour of dlm_controld, only
> > then we do wait_event() e.g. when we receive -EINVAL because
> > dlm_controld does not understand the op.
>
> What happens on a system that has this kernel change, but doesn't have
> the corresponding dlm_controld change for DLM_PLOCK_OP_CANCEL support?
> In this scenario, when a process is killed with SIGKILL, the kernel
> will send a DLM_PLOCK_OP_CANCEL request to dlm_controld. dlm_controld
> doesn't understand the DLM_PLOCK_OP_CANCEL request, so I assume that
> it will return a value other than 0 or -ENOENT. As a consequence, we
> will end up in wait_event(), which isn't interruptible. So before this
> kernel change, the process would be killable, but with this kernel
> change, it isn't killable anymore.
>
> I'm worried about this scenario because it isn't entirely unrealistic
> for a system to end up with this kernel change but without the
> corresponding user-space change.
>

as I said before the previous behaviour had broken cases. I am more
worried about that somebody runs into this case as somebody realize
its not killable anymore but will may then figure out to upgrade
dlm_controld.

- Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-03-26 13:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-18 18:07 [PATCHv3 v6.5-rc2 0/3] fs: dlm: lock cancellation feature Alexander Aring
2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 1/3] fs: dlm: remove twice newline Alexander Aring
2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 2/3] fs: dlm: allow to F_SETLKW getting interrupted Alexander Aring
2024-03-25 15:08   ` Andreas Gruenbacher
2024-03-26  0:32     ` Alexander Aring
2024-03-26 11:31       ` Andreas Gruenbacher
2024-03-26 13:02         ` Alexander Aring
2023-07-18 18:07 ` [PATCHv3 v6.5-rc2 3/3] fs: dlm: fix F_CANCELLK to cancel pending request Alexander Aring
2023-07-19 18:33   ` Alexander Aring
2023-07-25 14:26   ` Alexander Aring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).