All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH dlm/next 0/3] fs: dlm: some callback fixes
@ 2022-07-21 21:53 ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: teigland; +Cc: cluster-devel, stable, aahringo

Hi,

I currently look a little bit deeper into the callback handling of dlm
and I think we have some issues there. Especially with the refcounting
and queue_work() per lkb. I have some local branches which does more
some rework and moving away from the lkb_callbacks[] array per lkb and
using a queue for handling callbacks. However those are issues which I
currently found for now and should be fixed.

- Alex

Alexander Aring (3):
  fs: dlm: fix race between test_bit() and queue_work()
  fs: dlm: avoid double list_add() for lkb->lkb_cb_list
  fs: dlm: fix refcount handling for dlm_add_cb()

 fs/dlm/ast.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH dlm/next 0/3] fs: dlm: some callback fixes
@ 2022-07-21 21:53 ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

I currently look a little bit deeper into the callback handling of dlm
and I think we have some issues there. Especially with the refcounting
and queue_work() per lkb. I have some local branches which does more
some rework and moving away from the lkb_callbacks[] array per lkb and
using a queue for handling callbacks. However those are issues which I
currently found for now and should be fixed.

- Alex

Alexander Aring (3):
  fs: dlm: fix race between test_bit() and queue_work()
  fs: dlm: avoid double list_add() for lkb->lkb_cb_list
  fs: dlm: fix refcount handling for dlm_add_cb()

 fs/dlm/ast.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH dlm/next 1/3] fs: dlm: fix race between test_bit() and queue_work()
  2022-07-21 21:53 ` [Cluster-devel] " Alexander Aring
@ 2022-07-21 21:53   ` Alexander Aring
  -1 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: teigland; +Cc: cluster-devel, stable, aahringo

This patch will fix a race by surround ls_cb_mutex in set_bit() and the
test_bit() and it's conditional code blocks for LSFL_CB_DELAY.

The function dlm_callback_stop() has the idea to stop all callbacks and
flush all currently queued onces. The set_bit() is not enough because
there can be still queue_work() around after the workqueue was flushed.
To avoid queue_work() after set_bit() we surround both by ls_cb_mutex
lock.

Cc: stable@vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/ast.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 19ef136f9e4f..a44cc42b6317 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -200,13 +200,13 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 	if (!prev_seq) {
 		kref_get(&lkb->lkb_ref);
 
+		mutex_lock(&ls->ls_cb_mutex);
 		if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
-			mutex_lock(&ls->ls_cb_mutex);
 			list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
-			mutex_unlock(&ls->ls_cb_mutex);
 		} else {
 			queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
 		}
+		mutex_unlock(&ls->ls_cb_mutex);
 	}
  out:
 	mutex_unlock(&lkb->lkb_cb_mutex);
@@ -288,7 +288,9 @@ void dlm_callback_stop(struct dlm_ls *ls)
 
 void dlm_callback_suspend(struct dlm_ls *ls)
 {
+	mutex_lock(&ls->ls_cb_mutex);
 	set_bit(LSFL_CB_DELAY, &ls->ls_flags);
+	mutex_unlock(&ls->ls_cb_mutex);
 
 	if (ls->ls_callback_wq)
 		flush_workqueue(ls->ls_callback_wq);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH dlm/next 1/3] fs: dlm: fix race between test_bit() and queue_work()
@ 2022-07-21 21:53   ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch will fix a race by surround ls_cb_mutex in set_bit() and the
test_bit() and it's conditional code blocks for LSFL_CB_DELAY.

The function dlm_callback_stop() has the idea to stop all callbacks and
flush all currently queued onces. The set_bit() is not enough because
there can be still queue_work() around after the workqueue was flushed.
To avoid queue_work() after set_bit() we surround both by ls_cb_mutex
lock.

Cc: stable at vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/ast.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 19ef136f9e4f..a44cc42b6317 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -200,13 +200,13 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 	if (!prev_seq) {
 		kref_get(&lkb->lkb_ref);
 
+		mutex_lock(&ls->ls_cb_mutex);
 		if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
-			mutex_lock(&ls->ls_cb_mutex);
 			list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
-			mutex_unlock(&ls->ls_cb_mutex);
 		} else {
 			queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
 		}
+		mutex_unlock(&ls->ls_cb_mutex);
 	}
  out:
 	mutex_unlock(&lkb->lkb_cb_mutex);
@@ -288,7 +288,9 @@ void dlm_callback_stop(struct dlm_ls *ls)
 
 void dlm_callback_suspend(struct dlm_ls *ls)
 {
+	mutex_lock(&ls->ls_cb_mutex);
 	set_bit(LSFL_CB_DELAY, &ls->ls_flags);
+	mutex_unlock(&ls->ls_cb_mutex);
 
 	if (ls->ls_callback_wq)
 		flush_workqueue(ls->ls_callback_wq);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH dlm/next 2/3] fs: dlm: avoid double list_add() for lkb->lkb_cb_list
  2022-07-21 21:53 ` [Cluster-devel] " Alexander Aring
@ 2022-07-21 21:53   ` Alexander Aring
  -1 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: teigland; +Cc: cluster-devel, stable, aahringo

The dlm_add_cb() can run multiple times for a lkb to add a callback and
calling list_add() to calling queue_work() for later. If it's getting
called multiple times while test_bit(LSFL_CB_DELAY, &ls->ls_flags)
is true we need to ensure it was added only once or the list getting
corrupted. The list holder lkb->lkb_cb_list is being used with
list_del_init()/INIT_LIST_HEAD() and can be used with list_empty()
to check if it is not being part of a list.

Cc: stable@vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/ast.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index a44cc42b6317..0271796d36b1 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -202,7 +202,8 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 
 		mutex_lock(&ls->ls_cb_mutex);
 		if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
-			list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
+			if (list_empty(&lkb->lkb_cb_list))
+				list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
 		} else {
 			queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
 		}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH dlm/next 2/3] fs: dlm: avoid double list_add() for lkb->lkb_cb_list
@ 2022-07-21 21:53   ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The dlm_add_cb() can run multiple times for a lkb to add a callback and
calling list_add() to calling queue_work() for later. If it's getting
called multiple times while test_bit(LSFL_CB_DELAY, &ls->ls_flags)
is true we need to ensure it was added only once or the list getting
corrupted. The list holder lkb->lkb_cb_list is being used with
list_del_init()/INIT_LIST_HEAD() and can be used with list_empty()
to check if it is not being part of a list.

Cc: stable at vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/ast.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index a44cc42b6317..0271796d36b1 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -202,7 +202,8 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 
 		mutex_lock(&ls->ls_cb_mutex);
 		if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
-			list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
+			if (list_empty(&lkb->lkb_cb_list))
+				list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
 		} else {
 			queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
 		}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb()
  2022-07-21 21:53 ` [Cluster-devel] " Alexander Aring
@ 2022-07-21 21:53   ` Alexander Aring
  -1 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: teigland; +Cc: cluster-devel, stable, aahringo

Each time dlm_add_cb() queues work or adds the lkb for queuing later to
the ls->ls_cb_delay list it increments a refcount. However if the work
is already queued or being added to the list we need to revert the
incrementation of the refcount. The function dlm_add_cb() can be called
multiple times without handling the related dlm_callback_work() work
function where it's get a put call. This patch reverts the kref_get()
when it's necessary in cases if already queued or not.

In case of dlm_callback_resume() we need to ensure that the
LSFL_CB_DELAY bit is cleared after all ls->ls_cb_delay lkbs are queued for
work. As the ls->ls_cb_delay list handling is there for queuing work for
later it should not be the case that a work was already queued, if so we
drop a warning.

Cc: stable@vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/ast.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 0271796d36b1..68e09ed8234e 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -177,6 +177,7 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 {
 	struct dlm_ls *ls = lkb->lkb_resource->res_ls;
 	uint64_t new_seq, prev_seq;
+	bool queued = true;
 	int rv;
 
 	spin_lock(&dlm_cb_seq_spin);
@@ -202,13 +203,19 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 
 		mutex_lock(&ls->ls_cb_mutex);
 		if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
-			if (list_empty(&lkb->lkb_cb_list))
+			if (list_empty(&lkb->lkb_cb_list)) {
 				list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
+				queued = false;
+			}
 		} else {
-			queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
+			queued = !queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
 		}
 		mutex_unlock(&ls->ls_cb_mutex);
+
+		if (queued)
+			dlm_put_lkb(lkb);
 	}
+
  out:
 	mutex_unlock(&lkb->lkb_cb_mutex);
 }
@@ -303,9 +310,7 @@ void dlm_callback_resume(struct dlm_ls *ls)
 {
 	struct dlm_lkb *lkb, *safe;
 	int count = 0, sum = 0;
-	bool empty;
-
-	clear_bit(LSFL_CB_DELAY, &ls->ls_flags);
+	bool empty, queued;
 
 	if (!ls->ls_callback_wq)
 		return;
@@ -314,12 +319,16 @@ void dlm_callback_resume(struct dlm_ls *ls)
 	mutex_lock(&ls->ls_cb_mutex);
 	list_for_each_entry_safe(lkb, safe, &ls->ls_cb_delay, lkb_cb_list) {
 		list_del_init(&lkb->lkb_cb_list);
-		queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
+		queued = queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
+		WARN_ON_ONCE(!queued);
+
 		count++;
 		if (count == MAX_CB_QUEUE)
 			break;
 	}
 	empty = list_empty(&ls->ls_cb_delay);
+	if (empty)
+		clear_bit(LSFL_CB_DELAY, &ls->ls_flags);
 	mutex_unlock(&ls->ls_cb_mutex);
 
 	sum += count;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb()
@ 2022-07-21 21:53   ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 21:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Each time dlm_add_cb() queues work or adds the lkb for queuing later to
the ls->ls_cb_delay list it increments a refcount. However if the work
is already queued or being added to the list we need to revert the
incrementation of the refcount. The function dlm_add_cb() can be called
multiple times without handling the related dlm_callback_work() work
function where it's get a put call. This patch reverts the kref_get()
when it's necessary in cases if already queued or not.

In case of dlm_callback_resume() we need to ensure that the
LSFL_CB_DELAY bit is cleared after all ls->ls_cb_delay lkbs are queued for
work. As the ls->ls_cb_delay list handling is there for queuing work for
later it should not be the case that a work was already queued, if so we
drop a warning.

Cc: stable at vger.kernel.org
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/ast.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 0271796d36b1..68e09ed8234e 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -177,6 +177,7 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 {
 	struct dlm_ls *ls = lkb->lkb_resource->res_ls;
 	uint64_t new_seq, prev_seq;
+	bool queued = true;
 	int rv;
 
 	spin_lock(&dlm_cb_seq_spin);
@@ -202,13 +203,19 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
 
 		mutex_lock(&ls->ls_cb_mutex);
 		if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
-			if (list_empty(&lkb->lkb_cb_list))
+			if (list_empty(&lkb->lkb_cb_list)) {
 				list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
+				queued = false;
+			}
 		} else {
-			queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
+			queued = !queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
 		}
 		mutex_unlock(&ls->ls_cb_mutex);
+
+		if (queued)
+			dlm_put_lkb(lkb);
 	}
+
  out:
 	mutex_unlock(&lkb->lkb_cb_mutex);
 }
@@ -303,9 +310,7 @@ void dlm_callback_resume(struct dlm_ls *ls)
 {
 	struct dlm_lkb *lkb, *safe;
 	int count = 0, sum = 0;
-	bool empty;
-
-	clear_bit(LSFL_CB_DELAY, &ls->ls_flags);
+	bool empty, queued;
 
 	if (!ls->ls_callback_wq)
 		return;
@@ -314,12 +319,16 @@ void dlm_callback_resume(struct dlm_ls *ls)
 	mutex_lock(&ls->ls_cb_mutex);
 	list_for_each_entry_safe(lkb, safe, &ls->ls_cb_delay, lkb_cb_list) {
 		list_del_init(&lkb->lkb_cb_list);
-		queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
+		queued = queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
+		WARN_ON_ONCE(!queued);
+
 		count++;
 		if (count == MAX_CB_QUEUE)
 			break;
 	}
 	empty = list_empty(&ls->ls_cb_delay);
+	if (empty)
+		clear_bit(LSFL_CB_DELAY, &ls->ls_flags);
 	mutex_unlock(&ls->ls_cb_mutex);
 
 	sum += count;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb()
  2022-07-21 21:53   ` [Cluster-devel] " Alexander Aring
@ 2022-07-21 22:08     ` Alexander Aring
  -1 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 22:08 UTC (permalink / raw)
  To: David Teigland; +Cc: cluster-devel, stable

Hi,

On Thu, Jul 21, 2022 at 5:53 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> Each time dlm_add_cb() queues work or adds the lkb for queuing later to
> the ls->ls_cb_delay list it increments a refcount. However if the work
> is already queued or being added to the list we need to revert the
> incrementation of the refcount. The function dlm_add_cb() can be called
> multiple times without handling the related dlm_callback_work() work
> function where it's get a put call. This patch reverts the kref_get()
> when it's necessary in cases if already queued or not.
>
> In case of dlm_callback_resume() we need to ensure that the
> LSFL_CB_DELAY bit is cleared after all ls->ls_cb_delay lkbs are queued for
> work. As the ls->ls_cb_delay list handling is there for queuing work for
> later it should not be the case that a work was already queued, if so we
> drop a warning.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/ast.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
> index 0271796d36b1..68e09ed8234e 100644
> --- a/fs/dlm/ast.c
> +++ b/fs/dlm/ast.c
> @@ -177,6 +177,7 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>  {
>         struct dlm_ls *ls = lkb->lkb_resource->res_ls;
>         uint64_t new_seq, prev_seq;
> +       bool queued = true;
>         int rv;
>
>         spin_lock(&dlm_cb_seq_spin);
> @@ -202,13 +203,19 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>
>                 mutex_lock(&ls->ls_cb_mutex);
>                 if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
> -                       if (list_empty(&lkb->lkb_cb_list))
> +                       if (list_empty(&lkb->lkb_cb_list)) {
>                                 list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
> +                               queued = false;
> +                       }
>                 } else {
> -                       queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +                       queued = !queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
>                 }
>                 mutex_unlock(&ls->ls_cb_mutex);
> +
> +               if (queued)
> +                       dlm_put_lkb(lkb);
>         }
> +
>   out:
>         mutex_unlock(&lkb->lkb_cb_mutex);
>  }
> @@ -303,9 +310,7 @@ void dlm_callback_resume(struct dlm_ls *ls)
>  {
>         struct dlm_lkb *lkb, *safe;
>         int count = 0, sum = 0;
> -       bool empty;
> -
> -       clear_bit(LSFL_CB_DELAY, &ls->ls_flags);
> +       bool empty, queued;
>
>         if (!ls->ls_callback_wq)
>                 return;
> @@ -314,12 +319,16 @@ void dlm_callback_resume(struct dlm_ls *ls)
>         mutex_lock(&ls->ls_cb_mutex);
>         list_for_each_entry_safe(lkb, safe, &ls->ls_cb_delay, lkb_cb_list) {
>                 list_del_init(&lkb->lkb_cb_list);
> -               queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +               queued = queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +               WARN_ON_ONCE(!queued);

grml, that should be "!queue_work(ls->ls_callback_wq,
&lkb->lkb_cb_work);" and then "WARN_ON_ONCE(queued);" to follow the
same as above in dlm_add_cb(). Whereas queued is true as it is already
queued for work.

- Alex


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb()
@ 2022-07-21 22:08     ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-21 22:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, Jul 21, 2022 at 5:53 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> Each time dlm_add_cb() queues work or adds the lkb for queuing later to
> the ls->ls_cb_delay list it increments a refcount. However if the work
> is already queued or being added to the list we need to revert the
> incrementation of the refcount. The function dlm_add_cb() can be called
> multiple times without handling the related dlm_callback_work() work
> function where it's get a put call. This patch reverts the kref_get()
> when it's necessary in cases if already queued or not.
>
> In case of dlm_callback_resume() we need to ensure that the
> LSFL_CB_DELAY bit is cleared after all ls->ls_cb_delay lkbs are queued for
> work. As the ls->ls_cb_delay list handling is there for queuing work for
> later it should not be the case that a work was already queued, if so we
> drop a warning.
>
> Cc: stable at vger.kernel.org
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/ast.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
> index 0271796d36b1..68e09ed8234e 100644
> --- a/fs/dlm/ast.c
> +++ b/fs/dlm/ast.c
> @@ -177,6 +177,7 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>  {
>         struct dlm_ls *ls = lkb->lkb_resource->res_ls;
>         uint64_t new_seq, prev_seq;
> +       bool queued = true;
>         int rv;
>
>         spin_lock(&dlm_cb_seq_spin);
> @@ -202,13 +203,19 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>
>                 mutex_lock(&ls->ls_cb_mutex);
>                 if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
> -                       if (list_empty(&lkb->lkb_cb_list))
> +                       if (list_empty(&lkb->lkb_cb_list)) {
>                                 list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
> +                               queued = false;
> +                       }
>                 } else {
> -                       queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +                       queued = !queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
>                 }
>                 mutex_unlock(&ls->ls_cb_mutex);
> +
> +               if (queued)
> +                       dlm_put_lkb(lkb);
>         }
> +
>   out:
>         mutex_unlock(&lkb->lkb_cb_mutex);
>  }
> @@ -303,9 +310,7 @@ void dlm_callback_resume(struct dlm_ls *ls)
>  {
>         struct dlm_lkb *lkb, *safe;
>         int count = 0, sum = 0;
> -       bool empty;
> -
> -       clear_bit(LSFL_CB_DELAY, &ls->ls_flags);
> +       bool empty, queued;
>
>         if (!ls->ls_callback_wq)
>                 return;
> @@ -314,12 +319,16 @@ void dlm_callback_resume(struct dlm_ls *ls)
>         mutex_lock(&ls->ls_cb_mutex);
>         list_for_each_entry_safe(lkb, safe, &ls->ls_cb_delay, lkb_cb_list) {
>                 list_del_init(&lkb->lkb_cb_list);
> -               queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +               queued = queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +               WARN_ON_ONCE(!queued);

grml, that should be "!queue_work(ls->ls_callback_wq,
&lkb->lkb_cb_work);" and then "WARN_ON_ONCE(queued);" to follow the
same as above in dlm_add_cb(). Whereas queued is true as it is already
queued for work.

- Alex


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb()
  2022-07-21 21:53   ` [Cluster-devel] " Alexander Aring
@ 2022-07-22  0:09     ` Alexander Aring
  -1 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-22  0:09 UTC (permalink / raw)
  To: David Teigland; +Cc: cluster-devel, stable

Hi,

On Thu, Jul 21, 2022 at 5:53 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> Each time dlm_add_cb() queues work or adds the lkb for queuing later to
> the ls->ls_cb_delay list it increments a refcount. However if the work
> is already queued or being added to the list we need to revert the
> incrementation of the refcount. The function dlm_add_cb() can be called
> multiple times without handling the related dlm_callback_work() work
> function where it's get a put call. This patch reverts the kref_get()
> when it's necessary in cases if already queued or not.
>
> In case of dlm_callback_resume() we need to ensure that the
> LSFL_CB_DELAY bit is cleared after all ls->ls_cb_delay lkbs are queued for
> work. As the ls->ls_cb_delay list handling is there for queuing work for
> later it should not be the case that a work was already queued, if so we
> drop a warning.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/ast.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
> index 0271796d36b1..68e09ed8234e 100644
> --- a/fs/dlm/ast.c
> +++ b/fs/dlm/ast.c
> @@ -177,6 +177,7 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>  {
>         struct dlm_ls *ls = lkb->lkb_resource->res_ls;
>         uint64_t new_seq, prev_seq;
> +       bool queued = true;
>         int rv;
>
>         spin_lock(&dlm_cb_seq_spin);
> @@ -202,13 +203,19 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>
>                 mutex_lock(&ls->ls_cb_mutex);
>                 if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
> -                       if (list_empty(&lkb->lkb_cb_list))
> +                       if (list_empty(&lkb->lkb_cb_list)) {
>                                 list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
> +                               queued = false;
> +                       }
>                 } else {
> -                       queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +                       queued = !queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
>                 }
>                 mutex_unlock(&ls->ls_cb_mutex);
> +
> +               if (queued)
> +                       dlm_put_lkb(lkb);
>         }

I will drop this patch and 2/3, there is nothing wrong as it does a if
(!prev_seq) before and if (!prev_seq) is true it acts like if it's
already queued for list_add() and queue_work() (I believe)... however
personally, I am not happy with this how it's currently is because
this code smells like it was forgotten to take care about if it's
already queued or not. This is only working because of some other
lkb-specific handling what dlm_add_lkb_callback() and
dlm_callback_work() does regarding lkb->lkb_callbacks[0].seq - if
dlm_add_lkb_callback() would end in an "unlikely" error it would queue
nothing anymore.

Patch 1/3 is still valid and could cause problems.

- Alex


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Cluster-devel] [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb()
@ 2022-07-22  0:09     ` Alexander Aring
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Aring @ 2022-07-22  0:09 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, Jul 21, 2022 at 5:53 PM Alexander Aring <aahringo@redhat.com> wrote:
>
> Each time dlm_add_cb() queues work or adds the lkb for queuing later to
> the ls->ls_cb_delay list it increments a refcount. However if the work
> is already queued or being added to the list we need to revert the
> incrementation of the refcount. The function dlm_add_cb() can be called
> multiple times without handling the related dlm_callback_work() work
> function where it's get a put call. This patch reverts the kref_get()
> when it's necessary in cases if already queued or not.
>
> In case of dlm_callback_resume() we need to ensure that the
> LSFL_CB_DELAY bit is cleared after all ls->ls_cb_delay lkbs are queued for
> work. As the ls->ls_cb_delay list handling is there for queuing work for
> later it should not be the case that a work was already queued, if so we
> drop a warning.
>
> Cc: stable at vger.kernel.org
> Signed-off-by: Alexander Aring <aahringo@redhat.com>
> ---
>  fs/dlm/ast.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
> index 0271796d36b1..68e09ed8234e 100644
> --- a/fs/dlm/ast.c
> +++ b/fs/dlm/ast.c
> @@ -177,6 +177,7 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>  {
>         struct dlm_ls *ls = lkb->lkb_resource->res_ls;
>         uint64_t new_seq, prev_seq;
> +       bool queued = true;
>         int rv;
>
>         spin_lock(&dlm_cb_seq_spin);
> @@ -202,13 +203,19 @@ void dlm_add_cb(struct dlm_lkb *lkb, uint32_t flags, int mode, int status,
>
>                 mutex_lock(&ls->ls_cb_mutex);
>                 if (test_bit(LSFL_CB_DELAY, &ls->ls_flags)) {
> -                       if (list_empty(&lkb->lkb_cb_list))
> +                       if (list_empty(&lkb->lkb_cb_list)) {
>                                 list_add(&lkb->lkb_cb_list, &ls->ls_cb_delay);
> +                               queued = false;
> +                       }
>                 } else {
> -                       queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
> +                       queued = !queue_work(ls->ls_callback_wq, &lkb->lkb_cb_work);
>                 }
>                 mutex_unlock(&ls->ls_cb_mutex);
> +
> +               if (queued)
> +                       dlm_put_lkb(lkb);
>         }

I will drop this patch and 2/3, there is nothing wrong as it does a if
(!prev_seq) before and if (!prev_seq) is true it acts like if it's
already queued for list_add() and queue_work() (I believe)... however
personally, I am not happy with this how it's currently is because
this code smells like it was forgotten to take care about if it's
already queued or not. This is only working because of some other
lkb-specific handling what dlm_add_lkb_callback() and
dlm_callback_work() does regarding lkb->lkb_callbacks[0].seq - if
dlm_add_lkb_callback() would end in an "unlikely" error it would queue
nothing anymore.

Patch 1/3 is still valid and could cause problems.

- Alex


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-07-22  0:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-21 21:53 [PATCH dlm/next 0/3] fs: dlm: some callback fixes Alexander Aring
2022-07-21 21:53 ` [Cluster-devel] " Alexander Aring
2022-07-21 21:53 ` [PATCH dlm/next 1/3] fs: dlm: fix race between test_bit() and queue_work() Alexander Aring
2022-07-21 21:53   ` [Cluster-devel] " Alexander Aring
2022-07-21 21:53 ` [PATCH dlm/next 2/3] fs: dlm: avoid double list_add() for lkb->lkb_cb_list Alexander Aring
2022-07-21 21:53   ` [Cluster-devel] " Alexander Aring
2022-07-21 21:53 ` [PATCH dlm/next 3/3] fs: dlm: fix refcount handling for dlm_add_cb() Alexander Aring
2022-07-21 21:53   ` [Cluster-devel] " Alexander Aring
2022-07-21 22:08   ` Alexander Aring
2022-07-21 22:08     ` [Cluster-devel] " Alexander Aring
2022-07-22  0:09   ` Alexander Aring
2022-07-22  0:09     ` [Cluster-devel] " Alexander Aring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.