All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push
       [not found] <929991500.42327290.1412926226461.JavaMail.zimbra@redhat.com>
@ 2014-10-10  7:31 ` Xu Wang
  2014-10-10 12:07   ` Brian Foster
  0 siblings, 1 reply; 3+ messages in thread
From: Xu Wang @ 2014-10-10  7:31 UTC (permalink / raw)
  To: xfs

From 59c6babab7c4ce8708036018143d2acc1477cc7f Mon Sep 17 00:00:00 2001
From: George Wang <xuw@redhat.com>
Date: Fri, 10 Oct 2014 15:02:14 +0800
Subject: [PATCH] [PATCH] xfs: lock for removing item from cil->xc_cil in
 xlog_cil_push

There is a race condition when xlog_cil_force_lsn and xlog_cil_push for
cil->xc_cil items. When function xlog_cil_push_now called in
xlog_cil_force_lsn, and the xlog_cil_push run in workqueue just reaches the point for
setting cil->xc_current_sequence, the xlog_cil_force_lsn got the lock
and check the "sequence == cil->xc_current_sequence &&
!list_empty(&cil->xc_cil)" for restart. The statement "sequence ==
cil->xc_current_sequence" is true, but the cil->xc_cil is empty
according to the xlog_cil_push removed the items in cil->xc_cil without
lock protectionl. So the function xlog_cil_force_lsn will return
NULLCOMMITLSN, which means the cil log for current sequence will not be
submitted to disk. And if there is no more operations(for example, when
unmount fs, this situation happened in
xfs_unmountfs->xfs_log_sbcount->xfs_trans_commit), the xfs will hang up.

Signed-off-by: George Wang <xuw@redhat.com>
---
 fs/xfs/xfs_log_cil.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index f6b79e5..97ba527 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -478,6 +478,8 @@ xlog_cil_push(
     */
    lv = NULL;
    num_iovecs = 0;
+
+   spin_lock(&cil->xc_push_lock);
    while (!list_empty(&cil->xc_cil)) {
        struct xfs_log_item *item;

@@ -530,7 +532,6 @@ xlog_cil_push(
     * against the current sequence in log forces without risking
     * deferencing a freed context pointer.
     */
-   spin_lock(&cil->xc_push_lock);
    cil->xc_current_sequence = new_ctx->sequence;
    list_add(&ctx->committing, &cil->xc_committing);
    spin_unlock(&cil->xc_push_lock);
-- 
1.9.3


-- 
George Wang 王旭

Kernel Quantity Engineer
Red Hat Software (Beijing) Co.,Ltd
IRC:xuw
Tel:+86-010-62608041
Phone:15901231579
9/F, Tower C, Raycom

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push
  2014-10-10  7:31 ` [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push Xu Wang
@ 2014-10-10 12:07   ` Brian Foster
  2014-10-11  4:19     ` Eric Sandeen
  0 siblings, 1 reply; 3+ messages in thread
From: Brian Foster @ 2014-10-10 12:07 UTC (permalink / raw)
  To: Xu Wang; +Cc: xfs

On Fri, Oct 10, 2014 at 03:31:46AM -0400, Xu Wang wrote:
> From 59c6babab7c4ce8708036018143d2acc1477cc7f Mon Sep 17 00:00:00 2001
> From: George Wang <xuw@redhat.com>
> Date: Fri, 10 Oct 2014 15:02:14 +0800
> Subject: [PATCH] [PATCH] xfs: lock for removing item from cil->xc_cil in
>  xlog_cil_push
> 
> There is a race condition when xlog_cil_force_lsn and xlog_cil_push for
> cil->xc_cil items. When function xlog_cil_push_now called in
> xlog_cil_force_lsn, and the xlog_cil_push run in workqueue just reaches the point for
> setting cil->xc_current_sequence, the xlog_cil_force_lsn got the lock
> and check the "sequence == cil->xc_current_sequence &&
> !list_empty(&cil->xc_cil)" for restart. The statement "sequence ==
> cil->xc_current_sequence" is true, but the cil->xc_cil is empty
> according to the xlog_cil_push removed the items in cil->xc_cil without
> lock protectionl. So the function xlog_cil_force_lsn will return
> NULLCOMMITLSN, which means the cil log for current sequence will not be
> submitted to disk. And if there is no more operations(for example, when
> unmount fs, this situation happened in
> xfs_unmountfs->xfs_log_sbcount->xfs_trans_commit), the xfs will hang up.
> 
> Signed-off-by: George Wang <xuw@redhat.com>
> ---

Isn't this fixed by the following commit?

8af3dcd3 xfs: xlog_cil_force_lsn doesn't always wait correctly

With that patch, xlog_cil_push() adds the ctx to the committing list
before draining the ctx and updating the current sequence number.
xlog_cil_force_lsn() walks the committing list and checks the sequence
number and ctx list, all under the push lock.

That means that xlog_cil_force_lsn() should see the ctx either on the
committing list and wait for it (commit_lsn != NULL), or not on the
committing list at all. If it's not on the committing list yet,
xlog_cil_push() won't have either drained the ctx or updated the
sequence number, as it adds to the committing list first and that
requires the push lock (which xlog_cil_force_lsn() holds and doesn't
release until after the restart check).

Am I missing something? Do you reproduce a problem with the latest tree
that includes the above patch?

Brian

>  fs/xfs/xfs_log_cil.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index f6b79e5..97ba527 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -478,6 +478,8 @@ xlog_cil_push(
>      */
>     lv = NULL;
>     num_iovecs = 0;
> +
> +   spin_lock(&cil->xc_push_lock);
>     while (!list_empty(&cil->xc_cil)) {
>         struct xfs_log_item *item;
> 
> @@ -530,7 +532,6 @@ xlog_cil_push(
>      * against the current sequence in log forces without risking
>      * deferencing a freed context pointer.
>      */
> -   spin_lock(&cil->xc_push_lock);
>     cil->xc_current_sequence = new_ctx->sequence;
>     list_add(&ctx->committing, &cil->xc_committing);
>     spin_unlock(&cil->xc_push_lock);
> -- 
> 1.9.3
> 
> 
> -- 
> George Wang 王旭
> 
> Kernel Quantity Engineer
> Red Hat Software (Beijing) Co.,Ltd
> IRC:xuw
> Tel:+86-010-62608041
> Phone:15901231579
> 9/F, Tower C, Raycom
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push
  2014-10-10 12:07   ` Brian Foster
@ 2014-10-11  4:19     ` Eric Sandeen
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Sandeen @ 2014-10-11  4:19 UTC (permalink / raw)
  To: Brian Foster, Xu Wang; +Cc: xfs

On 10/10/14 7:07 AM, Brian Foster wrote:
 
> Isn't this fixed by the following commit?
> 
> 8af3dcd3 xfs: xlog_cil_force_lsn doesn't always wait correctly
> 
> With that patch, xlog_cil_push() adds the ctx to the committing list
> before draining the ctx and updating the current sequence number.
> xlog_cil_force_lsn() walks the committing list and checks the sequence
> number and ctx list, all under the push lock.
> 
> That means that xlog_cil_force_lsn() should see the ctx either on the
> committing list and wait for it (commit_lsn != NULL), or not on the
> committing list at all. If it's not on the committing list yet,
> xlog_cil_push() won't have either drained the ctx or updated the
> sequence number, as it adds to the committing list first and that
> requires the push lock (which xlog_cil_force_lsn() holds and doesn't
> release until after the restart check).
> 
> Am I missing something? Do you reproduce a problem with the latest tree
> that includes the above patch?

We clarified that Dave's patch does fix the problem; this was a Red Hat
bug, and I hadn't put the upstream commit into it yet, sorry about that.

George, thanks for working on it!

-Eric

> Brian

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-10-11  4:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <929991500.42327290.1412926226461.JavaMail.zimbra@redhat.com>
2014-10-10  7:31 ` [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push Xu Wang
2014-10-10 12:07   ` Brian Foster
2014-10-11  4:19     ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.