All of lore.kernel.org
 help / color / mirror / Atom feed
From: Austin Schuh <austin@peloton-tech.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs <xfs@oss.sgi.com>
Subject: Re: XFS crash?
Date: Mon, 23 Jun 2014 13:05:54 -0700	[thread overview]
Message-ID: <CANGgnMZ8OwzfBj5m9H7c6q2yahGhU7oFZLsJfVxnWoqZExkZmQ@mail.gmail.com> (raw)
In-Reply-To: <CANGgnMZqQc_NeaDpO_aX+bndmHrQ9VWo9mkfxhPBkRD-J=N6sQ@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 9223 bytes --]

Hi Dave,

On Tue, May 13, 2014 at 10:11 AM, Austin Schuh <austin@peloton-tech.com>
wrote:

> On Tue, May 13, 2014 at 2:03 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Tue, May 13, 2014 at 12:02:18AM -0700, Austin Schuh wrote:
> >> On Mon, May 12, 2014 at 11:39 PM, Dave Chinner <david@fromorbit.com>
> wrote:
> >> > On Mon, May 12, 2014 at 09:03:48PM -0700, Austin Schuh wrote:
> >> >> On Mon, May 12, 2014 at 8:46 PM, Dave Chinner <david@fromorbit.com>
> wrote:
> >> >> > On Mon, May 12, 2014 at 06:29:28PM -0700, Austin Schuh wrote:
> >> >> >> On Wed, Mar 5, 2014 at 4:53 PM, Austin Schuh <
> austin@peloton-tech.com> wrote:
> >> >> >> > Hi Dave,
> >> >> >> >
> >> >> >> > On Wed, Mar 5, 2014 at 3:35 PM, Dave Chinner <
> david@fromorbit.com> wrote:
> >> >> >> >> On Wed, Mar 05, 2014 at 03:08:16PM -0800, Austin Schuh wrote:
> >> >> >> >>> Howdy,
> >> >> >> >>>
> >> >> >> >>> I'm running a config_preempt_rt patched version of the
> 3.10.11 kernel,
> >> >> >> >>> and I'm seeing a couple lockups and crashes which I think are
> related
> >> >> >> >>> to XFS.
> >> >> >> >>
> >> >> >> >> I think they ar emore likely related to RT issues....
> >> >> >> >>
> >> >> >> >
> >> >> >> > That very well may be true.
> >> >> >> >
> >> >> >> >> Cheers,
> >> >> >> >>
> >> >> >> >> Dave.
> >> >> >> >> --
> >> >> >> >> Dave Chinner
> >> >> >>
> >> >> >> I had the issue reproduce itself today with just the main SSD
> >> >> >> installed.  This was on a new machine that was built this morning.
> >> >> >> There is a lot less going on in this trace than the previous one.
> >> >> >
> >> >> > The three blocked threads:
> >> >> >
> >> >> >         1. kworker running IO completion waiting on an inode lock,
> >> >> >            holding locked pages.
> >> >> >         2. kworker running writeback flusher work waiting for a
> page lock
> >> >> >         3. direct flush work waiting for allocation, holding page
> >> >> >            locks and the inode lock.
> >> >> >
> >> >> > What's the kworker thread running the allocation work doing?
> >> >> >
> >> >> > You might need to run `echo w > proc-sysrq-trigger` to get this
> >> >> > information...
> >> >>
> >> >> I was able to reproduce the lockup.  I ran `echo w >
> >> >> /proc/sysrq-trigger` per your suggestion.  I don't know how to figure
> >> >> out what the kworker thread is doing, but I'll happily do it if you
> >> >> can give me some guidance.
> >> >
> >> > There isn't a worker thread blocked doing an allocation in that
> >> > dump, so it doesn't shed any light on the problem at all. try
> >> > `echo l > /proc/sysrq-trigger`, followed by `echo t >
> >> > /proc/sysrq-trigger` so we can see all the processes running on CPUs
> >> > and all the processes in the system...
> >> >
> >> > Cheers,
> >> >
> >> > Dave.
> >>
> >> Attached is the output of the two commands you asked for.
> >
> > Nothing there. There's lots of processes waiting for allocation to
> > run, and no kworkers running allocation work. This looks more
> > like a rt-kernel workqueue issue, not an XFS problem.
> >
> > FWIW, it woul dbe really helpful if you compiled your kernels with
> > frame pointers enabled - the stack traces are much more precise and
> > readable (i.e. gets rid of all the false/stale entrys) and that
> > helps understanding where things are stuck immensely.
> >
> > Cheers,
> >
> > Dave.
>
> Thanks Dave.
>
> I'll go check with the rt-kernel guys and take it from there.  Thanks
> for the frame pointers suggestion.  I'll make that change the next
> time I build a kernel.
>
> Austin
>

I found 1 bug in XFS which I fixed, and I've uncovered something else that
I'm not completely sure how to fix.

In xfs_bmapi_allocate, you create a completion, and use that to wait until
the work has finished.  Occasionally, I'm seeing a case where I get a
context switch after the completion has been completed, but before the
workqueue has finished doing it's internal book-keeping.  This results in
the work being deleted before the workqueue is done using it, corrupting
the internal data structures.  I fixed it by waiting using flush_work and
removing the completion entirely.

--- a/fs/xfs/xfs_bmap_util.c        2014-06-23 12:59:10.008678410 -0700
+++ b/fs/xfs/xfs_bmap_util.c      2014-06-23 12:59:14.116678239 -0700
@@ -263,7 +263,6 @@
        current_set_flags_nested(&pflags, PF_FSTRANS);

        args->result = __xfs_bmapi_allocate(args);
-       complete(args->done);

        current_restore_flags_nested(&pflags, PF_FSTRANS);
 }
@@ -277,16 +276,13 @@
 xfs_bmapi_allocate(
        struct xfs_bmalloca     *args)
 {
-       DECLARE_COMPLETION_ONSTACK(done);
-
        if (!args->stack_switch)
                return __xfs_bmapi_allocate(args);


-       args->done = &done;
        INIT_WORK_ONSTACK(&args->work, xfs_bmapi_allocate_worker);
        queue_work(xfs_alloc_wq, &args->work);
-       wait_for_completion(&done);
+       flush_work(&args->work);
        destroy_work_on_stack(&args->work);
        return args->result;
 }
--- a/fs/xfs/xfs_bmap_util.h        2014-06-23 12:59:10.008678410 -0700
+++ b/fs/xfs/xfs_bmap_util.h      2014-06-23 12:59:11.708678340 -0700
@@ -57,7 +57,6 @@
        char                    conv;   /* overwriting unwritten extents */
        char                    stack_switch;
        int                     flags;
-       struct completion       *done;
        struct work_struct      work;
        int                     result;
 };

I enabled event tracing (and added a new event which lists the number of
workers running in a queue whenever that is changed).

To me, it looks like work is scheduled from irq/44-ahci-273 that will
acquire an inode lock.  scp-3986 then acquires the lock, and then goes and
schedules work.  That work is then scheduled behind the work from
irq/44-ahci-273 in the same pool.  The first work blocks waiting on the
lock, and scp-3986 won't finish and release that lock until the second work
gets run.

Any ideas on how to deal with this?  I think we need to create a new pool
to make sure that xfs_bmapi_allocate_worker gets run in a separate thread
to avoid this.

     irq/44-ahci-273   [000] ....1.5    76.340300: workqueue_queue_work:
work struct=ffff880406a6b998 function=xfs_end_io workqueue=ffff88040af82000
pool=ffff88042da63fc0 req_cpu=512 cpu=0
     irq/44-ahci-273   [000] ....1.5    76.340301: workqueue_activate_work:
work struct ffff880406a6b998

             scp-3986  [000] ....1..    76.342711: xfs_ilock_nowait: dev
8:5 ino 0x9794 lock_addr ffff880409fe0090 flags ILOCK_SHARED caller
xfs_map_blocks
             scp-3986  [000] ....1..    76.342714: xfs_iunlock: dev 8:5 ino
0x9794 lock_addr ffff880409fe0090 flags ILOCK_SHARED caller xfs_map_blocks
             scp-3986  [000] ....1..    76.342722: xfs_ilock: dev 8:5 ino
0x9794 lock_addr ffff880409fe0090 flags ILOCK_EXCL caller
xfs_iomap_write_allocate
             scp-3986  [000] ....1.3    76.342729: workqueue_queue_work:
work struct=ffff8800a5b8d900 function=xfs_bmapi_allocate_worker
workqueue=ffff88040bb36000 pool=ffff88042da63fc0 req_cpu=512 cpu=0
             scp-3986  [000] ....1.3    76.342730: workqueue_activate_work:
work struct ffff8800a5b8d900
             scp-3986  [000] ....1.4    76.342754: workqueue_queue_work:
work struct=ffff88040a6a01c8 function=blk_delay_work
workqueue=ffff88040c9f4a00 pool=ffff88042da644c0 req_cpu=512 cpu=0
             scp-3986  [000] ....1.4    76.342755: workqueue_activate_work:
work struct ffff88040a6a01c8

     kworker/0:2-757   [000] ....1.1    76.342868: workqueue_nr_running:
pool=ffff88042da63fc0 nr_running=1
     kworker/0:2-757   [000] ....1..    76.342869: workqueue_execute_start:
work struct ffff880406a6b998: function xfs_end_io
     kworker/0:2-757   [000] ....1..    76.342870: xfs_ilock: dev 8:5 ino
0x9794 lock_addr ffff880409fe0090 flags ILOCK_EXCL caller xfs_setfilesize

     irq/44-ahci-273   [003] ....1.5    76.419037: workqueue_queue_work:
work struct=ffff8800aebc3588 function=xfs_end_io workqueue=ffff88040af82000
pool=ffff88042dbe3fc0 req_cpu=512 cpu=3
     irq/44-ahci-273   [003] ....1.5    76.419038: workqueue_activate_work:
work struct ffff8800aebc3588

     kworker/3:1-119   [003] ....1.1    76.419105: workqueue_nr_running:
pool=ffff88042dbe3fc0 nr_running=1
     kworker/3:1-119   [003] ....1..    76.419106: workqueue_execute_start:
work struct ffff8800aebc3588: function xfs_end_io
     kworker/3:1-119   [003] ....1..    76.419107: xfs_ilock: dev 8:5 ino
0x9794 lock_addr ffff880409fe0090 flags ILOCK_EXCL caller xfs_setfilesize

     irq/44-ahci-273   [002] ....1.5    76.426120: workqueue_queue_work:
work struct=ffff880406a6b930 function=xfs_end_io workqueue=ffff88040af82000
pool=ffff88042db63fc0 req_cpu=512 cpu=2
     irq/44-ahci-273   [002] .N..1.5    76.426127: workqueue_activate_work:
work struct ffff880406a6b930
     kworker/2:1-72    [002] ....1.1    76.426242: workqueue_nr_running:
pool=ffff88042db63fc0 nr_running=1
     kworker/2:1-72    [002] ....1..    76.426243: workqueue_execute_start:
work struct ffff880406a6b930: function xfs_end_io
     kworker/2:1-72    [002] ....1..    76.426244: xfs_ilock: dev 8:5 ino
0x9794 lock_addr ffff880409fe0090 flags ILOCK_EXCL caller xfs_setfilesize

Thanks,
  Austin

[-- Attachment #1.2: Type: text/html, Size: 14087 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-06-23 20:06 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-05 23:08 XFS crash? Austin Schuh
2014-03-05 23:35 ` Dave Chinner
2014-03-06  0:53   ` Austin Schuh
2014-05-13  1:29     ` Austin Schuh
2014-05-13  3:10       ` Austin Schuh
2014-05-13  3:33       ` Austin Schuh
2014-05-13  3:46       ` Dave Chinner
2014-05-13  4:03         ` Austin Schuh
2014-05-13  6:39           ` Dave Chinner
2014-05-13  7:02             ` Austin Schuh
2014-05-13  9:03               ` Dave Chinner
2014-05-13 17:11                 ` Austin Schuh
2014-06-23 20:05                   ` Austin Schuh [this message]
2014-06-24  3:02                     ` On-stack work item completion race? (was Re: XFS crash?) Dave Chinner
2014-06-24  3:02                       ` Dave Chinner
2014-06-24  3:25                       ` Tejun Heo
2014-06-24  3:25                         ` Tejun Heo
2014-06-25  3:05                         ` Austin Schuh
2014-06-25 14:00                           ` Tejun Heo
2014-06-25 14:00                             ` Tejun Heo
2014-06-25 17:04                             ` Austin Schuh
2014-06-25 17:04                               ` Austin Schuh
2014-06-25  3:16                         ` Austin Schuh
2014-06-25  3:16                           ` Austin Schuh
2014-06-25  5:56                         ` Dave Chinner
2014-06-25  5:56                           ` Dave Chinner
2014-06-25 14:18                           ` Tejun Heo
2014-06-25 14:18                             ` Tejun Heo
2014-06-25 22:08                             ` Dave Chinner
2014-06-25 22:08                               ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANGgnMZ8OwzfBj5m9H7c6q2yahGhU7oFZLsJfVxnWoqZExkZmQ@mail.gmail.com \
    --to=austin@peloton-tech.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.