All of lore.kernel.org
 help / color / mirror / Atom feed
From: Justin Piszcz <jpiszcz@lucidpixels.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	xfs@oss.sgi.com, Alan Piszcz <ap@solarrain.com>
Subject: Re: 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available)
Date: Wed, 21 Oct 2009 06:19:54 -0400 (EDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.0910210618210.10288@p34.internal.lan> (raw)
In-Reply-To: <alpine.DEB.2.00.0910200431290.21878@p34.internal.lan>



On Tue, 20 Oct 2009, Justin Piszcz wrote:


>
>
> On Tue, 20 Oct 2009, Dave Chinner wrote:
>
>> On Mon, Oct 19, 2009 at 06:18:58AM -0400, Justin Piszcz wrote:
>>> On Mon, 19 Oct 2009, Dave Chinner wrote:
>>>> On Sun, Oct 18, 2009 at 04:17:42PM -0400, Justin Piszcz wrote:
>>>>> It has happened again, all sysrq-X output was saved this time.
>>>> .....
>>>> 
>>>> All pointing to log IO not completing.
>>>> 
>> ....
>>> So far I do not have a reproducible test case,
>> 
>> Ok. What sort of load is being placed on the machine?
> Hello, generally the load is low, it mainly serves out some samba shares.
>
>> 
>> It appears that both the xfslogd and the xfsdatad on CPU 0 are in
>> the running state but don't appear to be consuming any significant
>> CPU time. If they remain like this then I think that means they are
>> stuck waiting on the run queue.  Do these XFS threads always appear
>> like this when the hang occurs? If so, is there something else that
>> is hogging CPU 0 preventing these threads from getting the CPU?
> Yes, the XFS threads show up like this on each time the kernel crashed.  So 
> far
> with 2.6.30.9 after ~48hrs+ it has not crashed.  So it appears to be some 
> issue
> between 2.6.30.9 and 2.6.31.x when this began happening.  Any recommendations
> on how to catch this bug w/certain options enabled/etc?
>
>
>> 
>> Cheers,
>> 
>> Dave.
>> -- 
>> Dave Chinner
>> david@fromorbit.com
>> 
>

Uptime with 2.6.30.9:

  06:18:41 up 2 days, 14:10, 14 users,  load average: 0.41, 0.21, 0.07

No issues yet, so it first started happening in 2.6.(31).(x).

Any further recommendations on how to debug this issue?  BTW: Do you view this
as an XFS bug or MD/VFS layer issue based on the logs/output thus far?

Justin.


WARNING: multiple messages have this Message-ID (diff)
From: Justin Piszcz <jpiszcz@lucidpixels.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-raid@vger.kernel.org, Alan Piszcz <ap@solarrain.com>,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available)
Date: Wed, 21 Oct 2009 06:19:54 -0400 (EDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.0910210618210.10288@p34.internal.lan> (raw)
In-Reply-To: <alpine.DEB.2.00.0910200431290.21878@p34.internal.lan>



On Tue, 20 Oct 2009, Justin Piszcz wrote:


>
>
> On Tue, 20 Oct 2009, Dave Chinner wrote:
>
>> On Mon, Oct 19, 2009 at 06:18:58AM -0400, Justin Piszcz wrote:
>>> On Mon, 19 Oct 2009, Dave Chinner wrote:
>>>> On Sun, Oct 18, 2009 at 04:17:42PM -0400, Justin Piszcz wrote:
>>>>> It has happened again, all sysrq-X output was saved this time.
>>>> .....
>>>> 
>>>> All pointing to log IO not completing.
>>>> 
>> ....
>>> So far I do not have a reproducible test case,
>> 
>> Ok. What sort of load is being placed on the machine?
> Hello, generally the load is low, it mainly serves out some samba shares.
>
>> 
>> It appears that both the xfslogd and the xfsdatad on CPU 0 are in
>> the running state but don't appear to be consuming any significant
>> CPU time. If they remain like this then I think that means they are
>> stuck waiting on the run queue.  Do these XFS threads always appear
>> like this when the hang occurs? If so, is there something else that
>> is hogging CPU 0 preventing these threads from getting the CPU?
> Yes, the XFS threads show up like this on each time the kernel crashed.  So 
> far
> with 2.6.30.9 after ~48hrs+ it has not crashed.  So it appears to be some 
> issue
> between 2.6.30.9 and 2.6.31.x when this began happening.  Any recommendations
> on how to catch this bug w/certain options enabled/etc?
>
>
>> 
>> Cheers,
>> 
>> Dave.
>> -- 
>> Dave Chinner
>> david@fromorbit.com
>> 
>

Uptime with 2.6.30.9:

  06:18:41 up 2 days, 14:10, 14 users,  load average: 0.41, 0.21, 0.07

No issues yet, so it first started happening in 2.6.(31).(x).

Any further recommendations on how to debug this issue?  BTW: Do you view this
as an XFS bug or MD/VFS layer issue based on the logs/output thus far?

Justin.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2009-10-21 10:19 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-17 22:34 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) Justin Piszcz
2009-10-17 22:34 ` Justin Piszcz
2009-10-18 20:17 ` Justin Piszcz
2009-10-18 20:17   ` Justin Piszcz
2009-10-19  3:04   ` Dave Chinner
2009-10-19  3:04     ` Dave Chinner
2009-10-19 10:18     ` Justin Piszcz
2009-10-19 10:18       ` Justin Piszcz
2009-10-20  0:33       ` Dave Chinner
2009-10-20  0:33         ` Dave Chinner
2009-10-20  8:33         ` Justin Piszcz
2009-10-20  8:33           ` Justin Piszcz
2009-10-21 10:19           ` Justin Piszcz [this message]
2009-10-21 10:19             ` Justin Piszcz
2009-10-21 14:17             ` mdadm --detail showing annoying device Stephane Bunel
2009-10-21 21:46               ` Neil Brown
2009-10-22 11:22                 ` Stephane Bunel
2009-10-29  3:44                   ` Neil Brown
2009-11-03  9:37                     ` Stephane Bunel
2009-11-03 10:09                       ` Beolach
2009-11-03 12:16                         ` Stephane Bunel
2009-10-22 11:29                 ` Mario 'BitKoenig' Holbe
2009-10-22 14:17                   ` Stephane Bunel
2009-10-22 16:00                     ` Stephane Bunel
2009-10-22 22:49             ` 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) Justin Piszcz
2009-10-22 22:49               ` Justin Piszcz
2009-10-22 23:00               ` Dave Chinner
2009-10-22 23:00                 ` Dave Chinner
2009-10-26 11:24               ` Justin Piszcz
2009-10-26 11:24                 ` Justin Piszcz
2009-11-02 21:46                 ` Justin Piszcz
2009-11-02 21:46                   ` Justin Piszcz
2009-11-20 20:39             ` 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) - root cause found = asterisk Justin Piszcz
2009-11-20 20:39               ` Justin Piszcz
2009-11-20 23:44               ` Bug#557262: " Faidon Liambotis
2009-11-20 23:44                 ` Faidon Liambotis
2009-11-20 23:44                 ` Faidon Liambotis
2009-11-20 23:51                 ` Justin Piszcz
2009-11-20 23:51                   ` Justin Piszcz
2009-11-21 14:29                 ` Roger Heflin
2009-11-21 14:29                   ` Roger Heflin
2009-11-24 13:08 ` Which kernel options should be enabled to find the root cause of this bug? Justin Piszcz
2009-11-24 13:08   ` Justin Piszcz
2009-11-24 15:14   ` Eric Sandeen
2009-11-24 15:14     ` Eric Sandeen
2009-11-24 16:20     ` Justin Piszcz
2009-11-24 16:20       ` Justin Piszcz
2009-11-24 16:23       ` Eric Sandeen
2009-11-24 16:23         ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.0910210618210.10288@p34.internal.lan \
    --to=jpiszcz@lucidpixels.com \
    --cc=ap@solarrain.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.