From: Jeffrey Hundstad <jeffrey.hundstad@mnsu.edu>
To: Jakob Oestergaard <jakob@unthought.net>
Cc: Christoph Hellwig <hch@infradead.org>,
David Greaves <david@dgreaves.com>, Jan Kasprzak <kas@fi.muni.cz>,
linux-kernel@vger.kernel.org, kruty@fi.muni.cz
Subject: journaled filesystems -- known instability; Was: XFS: inode with st_mode == 0
Date: Mon, 17 Jan 2005 15:31:59 -0600 [thread overview]
Message-ID: <41EC2ECF.6010701@mnsu.edu> (raw)
In-Reply-To: <20050117100746.GI347@unthought.net>
For more of this look up subjects:
Bad things happening to journaled filesystem machines
Oops in kjournald
and from author:
Anders Saaby
I also can't keep a recent 2.6 or 2.6*-ac* kernel up more than a few
hours on a machine under real load. Perhaps us folks with the problem
need to talk to the powers who be to come up with a strategy to make a
report they can use. My guess is we're not sending something that can
be used.
--
jeffrey hundstad
Jakob Oestergaard wrote:
>On Sun, Jan 16, 2005 at 01:51:12PM +0000, Christoph Hellwig wrote:
>
>
>>On Fri, Jan 14, 2005 at 07:23:09PM +0100, Jakob Oestergaard wrote:
>>
>>
>>>So apart from the general well known instability problems that will
>>>occur when you actually start *using* the system, there should be no
>>>
>>>
>>What known instabilities?
>>
>>
>
>Where should I begin? ;)
>
>Most of the following have already been posted to LKML - primarily by
>Anders (as@cohaesio.com) - it seems that noone cares, but I'll repost a
>summary that Anders sent me below:
>
>-------
>Scenario 1: Mailservers:
> 2.6.10 (~24-40 hours uptime):
> Running ext3 on mailqueue:
>
><SNIP>
>Unable to handle kernel NULL pointer dereference at virtual address 00000004
>printing eip:
>c018a095
>*pde = 00000000
>Oops: 0002 [#1]
>SMP
>Modules linked in: nfs e1000 iptable_nat ipt_connlimit rtc
>CPU: 2
>EIP: 0060:[<c018a095>] Not tainted
>EFLAGS: 00010286 (2.6.8.1)
>EIP is at journal_commit_transaction+0x535/0x10e5
>eax: cac1e26c ebx: 00000000 ecx: f7cec400 edx: f7cec400
>esi: f65f3000 edi: cac1e26c ebp: f65f3000 esp: f65f3dc0
>ds: 007b es: 007b ss: 0068
>Process kjournald (pid: 174, threadinfo=f65f3000 task=c2308b70)
>Stack: f65f3e64 00000000 00000000 00000000 00000000 00000000 f7cec400 cda565fc
> 0000149a 00000004 f65f3e48 c01132d8 00000002 c202ad20 00000001 f65f3e5c
> c202ad20 c202ad20 00000002 00000001 0000001e 01c1af60 f65f3e68 c0407dc0
>Call Trace:
> [<c01132d8>] scheduler_tick+0x468/0x470
> [<c01127b5>] find_busiest_group+0x105/0x310
> [<c011db8e>] del_timer_sync+0x7e/0xa0
> [<c018cd4d>] kjournald+0xbd/0x230
> [<c0114b10>] autoremove_wake_function+0x0/0x40
> [<c0114b10>] autoremove_wake_function+0x0/0x40
> [<c0103f16>] ret_from_fork+0x6/0x14
> [<c018cc70>] commit_timeout+0x0/0x10
> [<c018cc90>] kjournald+0x0/0x230
> [<c01024bd>] kernel_thread_helper+0x5/0x18
>Code: f0 ff 43 04 8b 03 83 e0 04 74 4c 8b 8c 24 b8 01 00 00 c6 81
> <2>SoftDog: Initiating system reboot
></SNIP>
>
>-------
>Scenario 2: Mailservers:
> Running XFS on mailqueue:
>
><SNIP>
>Filesystem "sdb1": xfs_trans_delete_ail: attempting to delete a log item that
>is not in the AIL
>xfs_force_shutdown(sdb1,0x8) called from line 382 of file
>fs/xfs/xfs_trans_ail.c. Return address = 0xc0216a56
>@Linux version 2.6.9 (root@mail1.domain.tld) (gcc version 2.96 20000731 (Red
>Hat Linux 7.3 2.96-113)) #1 SMP Tue Oct 19 16:04:55 CEST 2004
></SNIP>
>
>
>=======
>Resolution to the mailserver problem:
> 2.4.28 is perfectly stable on these machines.
>
>-------
>Scenario 3: Webservers:
>
> 2.6.10, 2.6.10-ac8 (~3-12 hours uptime):
>
> <SNIP>
> Unable to handle kernel paging request
> <2>SoftDog: Initiating system reboot.
> <SNIP>
> (No more...) :(
>
>=======
>Resolution to the webserver problem:
> 2.4.28/2.4.29-rc2 are stable here
>
>-------
>Scenario 4: Storageservers:
> 2.6.8.1:
> Oopses after ~5-10 hours whith SMP on. - Cannot find the actual Oopses
>anymore and 2.6.8+ havent been tested as we cannot afford anymore downtime on
>these servers.
>
>
>=======
>Resolution to the storage server problem:
> 2.6.8.1 UP is stable (but oopses regularly after memory allocation
> failures)
>
>
>
>Hardware on all servers: IBM x335 and x345.
>
>Mentioned errors seen on a total of 17 servers.
>
>
>
next prev parent reply other threads:[~2005-01-17 21:31 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-09 12:59 XFS: inode with st_mode == 0 Jan Kasprzak
2004-12-09 13:53 ` Jakob Oestergaard
2004-12-09 14:07 ` Jan Kasprzak
2004-12-09 21:54 ` Christoph Hellwig
2004-12-14 23:40 ` Jakob Oestergaard
2004-12-21 18:43 ` Jan Kasprzak
2004-12-22 8:41 ` Jakob Oestergaard
2004-12-22 18:23 ` Christoph Hellwig
2004-12-23 15:01 ` Jakob Oestergaard
2005-01-04 8:48 ` Jakob Oestergaard
2005-01-05 11:34 ` Christoph Hellwig
2005-01-14 18:14 ` David Greaves
2005-01-14 18:23 ` Jakob Oestergaard
2005-01-15 2:09 ` Nathan Scott
2005-01-17 0:53 ` Jakob Oestergaard
2005-01-16 13:51 ` Christoph Hellwig
2005-01-17 10:07 ` Jakob Oestergaard
2005-01-17 11:55 ` Jan-Frode Myklebust
2005-01-17 13:48 ` Anders Saaby
2005-01-17 21:31 ` Jeffrey Hundstad [this message]
2005-01-17 20:54 ` journaled filesystems -- known instability; Was: " Alan Cox
2005-01-20 22:30 ` Jeffrey E. Hundstad
2005-01-25 12:47 ` Stephen C. Tweedie
2005-01-25 15:09 ` Jeffrey Hundstad
2005-01-25 15:37 ` Stephen C. Tweedie
2005-01-28 20:15 ` Jeffrey E. Hundstad
2005-01-28 21:00 ` Stephen C. Tweedie
2005-01-28 21:06 ` Jeffrey E. Hundstad
2005-01-18 11:45 ` Jan Kasprzak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41EC2ECF.6010701@mnsu.edu \
--to=jeffrey.hundstad@mnsu.edu \
--cc=david@dgreaves.com \
--cc=hch@infradead.org \
--cc=jakob@unthought.net \
--cc=kas@fi.muni.cz \
--cc=kruty@fi.muni.cz \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).