linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
@ 2004-11-22 19:06 Phil Dier
  2004-11-23  0:17 ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Phil Dier @ 2004-11-22 19:06 UTC (permalink / raw)
  To: Kernel Mailing List

Hi,

I'm setting up a storage array with Linux, software RAID, LVM, and XFS,
but I keep getting oopses during heavy I/O. I've been able to reproduce
this with 2.6.6, 2.6.8.1, 2.6.9, and 2.6.10-rc2-bk4. I have dual xeon
2.8s with 4gb of ram. I'm using adaptec and a fusion mpt scsi devices
(more details in the following link). Connected are 2 ultra160 scsi
jbods w/ 2 disks apiece. I'm using raid 10 (or should it be 01?) mirrored 
stripes.

Due to its size, I've posted my debug info at this location (I've included
output from all of the above kernels):

<http://www.icglink.com/cluster-debug-info.html> (~235kb)

Please let me know if I've left anything out that would help in locating
the source of the problem.  I'm very willing to try out any patches/config
changes.

please cc me on any replies, as I am not subscribed to the list...

Thanks,

--

Phil Dier (ICGLink.com -- 615 370-1530 x733)

/* vim:set noai nocindent ts=8 sw=8: */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-22 19:06 oops with dual xeon 2.8ghz 4gb ram +smp, software raid, lvm, and xfs Phil Dier
@ 2004-11-23  0:17 ` Andrew Morton
  2004-11-23 15:37   ` Phil Dier
                     ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Andrew Morton @ 2004-11-23  0:17 UTC (permalink / raw)
  To: Phil Dier; +Cc: linux-kernel

Phil Dier <phil@dier.us> wrote:
>
> I'm setting up a storage array with Linux, software RAID, LVM, and XFS,
> but I keep getting oopses during heavy I/O. I've been able to reproduce
> this with 2.6.6, 2.6.8.1, 2.6.9, and 2.6.10-rc2-bk4. I have dual xeon
> 2.8s with 4gb of ram. I'm using adaptec and a fusion mpt scsi devices
> (more details in the following link). Connected are 2 ultra160 scsi
> jbods w/ 2 disks apiece. I'm using raid 10 (or should it be 01?) mirrored 
> stripes.
> 
> Due to its size, I've posted my debug info at this location (I've included
> output from all of the above kernels):
> 
> <http://www.icglink.com/cluster-debug-info.html> (~235kb)

yow.  The dread combination of XFS, LVM, software RAID and bloaty scsi
drivers.  Looks like a stack overrun.

Can you rebuild the kernel with CONFIG_4KSTACKS=n?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23  0:17 ` Andrew Morton
@ 2004-11-23 15:37   ` Phil Dier
  2004-11-23 17:02     ` Jakob Oestergaard
  2004-11-24 15:45   ` Phil Dier
  2004-11-24 23:12   ` Neil Brown
  2 siblings, 1 reply; 28+ messages in thread
From: Phil Dier @ 2004-11-23 15:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: Scott Holdren, ziggy, Jack Massari

On Mon, 22 Nov 2004 16:17:25 -0800
Andrew Morton <akpm@osdl.org> wrote:

> yow. The dread combination of XFS, LVM, software RAID and bloaty scsi
> drivers. Looks like a stack overrun.
>
> Can you rebuild the kernel with CONFIG_4KSTACKS=n?
>

Thanks for the suggestion.. I'm doing a burn-in right now with 8k
stacks, and so far, so good.

I'm building this system with stability and flexibility foremost in
mind. Am I foolish in using all of these technologies with a new-ish
version of 2.6? Is there a particular version that would be better
suited for my application? Any other suggestions you (or anyone else
on the list) could give regarding stability would be greatly appreciated.

Thanks,

--

Phil Dier (ICGLink.com -- 615 370-1530 x733)

/* vim:set noai nocindent ts=8 sw=8: */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23 15:37   ` Phil Dier
@ 2004-11-23 17:02     ` Jakob Oestergaard
  2004-11-23 18:29       ` Phil Dier
  2004-11-23 22:39       ` Christoph Hellwig
  0 siblings, 2 replies; 28+ messages in thread
From: Jakob Oestergaard @ 2004-11-23 17:02 UTC (permalink / raw)
  To: Phil Dier; +Cc: linux-kernel, Scott Holdren, ziggy, Jack Massari

On Tue, Nov 23, 2004 at 09:37:44AM -0600, Phil Dier wrote:
...
> I'm building this system with stability and flexibility foremost in
> mind. Am I foolish in using all of these technologies with a new-ish
> version of 2.6? Is there a particular version that would be better
> suited for my application? Any other suggestions you (or anyone else
> on the list) could give regarding stability would be greatly appreciated.

If you'll be exporting via. NFS, it seems that there are still problems
with XFS+NFS.

With SMP, what I see is that sometimes a directory might decide that
it's a file - but I can't delete it, becuase it isn't 'empty' (it's
still somehow a directory).  Waiting a day or two, the system will
change its mind back to letting the directory be a directory. Sometimes
modes will be fscked up as well - a regular file can change owner, or it
can change modes from '-rw-rw---' to '?---------'.    Weird stuff, no
way to reproduce it reliably.

With UP, I know someone who's seeing stale handles reported by the NFS
server. The only known workaround is to stat the directories in question
on the *server* side - a little bash with 'while true; sleep 5; ls -l
/directory; do' will do the trick.

All of what I describe here are production environments - so it sucks to
have that kind of problems.  Some of it can be reproduced (the stale
handle errors), and some of it can't.

I guess the good news would be, that I don't know of any problems with
XFS+LVM+MD if you do not export the FS via. NFS    :)

That is, if you run 2.6.9.  Any earlier kernel will b0rk your XFS under
load.

-- 

 / jakob


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23 17:02     ` Jakob Oestergaard
@ 2004-11-23 18:29       ` Phil Dier
  2004-11-23 22:39       ` Christoph Hellwig
  1 sibling, 0 replies; 28+ messages in thread
From: Phil Dier @ 2004-11-23 18:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: scott, ziggy, webmaster

On Tue, 23 Nov 2004 18:02:23 +0100
Jakob Oestergaard <jakob@unthought.net> wrote:

> If you'll be exporting via. NFS, it seems that there are still problems
> with XFS+NFS.
> 
> With SMP, what I see is that sometimes a directory might decide that
> it's a file - but I can't delete it, becuase it isn't 'empty' (it's
> still somehow a directory).  Waiting a day or two, the system will
> change its mind back to letting the directory be a directory. Sometimes
> modes will be fscked up as well - a regular file can change owner, or it
> can change modes from '-rw-rw---' to '?---------'.    Weird stuff, no
> way to reproduce it reliably.
> 
> With UP, I know someone who's seeing stale handles reported by the NFS
> server. The only known workaround is to stat the directories in question
> on the *server* side - a little bash with 'while true; sleep 5; ls -l
> /directory; do' will do the trick.
> 
> All of what I describe here are production environments - so it sucks to
> have that kind of problems.  Some of it can be reproduced (the stale
> handle errors), and some of it can't.
> 
> I guess the good news would be, that I don't know of any problems with
> XFS+LVM+MD if you do not export the FS via. NFS    :)
> 
> That is, if you run 2.6.9.  Any earlier kernel will b0rk your XFS under
> load.

Thanks for the tips, Jakob.

I *will* be exporting via NFS, so this is definetly good to know. I've
been looking at using jfs and reiser as well, but some preliminary
benchmarks suggested that xfs was the best performer for the kind of
workload that I'm anticipating. I guess xfs is out of the question now,
as I definetly don't want to deal with weird interactions like that.

Can anyone speak on the stability of (reiser|jfs|other) with nfs? My
biggest requirements are online resizing and stability (ext3 online
resize is still beta IIRC, but I wouldn't be opposed to using it if
someone could tell me otherwise); speed would be nice, but I'm willing
to sacrifice speed for the sake of reliability.

I'm personally using lvm + reiser + nfs without consequence on my
fileserver at home, but it's not seeing nearly the loads that this box
is going to see.

Thanks again,
--

Phil Dier (ICGLink.com -- 615 370-1530 x733)

/* vim:set noai nocindent ts=8 sw=8: */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23 17:02     ` Jakob Oestergaard
  2004-11-23 18:29       ` Phil Dier
@ 2004-11-23 22:39       ` Christoph Hellwig
  2004-11-23 22:56         ` Jakob Oestergaard
  2004-11-30 17:37         ` Phil Dier
  1 sibling, 2 replies; 28+ messages in thread
From: Christoph Hellwig @ 2004-11-23 22:39 UTC (permalink / raw)
  To: Jakob Oestergaard, Phil Dier, linux-kernel, Scott Holdren, ziggy,
	Jack Massari

On Tue, Nov 23, 2004 at 06:02:23PM +0100, Jakob Oestergaard wrote:
> With SMP, what I see is that sometimes a directory might decide that
> it's a file - but I can't delete it, becuase it isn't 'empty' (it's
> still somehow a directory).  Waiting a day or two, the system will
> change its mind back to letting the directory be a directory. Sometimes
> modes will be fscked up as well - a regular file can change owner, or it
> can change modes from '-rw-rw---' to '?---------'.    Weird stuff, no
> way to reproduce it reliably.

Actually I can reproduce it reliably by running nfs_fsstress.sh for a
looong time.  The problem is that in the current XFS code the inode
generation counter starts at 0, but higher level code uses that as
a wildcard for any possible generation, so you may get a newly created
file for a stale nfs file handler of an deleted file with the same inode
number.

The patch below fixes it for me:


Index: fs/xfs/xfs_inode.c
===================================================================
RCS file: /cvs/linux-2.6-xfs/fs/xfs/xfs_inode.c,v
retrieving revision 1.406
diff -u -p -r1.406 xfs_inode.c
--- fs/xfs/xfs_inode.c	27 Oct 2004 12:06:24 -0000	1.406
+++ fs/xfs/xfs_inode.c	23 Nov 2004 20:40:56 -0000
@@ -1224,9 +1224,16 @@ xfs_ialloc(
 	ip->i_d.di_nextents = 0;
 	ASSERT(ip->i_d.di_nblocks == 0);
 	xfs_ichgtime(ip, XFS_ICHGTIME_CHG|XFS_ICHGTIME_ACC|XFS_ICHGTIME_MOD);
+
 	/*
-	 * di_gen will have been taken care of in xfs_iread.
+	 * Bump the generation count so no one will confuse us with an
+	 * earlier incarnations of this inode.
+	 *
+	 * Done early to skip generation 0, which is used as a wildcard
+	 * by higher level code.
 	 */
+	ip->i_d.di_gen++;
+
 	ip->i_d.di_extsize = 0;
 	ip->i_d.di_dmevmask = 0;
 	ip->i_d.di_dmstate = 0;
@@ -2370,11 +2377,6 @@ xfs_ifree(
 		XFS_IFORK_DSIZE(ip) / (uint)sizeof(xfs_bmbt_rec_t);
 	ip->i_d.di_format = XFS_DINODE_FMT_EXTENTS;
 	ip->i_d.di_aformat = XFS_DINODE_FMT_EXTENTS;
-	/*
-	 * Bump the generation count so no one will be confused
-	 * by reincarnations of this inode.
-	 */
-	ip->i_d.di_gen++;
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
 	if (delete) {

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23 22:39       ` Christoph Hellwig
@ 2004-11-23 22:56         ` Jakob Oestergaard
  2004-11-23 23:12           ` Christoph Hellwig
  2004-11-30 17:37         ` Phil Dier
  1 sibling, 1 reply; 28+ messages in thread
From: Jakob Oestergaard @ 2004-11-23 22:56 UTC (permalink / raw)
  To: Christoph Hellwig, Phil Dier, linux-kernel, Scott Holdren, ziggy,
	Jack Massari

On Tue, Nov 23, 2004 at 10:39:35PM +0000, Christoph Hellwig wrote:
> On Tue, Nov 23, 2004 at 06:02:23PM +0100, Jakob Oestergaard wrote:
> > With SMP, what I see is that sometimes a directory might decide that
> > it's a file - but I can't delete it, becuase it isn't 'empty' (it's
> > still somehow a directory).  Waiting a day or two, the system will
> > change its mind back to letting the directory be a directory. Sometimes
> > modes will be fscked up as well - a regular file can change owner, or it
> > can change modes from '-rw-rw---' to '?---------'.    Weird stuff, no
> > way to reproduce it reliably.
> 
> Actually I can reproduce it reliably by running nfs_fsstress.sh for a
> looong time.  The problem is that in the current XFS code the inode
> generation counter starts at 0, but higher level code uses that as
> a wildcard for any possible generation, so you may get a newly created
> file for a stale nfs file handler of an deleted file with the same inode
> number.
> 
> The patch below fixes it for me:

Very nice!

Is that patch on its way into mainline kernels, or is it waiting for
more test data ?

I could apply it and test it here if that would help (?)

-- 

 / jakob


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23 22:56         ` Jakob Oestergaard
@ 2004-11-23 23:12           ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2004-11-23 23:12 UTC (permalink / raw)
  To: Jakob Oestergaard, Phil Dier, linux-kernel, Scott Holdren, ziggy,
	Jack Massari

On Tue, Nov 23, 2004 at 11:56:50PM +0100, Jakob Oestergaard wrote:
> Very nice!
> 
> Is that patch on its way into mainline kernels, or is it waiting for
> more test data ?
> 
> I could apply it and test it here if that would help (?)

It's waiting for review right now, but should go into mainline fairly
soon.  Additional testing is of course always welcome.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23  0:17 ` Andrew Morton
  2004-11-23 15:37   ` Phil Dier
@ 2004-11-24 15:45   ` Phil Dier
  2004-11-24 16:56     ` Christoph Hellwig
  2004-11-24 23:12     ` Andrew Morton
  2004-11-24 23:12   ` Neil Brown
  2 siblings, 2 replies; 28+ messages in thread
From: Phil Dier @ 2004-11-24 15:45 UTC (permalink / raw)
  To: linux-kernel

On Mon, 22 Nov 2004 16:17:25 -0800
Andrew Morton <akpm@osdl.org> wrote:

> Phil Dier <phil@dier.us> wrote:
> >
> > I'm setting up a storage array with Linux, software RAID, LVM, and XFS,
> > but I keep getting oopses during heavy I/O. I've been able to reproduce
> > this with 2.6.6, 2.6.8.1, 2.6.9, and 2.6.10-rc2-bk4. I have dual xeon
> > 2.8s with 4gb of ram. I'm using adaptec and a fusion mpt scsi devices
> > (more details in the following link). Connected are 2 ultra160 scsi
> > jbods w/ 2 disks apiece. I'm using raid 10 (or should it be 01?) mirrored 
> > stripes.
> > 
> > Due to its size, I've posted my debug info at this location (I've included
> > output from all of the above kernels):
> > 
> > <http://www.icglink.com/cluster-debug-info.html> (~235kb)
> 
> yow.  The dread combination of XFS, LVM, software RAID and bloaty scsi
> drivers.  Looks like a stack overrun.
> 
> Can you rebuild the kernel with CONFIG_4KSTACKS=n?
> 


Looks like 8k stacks did the trick, at least for the oops. Now I'm
seeing the stuff below.

I got a ton more of this with jfs and xfs, but it seems much less with
reiser. Should I be worried, or is this something I can safely ignore?
It doesn't lock the system..  Could files be getting corrupted?


Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20
Nov 23 17:38:20 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
Nov 23 17:38:20 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
Nov 23 17:38:20 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
Nov 23 17:38:20 calculon [<c0140813>] alloc_slabmgmt+0x55/0x5f
Nov 23 17:38:20 calculon [<c0140992>] cache_grow+0xab/0x14d
Nov 23 17:38:20 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
Nov 23 17:38:20 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
Nov 23 17:38:20 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
Nov 23 17:38:20 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
Nov 23 17:38:20 calculon [<c032e8e0>] e1000_clean_rx_irq+0x189/0x44a
Nov 23 17:38:20 calculon [<c032e4f2>] e1000_intr+0x36/0x83
Nov 23 17:38:20 calculon [<c0107899>] handle_IRQ_event+0x31/0x65
Nov 23 17:38:20 calculon [<c0107c19>] do_IRQ+0xb0/0x15f
Nov 23 17:38:20 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 23 17:38:20 calculon [<c010301e>] default_idle+0x0/0x2c
Nov 23 17:38:20 calculon [<c0103047>] default_idle+0x29/0x2c
Nov 23 17:38:20 calculon [<c01030bc>] cpu_idle+0x3f/0x58
Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20
Nov 23 17:38:20 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
Nov 23 17:38:20 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
Nov 23 17:38:20 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
Nov 23 17:38:20 calculon [<c0140992>] cache_grow+0xab/0x14d
Nov 23 17:38:20 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
Nov 23 17:38:20 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
Nov 23 17:38:20 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
Nov 23 17:38:20 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
Nov 23 17:38:20 calculon [<c032e8e0>] e1000_clean_rx_irq+0x189/0x44a
Nov 23 17:38:20 calculon [<c032e4f2>] e1000_intr+0x36/0x83
Nov 23 17:38:20 calculon [<c0107899>] handle_IRQ_event+0x31/0x65
Nov 23 17:38:20 calculon [<c0107c19>] do_IRQ+0xb0/0x15f
Nov 23 17:38:20 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 23 17:38:20 calculon [<c010301e>] default_idle+0x0/0x2c
Nov 23 17:38:20 calculon [<c0103047>] default_idle+0x29/0x2c
Nov 23 17:38:20 calculon [<c01030bc>] cpu_idle+0x3f/0x58
Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20
Nov 23 17:38:20 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
Nov 23 17:38:20 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
Nov 23 17:38:20 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
Nov 23 17:38:20 calculon [<c0140992>] cache_grow+0xab/0x14d
Nov 23 17:38:20 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
Nov 23 17:38:20 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
Nov 23 17:38:20 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
Nov 23 17:38:20 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
Nov 23 17:38:20 calculon [<c032e8e0>] e1000_clean_rx_irq+0x189/0x44a
Nov 23 17:38:20 calculon [<c032e4f2>] e1000_intr+0x36/0x83
Nov 23 17:38:20 calculon [<c0107899>] handle_IRQ_event+0x31/0x65
Nov 23 17:38:20 calculon [<c0107c19>] do_IRQ+0xb0/0x15f
Nov 23 17:38:20 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 23 17:38:20 calculon [<c010301e>] default_idle+0x0/0x2c
Nov 23 17:38:20 calculon [<c0103047>] default_idle+0x29/0x2c
Nov 23 17:38:20 calculon [<c01030bc>] cpu_idle+0x3f/0x58
Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20
Nov 23 17:38:20 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
Nov 23 17:38:20 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
Nov 23 17:38:20 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
Nov 23 17:38:20 calculon [<c0140992>] cache_grow+0xab/0x14d
Nov 23 17:38:20 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
Nov 23 17:38:20 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
Nov 23 17:38:20 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
Nov 23 17:38:20 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
Nov 23 17:38:20 calculon [<c032e8e0>] e1000_clean_rx_irq+0x189/0x44a
Nov 23 17:38:20 calculon [<c0140ff0>] __kmalloc+0x77/0x8c
Nov 23 17:38:20 calculon [<c032e4f2>] e1000_intr+0x36/0x83
Nov 23 17:38:20 calculon [<c03f5020>] alloc_skb+0xde/0xe0
Nov 23 17:38:20 calculon [<c0107899>] handle_IRQ_event+0x31/0x65
Nov 23 17:38:20 calculon [<c0107c19>] do_IRQ+0xb0/0x15f
Nov 23 17:38:20 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 23 17:38:20 calculon [<c03fb243>] net_rx_action+0x62/0xf6
Nov 23 17:38:20 calculon [<c0121beb>] __do_softirq+0xb7/0xc6
Nov 23 17:38:20 calculon [<c0121c27>] do_softirq+0x2d/0x2f
Nov 23 17:38:20 calculon [<c0107c8d>] do_IRQ+0x124/0x15f
Nov 23 17:38:20 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 23 17:38:20 calculon [<c010301e>] default_idle+0x0/0x2c
Nov 23 17:38:20 calculon [<c0103047>] default_idle+0x29/0x2c
Nov 23 17:38:20 calculon [<c01030bc>] cpu_idle+0x3f/0x58

Nov 24 01:18:09 calculon swapper: page allocation failure. order:0, mode:0x20
Nov 24 01:18:09 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
Nov 24 01:18:09 calculon [<c040ce57>] ip_local_deliver_finish+0x0/0x181
Nov 24 01:18:09 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
Nov 24 01:18:09 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
Nov 24 01:18:09 calculon [<c0140813>] alloc_slabmgmt+0x55/0x5f
Nov 24 01:18:09 calculon [<c0140992>] cache_grow+0xab/0x14d
Nov 24 01:18:09 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
Nov 24 01:18:09 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
Nov 24 01:18:09 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
Nov 24 01:18:09 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
Nov 24 01:18:09 calculon [<c032e8e0>] e1000_clean_rx_irq+0x189/0x44a
Nov 24 01:18:09 calculon [<c012d45d>] rcu_check_quiescent_state+0x78/0x8e
Nov 24 01:18:09 calculon [<c032e4f2>] e1000_intr+0x36/0x83
Nov 24 01:18:09 calculon [<c0107899>] handle_IRQ_event+0x31/0x65
Nov 24 01:18:09 calculon [<c0107c19>] do_IRQ+0xb0/0x15f
Nov 24 01:18:09 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 24 01:18:09 calculon [<c010301e>] default_idle+0x0/0x2c
Nov 24 01:18:09 calculon [<c0103047>] default_idle+0x29/0x2c
Nov 24 01:18:09 calculon [<c01030bc>] cpu_idle+0x3f/0x58
Nov 24 01:18:09 calculon swapper: page allocation failure. order:0, mode:0x20
Nov 24 01:18:09 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
Nov 24 01:18:09 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
Nov 24 01:18:09 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
Nov 24 01:18:09 calculon [<c0140992>] cache_grow+0xab/0x14d
Nov 24 01:18:09 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
Nov 24 01:18:09 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
Nov 24 01:18:09 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
Nov 24 01:18:09 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
Nov 24 01:18:09 calculon [<c032e8e0>] e1000_clean_rx_irq+0x189/0x44a
Nov 24 01:18:09 calculon [<c012d45d>] rcu_check_quiescent_state+0x78/0x8e
Nov 24 01:18:09 calculon [<c032e4f2>] e1000_intr+0x36/0x83
Nov 24 01:18:09 calculon [<c0107899>] handle_IRQ_event+0x31/0x65
Nov 24 01:18:09 calculon [<c0107c19>] do_IRQ+0xb0/0x15f
Nov 24 01:18:09 calculon [<c0105a68>] common_interrupt+0x18/0x20
Nov 24 01:18:09 calculon [<c010301e>] default_idle+0x0/0x2c
Nov 24 01:18:09 calculon [<c0103047>] default_idle+0x29/0x2c
Nov 24 01:18:09 calculon [<c01030bc>] cpu_idle+0x3f/0x58



-- 

Phil Dier (ICGLink.com -- 615 370-1530 x733)

/* vim:set noai nocindent ts=8 sw=8: */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-24 15:45   ` Phil Dier
@ 2004-11-24 16:56     ` Christoph Hellwig
  2004-11-24 23:12     ` Andrew Morton
  1 sibling, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2004-11-24 16:56 UTC (permalink / raw)
  To: Phil Dier; +Cc: linux-kernel

On Wed, Nov 24, 2004 at 09:45:49AM -0600, Phil Dier wrote:
> Looks like 8k stacks did the trick, at least for the oops. Now I'm
> seeing the stuff below.
> 
> I got a ton more of this with jfs and xfs, but it seems much less with
> reiser. Should I be worried, or is this something I can safely ignore?
> It doesn't lock the system..  Could files be getting corrupted?
> 
> 
> Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20

This is pretty harmless.  It just means the NIC driver couldn't allocate as
much memory in the RX path as it wanted.  Try increasing
/proc/sys/vm/min_free_kbytes to make the warnings go away and get less packet
drops


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23  0:17 ` Andrew Morton
  2004-11-23 15:37   ` Phil Dier
  2004-11-24 15:45   ` Phil Dier
@ 2004-11-24 23:12   ` Neil Brown
  2004-11-24 23:50     ` Andrew Morton
  2 siblings, 1 reply; 28+ messages in thread
From: Neil Brown @ 2004-11-24 23:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Phil Dier, Jens Axboe, linux-kernel

On Monday November 22, akpm@osdl.org wrote:
> > <http://www.icglink.com/cluster-debug-info.html> (~235kb)
> 
> yow.  The dread combination of XFS, LVM, software RAID and bloaty scsi
> drivers.  Looks like a stack overrun.
> 
> Can you rebuild the kernel with CONFIG_4KSTACKS=n?
> 

Would the following (untested-but-seems-to-compile -
explanation-of-concept) patch be at all reasonable to avoid stack
depth problems with stacked block devices, or is adding stuff to
task_struct frowned upon? 

NeilBrown

==============================================
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>

### Diffstat output
 ./drivers/block/ll_rw_blk.c |   38 +++++++++++++++++++++++++++++++++++++-
 ./include/linux/sched.h     |    3 +++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff ./drivers/block/ll_rw_blk.c~current~ ./drivers/block/ll_rw_blk.c
--- ./drivers/block/ll_rw_blk.c~current~	2004-11-16 15:55:55.000000000 +1100
+++ ./drivers/block/ll_rw_blk.c	2004-11-25 10:05:14.000000000 +1100
@@ -2609,7 +2609,7 @@ static inline void block_wait_queue_runn
  * bi_sector for remaps as it sees fit.  So the values of these fields
  * should NOT be depended on after the call to generic_make_request.
  */
-void generic_make_request(struct bio *bio)
+static inline void __generic_make_request(struct bio *bio)
 {
 	request_queue_t *q;
 	sector_t maxsector;
@@ -2686,6 +2686,42 @@ end_io:
 	} while (ret);
 }
 
+/*
+ * We only want one ->make_request_fn to be active at a time, 
+ * else stack usage with stacked devices could be a problem.
+ * So use current->bio_{list,tail} to keep a list of requests
+ * submited by a make_request_fn function.
+ * current->bio_tail is also used as a flag to say if 
+ * generic_make_request is currently activce in this task or not.
+ * If it is NULL, then no make_request is active.  If it is non-NULL,
+ * then a make_request is active, and new requests should be added
+ * at the tail
+ */
+void generic_make_request(struct bio *bio)
+{
+	if (current->bio_tail) {
+		/* make_request is active */
+		*(current->bio_tail) = bio;
+		bio->bi_next = NULL;
+		current->bio_tail = &bio->bi_next;
+		return;
+	}
+	/* not active yet, make it active */
+	current->bio_list = NULL;
+	current->bio_tail = & current->bio_list;
+	__generic_make_request(bio);
+	while (current->bio_list) {
+		bio = current->bio_list;
+		current->bio_list = bio->bi_next;
+		if (bio->bi_next == NULL)
+			current->bio_tail = &current->bio_list;
+		else
+			bio->bi_next = NULL;
+		__generic_make_request(bio);
+	}
+	current->bio_tail = NULL; /* deactivate */
+}
+	
 EXPORT_SYMBOL(generic_make_request);
 
 /**

diff ./include/linux/sched.h~current~ ./include/linux/sched.h
--- ./include/linux/sched.h~current~	2004-11-25 09:57:07.000000000 +1100
+++ ./include/linux/sched.h	2004-11-25 09:57:34.000000000 +1100
@@ -649,6 +649,9 @@ struct task_struct {
 
 /* journalling filesystem info */
 	void *journal_info;
+	
+/* stacked block device info */
+	struct bio *bio_list, **bio_tail;
 
 /* VM state */
 	struct reclaim_state *reclaim_state;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-24 15:45   ` Phil Dier
  2004-11-24 16:56     ` Christoph Hellwig
@ 2004-11-24 23:12     ` Andrew Morton
  2004-11-25  0:48       ` Phil Dier
  2004-11-28 11:29       ` David Greaves
  1 sibling, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2004-11-24 23:12 UTC (permalink / raw)
  To: Phil Dier; +Cc: linux-kernel

Phil Dier <phil@dier.us> wrote:
>
> > Can you rebuild the kernel with CONFIG_4KSTACKS=n?
> > 
> 
> 
> Looks like 8k stacks did the trick, at least for the oops. Now I'm
> seeing the stuff below.
> 
> I got a ton more of this with jfs and xfs, but it seems much less with
> reiser. Should I be worried, or is this something I can safely ignore?
> It doesn't lock the system..  Could files be getting corrupted?
> 
> 
> Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20
> Nov 23 17:38:20 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
> Nov 23 17:38:20 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
> Nov 23 17:38:20 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
> Nov 23 17:38:20 calculon [<c0140813>] alloc_slabmgmt+0x55/0x5f
> Nov 23 17:38:20 calculon [<c0140992>] cache_grow+0xab/0x14d
> Nov 23 17:38:20 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
> Nov 23 17:38:20 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
> Nov 23 17:38:20 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
> Nov 23 17:38:20 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3

You didn't mention the kernel version.  2.6.9 had problems in this area, so
2.6.10-rc2 should be better.  And there are post-2.6.10-rc2 fixes which
will provide more headroom.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-24 23:12   ` Neil Brown
@ 2004-11-24 23:50     ` Andrew Morton
  2004-11-25  0:14       ` Neil Brown
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2004-11-24 23:50 UTC (permalink / raw)
  To: Neil Brown; +Cc: phil, axboe, linux-kernel

Neil Brown <neilb@cse.unsw.edu.au> wrote:
>
> Would the following (untested-but-seems-to-compile -
> explanation-of-concept) patch be at all reasonable to avoid stack
> depth problems with stacked block devices, or is adding stuff to
> task_struct frowned upon? 

It's always a tradeoff - we've put things in task_struct before to get
around sticky situations.  Certainly, removing potentially unbounded stack
utilisation is a worthwhile thing to do.

The patch bends my brain a bit.  Shouldn't the queueing happen in
submit_bio()?

Is bi_next free in there?  If anyone tries to do synchronous I/O things
will get stuck.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-24 23:50     ` Andrew Morton
@ 2004-11-25  0:14       ` Neil Brown
  2004-11-25  1:05         ` Andrew Morton
  2004-11-25  6:57         ` Jens Axboe
  0 siblings, 2 replies; 28+ messages in thread
From: Neil Brown @ 2004-11-25  0:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: phil, axboe, linux-kernel

On Wednesday November 24, akpm@osdl.org wrote:
> Neil Brown <neilb@cse.unsw.edu.au> wrote:
> >
> > Would the following (untested-but-seems-to-compile -
> > explanation-of-concept) patch be at all reasonable to avoid stack
> > depth problems with stacked block devices, or is adding stuff to
> > task_struct frowned upon? 
> 
> It's always a tradeoff - we've put things in task_struct before to get
> around sticky situations.  Certainly, removing potentially unbounded stack
> utilisation is a worthwhile thing to do.
> 
> The patch bends my brain a bit.

Recursion is like that (... like recursion, that is :-).

>                                   Shouldn't the queueing happen in
> submit_bio()?

Both md and dm call generic_make_request rather than submit_bio to
start IO on slaves, so it wouldn't work in submit_bio.  If dm and md
were changes to use submit_bio, then the counts (page-in, page-out)
would be quite different...

> 
> Is bi_next free in there?  If anyone tries to do synchronous I/O things
> will get stuck.

It is my understanding the bi_next is free.  It is available for use
by ->make_request_fn and below. __make_request uses it for chaining
bio's together  into a request.  raid5 uses it for other things.

If a ->make_request_fn did synchronous IO things would definitely get
unstuck.   But I don't think they should and doubt if they do (md
certainly doesn't).

NeilBrown

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-24 23:12     ` Andrew Morton
@ 2004-11-25  0:48       ` Phil Dier
  2004-11-28 11:29       ` David Greaves
  1 sibling, 0 replies; 28+ messages in thread
From: Phil Dier @ 2004-11-25  0:48 UTC (permalink / raw)
  To: linux-kernel

On Wed, 24 Nov 2004 15:12:34 -0800
Andrew Morton <akpm@osdl.org> wrote:

> You didn't mention the kernel version. 2.6.9 had problems in this
> area, so 2.6.10-rc2 should be better. And there are post-2.6.10-rc2
> fixes which will provide more headroom.
>

Sorry, yes, it is 2.6.9 that I'm using atm. I pushed
/proc/sys/vm/min_free_kbytes up to 2048 (it was at 987 or something)
as Christoph suggested and so far, so good. It was such an infrequent
thing though, it's hard to tell if it did any good. I left some stuff
hammering on the array to run over the holiday break, so hopefully any
bad stuff will shake out. I'll give 2.6.10-rc2+ a whirl when I get back
on monday.


Thanks everyone,

Phil

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-25  0:14       ` Neil Brown
@ 2004-11-25  1:05         ` Andrew Morton
  2004-11-25  6:57         ` Jens Axboe
  1 sibling, 0 replies; 28+ messages in thread
From: Andrew Morton @ 2004-11-25  1:05 UTC (permalink / raw)
  To: Neil Brown; +Cc: phil, axboe, linux-kernel

Neil Brown <neilb@cse.unsw.edu.au> wrote:
>
> If a ->make_request_fn did synchronous IO things would definitely get
>  unstuck.   But I don't think they should and doubt if they do (md
>  certainly doesn't).

generic_make_request() can block in get_request_wait(), but I can't
immediately think of a way in which that can deadlock things, especially if
each level is using a distinct queue.

It could certainly deadlock if a higher-level make_request() caller
required allocation of two or more requests at a lower level - all we'd
need is N/2 proceses each trying to allocate two requests.  But such a
lockup could happen in the current code anyway..  

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-25  0:14       ` Neil Brown
  2004-11-25  1:05         ` Andrew Morton
@ 2004-11-25  6:57         ` Jens Axboe
  2004-11-25  7:08           ` Andrew Morton
  1 sibling, 1 reply; 28+ messages in thread
From: Jens Axboe @ 2004-11-25  6:57 UTC (permalink / raw)
  To: Neil Brown; +Cc: Andrew Morton, phil, linux-kernel

On Thu, Nov 25 2004, Neil Brown wrote:
> On Wednesday November 24, akpm@osdl.org wrote:
> > Neil Brown <neilb@cse.unsw.edu.au> wrote:
> > >
> > > Would the following (untested-but-seems-to-compile -
> > > explanation-of-concept) patch be at all reasonable to avoid stack
> > > depth problems with stacked block devices, or is adding stuff to
> > > task_struct frowned upon? 
> > 
> > It's always a tradeoff - we've put things in task_struct before to get
> > around sticky situations.  Certainly, removing potentially unbounded stack
> > utilisation is a worthwhile thing to do.
> > 
> > The patch bends my brain a bit.
> 
> Recursion is like that (... like recursion, that is :-).

Pardon my ignorance, but where is the bug that called for something like
this? I can't say I love the idea of adding a bio list structure to the
tasklist, it feels pretty hacky. generic_make_request() doesn't really
use that much stack, if you just kill the BDEVNAME_SIZE struct.

===== drivers/block/ll_rw_blk.c 1.280 vs edited =====
--- 1.280/drivers/block/ll_rw_blk.c	2004-11-15 11:21:40 +01:00
+++ edited/drivers/block/ll_rw_blk.c	2004-11-25 07:56:10 +01:00
@@ -67,6 +67,11 @@
 EXPORT_SYMBOL(blk_max_low_pfn);
 EXPORT_SYMBOL(blk_max_pfn);
 
+struct b_name {
+	char b[BDEVNAME_SIZE];
+};
+static DEFINE_PER_CPU(struct b_name, b_cpu_name);
+
 /* Amount of time in which a process may batch requests */
 #define BLK_BATCH_TIME	(HZ/50UL)
 
@@ -2622,19 +2627,21 @@
 
 		if (maxsector < nr_sectors ||
 		    maxsector - nr_sectors < sector) {
-			char b[BDEVNAME_SIZE];
+			struct b_name *bn = &get_cpu_var(b_cpu_name);
+
 			/* This may well happen - the kernel calls
 			 * bread() without checking the size of the
 			 * device, e.g., when mounting a device. */
 			printk(KERN_INFO
 			       "attempt to access beyond end of device\n");
 			printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
-			       bdevname(bio->bi_bdev, b),
+			       bdevname(bio->bi_bdev, bn->b),
 			       bio->bi_rw,
 			       (unsigned long long) sector + nr_sectors,
 			       (long long) maxsector);
 
 			set_bit(BIO_EOF, &bio->bi_flags);
+			put_cpu_var(bn);
 			goto end_io;
 		}
 	}

> >                                   Shouldn't the queueing happen in
> > submit_bio()?
> 
> Both md and dm call generic_make_request rather than submit_bio to
> start IO on slaves, so it wouldn't work in submit_bio.  If dm and md
> were changes to use submit_bio, then the counts (page-in, page-out)
> would be quite different...

generic_make_request() has always been where the unstacking has
happened, so yeah submit_bio() would not work.

> > 
> > Is bi_next free in there?  If anyone tries to do synchronous I/O things
> > will get stuck.
> 
> It is my understanding the bi_next is free.  It is available for use
> by ->make_request_fn and below. __make_request uses it for chaining
> bio's together  into a request.  raid5 uses it for other things.

That's correct, bi_next is only used for request chaining. So it's
available for free use by the stacking drivers up until they call
make_request on a bio.

> If a ->make_request_fn did synchronous IO things would definitely get
> unstuck.   But I don't think they should and doubt if they do (md
> certainly doesn't).

There's nothing guaranteeing that a make_request would not do sync io.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-25  6:57         ` Jens Axboe
@ 2004-11-25  7:08           ` Andrew Morton
  2004-11-25  7:11             ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2004-11-25  7:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: neilb, phil, linux-kernel

Jens Axboe <axboe@suse.de> wrote:
>
> On Thu, Nov 25 2004, Neil Brown wrote:
> > On Wednesday November 24, akpm@osdl.org wrote:
> > > Neil Brown <neilb@cse.unsw.edu.au> wrote:
> > > >
> > > > Would the following (untested-but-seems-to-compile -
> > > > explanation-of-concept) patch be at all reasonable to avoid stack
> > > > depth problems with stacked block devices, or is adding stuff to
> > > > task_struct frowned upon? 
> > > 
> > > It's always a tradeoff - we've put things in task_struct before to get
> > > around sticky situations.  Certainly, removing potentially unbounded stack
> > > utilisation is a worthwhile thing to do.
> > > 
> > > The patch bends my brain a bit.
> > 
> > Recursion is like that (... like recursion, that is :-).
> 
> Pardon my ignorance, but where is the bug that called for something like
> this?

Well there was an xfs-on-raid-on-lvm stack overrun reported, but the
general problem we're addressing here is that stacking drivers can cause
arbitrary amounts of kernel stack windup.

> I can't say I love the idea of adding a bio list structure to the
> tasklist, it feels pretty hacky. generic_make_request() doesn't really
> use that much stack, if you just kill the BDEVNAME_SIZE struct.

Looks like a sensible thing to do, although it would be tidier to move the
whole thing into a separate function, no?


--- 25/drivers/block/ll_rw_blk.c~generic_make_request-stack-savings	2004-11-24 23:03:06.347778648 -0800
+++ 25-akpm/drivers/block/ll_rw_blk.c	2004-11-24 23:07:39.798207864 -0800
@@ -2584,6 +2584,20 @@ static inline void block_wait_queue_runn
 	}
 }
 
+static void handle_bad_sector(struct bio *bio)
+{
+	char b[BDEVNAME_SIZE];
+
+	printk(KERN_INFO "attempt to access beyond end of device\n");
+	printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
+			bdevname(bio->bi_bdev, b),
+			bio->bi_rw,
+			(unsigned long long)bio->bi_sector + bio_sectors(bio),
+			(long long)(bio->bi_bdev->bd_inode->i_size >> 9));
+
+	set_bit(BIO_EOF, &bio->bi_flags);
+}
+
 /**
  * generic_make_request: hand a buffer to its device driver for I/O
  * @bio:  The bio describing the location in memory and on the device.
@@ -2620,21 +2634,13 @@ void generic_make_request(struct bio *bi
 	if (maxsector) {
 		sector_t sector = bio->bi_sector;
 
-		if (maxsector < nr_sectors ||
-		    maxsector - nr_sectors < sector) {
-			char b[BDEVNAME_SIZE];
-			/* This may well happen - the kernel calls
-			 * bread() without checking the size of the
-			 * device, e.g., when mounting a device. */
-			printk(KERN_INFO
-			       "attempt to access beyond end of device\n");
-			printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
-			       bdevname(bio->bi_bdev, b),
-			       bio->bi_rw,
-			       (unsigned long long) sector + nr_sectors,
-			       (long long) maxsector);
-
-			set_bit(BIO_EOF, &bio->bi_flags);
+		if (maxsector < nr_sectors || maxsector - nr_sectors < sector) {
+			/*
+			 * This may well happen - the kernel calls bread()
+			 * without checking the size of the device, e.g., when
+			 * mounting a device.
+			 */
+			handle_bad_sector(bio);
 			goto end_io;
 		}
 	}
_


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-25  7:08           ` Andrew Morton
@ 2004-11-25  7:11             ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2004-11-25  7:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: neilb, phil, linux-kernel

On Wed, Nov 24 2004, Andrew Morton wrote:
> Jens Axboe <axboe@suse.de> wrote:
> >
> > On Thu, Nov 25 2004, Neil Brown wrote:
> > > On Wednesday November 24, akpm@osdl.org wrote:
> > > > Neil Brown <neilb@cse.unsw.edu.au> wrote:
> > > > >
> > > > > Would the following (untested-but-seems-to-compile -
> > > > > explanation-of-concept) patch be at all reasonable to avoid stack
> > > > > depth problems with stacked block devices, or is adding stuff to
> > > > > task_struct frowned upon? 
> > > > 
> > > > It's always a tradeoff - we've put things in task_struct before to get
> > > > around sticky situations.  Certainly, removing potentially unbounded stack
> > > > utilisation is a worthwhile thing to do.
> > > > 
> > > > The patch bends my brain a bit.
> > > 
> > > Recursion is like that (... like recursion, that is :-).
> > 
> > Pardon my ignorance, but where is the bug that called for something like
> > this?
> 
> Well there was an xfs-on-raid-on-lvm stack overrun reported, but the
> general problem we're addressing here is that stacking drivers can cause
> arbitrary amounts of kernel stack windup.

Ok. Without b[] on the stack locally, I don't think it's an issue.

> > I can't say I love the idea of adding a bio list structure to the
> > tasklist, it feels pretty hacky. generic_make_request() doesn't really
> > use that much stack, if you just kill the BDEVNAME_SIZE struct.
> 
> Looks like a sensible thing to do, although it would be tidier to move the
> whole thing into a separate function, no?

Yep, works for me.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-24 23:12     ` Andrew Morton
  2004-11-25  0:48       ` Phil Dier
@ 2004-11-28 11:29       ` David Greaves
  2004-11-28 18:27         ` Andrew Morton
  1 sibling, 1 reply; 28+ messages in thread
From: David Greaves @ 2004-11-28 11:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Phil Dier, linux-kernel

Andrew Morton wrote:

>Phil Dier <phil@dier.us> wrote:
>  
>
>>>Can you rebuild the kernel with CONFIG_4KSTACKS=n?
>>>
>>>      
>>>
>>Looks like 8k stacks did the trick, at least for the oops. Now I'm
>>seeing the stuff below.
>>
>>I got a ton more of this with jfs and xfs, but it seems much less with
>>reiser. Should I be worried, or is this something I can safely ignore?
>>It doesn't lock the system..  Could files be getting corrupted?
>>
>>
>>Nov 23 17:38:20 calculon swapper: page allocation failure. order:0, mode:0x20
>>Nov 23 17:38:20 calculon [<c013c854>] __alloc_pages+0x1b9/0x35e
>>Nov 23 17:38:20 calculon [<c013ca1e>] __get_free_pages+0x25/0x3f
>>Nov 23 17:38:20 calculon [<c013fccb>] kmem_getpages+0x21/0xc9
>>Nov 23 17:38:20 calculon [<c0140813>] alloc_slabmgmt+0x55/0x5f
>>Nov 23 17:38:20 calculon [<c0140992>] cache_grow+0xab/0x14d
>>Nov 23 17:38:20 calculon [<c0140ba8>] cache_alloc_refill+0x174/0x219
>>Nov 23 17:38:20 calculon [<c0140ffe>] __kmalloc+0x85/0x8c
>>Nov 23 17:38:20 calculon [<c03f4f89>] alloc_skb+0x47/0xe0
>>Nov 23 17:38:20 calculon [<c032ebe5>] e1000_alloc_rx_buffers+0x44/0xe3
>>    
>>
>
>You didn't mention the kernel version.  2.6.9 had problems in this area, so
>2.6.10-rc2 should be better.  And there are post-2.6.10-rc2 fixes which
>will provide more headroom.
>  
>
Hi
I have a system that's running 2.6.10rc2
It has libata sata_promise + sata_sil drives in an md raid5 array that's 
used by lvm2 and then xfs; then exported via nfs.
I saw this thread, upgraded to 2.6.10rc2 and I'm posting this in case 
it's related (it's hard to tell)

This oops happened whilst the box was quiet

Hopefully relevant config bits:
Single processor
echo 16384 > /proc/sys/vm/min_free_kbytes
CONFIG_4KSTACKS=n
I've done a memtest.
I haven't applied the inode patch - I'm usually writing a single 1-3Gb 
files whilst reading another.

Can I help by providing anything else?

Nov 28 09:05:03 cu kernel: Unable to handle kernel paging request at 
virtual address 00100104
Nov 28 09:05:03 cu kernel:  printing eip:
Nov 28 09:05:03 cu kernel: c0139a62
Nov 28 09:05:03 cu kernel: *pde = 00000000
Nov 28 09:05:03 cu kernel: Oops: 0002 [#1]
Nov 28 09:05:03 cu kernel: Modules linked in: nfs af_packet ipv6 e100 
mii usblp uhci_hcd usbcore nfsd exportfs lockd sunrpc sk98lin unix
Nov 28 09:05:03 cu kernel: CPU:    0
Nov 28 09:05:03 cu kernel: EIP:    0060:[cache_alloc_refill+210/528]    
Not tainted VLI
Nov 28 09:05:03 cu kernel: EFLAGS: 00010046   (2.6.10-rc2)
Nov 28 09:05:03 cu kernel: EIP is at cache_alloc_refill+0xd2/0x210
Nov 28 09:05:03 cu kernel: eax: 00100100   ebx: dffe2a00   ecx: 
ffffffff   edx: dffe3a6c
Nov 28 09:05:03 cu kernel: esi: c6118020   edi: c6118038   ebp: 
dffe2a10   esp: dd627e40
Nov 28 09:05:03 cu kernel: ds: 007b   es: 007b   ss: 0068
Nov 28 09:05:03 cu kernel: Process nfsd (pid: 2230, threadinfo=dd626000 
task=df1a7a00)
Nov 28 09:05:03 cu kernel: Stack: 0000002c 00000008 ca45dcbc c6118038 
dffe3a6c dffe3a74 00000296 ca45dcbc
Nov 28 09:05:03 cu kernel:        d12c7b7c 00000000 c0139d8e dffe3a60 
000000d0 fffffff4 c0162d8c dffe3a60
Nov 28 09:05:03 cu kernel:        000000d0 dd627ee4 d12c7b7c 00000000 
c015922d d12c7b7c fffffff4 ca45dcbc
Nov 28 09:05:03 cu kernel: Call Trace:
Nov 28 09:05:03 cu kernel:  [kmem_cache_alloc+62/64] 
kmem_cache_alloc+0x3e/0x40
Nov 28 09:05:03 cu kernel:  [d_alloc+28/416] d_alloc+0x1c/0x1a0
Nov 28 09:05:03 cu kernel:  [cached_lookup+125/144] cached_lookup+0x7d/0x90
Nov 28 09:05:03 cu kernel:  [__lookup_hash+139/224] __lookup_hash+0x8b/0xe0
Nov 28 09:05:03 cu kernel:  [lookup_hash+31/48] lookup_hash+0x1f/0x30
Nov 28 09:05:03 cu kernel:  [lookup_one_len+97/112] lookup_one_len+0x61/0x70
Nov 28 09:05:03 cu kernel:  [pg0+550179216/1069196288] 
nfsd_lookup+0x110/0x490 [nfsd]
Nov 28 09:05:03 cu kernel:  [pg0+550211681/1069196288] 
nfsd3_proc_lookup+0xa1/0xe0[nfsd]
Nov 28 09:05:03 cu kernel:  [pg0+550167977/1069196288] 
nfsd_dispatch+0xd9/0x230 [nfsd]
Nov 28 09:05:03 cu kernel:  [pg0+550042452/1069196288] 
svc_process+0x4a4/0x690 [sunrpc]
Nov 28 09:05:03 cu kernel:  [default_wake_function+0/32] 
default_wake_function+0x0/0x20
Nov 28 09:05:03 cu kernel:  [pg0+550167404/1069196288] nfsd+0x18c/0x2f0 
[nfsd]
Nov 28 09:05:03 cu kernel:  [pg0+550167008/1069196288] nfsd+0x0/0x2f0 [nfsd]
Nov 28 09:05:03 cu kernel:  [kernel_thread_helper+5/20] 
kernel_thread_helper+0x5/0x14
Nov 28 09:05:03 cu kernel: Code: 8b 56 10 0f b7 46 14 42 89 56 10 8b 7c 
24 0c 0f b7 04 47 66 89 46 14 8b 44 24 2c 3b 50 3c 73 06 49 83 f9 ff 75 
c3 8b 56 04 8b 06 <89> 50 04 89 02 c7 46 04 00 02 20 00 66 83 7e 14 ff 
c7 06 00 01


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-28 11:29       ` David Greaves
@ 2004-11-28 18:27         ` Andrew Morton
  2004-12-08  9:03           ` David Greaves
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2004-11-28 18:27 UTC (permalink / raw)
  To: David Greaves; +Cc: phil, linux-kernel

David Greaves <david@dgreaves.com> wrote:
>
> ...
> I have a system that's running 2.6.10rc2
> It has libata sata_promise + sata_sil drives in an md raid5 array that's 
> used by lvm2 and then xfs; then exported via nfs.
> I saw this thread, upgraded to 2.6.10rc2 and I'm posting this in case 
> it's related (it's hard to tell)
> 
> This oops happened whilst the box was quiet
> 
> Hopefully relevant config bits:
> Single processor
> echo 16384 > /proc/sys/vm/min_free_kbytes
> CONFIG_4KSTACKS=n
> I've done a memtest.
> I haven't applied the inode patch - I'm usually writing a single 1-3Gb 
> files whilst reading another.
> 
> Can I help by providing anything else?
> 
> Nov 28 09:05:03 cu kernel: Unable to handle kernel paging request at 
> virtual address 00100104

That's the list_del() poisoning pattern.

> Nov 28 09:05:03 cu kernel:  printing eip:
> Nov 28 09:05:03 cu kernel: c0139a62
> Nov 28 09:05:03 cu kernel: *pde = 00000000
> Nov 28 09:05:03 cu kernel: Oops: 0002 [#1]
> Nov 28 09:05:03 cu kernel: Modules linked in: nfs af_packet ipv6 e100 
> mii usblp uhci_hcd usbcore nfsd exportfs lockd sunrpc sk98lin unix
> Nov 28 09:05:03 cu kernel: CPU:    0
> Nov 28 09:05:03 cu kernel: EIP:    0060:[cache_alloc_refill+210/528]    
> Not tainted VLI
> Nov 28 09:05:03 cu kernel: EFLAGS: 00010046   (2.6.10-rc2)
> Nov 28 09:05:03 cu kernel: EIP is at cache_alloc_refill+0xd2/0x210
> Nov 28 09:05:03 cu kernel: eax: 00100100   ebx: dffe2a00   ecx: 
> ffffffff   edx: dffe3a6c
> Nov 28 09:05:03 cu kernel: esi: c6118020   edi: c6118038   ebp: 
> dffe2a10   esp: dd627e40
> Nov 28 09:05:03 cu kernel: ds: 007b   es: 007b   ss: 0068
> Nov 28 09:05:03 cu kernel: Process nfsd (pid: 2230, threadinfo=dd626000 
> task=df1a7a00)
> Nov 28 09:05:03 cu kernel: Stack: 0000002c 00000008 ca45dcbc c6118038 
> dffe3a6c dffe3a74 00000296 ca45dcbc
> Nov 28 09:05:03 cu kernel:        d12c7b7c 00000000 c0139d8e dffe3a60 
> 000000d0 fffffff4 c0162d8c dffe3a60
> Nov 28 09:05:03 cu kernel:        000000d0 dd627ee4 d12c7b7c 00000000 
> c015922d d12c7b7c fffffff4 ca45dcbc
> Nov 28 09:05:03 cu kernel: Call Trace:
> Nov 28 09:05:03 cu kernel:  [kmem_cache_alloc+62/64] 
> kmem_cache_alloc+0x3e/0x40
> Nov 28 09:05:03 cu kernel:  [d_alloc+28/416] d_alloc+0x1c/0x1a0
> Nov 28 09:05:03 cu kernel:  [cached_lookup+125/144] cached_lookup+0x7d/0x90
> Nov 28 09:05:03 cu kernel:  [__lookup_hash+139/224] __lookup_hash+0x8b/0xe0
> Nov 28 09:05:03 cu kernel:  [lookup_hash+31/48] lookup_hash+0x1f/0x30
> Nov 28 09:05:03 cu kernel:  [lookup_one_len+97/112] lookup_one_len+0x61/0x70
> Nov 28 09:05:03 cu kernel:  [pg0+550179216/1069196288] 
> nfsd_lookup+0x110/0x490 [nfsd]
> Nov 28 09:05:03 cu kernel:  [pg0+550211681/1069196288] 
> nfsd3_proc_lookup+0xa1/0xe0[nfsd]
> Nov 28 09:05:03 cu kernel:  [pg0+550167977/1069196288] 
> nfsd_dispatch+0xd9/0x230 [nfsd]
> Nov 28 09:05:03 cu kernel:  [pg0+550042452/1069196288] 
> svc_process+0x4a4/0x690 [sunrpc]
> Nov 28 09:05:03 cu kernel:  [default_wake_function+0/32] 
> default_wake_function+0x0/0x20
> Nov 28 09:05:03 cu kernel:  [pg0+550167404/1069196288] nfsd+0x18c/0x2f0 
> [nfsd]
> Nov 28 09:05:03 cu kernel:  [pg0+550167008/1069196288] nfsd+0x0/0x2f0 [nfsd]
> Nov 28 09:05:03 cu kernel:  [kernel_thread_helper+5/20] 

It appears that the dentry cache's slab freelists have become corrupted. 
Odd, because everyone uses that code a lot.  I'd suggest that you enable
CONFIG_DEBUG_SLAB, see if that catches anything.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-23 22:39       ` Christoph Hellwig
  2004-11-23 22:56         ` Jakob Oestergaard
@ 2004-11-30 17:37         ` Phil Dier
  1 sibling, 0 replies; 28+ messages in thread
From: Phil Dier @ 2004-11-30 17:37 UTC (permalink / raw)
  To: linux-kernel

Using the patch below with nfs_fsstress.sh results in this oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
00000000
*pde = 00000000
Oops: 0000 [#1]
SMP 
Modules linked in:
CPU:    1
EIP:    0060:[<00000000>]    Not tainted VLI
EFLAGS: 00010286   (2.6.9) 
EIP is at 0x0
eax: c05b2be0   ebx: fffffff4   ecx: f590e744   edx: f590e744
esi: f03508e0   edi: f2f418c0   ebp: 00000000   esp: f7bafeac
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 6095, threadinfo=f7bae000 task=f70260b0)
Stack: c01638b6 f03508e0 f2f418c0 00000000 ffffffff f6f860d9 c3183204 f6f860c8 
       c0163905 f7bafee8 f590e710 00000000 c0163970 f7bafee8 f590e710 b28e88ba 
       00000011 f6f860c8 00000011 f590e710 00000011 c01f4f16 f6f860c8 f590e710 
Call Trace:
 [<c01638b6>] __lookup_hash+0xa6/0xd6
 [<c0163905>] lookup_hash+0x1f/0x23
 [<c0163970>] lookup_one_len+0x67/0x74
 [<c01f4f16>] nfsd_lookup+0x115/0x4be
 [<c01fd791>] nfsd3_proc_lookup+0xa1/0xe0
 [<c01f23c7>] nfsd_dispatch+0xd9/0x1fa
 [<c043adda>] svc_process+0x56a/0x784
 [<c0119d71>] default_wake_function+0x0/0x12
 [<c01f2148>] nfsd+0x1f3/0x399
 [<c01f1f55>] nfsd+0x0/0x399
 [<c0103271>] kernel_thread_helper+0x5/0xb
Code:  Bad EIP value.

Here is my .config:

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.9
# Tue Nov 30 10:20:05 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_HOTPLUG=y
# CONFIG_IKCONFIG is not set
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
# CONFIG_TINY_SHMEM is not set

#
# Loadable module support
#
CONFIG_MODULES=y
# CONFIG_MODULE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
# CONFIG_HPET_EMULATE_RTC is not set
CONFIG_SMP=y
CONFIG_NR_CPUS=4
CONFIG_SCHED_SMT=y
# CONFIG_PREEMPT is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
CONFIG_HIGHPTE=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_IRQBALANCE=y
CONFIG_HAVE_DEC_LOCK=y
# CONFIG_REGPARM is not set

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
# CONFIG_X86_PM_TIMER is not set

#
# APM (Advanced Power Management) BIOS Support
#
# CONFIG_APM is not set

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
# CONFIG_PCI_MSI is not set
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set

#
# PCMCIA/CardBus support
#
# CONFIG_PCMCIA is not set
CONFIG_PCMCIA_PROBE=y

#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=m
# CONFIG_DEBUG_DRIVER is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
CONFIG_PARPORT_PC_CML1=y
# CONFIG_PARPORT_SERIAL is not set
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_OTHER is not set
# CONFIG_PARPORT_1284 is not set

#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
# CONFIG_ISAPNP is not set
# CONFIG_PNPBIOS is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
# CONFIG_BLK_DEV_RAM is not set
CONFIG_LBD=y

#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_TASKFILE_IO=y

#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set

#
# SCSI device support
#
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=y

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set

#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
CONFIG_SCSI_AIC79XX=y
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=15000
# CONFIG_AIC79XX_ENABLE_RD_STRM is not set
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_IN2000 is not set
CONFIG_MEGARAID_NEWGEN=y
CONFIG_MEGARAID_MM=y
CONFIG_MEGARAID_MAILBOX=y
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA2XXX=y
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
# CONFIG_MD_LINEAR is not set
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID10=y
# CONFIG_MD_RAID5 is not set
# CONFIG_MD_RAID6 is not set
# CONFIG_MD_MULTIPATH is not set
CONFIG_BLK_DEV_DM=y
# CONFIG_DM_CRYPT is not set
CONFIG_DM_SNAPSHOT=y
CONFIG_DM_MIRROR=y
CONFIG_DM_ZERO=y

#
# Fusion MPT device support
#
CONFIG_FUSION=y
CONFIG_FUSION_MAX_SGE=40
# CONFIG_FUSION_CTL is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set

#
# Networking support
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
# CONFIG_IPV6 is not set
# CONFIG_NETFILTER is not set

#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
# CONFIG_NET_CLS_ROUTE is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=y
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set

#
# ARCnet devices
#
# CONFIG_ARCNET is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set

#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_CS89x0 is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
CONFIG_E100=y
# CONFIG_E100_NAPI is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_NET_POCKET is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
CONFIG_E1000=y
# CONFIG_E1000_NAPI is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_TIGON3 is not set

#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
CONFIG_S2IO=m
# CONFIG_S2IO_NAPI is not set

#
# Token Ring devices
#
# CONFIG_TR is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
# CONFIG_SERIO_RAW is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_ACPI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_PRINTER is not set
# CONFIG_PPDEV is not set
# CONFIG_TIPAR is not set

#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
CONFIG_HW_RANDOM=y
# CONFIG_NVRAM is not set
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set

#
# Ftape, the floppy tape device driver
#
# CONFIG_AGP is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_HPET is not set
# CONFIG_HANGCHECK_TIMER is not set

#
# I2C support
#
# CONFIG_I2C is not set

#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set

#
# Misc devices
#
# CONFIG_IBM_ASM is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set

#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set

#
# Graphics support
#
# CONFIG_FB is not set
# CONFIG_VIDEO_SELECT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y

#
# Sound
#
# CONFIG_SOUND is not set

#
# USB support
#
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_BANDWIDTH is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_EHCI_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_UHCI_HCD=y

#
# USB Device Class drivers
#
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_ACM is not set
CONFIG_USB_PRINTER=y
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_RW_DETECT is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_DPCM=y
# CONFIG_USB_STORAGE_HP8200e is not set
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y

#
# USB Human Interface Devices (HID)
#
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
# CONFIG_HID_FF is not set
CONFIG_USB_HIDDEV=y
# CONFIG_USB_AIPTEK is not set
# CONFIG_USB_WACOM is not set
# CONFIG_USB_KBTAB is not set
# CONFIG_USB_POWERMATE is not set
# CONFIG_USB_MTOUCH is not set
# CONFIG_USB_EGALAX is not set
# CONFIG_USB_XPAD is not set
# CONFIG_USB_ATI_REMOTE is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USB_HPUSBSCSI is not set

#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set

#
# Video4Linux support is needed for USB Multimedia device support
#

#
# USB Network adaptors
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set

#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_TIGL is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGETSERVO is not set
# CONFIG_USB_TEST is not set

#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set

#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_REISERFS_FS_XATTR is not set
CONFIG_JFS_FS=y
# CONFIG_JFS_POSIX_ACL is not set
# CONFIG_JFS_DEBUG is not set
# CONFIG_JFS_STATISTICS is not set
CONFIG_XFS_FS=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_QUOTA is not set
# CONFIG_XFS_SECURITY is not set
# CONFIG_XFS_POSIX_ACL is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
# CONFIG_ZISOFS is not set
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
# CONFIG_RPCSEC_GSS_KRB5 is not set
# CONFIG_RPCSEC_GSS_SPKM3 is not set
CONFIG_SMB_FS=y
# CONFIG_SMB_NLS_DEFAULT is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y

#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set

#
# Profiling support
#
# CONFIG_PROFILING is not set

#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_HIGHMEM is not set
# CONFIG_DEBUG_INFO is not set
# CONFIG_FRAME_POINTER is not set
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_KPROBES is not set
CONFIG_DEBUG_STACK_USAGE=y
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_4KSTACKS is not set
# CONFIG_SCHEDSTATS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y

#
# Security options
#
# CONFIG_SECURITY is not set

#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set

#
# Library routines
#
# CONFIG_CRC_CCITT is not set
CONFIG_CRC32=y
CONFIG_LIBCRC32C=m
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_PC=y

On Tue, 23 Nov 2004 22:39:35 +0000
Christoph Hellwig <hch@infradead.org> wrote:

> Actually I can reproduce it reliably by running nfs_fsstress.sh for a
> looong time.  The problem is that in the current XFS code the inode
> generation counter starts at 0, but higher level code uses that as
> a wildcard for any possible generation, so you may get a newly created
> file for a stale nfs file handler of an deleted file with the same inode
> number.
> 
> The patch below fixes it for me:
> 
> 
> Index: fs/xfs/xfs_inode.c
> ===================================================================
> RCS file: /cvs/linux-2.6-xfs/fs/xfs/xfs_inode.c,v
> retrieving revision 1.406
> diff -u -p -r1.406 xfs_inode.c
> --- fs/xfs/xfs_inode.c	27 Oct 2004 12:06:24 -0000	1.406
> +++ fs/xfs/xfs_inode.c	23 Nov 2004 20:40:56 -0000
> @@ -1224,9 +1224,16 @@ xfs_ialloc(
>  	ip->i_d.di_nextents = 0;
>  	ASSERT(ip->i_d.di_nblocks == 0);
>  	xfs_ichgtime(ip, XFS_ICHGTIME_CHG|XFS_ICHGTIME_ACC|XFS_ICHGTIME_MOD);
> +
>  	/*
> -	 * di_gen will have been taken care of in xfs_iread.
> +	 * Bump the generation count so no one will confuse us with an
> +	 * earlier incarnations of this inode.
> +	 *
> +	 * Done early to skip generation 0, which is used as a wildcard
> +	 * by higher level code.
>  	 */
> +	ip->i_d.di_gen++;
> +
>  	ip->i_d.di_extsize = 0;
>  	ip->i_d.di_dmevmask = 0;
>  	ip->i_d.di_dmstate = 0;
> @@ -2370,11 +2377,6 @@ xfs_ifree(
>  		XFS_IFORK_DSIZE(ip) / (uint)sizeof(xfs_bmbt_rec_t);
>  	ip->i_d.di_format = XFS_DINODE_FMT_EXTENTS;
>  	ip->i_d.di_aformat = XFS_DINODE_FMT_EXTENTS;
> -	/*
> -	 * Bump the generation count so no one will be confused
> -	 * by reincarnations of this inode.
> -	 */
> -	ip->i_d.di_gen++;
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  
>  	if (delete) {


-- 

Phil Dier (ICGLink.com -- 615 370-1530 x733)

/* vim:set noai nocindent ts=8 sw=8: */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-11-28 18:27         ` Andrew Morton
@ 2004-12-08  9:03           ` David Greaves
  2004-12-08  9:15             ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: David Greaves @ 2004-12-08  9:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: phil, linux-kernel

Andrew Morton wrote:

>David Greaves <david@dgreaves.com> wrote:
>  
>
>>...
>>I have a system that's running 2.6.10rc2
>>It has libata sata_promise + sata_sil drives in an md raid5 array that's 
>>used by lvm2 and then xfs; then exported via nfs.
>>I saw this thread, upgraded to 2.6.10rc2 and I'm posting this in case 
>>it's related (it's hard to tell)
>>
>>This oops happened whilst the box was quiet
>>
>>Hopefully relevant config bits:
>>Single processor
>>echo 16384 > /proc/sys/vm/min_free_kbytes
>>CONFIG_4KSTACKS=n
>>I've done a memtest.
>>I haven't applied the inode patch - I'm usually writing a single 1-3Gb 
>>files whilst reading another.
>>
>>Can I help by providing anything else?
>>
>>Nov 28 09:05:03 cu kernel: Unable to handle kernel paging request at 
>>virtual address 00100104
>>    
>>
>
>That's the list_del() poisoning pattern.
>  
>
<snip old log>

>It appears that the dentry cache's slab freelists have become corrupted. 
>Odd, because everyone uses that code a lot.  I'd suggest that you enable
>CONFIG_DEBUG_SLAB, see if that catches anything.
>  
>
Thanks for the reply Andrew.

I did as you suggested and it's been fine until I got this last night.

Dec  8 06:50:04 cu kernel: slab: Internal list corruption detected in 
cache 'vm_area_struct'(41), slabp cfedd000(13). Hexdump:
Dec  8 06:50:04 cu kernel:
Dec  8 06:50:04 cu kernel: 000: 00 01 10 00 00 02 20 00 6c 00 00 00 6c 
d0 ed cf
Dec  8 06:50:04 cu kernel: 010: 0d 00 00 00 11 00 14 08 1a 00 fe ff 0a 
00 06 00
Dec  8 06:50:04 cu kernel: 020: fe ff fe ff 02 00 fe ff 22 00 21 00 18 
00 27 00
Dec  8 06:50:04 cu kernel: 030: ff ff fe ff fe ff 03 00 00 00 19 00 03 
00 fe ff
Dec  8 06:50:04 cu kernel: 040: fe ff 08 00 fe ff fe ff 1c 00 10 00 15 
00 fe ff
Dec  8 06:50:04 cu kernel: 050: 25 00 12 00 fe ff
Dec  8 06:50:04 cu kernel: ------------[ cut here ]------------
Dec  8 06:50:04 cu kernel: kernel BUG at mm/slab.c:1947!
Dec  8 06:50:04 cu kernel: invalid operand: 0000 [#1]
Dec  8 06:50:04 cu kernel: Modules linked in: nfs af_packet ipv6 e100 
mii usblp uhci_hcd usbcore nfsd exportfs lockd sunrpc sk98lin unix
Dec  8 06:50:04 cu kernel: CPU:    0
Dec  8 06:50:04 cu kernel: EIP:    0060:[check_slabp+180/240]    Not 
tainted VLI
Dec  8 06:50:04 cu kernel: EFLAGS: 00010092   (2.6.10-rc2cu-041128-02)
Dec  8 06:50:04 cu kernel: EIP is at check_slabp+0xb4/0xf0
Dec  8 06:50:04 cu kernel: eax: 00000001   ebx: 00000056   ecx: 
00000082   edx: 0000898d
Dec  8 06:50:04 cu kernel: esi: cfedd000   edi: dffe9960   ebp: 
cfedd018   esp: c1f3bca8
Dec  8 06:50:04 cu kernel: ds: 007b   es: 007b   ss: 0068
Dec  8 06:50:04 cu kernel: Process munin-node (pid: 6456, 
threadinfo=c1f3a000 task=c32dea00)
Dec  8 06:50:04 cu kernel: Stack: c0352d03 000000ff 00000029 cfedd000 
0000000d cfedd000 0000001b cfedda8c
Dec  8 06:50:04 cu kernel:        c013aa19 dffe9960 cfedd000 00000000 
dffe996c dffe997c 0000000c 00000010
Dec  8 06:50:04 cu kernel:        dffe9960 c094ba2c dffea728 c013ab2b 
dffe9960 dffe65e8 00000010 dffe65e8
Dec  8 06:50:04 cu kernel: Call Trace:
Dec  8 06:50:04 cu kernel:  [free_block+153/336] free_block+0x99/0x150
Dec  8 06:50:04 cu kernel:  [cache_flusharray+91/304] 
cache_flusharray+0x5b/0x130
Dec  8 06:50:04 cu kernel:  [kmem_cache_free+122/128] 
kmem_cache_free+0x7a/0x80
Dec  8 06:50:04 cu kernel:  [remove_vm_struct+94/128] 
remove_vm_struct+0x5e/0x80
Dec  8 06:50:04 cu kernel:  [remove_vm_struct+94/128] 
remove_vm_struct+0x5e/0x80
Dec  8 06:50:04 cu kernel:  [exit_mmap+284/320] exit_mmap+0x11c/0x140
Dec  8 06:50:04 cu kernel:  [mmput+44/128] mmput+0x2c/0x80
Dec  8 06:50:04 cu kernel:  [exec_mmap+121/240] exec_mmap+0x79/0xf0
Dec  8 06:50:04 cu kernel:  [flush_old_exec+202/1616] 
flush_old_exec+0xca/0x650
Dec  8 06:50:04 cu kernel:  [kernel_read+80/96] kernel_read+0x50/0x60
Dec  8 06:50:04 cu kernel:  [load_elf_binary+827/3184] 
load_elf_binary+0x33b/0xc70
Dec  8 06:50:04 cu kernel:  [get_empty_filp+70/208] get_empty_filp+0x46/0xd0
Dec  8 06:50:04 cu kernel:  [autoremove_wake_function+0/96] 
autoremove_wake_function+0x0/0x60
Dec  8 06:50:04 cu kernel:  [kernel_read+80/96] kernel_read+0x50/0x60
Dec  8 06:50:04 cu kernel:  [search_binary_handler+93/432] 
search_binary_handler+0x5d/0x1b0
Dec  8 06:50:04 cu kernel:  [load_script+520/576] load_script+0x208/0x240
Dec  8 06:50:04 cu kernel:  [__alloc_pages+458/864] 
__alloc_pages+0x1ca/0x360
Dec  8 06:50:04 cu kernel:  [copy_from_user+66/128] copy_from_user+0x42/0x80
Dec  8 06:50:04 cu kernel:  [copy_strings+392/512] copy_strings+0x188/0x200
Dec  8 06:50:04 cu kernel:  [search_binary_handler+93/432] 
search_binary_handler+0x5d/0x1b0
Dec  8 06:50:04 cu kernel:  [do_execve+409/528] do_execve+0x199/0x210
Dec  8 06:50:04 cu kernel:  [sys_execve+66/128] sys_execve+0x42/0x80
Dec  8 06:50:04 cu kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Dec  8 06:50:04 cu kernel: Code: b6 04 33 43 c7 04 24 94 59 34 c0 89 44 
24 04 e8 23 d7 fd ff 8b 47 3c 8d 44 00 04 39 c3 72 db c7 04 24 03 2d 35 
c0 e8 0c d7 fd ff <0f> 0b 9b 07 1e 58 34 c0 83 c4 14 5b 5e 5f c3 89 5c 
24 04 c7 04

Additional info:
when the machine started I got three:
  swapper: page allocation failure. order:1, mode:0x20
before I could:
  echo 16384 > /proc/sys/vm/min_free_kbytes

Anything else you'd like me to try?

David


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-12-08  9:03           ` David Greaves
@ 2004-12-08  9:15             ` Andrew Morton
  2004-12-09  3:50               ` Nigel Cunningham
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2004-12-08  9:15 UTC (permalink / raw)
  To: David Greaves; +Cc: phil, linux-kernel

David Greaves <david@dgreaves.com> wrote:
>
> I did as you suggested and it's been fine until I got this last night.
> 
>  Dec  8 06:50:04 cu kernel: slab: Internal list corruption detected in 
>  cache 'vm_area_struct'(41), slabp cfedd000(13).

That's totally different from the previous oops (it was in dcache).

I'd be suspecting either a random memory scribble or flakey hardware.  It
could well be the latter if you're not using any unusual
drivers/filesystems/etc.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
  2004-12-08  9:15             ` Andrew Morton
@ 2004-12-09  3:50               ` Nigel Cunningham
  0 siblings, 0 replies; 28+ messages in thread
From: Nigel Cunningham @ 2004-12-09  3:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Greaves, phil, Linux Kernel Mailing List

Hi Andrew.

On Wed, 2004-12-08 at 20:15, Andrew Morton wrote:
> David Greaves <david@dgreaves.com> wrote:
> >
> > I did as you suggested and it's been fine until I got this last night.
> > 
> >  Dec  8 06:50:04 cu kernel: slab: Internal list corruption detected in 
> >  cache 'vm_area_struct'(41), slabp cfedd000(13).
> 
> That's totally different from the previous oops (it was in dcache).
> 
> I'd be suspecting either a random memory scribble or flakey hardware.  It
> could well be the latter if you're not using any unusual
> drivers/filesystems/etc.

I'm seeing similar things occasionally with 2.6.9+kgdb+suspend2+Win4Lin
on ht/preempt/regparm/4gb highmem. I've come to the conclusion it's
probably not directly suspend (I can do 100 cycles on the trot), but
haven't been able to reliably reproduce it. The corruption is always in
fs related data, but apart from that seems random. Seeing the mention of
being unable to allocate a page at the bottom of David's email makes me
wonder if the difficulties with memory freeing are triggering some code
that's not properly handling failed page allocations.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm,  and xfs
       [not found]           ` <34m5a-61Z-9@gated-at.bofh.it>
@ 2004-11-25 11:07             ` Andi Kleen
  0 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-11-25 11:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton <akpm@osdl.org> writes:

> > I can't say I love the idea of adding a bio list structure to the
> > tasklist, it feels pretty hacky. generic_make_request() doesn't really
> > use that much stack, if you just kill the BDEVNAME_SIZE struct.
> 
> Looks like a sensible thing to do, although it would be tidier to move the
> whole thing into a separate function, no?
> 
> 
> --- 25/drivers/block/ll_rw_blk.c~generic_make_request-stack-savings	2004-11-24 23:03:06.347778648 -0800
> +++ 25-akpm/drivers/block/ll_rw_blk.c	2004-11-24 23:07:39.798207864 -0800
> @@ -2584,6 +2584,20 @@ static inline void block_wait_queue_runn
>  	}
>  }
>  
> +static void handle_bad_sector(struct bio *bio)

You need to mark it noinline, otherwise a unit-at-a-time gcc (3.4+) 
will happily inline it anyways.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz  4gb ram +smp, software raid, lvm, and xfs
@ 2004-11-24  9:28 Anders Saaby
  0 siblings, 0 replies; 28+ messages in thread
From: Anders Saaby @ 2004-11-24  9:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: Phil Dier, Jakob Oestergaard, Christoph Hellwig

Hi Phil,

I have some hands-on experience with this kind of setop. I am working with a 
fairly similar setup as yours:

Two (UP) Xeon servers each with ~1TB SCSI RAID. Running XFS, exporting via NFS 
on 2.6.8.1. Serving ~18.000 Homedirs. - 24/7 heavy load.

I have seen quite a lot of Oops's on these servers, but now have a stable 
setup.

Here's the highlights:

< 2.6.8.1 (+/- various patches): XFS b0rks after a short period of heavy load. 
(tried several different setups and patches from SGI) 
2.6.8.1 (+/- various patches): SMP+XFS+NFS Oops's after ~1 Hour under heavy 
load.
2.6.8.1 (without patches): UP+XFS+NFS has now been running stable for 56 days, 
06h 24m. :)

I haven't tried 2.6.9 on these servers yet because of the stale filehandles 
issue and have no urge to break a stable setup. I am not 100% sure if the 
issue with weird changes on files which Jakob talked about, is introduced in 
2.6.9, but Im not seeing it on my setup.

- So buttom line - 2.6.8.1 running on a single CPU machine does the trick for 
me. (And who needs a lot of CPU powah on an NFS server? :))

Regarding ext3... This filesystem also seems to be b0rked on at least the 
newer 2.6.x kernels. We have some mailservers which until two days ago Oopsed 
on ext3. These now run XFS and the errors seems to be gone. - Don't get me 
wrong here, I have never seen ext3 Oops's on a low-load server - Only under 
heavy load (and SMP).

Snip of one of the ext3 Oops's (you will see several people here on LKML 
having the same/similar problem):
<SNIP>
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
printing eip:
c018b2f5
*pde = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: nfs e1000 iptable_nat rtc
CPU:    2
EIP:    0060:[<c018b2f5>]    Not tainted VLI
EFLAGS: 00010286   (2.6.9)
EIP is at journal_commit_transaction+0x545/0x11b0
eax: d971826c   ebx: 00000000   ecx: e489eefc   edx: 00000014
esi: d971826c   edi: f7406000   ebp: ea0a6f80   esp: f7407d8c
ds: 007b   es: 007b   ss: 0068
Process kjournald (pid: 177, threadinfo=f7406000 task=f7df63b0)
Stack: 03afe6b2 c2157478 f7407e40 f7406000 c2157414 00000000 00000000 00000000
       00000000 00000000 e489ebfc cd61056c 000010e8 01c2bf60 c040e020 00000000
       f7406000 0000001e f7407e1c c0412f80 00000008 f7407e5c c01134e3 f7407e1c
Call Trace:
 [<c01134e3>] find_busiest_group+0xf3/0x300
 [<c0113799>] find_busiest_queue+0xa9/0xd0
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c018e0e1>] kjournald+0xc1/0x230
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c0112ba3>] finish_task_switch+0x33/0x70
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c0103ff6>] ret_from_fork+0x6/0x14
 [<c018e000>] commit_timeout+0x0/0x10
 [<c018e020>] kjournald+0x0/0x230
 [<c010253d>] kernel_thread_helper+0x5/0x18
Code: 00 89 f0 e8 5e e1 17 00 83 c4 14 8b 45 18 85 c0 0f 84 49 01 00 00 bf 00 
e0 ff ff 21 e7 89 f6 8d bc 27 00 00 00 00 8b 70 20 8b 1e <f0> ff 43 0c 8b 03 
83 e0 04 74 4e 8b 94 24
 e8 01 00 00 8d 82 c0
</SNIP>

Phil Dier wrote:

> 
> Thanks for the tips, Jakob.
> 
> I *will* be exporting via NFS, so this is definetly good to know. I've
> been looking at using jfs and reiser as well, but some preliminary
> benchmarks suggested that xfs was the best performer for the kind of
> workload that I'm anticipating. I guess xfs is out of the question now,
> as I definetly don't want to deal with weird interactions like that.
> 
> Can anyone speak on the stability of (reiser|jfs|other) with nfs? My
> biggest requirements are online resizing and stability (ext3 online
> resize is still beta IIRC, but I wouldn't be opposed to using it if
> someone could tell me otherwise); speed would be nice, but I'm willing
> to sacrifice speed for the sake of reliability.
> 
> I'm personally using lvm + reiser + nfs without consequence on my
> fileserver at home, but it's not seeing nearly the loads that this box
> is going to see.
> 

-- 
Med venlig hilsen - Best regards - Meilleures salutations

Anders Saaby
Systems Engineer
------------------------------------------------
Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby
Phone: +45 45 880 888 - Fax: +45 45 880 777
Mail: as@cohaesio.com - http://www.cohaesio.com
------------------------------------------------

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: oops with dual xeon 2.8ghz 4gb ram +smp, software raid, lvm, and xfs
@ 2004-11-23 20:48 Joerg Sommrey
  0 siblings, 0 replies; 28+ messages in thread
From: Joerg Sommrey @ 2004-11-23 20:48 UTC (permalink / raw)
  To: Linux kernel mailing list; +Cc: Phil Dier

>Hi,
>
>I'm setting up a storage array with Linux, software RAID, LVM, and XFS,
>but I keep getting oopses during heavy I/O. I've been able to reproduce
>this with 2.6.6, 2.6.8.1, 2.6.9, and 2.6.10-rc2-bk4. I have dual xeon
>2.8s with 4gb of ram. I'm using adaptec and a fusion mpt scsi devices
>(more details in the following link). Connected are 2 ultra160 scsi
>jbods w/ 2 disks apiece. I'm using raid 10 (or should it be 01?) mirrored 
>stripes.

This looks very interesting.  My setup is somehow similar:
linux 2.6.9-ac8, SMP (2 x Athlon), Adaptec 2940UW + Promise SATA150 TX4,
4K stacks, software RAID, LVM and XFS.  The symptoms are different,
however.  When creating snapshots I get sometimes errors like this,
(which I'm unable to reproduce):

Nov 21 03:00:48 bear kernel:  [__alloc_pages+457/912] __alloc_pages+0x1c9/0x390
Nov 21 03:00:48 bear kernel:  [check_poison_obj+47/480] check_poison_obj+0x2f/0x1e0
Nov 21 03:00:48 bear kernel:  [__get_free_pages+37/64] __get_free_pages+0x25/0x40
Nov 21 03:00:48 bear kernel:  [kmem_getpages+33/208] kmem_getpages+0x21/0xd0
Nov 21 03:00:48 bear kernel:  [dbg_redzone1+21/48] dbg_redzone1+0x15/0x30
Nov 21 03:00:48 bear kernel:  [cache_grow+176/352] cache_grow+0xb0/0x160
Nov 21 03:00:48 bear kernel:  [cache_alloc_refill+428/640] cache_alloc_refill+0x1ac/0x280
Nov 21 03:00:48 bear kernel:  [__kmalloc+188/240] __kmalloc+0xbc/0xf0
Nov 21 03:00:48 bear kernel:  [mempool_resize+156/416] mempool_resize+0x9c/0x1a0
Nov 21 03:00:48 bear kernel:  [resize_pool+100/224] resize_pool+0x64/0xe0
Nov 21 03:00:48 bear kernel:  [dm_create_persistent+40/320] dm_create_persistent+0x28/0x140
Nov 21 03:00:48 bear kernel:  [snapshot_ctr+804/912] snapshot_ctr+0x324/0x390
Nov 21 03:00:48 bear kernel:  [dm_table_add_target+262/432] dm_table_add_target+0x106/0x1b0
Nov 21 03:00:48 bear kernel:  [populate_table+130/224] populate_table+0x82/0xe0
Nov 21 03:00:48 bear kernel:  [table_load+104/320] table_load+0x68/0x140
Nov 21 03:00:48 bear kernel:  [ctl_ioctl+241/336] ctl_ioctl+0xf1/0x150
Nov 21 03:00:48 bear kernel:  [table_load+0/320] table_load+0x0/0x140
Nov 21 03:00:48 bear kernel:  [sys_ioctl+253/640] sys_ioctl+0xfd/0x280
Nov 21 03:00:48 bear kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
Nov 21 03:00:48 bear kernel: device-mapper: : Couldn't create exception store
Nov 21 03:00:48 bear kernel:
Nov 21 03:00:48 bear kernel: device-mapper: error adding target to table

When I tried the sample script (modified according to the size of the
FS) nothing happened at first.  But creating snapshots in parallel
locked up the system after a while. (Hard lockup, no diagnostic data
available.)

I switched to 8K stacks then. Running the sample and creating snapshots
didn't lock up, but resulted in a similar error as shown above:

Nov 23 20:46:33 bear kernel: lvcreate: page allocation failure. order:0, mode:0xd0
Nov 23 20:46:33 bear kernel:  [__alloc_pages+457/912] __alloc_pages+0x1c9/0x390
Nov 23 20:46:33 bear kernel:  [__get_free_pages+37/64] __get_free_pages+0x25/0x40
Nov 23 20:46:33 bear kernel:  [kmem_getpages+33/208] kmem_getpages+0x21/0xd0
Nov 23 20:46:33 bear kernel:  [cache_grow+176/352] cache_grow+0xb0/0x160
Nov 23 20:46:33 bear kernel:  [check_slabp+24/240] check_slabp+0x18/0xf0
Nov 23 20:46:33 bear kernel:  [cache_alloc_refill+428/640] cache_alloc_refill+0x1ac/0x280
Nov 23 20:46:33 bear kernel:  [dbg_redzone1+21/48] dbg_redzone1+0x15/0x30
Nov 23 20:46:33 bear kernel:  [cache_alloc_debugcheck_after+65/368] cache_alloc_debugcheck_after+0x41/0x170
Nov 23 20:46:33 bear kernel:  [kmem_cache_alloc+149/192] kmem_cache_alloc+0x95/0xc0
Nov 23 20:46:33 bear kernel:  [alloc_io+34/48] alloc_io+0x22/0x30
Nov 23 20:46:33 bear kernel:  [alloc_io+34/48] alloc_io+0x22/0x30
Nov 23 20:46:33 bear kernel:  [mempool_resize+284/416] mempool_resize+0x11c/0x1a0
Nov 23 20:46:34 bear kernel:  [resize_pool+100/224] resize_pool+0x64/0xe0
Nov 23 20:46:34 bear kernel:  [kcopyd_client_create+134/208] kcopyd_client_create+0x86/0xd0
Nov 23 20:46:34 bear kernel:  [snapshot_ctr+700/912] snapshot_ctr+0x2bc/0x390
Nov 23 20:46:34 bear kernel:  [dm_table_add_target+262/432] dm_table_add_target+0x106/0x1b0
Nov 23 20:46:34 bear kernel:  [populate_table+130/224] populate_table+0x82/0xe0
Nov 23 20:46:34 bear kernel:  [table_load+104/320] table_load+0x68/0x140
Nov 23 20:46:34 bear kernel:  [ctl_ioctl+241/336] ctl_ioctl+0xf1/0x150
Nov 23 20:46:34 bear kernel:  [table_load+0/320] table_load+0x0/0x140
Nov 23 20:46:35 bear kernel:  [sys_ioctl+253/640] sys_ioctl+0xfd/0x280
Nov 23 20:46:36 bear kernel:  [syscall_call+7/11] syscall_call+0x7/0xb

Maybe these issues are not related.  If they are I'd be glad to support
with some additional testing.

-jo


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2004-12-09  3:51 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-22 19:06 oops with dual xeon 2.8ghz 4gb ram +smp, software raid, lvm, and xfs Phil Dier
2004-11-23  0:17 ` Andrew Morton
2004-11-23 15:37   ` Phil Dier
2004-11-23 17:02     ` Jakob Oestergaard
2004-11-23 18:29       ` Phil Dier
2004-11-23 22:39       ` Christoph Hellwig
2004-11-23 22:56         ` Jakob Oestergaard
2004-11-23 23:12           ` Christoph Hellwig
2004-11-30 17:37         ` Phil Dier
2004-11-24 15:45   ` Phil Dier
2004-11-24 16:56     ` Christoph Hellwig
2004-11-24 23:12     ` Andrew Morton
2004-11-25  0:48       ` Phil Dier
2004-11-28 11:29       ` David Greaves
2004-11-28 18:27         ` Andrew Morton
2004-12-08  9:03           ` David Greaves
2004-12-08  9:15             ` Andrew Morton
2004-12-09  3:50               ` Nigel Cunningham
2004-11-24 23:12   ` Neil Brown
2004-11-24 23:50     ` Andrew Morton
2004-11-25  0:14       ` Neil Brown
2004-11-25  1:05         ` Andrew Morton
2004-11-25  6:57         ` Jens Axboe
2004-11-25  7:08           ` Andrew Morton
2004-11-25  7:11             ` Jens Axboe
2004-11-23 20:48 Joerg Sommrey
2004-11-24  9:28 Anders Saaby
     [not found] <33rTj-1VZ-13@gated-at.bofh.it>
     [not found] ` <33wJq-633-25@gated-at.bofh.it>
     [not found]   ` <34fwL-P1-21@gated-at.bofh.it>
     [not found]     ` <34fGp-V2-9@gated-at.bofh.it>
     [not found]       ` <34fGp-V2-7@gated-at.bofh.it>
     [not found]         ` <34lVr-5WH-1@gated-at.bofh.it>
     [not found]           ` <34m5a-61Z-9@gated-at.bofh.it>
2004-11-25 11:07             ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).