linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nathan Scott <nathans@sgi.com>
To: Mihai RUSU <dizzy@roedu.net>, Linus Torvalds <torvalds@osdl.org>
Cc: Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jens Axboe <axboe@suse.de>, Neil Brown <neilb@cse.unsw.edu.au>
Subject: Re: kernel BUG at mm/filemap.c:332!
Date: Fri, 5 Dec 2003 08:16:11 +1100	[thread overview]
Message-ID: <20031204211611.GA567@frodo> (raw)
In-Reply-To: <Pine.LNX.4.56L0.0312041849250.10045@ahriman.bucharest.roedu.net>

On Thu, Dec 04, 2003 at 07:26:38PM +0200, Mihai RUSU wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi Linus
> 
> First of all thanks for the answer!
> 
> On Thu, 4 Dec 2003, Linus Torvalds wrote:
> 
> > 
> > Nathan,
> >  you're not off the hook yet. This is a smoking gun on XFS, and this time
> > with a big clue: large directories, and a low-memory situation.
> 
> Sorry to have misguided you guys in the first post. After rebooting the
> machine I have some more information, the actual directory size its about
> some hundred entries (~400) and not thousands as I previously speculated

Still, its a clue - all metadata I/O in XFS goes though the
pagebuf code, so we're looking in the right place.

> However I have some more usefull (I hope) information about the subject.  
> Before rebooting I wanted to first install a do_brk() patched 2.4.21-xfs
> kernel with lilo. Unfortunetly lilo stuck in a fsync() call after writing
> to screen that it did added all kernel images to MBR as configured in
> lilo.conf. When I booted I had no problem to boot from the new do_brk()
> fixed kernel so lilo seems it did the job, I dont know why it stuck
> in fsync().

Was your filesystem near full?  There was a 2.4 deadlock fixed
recently which could be what you hit there.

> After power on, one coleague complained that a file on which he worked a
> couple of minutes before I took the machine down had NULL bytes instead of
> actual content. I know that "dirty" data gets flushed to disk every 30
> seconds so this seems a little bit strange (in general I know that XFS
> leaves NULL bytes in files modified just before a unclean reboot but this
> file was modified some 5 minutes before the "hard" reboot).

You'll want a more recent 2.4 XFS kernel I suspect - Steve made
several improvements in this area awhile back.

> > Also, this time the config file doesn't have any MD/RAID support according
> > to the attachment:
> > 
> > 	# Multi-device support (RAID and LVM)
> > 	#
> > 	# CONFIG_MD is not set
> > 
> > so it looks like the XFS and MD issues really are totally unrelated.

Sure does.

> > Mihai: the oops itself is in this case not very telling, since it's just a
> > result of corruption of some fundamental data structures (probably
> > somebody using a page cache page after having free'd it - and it probably
> > only shows up when memory gets low and pages have to be cleaned). Can you
> > tell Nathan more about the filesystem setup (block size, as much as
> > possible about the affected directory, etc).
> 
> Ok.
> 
> $ xfs_info /var
> meta-data=/var                   isize=256    agcount=18, agsize=262144 blks
> data     =                       bsize=4096   blocks=4482127, imaxpct=25
>          =                       sunit=0      swidth=0 blks, unwritten=0
> naming   =version 2              bsize=4096  
> log      =internal               bsize=4096   blocks=1200
> realtime =none                   extsz=65536  blocks=0, rtextents=0

OK, looks like a default mkfs then (with an old-ish mkfs binary)?
Newer mkfs' will give you a better AG layout and unwritten extents
would be turned on - not relevent to this problem at all though.

An "ls -ld" and "xfs_bmap -v" on the directory would also provide
me a bit more info to work with -- thanks!

I have a few ideas about what this might be, let me stew on those
for a bit and try a few things.

cheers.

-- 
Nathan

  reply	other threads:[~2003-12-04 21:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-04 14:59 kernel BUG at mm/filemap.c:332! Mihai RUSU
2003-12-04 16:45 ` Linus Torvalds
2003-12-04 17:26   ` Mihai RUSU
2003-12-04 21:16     ` Nathan Scott [this message]
2003-12-05  7:14       ` Mihai RUSU

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031204211611.GA567@frodo \
    --to=nathans@sgi.com \
    --cc=axboe@suse.de \
    --cc=dizzy@roedu.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).