linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fsck out of memory
@ 2003-02-07 15:17 kernel
  2003-02-07 17:07 ` Stephan van Hienen
  2003-02-07 17:24 ` fsck out of memory Andreas Dilger
  0 siblings, 2 replies; 21+ messages in thread
From: kernel @ 2003-02-07 15:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-raid

i'm trying to run e2fsk after a system hang
after 1 hour running (70%) which had a memory usage for about 128M
i get these errors in the dmesg :

Out of Memory: Killed process 732 (fsck.ext2).
Out of Memory: Killed process 732 (fsck.ext2).
Out of Memory: Killed process 732 (fsck.ext2).
Out of Memory: Killed process 732 (fsck.ext2).

and this some pages

top gives me this info for fsck.ext2 :

  732 root       9   0  592M 465M  2068 S    64.7 92.6   6:31 fsck.ext2

Mem:   514360K av,  512176K used,    2184K free,       0K shrd,     564K
buff
Swap:  136544K av,  136544K used,       0K free                    3120K
cache

system has 512Megabyte memory (and 128mb swap (only fileserver, never
needed more swap)

I really wonder if there is something wrong with e2fsk ?
does it really need that much memory ?
(fsck on 2.2TB /dev/md0)

it was putting a lot of info on the screen (for some minutes) :
Duplicate/bad block in inode ... / ... ...  ... ... ...
(and scrolling in real fast speed)

e2fsprogs version 1.27 with kernel 2.4.20 (+lbd patch)
i tried upgrading e2fsutils to 1.32 (latest version), but this doesn't
help

any hints ? (maybe a way to disable the enormous output from
'Duplicate/bad block in inode ..' ?)

(also why does it tell, killed when it stays running (otherwise it can't
kill multiple times...))

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 15:17 fsck out of memory kernel
@ 2003-02-07 17:07 ` Stephan van Hienen
  2003-02-07 17:28   ` Andreas Dilger
  2003-02-07 17:24 ` fsck out of memory Andreas Dilger
  1 sibling, 1 reply; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-07 17:07 UTC (permalink / raw)
  To: kernel; +Cc: linux-kernel, linux-raid

ok added some swap space (4 gigabyte)

usage was about 2.5GB

till aborted :

d0:  64450554/dev/md0:  64450555/dev/md0:  64450556/dev/md0:
64450557/dev/md0:  64450558/dev/md0:  64450559/dev/md0:  64450560/dev/md0:
64450561/dev/md0:  64450562/dev/md0:  64450563/dev/md0:  64450564/dev/md0:
64450565/dev/md0:  64450566/dev/md0:  64450567/dev/md0:  64450568/dev/md0:
64450569/dev/md0:  64450570/dev/md0:  64450571/dev/md0:  64450572/dev/md0:
64450573/dev/md0:  64450574/dev/md0:  64450575/dev/md0:  64450576/dev/md0:
64450577/dev/md0:  64450578/dev/md0:  64450579/dev/md0:  64450580/dev/md0:
64450581/dev/md0:  64450582/dev/md0:  64450583/dev/md0:  64450584/dev/md0:
64450585/dev/md0:  64450586/dev/md0:  64450587/dev/md0:  64450588/dev/md0:
64450589/dev/md0:  64450590e2fsck: Can't allocate block element

e2fsck: aborted
/dev/md0: 153834/76922880 files (9.3% non-contiguous), 181680730/615381536
blocks

any hints ?
(i really would like to get back a clean fs (with ext3 journal))

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 15:17 fsck out of memory kernel
  2003-02-07 17:07 ` Stephan van Hienen
@ 2003-02-07 17:24 ` Andreas Dilger
  1 sibling, 0 replies; 21+ messages in thread
From: Andreas Dilger @ 2003-02-07 17:24 UTC (permalink / raw)
  To: kernel; +Cc: linux-kernel, linux-raid

On Feb 07, 2003  16:17 +0100, kernel@ddx.a2000.nu wrote:
> i'm trying to run e2fsk after a system hang
> after 1 hour running (70%) which had a memory usage for about 128M
> i get these errors in the dmesg :
> 
> Out of Memory: Killed process 732 (fsck.ext2).
> Out of Memory: Killed process 732 (fsck.ext2).
> Out of Memory: Killed process 732 (fsck.ext2).
> Out of Memory: Killed process 732 (fsck.ext2).
> 
> I really wonder if there is something wrong with e2fsk ?
> does it really need that much memory ?
> (fsck on 2.2TB /dev/md0)

I don't think many people have run e2fsck on such a large filesystem
before when there are lots of problems.  It is entirely possible that
you need so much memory for such a large filesystem.  I would suggest
creating a larger swap file temporarily (on some other partition) so
that e2fsck can complete.

It _may_ be that e2fsck could reduce memory consumption somewhere (or
enable a "use less memory but run slowly" heuristic, but that isn't
very likely, or if it was it would be very slow.

Regarding the "use fsck.ext3" response - ignore it, it is incorrect.
There is no difference at all between fsck.ext2, fsck.ext3, e2fsck.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 17:07 ` Stephan van Hienen
@ 2003-02-07 17:28   ` Andreas Dilger
  2003-02-09 10:08     ` Stephan van Hienen
  0 siblings, 1 reply; 21+ messages in thread
From: Andreas Dilger @ 2003-02-07 17:28 UTC (permalink / raw)
  To: Stephan van Hienen
  Cc: linux-kernel, linux-raid, ext2-devel, Theodore Ts'o

On Feb 07, 2003  18:07 +0100, Stephan van Hienen wrote:
> ok added some swap space (4 gigabyte)
> 
> usage was about 2.5GB
> 
> till aborted :
> 
> d0:  64450554/dev/md0:  64450555/dev/md0:  64450556/dev/md0:
> 64450557/dev/md0:  64450558/dev/md0:  64450559/dev/md0:  64450560/dev/md0:
> 64450561/dev/md0:  64450562/dev/md0:  64450563/dev/md0:  64450564/dev/md0:
> 64450565/dev/md0:  64450566/dev/md0:  64450567/dev/md0:  64450568/dev/md0:
> 64450569/dev/md0:  64450570/dev/md0:  64450571/dev/md0:  64450572/dev/md0:
> 64450573/dev/md0:  64450574/dev/md0:  64450575/dev/md0:  64450576/dev/md0:
> 64450577/dev/md0:  64450578/dev/md0:  64450579/dev/md0:  64450580/dev/md0:
> 64450581/dev/md0:  64450582/dev/md0:  64450583/dev/md0:  64450584/dev/md0:
> 64450585/dev/md0:  64450586/dev/md0:  64450587/dev/md0:  64450588/dev/md0:
> 64450589/dev/md0:  64450590e2fsck: Can't allocate block element
> 
> e2fsck: aborted
> /dev/md0: 153834/76922880 files (9.3% non-contiguous), 181680730/615381536
> blocks
> 
> any hints ?
> (i really would like to get back a clean fs (with ext3 journal))

Hmm, I don't think that will be easy...  By default e2fsck will load all
of the inode blocks into memory (pretty sure at least), and if you have
76922880 inodes that is 9.6GB of memory, which you can't allocate from a
single process on i386 no matter how much swap you have.  2.5GB sounds
about right for the maximum amount of memory one can allocate.

Ted, any suggestions?

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 17:28   ` Andreas Dilger
@ 2003-02-09 10:08     ` Stephan van Hienen
  2003-02-09 10:32       ` Stephan van Hienen
                         ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-09 10:08 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-kernel, linux-raid, ext2-devel, Theodore Ts'o, peter, tbm

On Fri, 7 Feb 2003, Andreas Dilger wrote:

> Hmm, I don't think that will be easy...  By default e2fsck will load all
> of the inode blocks into memory (pretty sure at least), and if you have
> 76922880 inodes that is 9.6GB of memory, which you can't allocate from a
> single process on i386 no matter how much swap you have.  2.5GB sounds
> about right for the maximum amount of memory one can allocate.

hmms the data is not critical yet (i was just testing this server)
i really wonder why the crash was there in the first place

thing i found in /var/log/messages :

Feb  7 04:18:15 storage kernel: EXT3-fs error (device md(9,0)):
ext3_new_block:
Allocating block in system zone - block = 536875638
Feb  7 04:18:15 storage kernel: EXT3-fs error (device md(9,0)):
ext3_new_block:
Allocating block in system zone - block = 536875639

doesn't look ok to me (and explains the crash?)

makes me wonder if this can have todo with the lbd (to allow 2TB+ devices)
patch ? or is this something else?
(if it can be related to the lbd patch, i will remove 2 hd's from the
array (but i don't prefer this option))

also for not getting this thing again (that i can't fsck my filesystem)
what are the setting i can use for creating a large filesystem on /dev/md0
? (what is the maximum workable inodes?)

i did this :

mke2fs -j -m 0  -b 4096 -i 4096 -R stride=16


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-09 10:08     ` Stephan van Hienen
@ 2003-02-09 10:32       ` Stephan van Hienen
  2003-02-09 20:08       ` Peter Chubb
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-09 10:32 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-kernel, linux-raid, ext2-devel, Theodore Ts'o, peter, tbm

and yesterday had another crash (i was using the /dev/md0 mounted without
fsck(running ok for about 24h, only 1 dir was not accessable(was
created at time previous crash)

at this time a bit more info in /var/log/messages :

Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871063
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871065
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871071
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871079
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871081
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871083
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871085
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871087
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871095
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871103
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871108
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871114
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871119
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871121
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871123
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871127
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871129
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871135
Feb  8 20:11:27 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 536871143
..
..(few minutes about the same msg's (only diffent block)
..
Feb  8 20:19:12 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 540606715
Feb  8 20:19:12 storage kernel: EXT2-fs error (device md(9,0)):
ext2_new_block: Allocating block in system zone - block = 540606717
Feb  8 20:19:36 storage kernel: raid5: multiple 1 requests for sector
2064432
Feb  8 20:22:17 storage kernel: raid5: multiple 0 requests for sector
14094488
Feb  8 20:29:12 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:12 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:13 storage last message repeated 4 times
Feb  8 20:29:13 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:13 storage last message repeated 2 times
Feb  8 20:29:13 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:13 storage last message repeated 6 times
Feb  8 20:29:13 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:13 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:13 storage kernel: raid5: multiple 0 requests for sector 6496
Feb  8 20:29:13 storage last message repeated 5 times
Feb  8 20:29:13 storage kernel: raid5: multiple 0 requests for sector
25792
Feb  8 20:29:13 storage last message repeated 5 times
Feb  8 20:29:14 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:14 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:14 storage last message repeated 2 times
Feb  8 20:29:14 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:14 storage last message repeated 4 times
Feb  8 20:29:15 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:15 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:15 storage last message repeated 2 times
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:16 storage last message repeated 2 times
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector 6496
Feb  8 20:29:16 storage last message repeated 2 times
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
25792
Feb  8 20:29:16 storage last message repeated 2 times
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:16 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:16 storage last message repeated 6 times
Feb  8 20:29:17 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:17 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:18 storage last message repeated 4 times
Feb  8 20:29:18 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:18 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:18 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:18 storage last message repeated 2 times
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
306783232
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:19 storage last message repeated 2 times
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
9587064
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:19 storage last message repeated 8 times
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:19 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:19 storage last message repeated 2 times
Feb  8 20:29:21 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:21 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:21 storage last message repeated 4 times
Feb  8 20:29:21 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:31 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:31 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:32 storage kernel: raid5: multiple 0 requests for sector
3670152
Feb  8 20:29:32 storage kernel: raid5: multiple 0 requests for sector
306783480
Feb  8 20:29:32 storage kernel: raid5: multiple 0 requests for sector 6496
Feb  8 20:29:32 storage last message repeated 5 times
Feb  8 20:29:32 storage kernel: raid5: multiple 0 requests for sector
25792

(at that time i did a powerdown (reboot was not possible))

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-09 10:08     ` Stephan van Hienen
  2003-02-09 10:32       ` Stephan van Hienen
@ 2003-02-09 20:08       ` Peter Chubb
  2003-02-10 11:28         ` Stephan van Hienen
  2003-02-09 20:31       ` Andreas Dilger
  2003-02-10 22:44       ` [Ext2-devel] Re: fsck out of memory Stephen C. Tweedie
  3 siblings, 1 reply; 21+ messages in thread
From: Peter Chubb @ 2003-02-09 20:08 UTC (permalink / raw)
  To: Stephan van Hienen
  Cc: Andreas Dilger, linux-kernel, linux-raid, ext2-devel,
	Theodore Ts'o, peter, tbm

>>>>> "Stephan" == Stephan van Hienen <raid@a2000.nu> writes:

Stephan> makes me wonder if this can have todo with the lbd (to allow
Stephan> 2TB+ devices) patch ? or is this something else?  (if it can
Stephan> be related to the lbd patch, i will remove 2 hd's from the
Stephan> array (but i don't prefer this option))

I haven't tested ext[23] with that large a system on IA32 (I stopped
at 2.4TB, and that was on Linux 2.5).  The 2.4 LBD patch was basically
backported from the 2.5.9 version (the last tested version before Al
Viro's rewrite of the block device and partitioning code).  Differences in
ext[32] between 2.4.20 and 2.5.9 may not have been allowed for
properly.

I'll have a look when I'm in at work today.

Is there any reason why you're sticking with the 2.4 kernel and ext3?
XFS has been used (on SGI systems) for much longer with large disk
arrays, and I'd expect (linux-specific bugs aside) it to be a more
mature product for this application.

Peter C


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-09 10:08     ` Stephan van Hienen
  2003-02-09 10:32       ` Stephan van Hienen
  2003-02-09 20:08       ` Peter Chubb
@ 2003-02-09 20:31       ` Andreas Dilger
  2003-02-10 12:01         ` Stephan van Hienen
  2003-02-10 22:44       ` [Ext2-devel] Re: fsck out of memory Stephen C. Tweedie
  3 siblings, 1 reply; 21+ messages in thread
From: Andreas Dilger @ 2003-02-09 20:31 UTC (permalink / raw)
  To: Stephan van Hienen
  Cc: linux-kernel, linux-raid, ext2-devel, Theodore Ts'o, peter, tbm

On Feb 09, 2003  11:08 +0100, Stephan van Hienen wrote:
> makes me wonder if this can have todo with the lbd (to allow 2TB+ devices)
> patch ? or is this something else?
> (if it can be related to the lbd patch, i will remove 2 hd's from the
> array (but i don't prefer this option))

Now that you mention this, I believe that there were som fixes to the ext2/3
code to not overflow some calcs, but I don't recall the specifics.  It sure
seems unusual to have such easy-to-reproduce errors.

> mke2fs -j -m 0  -b 4096 -i 4096 -R stride=16

Do you expect to have so many small files in this huge filesystem?
Basically, the "-i" parameter is telling mke2fs what you think the
average file size will be, so 4kB is very small.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-09 20:08       ` Peter Chubb
@ 2003-02-10 11:28         ` Stephan van Hienen
  0 siblings, 0 replies; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-10 11:28 UTC (permalink / raw)
  To: Peter Chubb
  Cc: Andreas Dilger, linux-kernel, linux-raid, ext2-devel,
	Theodore Ts'o, tbm

On Mon, 10 Feb 2003, Peter Chubb wrote:

> Is there any reason why you're sticking with the 2.4 kernel and ext3?
> XFS has been used (on SGI systems) for much longer with large disk
> arrays, and I'd expect (linux-specific bugs aside) it to be a more
> mature product for this application.

i used ext2/3 on all my servers
never checked out xfs or reiserfs, so don't really want to check it out an
an important server, but if it's better to switch to something else..... ?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-09 20:31       ` Andreas Dilger
@ 2003-02-10 12:01         ` Stephan van Hienen
       [not found]           ` <3E479CC7.B170D5C7@aitel.hist.no>
  0 siblings, 1 reply; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-10 12:01 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-kernel, linux-raid, ext2-devel, Theodore Ts'o, peter, tbm

On Sun, 9 Feb 2003, Andreas Dilger wrote:

> > mke2fs -j -m 0  -b 4096 -i 4096 -R stride=16
>
> Do you expect to have so many small files in this huge filesystem?
> Basically, the "-i" parameter is telling mke2fs what you think the
> average file size will be, so 4kB is very small.
not really, i thought the -b was telling this ?
i think average filesize should be somewhere from 1-5 megabyte
(zipfiles few megabyte/videofiles (can be a few gigabyte)/installation
files for programmes)

i also wonder what kind of chunk-size i need to use
i use 64k now, but i wonder if 256k (or something bigger?) would be better
(does chunk size difference in performance between a 4disk raid5 and a 15disk raid5 ?)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Ext2-devel] Re: fsck out of memory
  2003-02-09 10:08     ` Stephan van Hienen
                         ` (2 preceding siblings ...)
  2003-02-09 20:31       ` Andreas Dilger
@ 2003-02-10 22:44       ` Stephen C. Tweedie
  2003-02-11 13:11         ` Stephan van Hienen
  3 siblings, 1 reply; 21+ messages in thread
From: Stephen C. Tweedie @ 2003-02-10 22:44 UTC (permalink / raw)
  To: Stephan van Hienen
  Cc: Andreas Dilger, linux-kernel, linux-raid, ext2-devel,
	Theodore Ts'o, peter, tbm

Hi,

On Sun, 2003-02-09 at 10:08, Stephan van Hienen wrote:

> Feb  7 04:18:15 storage kernel: EXT3-fs error (device md(9,0)):
> ext3_new_block:
> Allocating block in system zone - block = 536875638

That looks like it could be a block wrap, amongst other possible causes.

> makes me wonder if this can have todo with the lbd (to allow 2TB+ devices)
> patch ? or is this something else?

Well, that's the most likely candidate, because it's the least tested
component.  Are you using Ben LaHaise's LBD fixes for the md devices? 
Without those, md and lvm are not LBD-safe.

Cheers,
 Stephen


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Ext2-devel] Re: fsck out of memory
  2003-02-10 22:44       ` [Ext2-devel] Re: fsck out of memory Stephen C. Tweedie
@ 2003-02-11 13:11         ` Stephan van Hienen
  2003-02-11 14:33           ` Stephen C. Tweedie
  0 siblings, 1 reply; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-11 13:11 UTC (permalink / raw)
  To: Stephen C. Tweedie
  Cc: Andreas Dilger, linux-kernel, linux-raid, ext2-devel,
	Theodore Ts'o, peter, tbm

On Mon, 10 Feb 2003, Stephen C. Tweedie wrote:

> On Sun, 2003-02-09 at 10:08, Stephan van Hienen wrote:
>
> > Feb  7 04:18:15 storage kernel: EXT3-fs error (device md(9,0)):
> > ext3_new_block:
> > Allocating block in system zone - block = 536875638
>
> That looks like it could be a block wrap, amongst other possible causes.
hmms and this means ?


>
> > makes me wonder if this can have todo with the lbd (to allow 2TB+ devices)
> > patch ? or is this something else?
>
> Well, that's the most likely candidate, because it's the least tested
> component.  Are you using Ben LaHaise's LBD fixes for the md devices?
> Without those, md and lvm are not LBD-safe.
where can i find this lbd fixes for md ?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Ext2-devel] Re: fsck out of memory
  2003-02-11 13:11         ` Stephan van Hienen
@ 2003-02-11 14:33           ` Stephen C. Tweedie
       [not found]             ` <1044993857.6640.7.camel@plokta.s8.com>
  0 siblings, 1 reply; 21+ messages in thread
From: Stephen C. Tweedie @ 2003-02-11 14:33 UTC (permalink / raw)
  To: Stephan van Hienen
  Cc: Andreas Dilger, linux-kernel, linux-raid, ext2-devel,
	Theodore Ts'o, peter, tbm, Stephen Tweedie

Hi,

On Tue, 2003-02-11 at 13:11, Stephan van Hienen wrote:
> > On Sun, 2003-02-09 at 10:08, Stephan van Hienen wrote:
> >
> > > Feb  7 04:18:15 storage kernel: EXT3-fs error (device md(9,0)):
> > > ext3_new_block:
> > > Allocating block in system zone - block = 536875638
> >
> > That looks like it could be a block wrap, amongst other possible causes.
> hmms and this means ?

One possible cause here is that some component of the system has wrapped
the block number round at 2TB, rather than correctly going beyond 2TB,
resulting in the wrong block being picked up as a bitmap block.

> > Well, that's the most likely candidate, because it's the least tested
> > component.  Are you using Ben LaHaise's LBD fixes for the md devices?
> > Without those, md and lvm are not LBD-safe.
> where can i find this lbd fixes for md ?

I've no idea.  Ben has some lb patches up at

  http://people.redhat.com/bcrl/lb/

but there's nothing broken out against the latest lbd diffs.

Cheers,
 Stephen


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re:2TB+ fs ext3 (was fsck out of memory)
       [not found]           ` <3E479CC7.B170D5C7@aitel.hist.no>
@ 2003-02-11 16:14             ` Stephan van Hienen
  0 siblings, 0 replies; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-11 16:14 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Andreas Dilger, linux-kernel, linux-raid, ext2-devel,
	Theodore Ts'o, peter, bernard

On Mon, 10 Feb 2003, Helge Hafting wrote:

> For 1MB average filesize use -i 1048576

tried to mke2fs with '-i 1048576' :

----
]# mke2fs -i 1048576 /dev/md0 -R stride=16 -m 0 -j
mke2fs 1.32 (09-Nov-2002)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
2403840 inodes, 615381536 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
18780 block groups
32768 blocks per group, 32768 fragments per group
128 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616,
78675968,
        102400000, 214990848, 512000000, 550731776

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 23 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
----
dmesg output :

md: mke2fs(pid 10600) used obsolete MD ioctl, upgrade your software to use
new ioctls.
----
then i try to mount :

----
]# mount /dev/md0  /raid/
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       or too many mounted file systems
---
dmesg output :

raid5: switching cache buffer size, 4096 --> 1024
raid5: switching cache buffer size, 1024 --> 4096
EXT3-fs error (device md(9,0)): ext3_check_descriptors: Block bitmap for
group 4480 not in group (block 15)!
EXT3-fs: group descriptors corrupted !
----

and e2fsck :

----
e2fsck /dev/md0
e2fsck 1.32 (09-Nov-2002)
Group descriptors look bad... trying backup blocks...
Block bitmap for group 6528 is not in group.  (block 15)
Relocate<y>? yes

Inode bitmap for group 6528 is not in group.  (block 3145728)
Relocate<y>? yes

Inode table for group 6528 is not in group.  (block 0)
WARNING: SEVERE DATA LOSS POSSIBLE.
Relocate<y>? yes

Block bitmap for group 6529 is not in group.  (block 0)
Relocate<y>? yes

Inode bitmap for group 6529 is not in group.  (block 0)
Relocate<y>? yes

Inode table for group 6529 is not in group.  (block 0)
WARNING: SEVERE DATA LOSS POSSIBLE.
Relocate<y>? yes

Block bitmap for group 6530 is not in group.  (block 0)
Relocate<y>? yes

Inode bitmap for group 6530 is not in group.  (block 0)
Relocate<y>? yes

Inode table for group 6530 is not in group.  (block 0)
WARNING: SEVERE DATA LOSS POSSIBLE.
Relocate<y>? yes

.....

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re:2TB+ fs ext3 (was fsck out of memory)
       [not found]             ` <1044993857.6640.7.camel@plokta.s8.com>
@ 2003-02-11 20:32               ` Stephan van Hienen
  0 siblings, 0 replies; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-11 20:32 UTC (permalink / raw)
  To: Bryan O'Sullivan; +Cc: linux-raid, linux-kernel, ext2-devel, tbm, peter

On Tue, 11 Feb 2003, Bryan O'Sullivan wrote:

> Ugh, that stuff is ancient.
>
> Peter Chubb has a backport of his (very thorough) 2.5 patch to support
> large block devices on 32-bit platforms, against much newer kernels
> (e.g. 2.4.20) than Ben's stuff.  His site seems to be down, but you
> should be able to get it from somewhere under
> http://www.gelato.unsw.edu.au/~peterc/
correct link : http://www.gelato.unsw.edu.au/patches-index.html
(but down)

i used this patch, but i got this comment :

> Well, that's the most likely candidate, because it's the least tested
> component.  Are you using Ben LaHaise's LBD fixes for the md devices?
> Without those, md and lvm are not LBD-safe.

makes me wonder if i need another patch besides the 'Peter Chubb patch'
when using md raid ?

> I haven't used Peter's patch, but a similar patch, developed
> independently, definitely allows ext3 filesystems of up to 8TB in size
> to work fine on x86, under 2.4.

look at my other posts, i can't create/work with a 2348 Gigabyte /dev/md0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* fsck out of memory
@ 2003-02-11  2:53 Marco C. Mason
  0 siblings, 0 replies; 21+ messages in thread
From: Marco C. Mason @ 2003-02-11  2:53 UTC (permalink / raw)
  To: linux-kernel, raid

Stephan--

I don't know if anyone mentioned it or not, but the block addresses in your
error messages appear suspiciously close to 2^29.  I'm suspecting an 
internal
overflow in a calculation somewhere...

--marco



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 16:20       ` Wim Vinckier
@ 2003-02-07 17:08         ` Stephan van Hienen
  0 siblings, 0 replies; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-07 17:08 UTC (permalink / raw)
  To: Wim Vinckier; +Cc: linux-raid, linux-kernel

On Fri, 7 Feb 2003, Wim Vinckier wrote:

> I would really use fsck.ext3...  I guess it will give a lot less errors...

fsck.ext3 = fsck.ext2

]# fsck.ext3 /dev/md0
e2fsck 1.32 (09-Nov-2002)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 16:15     ` Stephan van Hienen
@ 2003-02-07 16:20       ` Wim Vinckier
  2003-02-07 17:08         ` Stephan van Hienen
  0 siblings, 1 reply; 21+ messages in thread
From: Wim Vinckier @ 2003-02-07 16:20 UTC (permalink / raw)
  To: Stephan van Hienen; +Cc: linux-raid, linux-kernel

On Fri, 7 Feb 2003, Stephan van Hienen wrote:

> On Fri, 7 Feb 2003, Wim Vinckier wrote:
>
> > I'm just wondering why you are using ext2 in stead of ext3 or reiserfs...
> i'm running ext3
> but crash was heavy enough for removing the journal info :(
>

I would really use fsck.ext3...  I guess it will give a lot less errors...

> > I would still give it a try to boot my system without mounting the raid so
> > you just can wait untill the raid is synchronized.  Once this is ready,
> i can mount the filesystem, but get errors on accessing some files
> so i prefer to run fsck on it (and restore the journal info)
>
> > you can check your raid.  BTW, I had two fans blowing air over my
> > harddisks but I got the crash because I used the normal flat IDE-cable...
> > I suppose you really checked the heat of the disks?
> yes i did
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

------------------------------------------------------------------------
Wim VINCKIER
Wim-Raid@tisnix.be                                         ICQ 100545109
------------------------------------------------------------------------
'Windows 98 or better required' said the box... so I installed linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 16:09   ` Wim Vinckier
@ 2003-02-07 16:15     ` Stephan van Hienen
  2003-02-07 16:20       ` Wim Vinckier
  0 siblings, 1 reply; 21+ messages in thread
From: Stephan van Hienen @ 2003-02-07 16:15 UTC (permalink / raw)
  To: Wim Vinckier; +Cc: kernel, linux-raid, linux-kernel

On Fri, 7 Feb 2003, Wim Vinckier wrote:

> I'm just wondering why you are using ext2 in stead of ext3 or reiserfs...
i'm running ext3
but crash was heavy enough for removing the journal info :(

> I would still give it a try to boot my system without mounting the raid so
> you just can wait untill the raid is synchronized.  Once this is ready,
i can mount the filesystem, but get errors on accessing some files
so i prefer to run fsck on it (and restore the journal info)

> you can check your raid.  BTW, I had two fans blowing air over my
> harddisks but I got the crash because I used the normal flat IDE-cable...
> I suppose you really checked the heat of the disks?
yes i did


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
  2003-02-07 15:50 ` kernel
@ 2003-02-07 16:09   ` Wim Vinckier
  2003-02-07 16:15     ` Stephan van Hienen
  0 siblings, 1 reply; 21+ messages in thread
From: Wim Vinckier @ 2003-02-07 16:09 UTC (permalink / raw)
  To: kernel; +Cc: linux-raid, linux-kernel

On Fri, 7 Feb 2003 kernel@ddx.a2000.nu wrote:

> On Fri, 7 Feb 2003, Wim Vinckier wrote:
>
> > I've got an equivalent problem with my server.  After a long search, it
> > seemed to be a heating problem.  The ventilation wasn't good enough to
>
> disks are not warm at all
> there are 3 6000rpms fans blowing air over them
>

I'm just wondering why you are using ext2 in stead of ext3 or reiserfs...
I would still give it a try to boot my system without mounting the raid so
you just can wait untill the raid is synchronized.  Once this is ready,
you can check your raid.  BTW, I had two fans blowing air over my
harddisks but I got the crash because I used the normal flat IDE-cable...
I suppose you really checked the heat of the disks?

Wim.
------------------------------------------------------------------------
Wim VINCKIER
Wim-Raid@tisnix.be                                         ICQ 100545109
------------------------------------------------------------------------
'Windows 98 or better required' said the box... so I installed linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: fsck out of memory
       [not found] <Pine.LNX.4.33.0302071637050.11484-100000@nooks.wimpunk.com>
@ 2003-02-07 15:50 ` kernel
  2003-02-07 16:09   ` Wim Vinckier
  0 siblings, 1 reply; 21+ messages in thread
From: kernel @ 2003-02-07 15:50 UTC (permalink / raw)
  To: Wim Vinckier; +Cc: linux-raid, linux-kernel

On Fri, 7 Feb 2003, Wim Vinckier wrote:

> I've got an equivalent problem with my server.  After a long search, it
> seemed to be a heating problem.  The ventilation wasn't good enough to

disks are not warm at all
there are 3 6000rpms fans blowing air over them

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2003-02-11 20:23 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-07 15:17 fsck out of memory kernel
2003-02-07 17:07 ` Stephan van Hienen
2003-02-07 17:28   ` Andreas Dilger
2003-02-09 10:08     ` Stephan van Hienen
2003-02-09 10:32       ` Stephan van Hienen
2003-02-09 20:08       ` Peter Chubb
2003-02-10 11:28         ` Stephan van Hienen
2003-02-09 20:31       ` Andreas Dilger
2003-02-10 12:01         ` Stephan van Hienen
     [not found]           ` <3E479CC7.B170D5C7@aitel.hist.no>
2003-02-11 16:14             ` Re:2TB+ fs ext3 (was fsck out of memory) Stephan van Hienen
2003-02-10 22:44       ` [Ext2-devel] Re: fsck out of memory Stephen C. Tweedie
2003-02-11 13:11         ` Stephan van Hienen
2003-02-11 14:33           ` Stephen C. Tweedie
     [not found]             ` <1044993857.6640.7.camel@plokta.s8.com>
2003-02-11 20:32               ` Re:2TB+ fs ext3 (was fsck out of memory) Stephan van Hienen
2003-02-07 17:24 ` fsck out of memory Andreas Dilger
     [not found] <Pine.LNX.4.33.0302071637050.11484-100000@nooks.wimpunk.com>
2003-02-07 15:50 ` kernel
2003-02-07 16:09   ` Wim Vinckier
2003-02-07 16:15     ` Stephan van Hienen
2003-02-07 16:20       ` Wim Vinckier
2003-02-07 17:08         ` Stephan van Hienen
2003-02-11  2:53 Marco C. Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).