linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* filesystem bug?
@ 2003-12-15  9:25 Tsuchiya Yoshihiro
  2003-12-15  9:55 ` bert hubert
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Tsuchiya Yoshihiro @ 2003-12-15  9:25 UTC (permalink / raw)
  To: linux-kernel

Hi,

Ext2 and Ext3 filesystem go to inconsistent status by
simple test program on my system.

My test program is a script that extract a tar+gzip archive
twice and compare them, and remove one of the tree, and then
another extracting, and compare them again. A very simple test.

Following is an Ext2 result and the inode is filled by zero.
I think the inode becomes a badinode.

----
[root@dell04 tsuchiya]# ls -l /mnt/foo/ae/dir0/mozilla/layout/html/tests/table/bugs/bug2757.html
ls: /mnt/foo/ae/dir0/mozilla/layout/html/tests/table/bugs/bug2757.html: Input/output error


debugfs:  stat foo/ae/dir0/mozilla/layout/html/tests/table/bugs/bug2757.html
Inode: 1935297   Type: bad type    Mode:  0000   Flags: 0x0   Generation: 0
User:     0   Group:     0   Size: 0
File ACL: 0    Directory ACL: 0
Links: 0   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x00000000 -- Thu Jan  1 09:00:00 1970
atime: 0x00000000 -- Thu Jan  1 09:00:00 1970
mtime: 0x00000000 -- Thu Jan  1 09:00:00 1970
BLOCKS:

/dev/sda4 on /mnt type ext2 (rw)
----

I saw same thing on Ext3 before.

I use RedHat9 which kernel is 2.4.20-8 and I also tried
2.4.20-19.9(redhat kernel patch rpm).

I want to know whether it is a redhat kernel problem or a generic
Ext problem and on which version it is fixed.


Mkfs parameter is just default of /sbin/mkfs.ext2 and mkfs.ext3,
and I use DELL 1650's internal SCSI disks for this test:

----
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

blk: queue dfceb214, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: SEAGATE   Model: ST336607LC        Rev: DS04
  Type:   Direct-Access                      ANSI SCSI revision: 03
blk: queue dfceb414, I/O limit 4095Mb (mask 0xffffffff)
  Vendor: SEAGATE   Model: ST336753LC        Rev: DX03
  Type:   Direct-Access                      ANSI SCSI revision: 03
----

I will attach my script. I use Mozilla's tar archive.
Edit the first three lines for your use.

In the example above, the inode structure was cleared by zero, and
some time the data area was broken. Also I saw an inode overwritten
by deleted inode(which nlink=0 and i_dtime is on).
My feeling is that the broken buffers were used for some other
purpose and destroyed without having right LOCK of the buffer.

Here is my script:
---
#!/bin/bash

TARGETPREFIX=/mnt/foo   # filesystem that will be tested
MOZSRC=/home/tsuchiya/src/mozilla-source-1.3.tar.gz     # tgz used for test
RDIR="/tmp/xcresult"    # result directory

function _xtract+compare {
        echo "extracting directory to be compared against for $1"
        TARGETDIR=$TARGETPREFIX/$1
        mkdir -p $TARGETDIR
        cd $TARGETDIR
        tar zxf $MOZSRC
        echo "$1 done .... now the job is started."
        RESULTS=$RDIR/$1

        echo "test result will be stored under $RESULTS"
        mkdir -p $RESULTS;

        for ((i=0; i < 100000; i++))
        do
                echo "$1:$i-th trial"

                echo "test dir is $TARGETDIR";
                mkdir -p $TARGETDIR;

                cd $TARGETDIR
                mkdir dir$i
                cd dir$i
                tar zxf $MOZSRC
                diff -rq $TARGETDIR/mozilla mozilla > $RESULTS/dir$i.result 2>&1
                DIFFSIZE=`ls -l $RESULTS/dir$i.result | awk '{print $5}'`
                if [ $DIFFSIZE != 0 ];
                then
                        echo "something wrong happened at $1:$i-th trial "
                        exit;
                else
                        rm $RESULTS/dir$i.result
                        echo "test $1:$i-th passed"
                fi
                rm -rf mozilla &
        done
}

for target in aa ab ac ad ae # af ag ah ai aj ak al am an
do
        _xtract+compare $target $RDIR &
done

---


Any information would be appreciated.

Thanks,
Yoshi
---
Yoshihiro Tsuchiya




^ permalink raw reply	[flat|nested] 30+ messages in thread
* Re: filesystem bug?
@ 2003-12-26 13:22 土屋芳浩
  2003-12-26 14:30 ` dlion
  0 siblings, 1 reply; 30+ messages in thread
From: 土屋芳浩 @ 2003-12-26 13:22 UTC (permalink / raw)
  To: dlion2004; +Cc: tsuchiya, linux-kernel

Hi,

 >I got other errors on ext3 filesystem include:
 >1. missing file
 >2. corrupted file
 >but when I used fsck.ext3 to check the ramdisk, the result was clean.

Dlion,  how did the corrupted file look like?
(its file size, number of blocks etc.)

Thanks,
Yoshi
--
Yoshihiro Tsuchiya


^ permalink raw reply	[flat|nested] 30+ messages in thread
* Re: filesystem bug?
@ 2003-12-27 14:35 Tsuchiya Yoshihiro
       [not found] ` <Pine.LNX.4.58L.0312301556380.23875@logos.cnet>
  0 siblings, 1 reply; 30+ messages in thread
From: Tsuchiya Yoshihiro @ 2003-12-27 14:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: dlion2004


Hi,

 >1. some corrupted files is truncated to 0 bytes. Blockcount is 0.
 >
 >2. some corrupted files is truncated . the result is a shorter file.
 >the new size is multiple of block size.

I have seen these things before, though

 >3. maybe all corrupted files' mtime is exactly the same
 >wrong value. Should be around 2003.12.26 21:30:00, but
 >is 2002.05.12 12:00:48(hex value is 0x3cdde8f0) . ctime
 >and atime is correct. The system's clock time is unchanged.
 >
 >4. it seems that the corrupted files tends to exist in the same
 >directory.

I haven't been aware of these ones. Thank you.

Yoshi
---
Yoshihiro Tsuchiya 


^ permalink raw reply	[flat|nested] 30+ messages in thread
* Re: filesystem bug?
@ 2004-01-16  2:59 Tsuchiya Yoshihiro
  2004-01-16 12:29 ` Stephen C. Tweedie
  0 siblings, 1 reply; 30+ messages in thread
From: Tsuchiya Yoshihiro @ 2004-01-16  2:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Stephen C. Tweedie


Hi Stephen,

>Now, I can't tell from this whether it's a bash bug or an exit/signal
> bug, but it doesn't look like a filesystem problem for now. I'm going
> to try with a different shell to see if that helps.

I tried with /bin/zsh, and it seems you are right. The script
is working fine for about 2 hours.

So I will try to find out about EIO(inode corruption) problem next.

Thank you so much,

Yoshi

-- 
--
Yoshihiro Tsuchiya




^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2004-01-20 16:27 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-15  9:25 filesystem bug? Tsuchiya Yoshihiro
2003-12-15  9:55 ` bert hubert
2003-12-16 13:44 ` Stephen C. Tweedie
2003-12-16 21:40   ` Bryan Whitehead
2003-12-16 21:50     ` Bryan Whitehead
2003-12-16 23:31     ` Tsuchiya Yoshihiro
2003-12-16 23:40       ` viro
2003-12-17  0:12         ` Tsuchiya Yoshihiro
2003-12-17 23:24       ` Tsuchiya Yoshihiro
2003-12-18 21:29         ` Stephen C. Tweedie
2003-12-21 23:15           ` Tsuchiya Yoshihiro
2003-12-22  1:54             ` Tsuchiya Yoshihiro
2003-12-22  4:30             ` Tsuchiya Yoshihiro
2003-12-22 12:03               ` Stephen C. Tweedie
2003-12-24  1:48                 ` Tsuchiya Yoshihiro
2003-12-24 23:09                   ` Tsuchiya Yoshihiro
2004-01-15  6:38                     ` Tsuchiya Yoshihiro
2003-12-26  9:59 ` dlion
2003-12-26 12:27   ` dlion
2003-12-26 13:22 土屋芳浩
2003-12-26 14:30 ` dlion
2003-12-28  8:26   ` dlion
2003-12-27 14:35 Tsuchiya Yoshihiro
     [not found] ` <Pine.LNX.4.58L.0312301556380.23875@logos.cnet>
     [not found]   ` <74964CA8-3B50-11D8-B879-00039341E01A@ybb.ne.jp>
     [not found]     ` <1074109164.4538.8.camel@sisko.scot.redhat.com>
     [not found]       ` <0586254E-46DA-11D8-B45E-00039341E01A@ybb.ne.jp>
2004-01-15 22:38         ` Stephen C. Tweedie
2004-01-16  2:59 Tsuchiya Yoshihiro
2004-01-16 12:29 ` Stephen C. Tweedie
2004-01-19  7:52   ` Tsuchiya Yoshihiro
2004-01-19 13:12     ` Stephen C. Tweedie
2004-01-20  8:36       ` Tsuchiya Yoshihiro
2004-01-20 16:27         ` Stephen C. Tweedie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).