linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ext2fs corruption again
@ 2001-09-15 11:14 Kristian Peters
  2001-09-15 12:21 ` Roeland Th. Jansen
  2001-09-15 15:48 ` Mohammad A. Haque
  0 siblings, 2 replies; 14+ messages in thread
From: Kristian Peters @ 2001-09-15 11:14 UTC (permalink / raw)
  To: linux-kernel

Hello.

For about 3 weeks I sent a report that I've got very strange kernel error messages.

I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB
harddisks are not very stable.

Today I've got these from the kernel (with the new hd):

Sep 15 10:01:58 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4215
Sep 15 10:01:58 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4217
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4234
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4236
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4239
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4847
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4848
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4852
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4855
Sep 15 10:02:06 adlib kernel: EXT2-fs error (device ide0(3,5)): ext2_new_block:
Allocating block in system zone - block = 174
Sep 15 10:02:06 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: Freeing blocks in system zones - Block = 179, count = 3
Sep 15 10:02:09 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4839

Then I did an e2fsck on that device (hda5) and the errors occured after the
check (and a complete reboot) again:

Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4163
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4166
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4131
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4132
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4155
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4156
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4157
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4161
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4162
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 716
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 717
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 720
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 723
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 724
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 725
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 726
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 727

I've written down all e2fsck messages by hand. ;-) And I compared them.

The following messages from e2fsck are always the same even on the old and on
the new hd. Here they are:

Duplicate/bad bock(s) in inode:  97: 643
Duplicate/bad bock(s) in inode: 100: 649
Duplicate/bad bock(s) in inode: 101: 650 651
Duplicate/bad bock(s) in inode: 102: 652
Duplicate/bad bock(s) in inode: 103: 653 656
Duplicate/bad bock(s) in inode: 104: 659 660
Duplicate/bad bock(s) in inode: 105: 661 662 663 664 665 666
Duplicate/bad bock(s) in inode: 106: 667 668
Duplicate/bad bock(s) in inode: 107: 669 671
Duplicate/bad bock(s) in inode: 108: 672 673 674
Duplicate/bad bock(s) in inode: 110: 678

Inodes 643-678 are always connected to faults.

The following files are always in connection with these errors:
/var/log/wtmp
/var/log/messages

The old hd was hda: IBM-DTLA-305040, ATA DISK drive. The new is: hda:
IBM-DTLA-307075, ATA DISK drive.

hdparm says:
    Model=IBM-DTLA-307075, FwRev=TXAOA50C, SerialNo=YSDYSFN9998
    Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
    RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
    BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=8
    CurCHS=17475/15/63, CurSects=-78446341, LBA=yes, LBAsects=150136560
    IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
    PIO modes: pio0 pio1 pio2 pio3 pio4
    DMA modes: mdma0 mdma1 mdma2 udma0 udma1 *udma2
    AdvancedPM=yes: disabled (255)
    Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4 ATA-5

I currently use linux 2.4.9 and e2fsprogs 1.23 and fileutils-4.1 and a modified
RedHat 6.2. These errors only occured with linux>=2.4.5-ac11.

I might say this is definitely an error with ext2 !

Kristian

·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                            :: http://www.korseby.net
                            :: http://www.tomlab.de
kristian@korseby.net ....::


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 11:14 ext2fs corruption again Kristian Peters
@ 2001-09-15 12:21 ` Roeland Th. Jansen
  2001-09-15 14:49   ` Kristian
  2001-09-15 15:48 ` Mohammad A. Haque
  1 sibling, 1 reply; 14+ messages in thread
From: Roeland Th. Jansen @ 2001-09-15 12:21 UTC (permalink / raw)
  To: Kristian Peters; +Cc: linux-kernel

On Sat, Sep 15, 2001 at 01:14:32PM +0200, Kristian Peters wrote:
> I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB
> harddisks are not very stable.

[....] 

> I might say this is definitely an error with ext2 !

I might say that you could have tried a complete different make drive. who
says that the 75 MB IBM is ok ?

maybe the same design mistake was made or so. just a thought. 

options to see if something weird isn't due to hardware could be to not
use dma, change pio modes, e.g. relax the drive settings. 

not that I say IBM's drive is bad, it's just a thought.

-- 
Grobbebol's Home                   |  Don't give in to spammers.   -o)
http://www.xs4all.nl/~bengel       | Use your real e-mail address   /\
Linux 2.2.16 SMP 2x466MHz / 256 MB |        on Usenet.             _\_v  

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 12:21 ` Roeland Th. Jansen
@ 2001-09-15 14:49   ` Kristian
  2001-09-15 21:04     ` Mike Fedyk
  0 siblings, 1 reply; 14+ messages in thread
From: Kristian @ 2001-09-15 14:49 UTC (permalink / raw)
  To: Roeland Th. Jansen, David Weinehall; +Cc: linux-kernel

Roeland Th. Jansen wrote:
> not that I say IBM's drive is bad, it's just a thought.

They are bad if it's hardware-related. The big one is manufactured in Hungary, 
the other one in Thaiwan.

The error occured again.

I post the new errors. Maybe you can see any structure in it.

Sep 15 16:16:23 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block (5412-5427)

e2fsck reported the following on that device (hda5):

++ entries are new with this check
-- entries only appeared earlier

Duplicate/bad bock(s) in inode:  97: 643 +644+
Duplicate/bad bock(s) in inode:  98: +647+
Duplicate/bad bock(s) in inode:  99: +648+
Duplicate/bad bock(s) in inode: 100: 649
Duplicate/bad bock(s) in inode: 101: 650 651
Duplicate/bad bock(s) in inode: 102: 652
Duplicate/bad bock(s) in inode: 103: 653 656 +657+
Duplicate/bad bock(s) in inode: 104: +658+ 659 660
Duplicate/bad bock(s) in inode: 105: 661 662 663 664 665 666
Duplicate/bad bock(s) in inode: 106: 667 -668-
Duplicate/bad bock(s) in inode: 107: 669 -671-
-Duplicate/bad bock(s) in inode: 108: 672 673 674-
-Duplicate/bad bock(s) in inode: 110: 678-

767011: 647 648 649 650 651 652 653 654 671
832166: 655 656 657 658 659 660 661 662
832170: 643 644
832178: 663 664 665 666 667

832178 is /var/log/boot.log
832170 is /var/log/wtmp
832166 is /var/log/messages
767011 is /home/tisi/syslog

Only syslog related files are concerned.

syslog is configured that it will accept logs from other machines. Maybe there's 
a possibility that these strange errors were caused by the network-card or 
-driver ? I own an eepro100. Just a thought...

These errors occured since 2.4.5 that's why I think it's software-related.

I'll try to use 'hdparm -d1 -X33 /dev/hda' and other modes to see if it occurs 
again. But testing could take some time. It appears ~~ every second day.

Kristian

·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                          :: http://www.korseby.net
                          :: http://www.tomlab.de
kristian@korseby.net ....::


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 11:14 ext2fs corruption again Kristian Peters
  2001-09-15 12:21 ` Roeland Th. Jansen
@ 2001-09-15 15:48 ` Mohammad A. Haque
  2001-09-15 16:19   ` Kristian Peters
  1 sibling, 1 reply; 14+ messages in thread
From: Mohammad A. Haque @ 2001-09-15 15:48 UTC (permalink / raw)
  To: Kristian Peters; +Cc: linux-kernel

Kristian Peters wrote:
> For about 3 weeks I sent a report that I've got very strange kernel error messages.
> 
> I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB
> harddisks are not very stable.
> 
...
> 
> The following files are always in connection with these errors:
> /var/log/wtmp
> /var/log/messages
> 
> The old hd was hda: IBM-DTLA-305040, ATA DISK drive. The new is: hda:
> IBM-DTLA-307075, ATA DISK drive.
> 
...
> 
> I currently use linux 2.4.9 and e2fsprogs 1.23 and fileutils-4.1 and a modified
> RedHat 6.2. These errors only occured with linux>=2.4.5-ac11.
> 
> I might say this is definitely an error with ext2 !


You need to provide more information such as what kind of motherboard or
ide chipset you are using. 

-- 

=====================================================================
Mohammad A. Haque                              http://www.haque.net/ 
                                               mhaque@haque.net

  "Alcohol and calculus don't mix.             Project Lead
   Don't drink and derive." --Unknown          http://wm.themes.org/
                                               batmanppc@themes.org
=====================================================================

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 15:48 ` Mohammad A. Haque
@ 2001-09-15 16:19   ` Kristian Peters
  0 siblings, 0 replies; 14+ messages in thread
From: Kristian Peters @ 2001-09-15 16:19 UTC (permalink / raw)
  To: Mohammad A. Haque; +Cc: linux-kernel

*Soooorrrry for sending the first message of this thread 3 times... But the 
mailing list still returns my mails...*

Mohammad A. Haque wrote:
> You need to provide more information such as what kind of motherboard or
> ide chipset you are using. 

Ok. I already did that before and will do it again. ;-)

As you'll see I have no VIA chipset.

[root@adlib /root]# cat /proc/interrupts
            CPU0
   0:     424127          XT-PIC  timer
   1:      20206          XT-PIC  keyboard
   2:          0          XT-PIC  cascade
   8:          1          XT-PIC  rtc
  11:     186306          XT-PIC  es1371, bttv, eth0
  12:      94563          XT-PIC  PS/2 Mouse
  14:     117603          XT-PIC  ide0
  15:         17          XT-PIC  ide1
NMI:          0
ERR:          0
[root@adlib /root]# cat /proc/pci
PCI devices found:
   Bus  0, device   0, function  0:
     Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 3).
       Master Capable.  Latency=64.
       Prefetchable 32 bit memory at 0x44000000 [0x47ffffff].
   Bus  0, device   1, function  0:
     PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 3).
       Master Capable.  Latency=64.  Min Gnt=140.
   Bus  0, device  14, function  0:
     Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 5).
       IRQ 11.
       Master Capable.  Latency=66.  Min Gnt=8.Max Lat=56.
       Prefetchable 32 bit memory at 0x48100000 [0x48100fff].
       I/O at 0x1000 [0x101f].
       Non-prefetchable 32 bit memory at 0x48000000 [0x480fffff].
   Bus  0, device  15, function  0:
     Multimedia audio controller: Ensoniq ES1371 [AudioPCI-97] (rev 6).
       IRQ 11.
       Master Capable.  Latency=64.  Min Gnt=12.Max Lat=128.
       I/O at 0x1080 [0x10bf].
   Bus  0, device  16, function  0:
     Multimedia video controller: Brooktree Corporation Bt878 (rev 17).
       IRQ 11.
       Master Capable.  Latency=66.  Min Gnt=16.Max Lat=40.
       Prefetchable 32 bit memory at 0x48200000 [0x48200fff].
   Bus  0, device  16, function  1:
     Multimedia controller: Brooktree Corporation Bt878 (rev 17).
       IRQ 11.
       Master Capable.  Latency=66.  Min Gnt=4.Max Lat=255.
       Prefetchable 32 bit memory at 0x48300000 [0x48300fff].
   Bus  0, device  20, function  0:
     ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 2).
   Bus  0, device  20, function  1:
     IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 1).
       Master Capable.  Latency=64.
       I/O at 0x1040 [0x104f].
   Bus  0, device  20, function  2:
     USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 1).
       IRQ 11.
       Master Capable.  Latency=64.
       I/O at 0x1020 [0x103f].
   Bus  0, device  20, function  3:
     Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 2).
       IRQ 9.
   Bus  1, device   0, function  0:
     VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 130).
       IRQ 11.
       Master Capable.  Latency=64.  Min Gnt=16.Max Lat=32.
       Prefetchable 32 bit memory at 0x42000000 [0x43ffffff].
       Non-prefetchable 32 bit memory at 0x40800000 [0x40803fff].
       Non-prefetchable 32 bit memory at 0x40000000 [0x407fffff].
[root@adlib /root]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 3
cpu MHz         : 597.416
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat p
se36 mmx fxsr sse
bogomips        : 1192.75
[root@adlib /root]# cat /proc/filesystems
nodev   proc
nodev   sockfs
nodev   tmpfs
nodev   shm
nodev   pipefs
         ext2
         msdos
         vfat
         iso9660
nodev   devfs
         umsdos
nodev   devpts
nodev   nfs
[root@adlib /root]# cat /proc/dma
  4: cascade

Kristian

·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                          :: http://www.korseby.net
                          :: http://www.tomlab.de
kristian@korseby.net ....::


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 14:49   ` Kristian
@ 2001-09-15 21:04     ` Mike Fedyk
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Fedyk @ 2001-09-15 21:04 UTC (permalink / raw)
  To: linux-kernel

On Sat, Sep 15, 2001 at 04:49:22PM +0200, Kristian wrote:
> I'll try to use 'hdparm -d1 -X33 /dev/hda' and other modes to see if it occurs 
> again. But testing could take some time. It appears ~~ every second day.
> 
> Kristian
> 

If it takes that long, then make sure you start with ''hdparm -d0 /dev/hda'.

Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 23:51     ` Andreas Dilger
@ 2001-09-16 19:33       ` Pavel Machek
  0 siblings, 0 replies; 14+ messages in thread
From: Pavel Machek @ 2001-09-16 19:33 UTC (permalink / raw)
  To: linux-kernel

Hi!

> > Install crc loop device, and if disk does silent errors, you'll know.
> 
> Where do you store the CRCs?  It appears that they are written to another
> block device.  Also, how do you initialize the CRC table for an existing
> filesystem?

You just read the device in special mode. That forces it to recompute
crc-s.

> What would make this considerably more useful is to be able to write the
> CRCs into a regular file, as it would be a bit of a pain to have a partition
> for each CRC loop device to store the CRCs in.

By -o loop, you can turn regular file into blockdevice ;-).

> Otherwise, it looks very useful, and could be handy in tracking down
> reports like this where it is unclear where the data corruption is.
								Pavel
-- 
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 22:19   ` Pavel Machek
@ 2001-09-15 23:51     ` Andreas Dilger
  2001-09-16 19:33       ` Pavel Machek
  0 siblings, 1 reply; 14+ messages in thread
From: Andreas Dilger @ 2001-09-15 23:51 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Sep 16, 2001  00:19 +0200, Pavel Machek wrote:
> Install crc loop device, and if disk does silent errors, you'll know.

Where do you store the CRCs?  It appears that they are written to another
block device.  Also, how do you initialize the CRC table for an existing
filesystem?

What would make this considerably more useful is to be able to write the
CRCs into a regular file, as it would be a bit of a pain to have a partition
for each CRC loop device to store the CRCs in.

Otherwise, it looks very useful, and could be handy in tracking down
reports like this where it is unclear where the data corruption is.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 12:42 ` David Weinehall
  2001-09-15 17:19   ` David Rees
@ 2001-09-15 22:19   ` Pavel Machek
  2001-09-15 23:51     ` Andreas Dilger
  1 sibling, 1 reply; 14+ messages in thread
From: Pavel Machek @ 2001-09-15 22:19 UTC (permalink / raw)
  To: David Weinehall, Kristian; +Cc: linux-kernel

Hi!

> > Hello.
> > 
> > For about 3 weeks I sent a report that I've got very strange kernel
> > error messages.
> > 
> > I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB 
> > harddisks are not very stable.
> 
> Just to get it out of the way, can you open your computer and check
> what country the disk is manufactured in? There has been some complaints
> on this list about IBM-disks fabricated in Hungary.

Install crc loop device, and if disk does silent errors, you'll know.
								Pavel

--- clean/drivers/block/loop.c	Sun Jul  8 23:26:37 2001
+++ linux/drivers/block/loop.c	Sun Jul  8 23:08:02 2001
@@ -69,6 +69,7 @@
 #include <linux/slab.h>
 
 #include <asm/uaccess.h>
+#include <asm/checksum.h>
 
 #include <linux/loop.h>		
 
@@ -107,7 +108,6 @@
 		in = loop_buf;
 		out = raw_buf;
 	}
-
 	key = lo->lo_encrypt_key;
 	keysize = lo->lo_encrypt_key_size;
 	for (i = 0; i < size; i++)
@@ -115,11 +115,106 @@
 	return 0;
 }
 
-static int none_status(struct loop_device *lo, struct loop_info *info)
+#define ID printk(KERN_ERR "crc: info about (%s, %d, %d) ", kdevname(lo->lo_device), real_block, blksize);
+
+
+static int transfer_crc(struct loop_device *lo, int cmd, char *raw_buf,
+		  char *loop_buf, int size, int real_block)
 {
+	struct buffer_head *bh;
+	int blksize = 1024, nsect;	/* Size of block on auxilary media */
+	int cksum;
+	u32 *data;
+	nsect = blksize / 4;
+
+	if (!lo->second_device) {
+		ID; printk( "reading from not-yet-setup crc device can result in armagedon. Dont try again.\n" );
+		return -1;
+	}
+	bh = getblk(lo->second_device, 1+real_block/nsect, blksize);
+	if (!bh) {
+		ID; printk( "getblk returned NULL.\n" );
+		return -1;
+	}
+	if (!buffer_uptodate(bh)) {
+		ll_rw_block(READ, 1, &bh);
+		wait_on_buffer(bh);
+		if (!buffer_uptodate(bh)) {
+			ID; printk(  "could not read block with CRC\n" );
+			goto error;
+		}
+	}
+
+	data = (u32 *) bh->b_data;
+	if (cmd == READ)
+		cksum = csum_partial_copy_nocheck(raw_buf, loop_buf, size, 0);
+	else
+		cksum = csum_partial_copy_nocheck(loop_buf, raw_buf, size, 0);
+
+	if (cmd == READ) {
+		if (le32_to_cpu(data[real_block%nsect]) != cksum) {
+			if (lo->lo_encrypt_key_size == 0) {	/* Normal mode */
+				ID; printk( "wrong checksum reading, is %x, should be %x\n", cksum, 0x1234 );
+				goto error;
+			} else { 
+				ID; printk( "wrong checksum repairing, setting to %x\n", cksum );
+				goto repair;
+			}
+		}
+	} else {
+	repair:
+		data[real_block%nsect] = cpu_to_le32(cksum);
+		mark_buffer_uptodate(bh, 1);
+		mark_buffer_dirty(bh);
+	}
+
+	brelse(bh);
 	return 0;
+error:
+	brelse(bh);
+	return -1;
+
 }
 
+static int ioctl_crc(struct loop_device *lo, int cmd, unsigned long arg)
+{
+	struct file	*file;
+	struct inode	*inode;
+	int error;
+
+	printk( "Entering ioctl_crc\n" );
+	if (cmd != LOOP_CRC_SET_FD)
+		return -EINVAL;
+
+	error = -EBADF;
+	file = fget(arg);
+	if (!file)
+		return -EINVAL;
+
+	error = -EINVAL;
+	inode = file->f_dentry->d_inode;
+	if (!inode) {
+		printk(KERN_ERR "ioctl_crc: NULL inode?!?\n");
+		goto out;
+	}
+
+	if (S_ISBLK(inode->i_mode)) {
+		error = blkdev_open(inode, file);
+		lo->second_device = inode->i_rdev;
+		printk( "loop_crc: Registered device %x\n", lo->second_device );
+		return error;
+	} else {
+	out:
+		fput(file);
+		return -EINVAL;
+	}
+}
+
+static int none_status(struct loop_device *lo, struct loop_info *info)
+{
+	return 0;
+} 
+
 static int xor_status(struct loop_device *lo, struct loop_info *info)
 {
 	if (info->lo_encrypt_key_size <= 0)
@@ -139,10 +234,19 @@
 	init: xor_status
 }; 	
 
+struct loop_func_table crc_funcs = { 
+	number: LO_CRYPT_CRC,
+	transfer: transfer_crc,
+	init: none_status,
+	ioctl: ioctl_crc
+}; 	
+
 /* xfer_funcs[0] is special - its release function is never called */ 
 struct loop_func_table *xfer_funcs[MAX_LO_CRYPT] = {
 	&none_funcs,
-	&xor_funcs  
+	&xor_funcs,
+	NULL, NULL, NULL, NULL, NULL,
+	&crc_funcs,
 };
 
 #define MAX_DISK_SIZE 1024*1024*1024
@@ -728,6 +832,7 @@
 	lo->transfer = NULL;
 	lo->ioctl = NULL;
 	lo->lo_device = 0;
+	lo->second_device = 0;
 	lo->lo_encrypt_type = 0;
 	lo->lo_offset = 0;
 	lo->lo_encrypt_key_size = 0;
--- clean/include/linux/loop.h	Wed Aug 29 01:23:55 2001
+++ linux/include/linux/loop.h	Sun Aug 26 18:27:56 2001
@@ -28,6 +28,7 @@
 	int		lo_number;
 	int		lo_refcnt;
 	kdev_t		lo_device;
+	kdev_t		second_device;
 	int		lo_offset;
 	int		lo_encrypt_type;
 	int		lo_encrypt_key_size;
@@ -119,6 +120,7 @@
 #define LO_CRYPT_BLOW     4
 #define LO_CRYPT_CAST128  5
 #define LO_CRYPT_IDEA     6
+#define LO_CRYPT_CRC	  7
 #define LO_CRYPT_DUMMY    9
 #define LO_CRYPT_SKIPJACK 10
 #define MAX_LO_CRYPT	20
@@ -150,5 +152,6 @@
 #define LOOP_CLR_FD	0x4C01
 #define LOOP_SET_STATUS	0x4C02
 #define LOOP_GET_STATUS	0x4C03
+#define LOOP_CRC_SET_FD 0x4C04
 
 #endif


-- 
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 17:19   ` David Rees
@ 2001-09-15 17:55     ` Kristian Peters
  0 siblings, 0 replies; 14+ messages in thread
From: Kristian Peters @ 2001-09-15 17:55 UTC (permalink / raw)
  To: David Rees; +Cc: linux-kernel

David Rees wrote:
> It's not just the disks made in Hungary, I've had 3 IBM drives go bad on me
> in the last week after 3-4 months of operation 2 15GB 75GXPs made in
> Thailand (bad sectors), 1 40GB 40GV also made in Thailand (started making
> bad scratching noise, BIOS wouldn't detect it after that).  Still have a
> number of the 75GXPs in service, but I'm keeping my eye on them.
> 
> Kristian's problem looks like it could be hardware problems of some sort
> leading to corruption.

I think so. Someone send me a link where there was described that especially 
that 75 GB drives are causing such severe corruption.

But that drives seem to have these errors from the beginning. I just putted off 
the packaging yesterday of that 75 GB drive. Mostly that errors occured when the 
disk was totally off for a moment and only on my root-partition.

Is it possible to detect which file currently own a specific inode ?

Thanks anyway.

Kristian
·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                          :: http://www.korseby.net
                          :: http://www.tomlab.de
kristian@korseby.net ....::

-- 
·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                          :: http://www.korseby.net
                          :: http://www.tomlab.de
kristian@korseby.net ....::


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15 12:42 ` David Weinehall
@ 2001-09-15 17:19   ` David Rees
  2001-09-15 17:55     ` Kristian Peters
  2001-09-15 22:19   ` Pavel Machek
  1 sibling, 1 reply; 14+ messages in thread
From: David Rees @ 2001-09-15 17:19 UTC (permalink / raw)
  To: linux-kernel

On Sat, Sep 15, 2001 at 02:42:36PM +0200, David Weinehall wrote:
> On Sat, Sep 15, 2001 at 10:46:36AM +0200, Kristian wrote:
> > Hello.
> > 
> > For about 3 weeks I sent a report that I've got very strange kernel
> > error messages.
> > 
> > I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB 
> > harddisks are not very stable.
> 
> Just to get it out of the way, can you open your computer and check
> what country the disk is manufactured in? There has been some complaints
> on this list about IBM-disks fabricated in Hungary.

It's not just the disks made in Hungary, I've had 3 IBM drives go bad on me
in the last week after 3-4 months of operation 2 15GB 75GXPs made in
Thailand (bad sectors), 1 40GB 40GV also made in Thailand (started making
bad scratching noise, BIOS wouldn't detect it after that).  Still have a
number of the 75GXPs in service, but I'm keeping my eye on them.

Kristian's problem looks like it could be hardware problems of some sort
leading to corruption.

-Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ext2fs corruption again
  2001-09-15  8:46 Kristian
@ 2001-09-15 12:42 ` David Weinehall
  2001-09-15 17:19   ` David Rees
  2001-09-15 22:19   ` Pavel Machek
  0 siblings, 2 replies; 14+ messages in thread
From: David Weinehall @ 2001-09-15 12:42 UTC (permalink / raw)
  To: Kristian; +Cc: linux-kernel

On Sat, Sep 15, 2001 at 10:46:36AM +0200, Kristian wrote:
> Hello.
> 
> For about 3 weeks I sent a report that I've got very strange kernel
> error messages.
> 
> I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB 
> harddisks are not very stable.

Just to get it out of the way, can you open your computer and check
what country the disk is manufactured in? There has been some complaints
on this list about IBM-disks fabricated in Hungary.


/David Weinehall
  _                                                                 _
 // David Weinehall <tao@acc.umu.se> /> Northern lights wander      \\
//  Project MCA Linux hacker        //  Dance across the winter sky //
\>  http://www.acc.umu.se/~tao/    </   Full colour fire           </

^ permalink raw reply	[flat|nested] 14+ messages in thread

* ext2fs corruption again
@ 2001-09-15 10:44 Kristian
  0 siblings, 0 replies; 14+ messages in thread
From: Kristian @ 2001-09-15 10:44 UTC (permalink / raw)
  To: linux-kernel

Hello.

For about 3 weeks I sent a report that I've got very strange kernel error messages.

I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB
harddisks are not very stable.

Today I've got these from the kernel (with the new hd):

Sep 15 10:01:58 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4215
Sep 15 10:01:58 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4217
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4234
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4236
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4239
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4847
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4848
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4852
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4855
Sep 15 10:02:06 adlib kernel: EXT2-fs error (device ide0(3,5)): ext2_new_block:
Allocating block in system zone - block = 174
Sep 15 10:02:06 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: Freeing blocks in system zones - Block = 179, count = 3
Sep 15 10:02:09 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4839

Then I did an e2fsck on that device (hda5) and the errors occured after the
check (and a complete reboot) again:

Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4163
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4166
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4131
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4132
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4155
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4156
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4157
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4161
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 4162
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 716
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 717
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 720
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 723
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 724
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 725
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 726
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)):
ext2_free_blocks: bit already cleared for block 727

I've written down all e2fsck messages by hand. ;-) And I compared them.

The following messages from e2fsck are always the same even on the old and on
the new hd. Here they are:

Duplicate/bad bock(s) in inode:  97: 643
Duplicate/bad bock(s) in inode: 100: 649
Duplicate/bad bock(s) in inode: 101: 650 651
Duplicate/bad bock(s) in inode: 102: 652
Duplicate/bad bock(s) in inode: 103: 653 656
Duplicate/bad bock(s) in inode: 104: 659 660
Duplicate/bad bock(s) in inode: 105: 661 662 663 664 665 666
Duplicate/bad bock(s) in inode: 106: 667 668
Duplicate/bad bock(s) in inode: 107: 669 671
Duplicate/bad bock(s) in inode: 108: 672 673 674
Duplicate/bad bock(s) in inode: 110: 678

Inodes 643-678 are always connected to faults.

The following files are always in connection with these errors:
/var/log/wtmp
/var/log/messages

The old hd was hda: IBM-DTLA-305040, ATA DISK drive. The new is: hda:
IBM-DTLA-307075, ATA DISK drive.

hdparm says:
   Model=IBM-DTLA-307075, FwRev=TXAOA50C, SerialNo=YSDYSFN9998
   Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
   RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
   BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=8
   CurCHS=17475/15/63, CurSects=-78446341, LBA=yes, LBAsects=150136560
   IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
   PIO modes: pio0 pio1 pio2 pio3 pio4
   DMA modes: mdma0 mdma1 mdma2 udma0 udma1 *udma2
   AdvancedPM=yes: disabled (255)
   Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4 ATA-5

I currently use linux 2.4.9 and e2fsprogs 1.23 and fileutils-4.1 and a modified
RedHat 6.2. These errors only occured with linux>=2.4.5-ac11.

I might say this is definitely an error with ext2 !

Kristian

·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                           :: http://www.korseby.net
                           :: http://www.tomlab.de
kristian@korseby.net ....::



^ permalink raw reply	[flat|nested] 14+ messages in thread

* ext2fs corruption again
@ 2001-09-15  8:46 Kristian
  2001-09-15 12:42 ` David Weinehall
  0 siblings, 1 reply; 14+ messages in thread
From: Kristian @ 2001-09-15  8:46 UTC (permalink / raw)
  To: linux-kernel

Hello.

For about 3 weeks I sent a report that I've got very strange kernel error messages.

I changed my harddrive to IBM 75 GB because someone said that IBM's 40 GB 
harddisks are not very stable.

Today I've got these from the kernel (with the new hd):

Sep 15 10:01:58 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4215
Sep 15 10:01:58 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4217
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4234
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4236
Sep 15 10:01:59 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4239
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4847
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4848
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4852
Sep 15 10:02:03 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4855
Sep 15 10:02:06 adlib kernel: EXT2-fs error (device ide0(3,5)): ext2_new_block: 
Allocating block in system zone - block = 174
Sep 15 10:02:06 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: Freeing blocks in system zones - Block = 179, count = 3
Sep 15 10:02:09 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4839

Then I did an e2fsck on that device (hda5) and the errors occured after the 
check (and a complete reboot) again:

Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4163
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4166
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4131
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4132
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4155
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4156
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4157
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4161
Sep 15 10:10:38 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 4162
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 716
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 717
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 720
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 723
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 724
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 725
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 726
Sep 15 10:10:43 adlib kernel: EXT2-fs error (device ide0(3,5)): 
ext2_free_blocks: bit already cleared for block 727

I've written down all e2fsck messages by hand. ;-) And I compared them.

The following messages from e2fsck are always the same even on the old and on 
the new hd. Here they are:

Duplicate/bad bock(s) in inode:  97: 643
Duplicate/bad bock(s) in inode: 100: 649
Duplicate/bad bock(s) in inode: 101: 650 651
Duplicate/bad bock(s) in inode: 102: 652
Duplicate/bad bock(s) in inode: 103: 653 656
Duplicate/bad bock(s) in inode: 104: 659 660
Duplicate/bad bock(s) in inode: 105: 661 662 663 664 665 666
Duplicate/bad bock(s) in inode: 106: 667 668
Duplicate/bad bock(s) in inode: 107: 669 671
Duplicate/bad bock(s) in inode: 108: 672 673 674
Duplicate/bad bock(s) in inode: 110: 678

Inodes 643-678 are always connected to faults.

The following files are always in connection with these errors:
/var/log/wtmp
/var/log/messages

The old hd was hda: IBM-DTLA-305040, ATA DISK drive. The new is: hda: 
IBM-DTLA-307075, ATA DISK drive.

hdparm says:
  Model=IBM-DTLA-307075, FwRev=TXAOA50C, SerialNo=YSDYSFN9998
  Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
  RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
  BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=8
  CurCHS=17475/15/63, CurSects=-78446341, LBA=yes, LBAsects=150136560
  IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
  PIO modes: pio0 pio1 pio2 pio3 pio4
  DMA modes: mdma0 mdma1 mdma2 udma0 udma1 *udma2
  AdvancedPM=yes: disabled (255)
  Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4 ATA-5

I currently use linux 2.4.9 and e2fsprogs 1.23 and fileutils-4.1 and a modified 
RedHat 6.2. These errors only occured with linux>=2.4.5-ac11.

I might say this is definitely an error with ext2 !

Kristian

·· · · reach me :: · ·· ·· ·  · ·· · ··  · ··· · ·
                          :: http://www.korseby.net
                          :: http://www.tomlab.de
kristian@korseby.net ....::


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2001-09-16 19:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-15 11:14 ext2fs corruption again Kristian Peters
2001-09-15 12:21 ` Roeland Th. Jansen
2001-09-15 14:49   ` Kristian
2001-09-15 21:04     ` Mike Fedyk
2001-09-15 15:48 ` Mohammad A. Haque
2001-09-15 16:19   ` Kristian Peters
  -- strict thread matches above, loose matches on Subject: below --
2001-09-15 10:44 Kristian
2001-09-15  8:46 Kristian
2001-09-15 12:42 ` David Weinehall
2001-09-15 17:19   ` David Rees
2001-09-15 17:55     ` Kristian Peters
2001-09-15 22:19   ` Pavel Machek
2001-09-15 23:51     ` Andreas Dilger
2001-09-16 19:33       ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).