* Something corrupts raid5 disks slightly during reboot @ 2003-10-31 19:08 Ville Herva 2003-11-01 1:41 ` Jeffrey E. Hundstad 0 siblings, 1 reply; 19+ messages in thread From: Ville Herva @ 2003-10-31 19:08 UTC (permalink / raw) To: linux-kernel I've been experiencing strange corruption on a raid5 volume for some time. Basically, after unmounting the filesystem, I can mount it again without problems. I can also raidstop the raid device in between and all is still fine: > umount /dev/md4; mount /dev/md4 - no corruption > umount /dev/md4; raidstop /dev/md4; raidstart /dev/md4; mount /dev/md4 - no corruption But after a reboot, the filesystem is corrupted: > mount /dev/md4 EXT2-fs error (device md(9,4)): ext2_check_descriptors: Block bitmap for group 17 not in group (block 0)! EXT2-fs: group descriptors corrupted ! (This is recoverable with e2fsck.) The array consists of three 80GB Samsung disks in raid5 mode, but I experienced this problem with two of the disks in raid0 mode, too. The raid consists of raw disks hdb,hdc,hdg (rather than partitions hdb1,hdc1,hdg1). On the same box I have three other raid arrays on different disks, all of which consist of partitions. These do not show corruption on boot. I made a little experiment and saved first megabyte of hd[bcg] between umount,mount and umount,raidstop,raidstart,mount operations. They did not change. The I did umount,raidstop and rebooted. After boot, the beginning hdb was intact, but hdc and hdg had been tampered. (Unfortunately, raidstart was automatically run on boot, but I did raidstop as the first thing.) I narrowed the difference down to bytes between 1060-1080 on hdc: root@linux:/scratch>od -x hdc_bytes-1060-1080_before_boot 0000000 1e1e 00d0 000d 00d0 752e 4264 7714 3fa2 0000020 0002 0014 root@linux:/scratch>od -x hdc_bytes-1060-1080_after_boot 0000000 1e1e 00d0 000d 00d0 75ff 4264 7427 3fa2 0000020 0003 0014 On hdg, this range differed too: root@linux:/scratch>od -x hdg_bytes-1060-1080_after_boot 0000000 8000 0000 8000 0000 7526 3fa2 7539 3fa2 0000020 0002 0014 root@linux:/scratch>od -x hdg_bytes-1060-1080_after_boot 0000000 8000 0000 8000 0000 75f7 3fa2 760a 3fa2 0000020 0003 0014 But there was additional difference somewhere between 1kB and 5kB that wasn't there on hdc. When I copied the saved 1MB blocks back in place, the fs mounted without problems. AFAIK, the first 512b on each disk should be the raid superblock and the next 512 may be ext2 superblock. I assume 1060-1080 falls into group descriptor table that gets corrupted. It may be something in userspace that corrupts the disks, but I cannot think what it could be. Right now, the kernel is 2.2.25-secure + patches, but earlier 2.2.x kernels exhibited this as well. These include the newest raid 0.90 patches for 2.2. Any ideas what might cause this or how to debug this further? -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2003-10-31 19:08 Something corrupts raid5 disks slightly during reboot Ville Herva @ 2003-11-01 1:41 ` Jeffrey E. Hundstad 2003-11-01 1:57 ` Mike Fedyk 2003-11-01 8:27 ` ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] Ville Herva 0 siblings, 2 replies; 19+ messages in thread From: Jeffrey E. Hundstad @ 2003-11-01 1:41 UTC (permalink / raw) To: Ville Herva; +Cc: linux-kernel Try: hdparm -W0 /dev/hdX for each of your ide drives. This turns off write-caching which is usually a bad thing with ide drives anyway. Ville Herva wrote: >I've been experiencing strange corruption on a raid5 volume for some time. >Basically, after unmounting the filesystem, I can mount it again without >problems. I can also raidstop the raid device in between and all is still >fine: > > > >>umount /dev/md4; mount /dev/md4 >> >> > - no corruption > > >>umount /dev/md4; raidstop /dev/md4; raidstart /dev/md4; mount /dev/md4 >> >> > - no corruption > >But after a reboot, the filesystem is corrupted: > > > >>mount /dev/md4 >> >> > EXT2-fs error (device md(9,4)): ext2_check_descriptors: Block bitmap for > group 17 not in group (block 0)! > EXT2-fs: group descriptors corrupted ! > >(This is recoverable with e2fsck.) > >The array consists of three 80GB Samsung disks in raid5 mode, but I >experienced this problem with two of the disks in raid0 mode, too. The raid >consists of raw disks hdb,hdc,hdg (rather than partitions hdb1,hdc1,hdg1). > >On the same box I have three other raid arrays on different disks, all of >which consist of partitions. These do not show corruption on boot. > >I made a little experiment and saved first megabyte of hd[bcg] between >umount,mount and umount,raidstop,raidstart,mount operations. They did not >change. > >The I did umount,raidstop and rebooted. After boot, the beginning hdb was >intact, but hdc and hdg had been tampered. (Unfortunately, raidstart was >automatically run on boot, but I did raidstop as the first thing.) > >I narrowed the difference down to bytes between 1060-1080 on hdc: > >root@linux:/scratch>od -x hdc_bytes-1060-1080_before_boot >0000000 1e1e 00d0 000d 00d0 752e 4264 7714 3fa2 >0000020 0002 0014 >root@linux:/scratch>od -x hdc_bytes-1060-1080_after_boot >0000000 1e1e 00d0 000d 00d0 75ff 4264 7427 3fa2 >0000020 0003 0014 > >On hdg, this range differed too: > >root@linux:/scratch>od -x hdg_bytes-1060-1080_after_boot >0000000 8000 0000 8000 0000 7526 3fa2 7539 3fa2 >0000020 0002 0014 >root@linux:/scratch>od -x hdg_bytes-1060-1080_after_boot >0000000 8000 0000 8000 0000 75f7 3fa2 760a 3fa2 >0000020 0003 0014 > >But there was additional difference somewhere between 1kB and 5kB that >wasn't there on hdc. > >When I copied the saved 1MB blocks back in place, the fs mounted without >problems. > >AFAIK, the first 512b on each disk should be the raid superblock and the >next 512 may be ext2 superblock. I assume 1060-1080 falls into group >descriptor table that gets corrupted. > >It may be something in userspace that corrupts the disks, but I cannot think >what it could be. > >Right now, the kernel is 2.2.25-secure + patches, but earlier 2.2.x kernels >exhibited this as well. These include the newest raid 0.90 patches for 2.2. > >Any ideas what might cause this or how to debug this further? > > >-- v -- > >v@iki.fi >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2003-11-01 1:41 ` Jeffrey E. Hundstad @ 2003-11-01 1:57 ` Mike Fedyk 2003-11-01 8:33 ` Ville Herva 2003-11-01 8:27 ` ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] Ville Herva 1 sibling, 1 reply; 19+ messages in thread From: Mike Fedyk @ 2003-11-01 1:57 UTC (permalink / raw) To: Jeffrey E. Hundstad; +Cc: Ville Herva, linux-kernel On Fri, Oct 31, 2003 at 07:41:30PM -0600, Jeffrey E. Hundstad wrote: > Try: > > hdparm -W0 /dev/hdX > > for each of your ide drives. This turns off write-caching which is > usually a bad thing with ide drives anyway. > Also try installing smartmontools, and run smartmon -a on each of the drives. It might tell you one of the drives is going bad... ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2003-11-01 1:57 ` Mike Fedyk @ 2003-11-01 8:33 ` Ville Herva 0 siblings, 0 replies; 19+ messages in thread From: Ville Herva @ 2003-11-01 8:33 UTC (permalink / raw) To: linux-kernel On Fri, Oct 31, 2003 at 05:57:33PM -0800, you [Mike Fedyk] wrote: > On Fri, Oct 31, 2003 at 07:41:30PM -0600, Jeffrey E. Hundstad wrote: > > Try: > > > > hdparm -W0 /dev/hdX > > > > for each of your ide drives. This turns off write-caching which is > > usually a bad thing with ide drives anyway. > > > > Also try installing smartmontools, and run smartmon -a on each of the > drives. It might tell you one of the drives is going bad... I am monitoring all my drives with smart constantly, and they haven't shown any symptoms. The corruption only happens upon reboot, which is a quite rare event for a server. Also, I find that smart rarely gives much useful warnings beforehand when a drive is about to fail. And when the drive fails I usually get a good doze of UncorrectableErrors into the log, not silent corruption (and I've seen a lot drives to fail ;( ). Silent corruption is usually caused by the chipset or driver (seen that, too ;( ), but it has usually happened under stress, not when nothing much is being written to the drive. -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-01 1:41 ` Jeffrey E. Hundstad 2003-11-01 1:57 ` Mike Fedyk @ 2003-11-01 8:27 ` Ville Herva 2003-11-01 15:56 ` Willy Tarreau 1 sibling, 1 reply; 19+ messages in thread From: Ville Herva @ 2003-11-01 8:27 UTC (permalink / raw) To: Jeffrey E. Hundstad; +Cc: linux-kernel On Fri, Oct 31, 2003 at 07:41:30PM -0600, you [Jeffrey E. Hundstad] wrote: > Try: > > hdparm -W0 /dev/hdX > > for each of your ide drives. This turns off write-caching which is > usually a bad thing with ide drives anyway. According to hdparm, write caching is indeed enabled for all the drives. I find it somewhat odd if this was the cause, though. Before reboot, the drives were not being written to for quite a while (the fs had been unmounted and the raid array had been stopped.) I suppose it _is_ possible that the drives were updating the ext2 superblock from their write cache when power went off. The md5sum of first 1MB of the drives was probably in sync before reboot because I got it from kernel's cache (or drive's cache), although the up-to-date data had not been written onto the platter yet. Also, as this is a raid5 array, one of the drives could have been clean because the ext2 superblock (that I assume was being updated) is physically located on only two of the drives. I can try to turn of write caching well before next reboot. I don't suppose there is a way to boot so that the write caching would be off all the time - the best I can do is turn it off early in boot scripts, no? Does anyone know if there is a crucial write caching / flushing fix in 2.4/2.6 that hasn't been merged into 2.2 (I am using the newest 2.4 ide backport from Krzysztof Olêdzk (ide-2.2.21-06162002)). I don't suppose there is a away to explicitly flush the IDE drive write cache from user space? Or is this likely to be a drive firmware problem (kernel tries to flush the drives, but they don't do it early enough?) How long do ide drives normally hold data in write cache if they are idle? The drives are SAMSUNG SV8004H, FwRev=QR100-07, fwiw. Turning off write caching permanently doesn't sound inviting though, as it'll probably ruin the raid performance completely... -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-01 8:27 ` ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] Ville Herva @ 2003-11-01 15:56 ` Willy Tarreau 2003-11-01 18:25 ` Ville Herva 0 siblings, 1 reply; 19+ messages in thread From: Willy Tarreau @ 2003-11-01 15:56 UTC (permalink / raw) To: Ville Herva, linux-kernel Hi Ville, do you have the ability to reboot this beast on a DOS floppy equiped with a disk editor or even debug ? It would tell you wether it's the IDE initialization or shutdown which harms the disks. BTW, it may even be your bios which believes for an unknown reason that it has to write to the partition table which is not one. just my 2 cents, Willy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-01 15:56 ` Willy Tarreau @ 2003-11-01 18:25 ` Ville Herva 2003-11-01 19:01 ` Willy Tarreau 0 siblings, 1 reply; 19+ messages in thread From: Ville Herva @ 2003-11-01 18:25 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel On Sat, Nov 01, 2003 at 04:56:04PM +0100, you [Willy Tarreau] wrote: > Hi Ville, > > do you have the ability to reboot this beast on a DOS floppy equiped with a > disk editor or even debug ? I have been planning (as someone else suggested) to boot to a different kernel, but unfortunately I think my off-the-shelf solution, knoppix, won't do as it probably includes raid autodetection in its kernel, and I'd rather rule raidstart out as well. Is there anything special in booting to DOS instead of different linux kernel, other than that it would rule out some strange kernel bug that is present in 2.2 and 2.4? > BTW, it may even be your bios which believes for an unknown reason that it > has to write to the partition table which is not one. Yes, but I find it unlikely. The partition table in within the first 512 bytes and the corruption was in bytes 1060-1080. Also, one of the corrupted disks is on i815 and another in on HPT370. BTW: the corruption happens on warm reboots (running reboot command), not just on power off / on. -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-01 18:25 ` Ville Herva @ 2003-11-01 19:01 ` Willy Tarreau 2003-11-01 21:02 ` Ville Herva 2004-01-02 19:42 ` Something corrupts raid5 disks slightly during reboot Ville Herva 0 siblings, 2 replies; 19+ messages in thread From: Willy Tarreau @ 2003-11-01 19:01 UTC (permalink / raw) To: Ville Herva, linux-kernel On Sat, Nov 01, 2003 at 08:25:18PM +0200, Ville Herva wrote: > Is there anything special in booting to DOS instead of different linux > kernel, other than that it would rule out some strange kernel bug that is > present in 2.2 and 2.4? No, it was just to quicky confirm or deny the fact that it's the kernel which causes the problem. It could have been a long standing bug in the IDE or partition code, and which is present in several kernels. But as you say that it affects two different controllers, there's little chance that it's caused by anything except linux itself. Then, the reboot on DOS will only tell you if the drives were corrupted at startup or at shutdown. > Yes, but I find it unlikely. The partition table in within the first 512 > bytes and the corruption was in bytes 1060-1080. Also, one of the corrupted > disks is on i815 and another in on HPT370. I agree, but I proposed it just because it was simple to test. > BTW: the corruption happens on warm reboots (running reboot command), not > just on power off / on. OK, but the BIOS scans your disks even during warm reboots. Though I don't think it comes from there because of your two different controllers. Willy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-01 19:01 ` Willy Tarreau @ 2003-11-01 21:02 ` Ville Herva 2003-11-02 6:05 ` Andre Hedrick 2004-01-02 19:42 ` Something corrupts raid5 disks slightly during reboot Ville Herva 1 sibling, 1 reply; 19+ messages in thread From: Ville Herva @ 2003-11-01 21:02 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel On Sat, Nov 01, 2003 at 08:01:14PM +0100, you [Willy Tarreau] wrote: > On Sat, Nov 01, 2003 at 08:25:18PM +0200, Ville Herva wrote: > > > Is there anything special in booting to DOS instead of different linux > > kernel, other than that it would rule out some strange kernel bug that is > > present in 2.2 and 2.4? > > No, it was just to quicky confirm or deny the fact that it's the kernel > which causes the problem. It could have been a long standing bug in the IDE > or partition code, and which is present in several kernels. I vaguely recall some ide write cache flushing code was fixed some time ago, but I can't find it in the archives. Maybe I dreamed that up. But I still wonder why an otherwise idle drive would hold the data in write cache for so long (several minutes.) > But as you say that it affects two different controllers, there's little > chance that it's caused by anything except linux itself. Unless the drive is buggy wrt. flushing its write cache. But I think it's a quite distant possibility. > Then, the reboot on DOS will only tell you if the drives were corrupted at > startup or at shutdown. Yep. I'll try to find the moment to boot the beast into something else than the current kernel / distro (it could in theory be something in userspace, though I cannot think what). > > BTW: the corruption happens on warm reboots (running reboot command), not > > just on power off / on. > > OK, but the BIOS scans your disks even during warm reboots. True, I mainly made this note because I hadn't mentioned it before in the thread, and I thought it might have some relevance wrt. possible ide write caching problems. I didn't mean it as a response to the BIOS theory. -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-01 21:02 ` Ville Herva @ 2003-11-02 6:05 ` Andre Hedrick 2003-11-02 8:28 ` Ville Herva 0 siblings, 1 reply; 19+ messages in thread From: Andre Hedrick @ 2003-11-02 6:05 UTC (permalink / raw) To: Ville Herva; +Cc: Willy Tarreau, linux-kernel I added the flush code to flush a drive in several places but it got pulled and munged. The original model was to flush each time a device was closed, when any partition mount point was released, and called by notifier. In a minimal partition count of 1, you had at least two flush before shutdown or reboot. So it was not the code because I fixed it, but then again I am retiring from formal maintainership. Cheers, Andre Hedrick LAD Storage Consulting Group On Sat, 1 Nov 2003, Ville Herva wrote: > On Sat, Nov 01, 2003 at 08:01:14PM +0100, you [Willy Tarreau] wrote: > > On Sat, Nov 01, 2003 at 08:25:18PM +0200, Ville Herva wrote: > > > > > Is there anything special in booting to DOS instead of different linux > > > kernel, other than that it would rule out some strange kernel bug that is > > > present in 2.2 and 2.4? > > > > No, it was just to quicky confirm or deny the fact that it's the kernel > > which causes the problem. It could have been a long standing bug in the IDE > > or partition code, and which is present in several kernels. > > I vaguely recall some ide write cache flushing code was fixed some time ago, > but I can't find it in the archives. Maybe I dreamed that up. But I still > wonder why an otherwise idle drive would hold the data in write cache for so > long (several minutes.) > > > But as you say that it affects two different controllers, there's little > > chance that it's caused by anything except linux itself. > > Unless the drive is buggy wrt. flushing its write cache. But I think it's > a quite distant possibility. > > > Then, the reboot on DOS will only tell you if the drives were corrupted at > > startup or at shutdown. > > Yep. I'll try to find the moment to boot the beast into something else than > the current kernel / distro (it could in theory be something in userspace, > though I cannot think what). > > > > BTW: the corruption happens on warm reboots (running reboot command), not > > > just on power off / on. > > > > OK, but the BIOS scans your disks even during warm reboots. > > True, I mainly made this note because I hadn't mentioned it before in the > thread, and I thought it might have some relevance wrt. possible ide write > caching problems. I didn't mean it as a response to the BIOS theory. > > > -- v -- > > v@iki.fi > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-02 6:05 ` Andre Hedrick @ 2003-11-02 8:28 ` Ville Herva 2003-11-02 20:57 ` Matthias Andree 2003-11-03 5:34 ` Andre Hedrick 0 siblings, 2 replies; 19+ messages in thread From: Ville Herva @ 2003-11-02 8:28 UTC (permalink / raw) To: Andre Hedrick; +Cc: linux-kernel On Sat, Nov 01, 2003 at 10:05:31PM -0800, you [Andre Hedrick] wrote: > > I added the flush code to flush a drive in several places but it got > pulled and munged. > > The original model was to flush each time a device was closed, when any > partition mount point was released, and called by notifier. > > In a minimal partition count of 1, you had at least two flush before > shutdown or reboot. > > So it was not the code because I fixed it, but then again I am retiring > from formal maintainership. Thanks, Andre :(. As an^Wthe IDE expert, can you clarify a few points: - How long can the unwritten data linger in the drive cache if the drive is otherwise idle? (Without an explicit flush and with write caching enabled.) I had unmounted the fs an raidstopped the md minutes before the boot. - Can this corruption happen on warmboot or only on poweroff? - What kind of corruption can one see the if boot takes place "too fast" and drive hasn't got enough time to flush its cache? -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-02 8:28 ` Ville Herva @ 2003-11-02 20:57 ` Matthias Andree 2003-11-03 5:34 ` Andre Hedrick 1 sibling, 0 replies; 19+ messages in thread From: Matthias Andree @ 2003-11-02 20:57 UTC (permalink / raw) To: linux-kernel; +Cc: Ville Herva, Andre Hedrick On Sun, 02 Nov 2003, Ville Herva wrote: > As an^Wthe IDE expert, can you clarify a few points: > > - How long can the unwritten data linger in the drive cache if the drive > is otherwise idle? (Without an explicit flush and with write caching > enabled.) Several seconds. This is usually detailed in the OEM integrator manual, at least it used to be for several IBM and Fujitsu drives when I looked two years ago. Drives usually start flushing cached data before they go idle, and some drives guarantee maximum times before data hits the disk. IIRC, Fujitsu MAH drives (SCSI though, not ATA) for instance guarantee not to cache data for longer than 3 s, even if that means interrupting reordering writes and hits write performance adversely (because it might involve seeks). I seem to recall some IBM ATA drive claimed 15 s, but don't quote me on that, I don't even recall if that was 2.5" or 3.5". I don't recall the exact wording, so it may mean that the drive will not VOLUNTARILY DELAY the write for more than 3 s. It's quite hard to write 4,096 scattered blocks on individual cylinders in 3 s even on 10,025/min drives and requires knowing the block offset from the current rotational angle of the platter... I wonder if drive firmware makes such scheduling efforts. > I had unmounted the fs an raidstopped the md minutes before the boot. Ugly if it still corrupts. :-( > - Can this corruption happen on warmboot or only on poweroff? On ATA drives, the cache contents must persist across soft or hard reset (warmboot). > - What kind of corruption can one see the if boot takes place "too fast" > and drive hasn't got enough time to flush its cache? None with intact drives and bug-free firmware (I doubt such a thing exists). Anyways, on powering down or with firmware bugs, anything is possible. -- Matthias Andree Encrypt your mail: my GnuPG key ID is 0x052E7D95 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-02 8:28 ` Ville Herva 2003-11-02 20:57 ` Matthias Andree @ 2003-11-03 5:34 ` Andre Hedrick 2003-11-03 6:38 ` Ville Herva 1 sibling, 1 reply; 19+ messages in thread From: Andre Hedrick @ 2003-11-03 5:34 UTC (permalink / raw) To: Ville Herva; +Cc: linux-kernel On Sun, 2 Nov 2003, Ville Herva wrote: > On Sat, Nov 01, 2003 at 10:05:31PM -0800, you [Andre Hedrick] wrote: > > > > I added the flush code to flush a drive in several places but it got > > pulled and munged. > > > > The original model was to flush each time a device was closed, when any > > partition mount point was released, and called by notifier. > > > > In a minimal partition count of 1, you had at least two flush before > > shutdown or reboot. > > > > So it was not the code because I fixed it, but then again I am retiring > > from formal maintainership. > > Thanks, Andre :(. > > As an^Wthe IDE expert, can you clarify a few points: > > - How long can the unwritten data linger in the drive cache if the drive > is otherwise idle? (Without an explicit flush and with write caching > enabled.) Basically forever, until a read is issued to a range of lba's which starts smaller than the uncommitted contents's lba, and includes the content in question. Or if a flush cache or disable write-back cache is issued. > I had unmounted the fs an raidstopped the md minutes before the boot. The problem imho, is a break down of fundamental cascading callers. Unmount MD -> flush MD MD is a fakie device :-/ MD fakie calls for flush of R_DEV's Likewise unloading or stopping MD operations should repeat regardless of mount or not. > - Can this corruption happen on warmboot or only on poweroff? Given POST (assume x86 for only a brief moment) will issue execute diagnositics to hunt for signatures on the ribbon, that basically wacks the content. Cool cycle obviously wacks the buffer. > - What kind of corruption can one see the if boot takes place "too fast" > and drive hasn't got enough time to flush its cache? erm, I am lost with the above. Flush Cache is a hold and wait on completion, period. However, a cache error at this point is a wasted effort to attempt recovery. Not sure I helped or not ... Cheers, Andre ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] 2003-11-03 5:34 ` Andre Hedrick @ 2003-11-03 6:38 ` Ville Herva 0 siblings, 0 replies; 19+ messages in thread From: Ville Herva @ 2003-11-03 6:38 UTC (permalink / raw) To: Andre Hedrick; +Cc: linux-kernel On Sun, Nov 02, 2003 at 09:34:30PM -0800, you [Andre Hedrick] wrote: > > > - How long can the unwritten data linger in the drive cache if the drive > > is otherwise idle? (Without an explicit flush and with write caching > > enabled.) > > Basically forever, until a read is issued to a range of lba's which starts > smaller than the uncommitted contents's lba, and includes the content in > question. Or if a flush cache or disable write-back cache is issued. Huh. Sounds stunning. I mean if the drive is otherwise idle, why would it hold the data in cache without trying to write it onto platter? But I'll take your word for it. > > I had unmounted the fs an raidstopped the md minutes before the boot. > > The problem imho, is a break down of fundamental cascading callers. > > Unmount MD -> flush MD > > MD is a fakie device :-/ > > MD fakie calls for flush of R_DEV's > > Likewise unloading or stopping MD operations should repeat regardless of > mount or not. Yep. You wouldn't happen to know if it could make difference if the md consists of raw devices (hdb,hdc,hdg) instead of partitions (hdc1,hb1,hdg1) wrt. how and when the IDE flushes get triggered? Is there code that does it for partitions but is lacking for whole devices? (The other MDs on the same box that consist of partitions do not get corrupted, but they are on Maxtors, not Samsungs.) > > - Can this corruption happen on warmboot or only on poweroff? > > Given POST (assume x86 for only a brief moment) will issue execute x86 in this case, yes. > diagnositics to hunt for signatures on the ribbon, that basically wacks > the content. Cool cycle obviously wacks the buffer. Ack. > > - What kind of corruption can one see the if boot takes place "too fast" > > and drive hasn't got enough time to flush its cache? > > erm, I am lost with the above. > Flush Cache is a hold and wait on completion, period. > However, a cache error at this point is a wasted effort to attempt > recovery. I meant: if the drive does not flush it cache before reboot, is it likely to see the sectors either up-to-date or having the old data? Or can one see half-written or otherwise corrupted sectors? The corruption I saw didn't look like the sector just had the old data, but I'm not sure. Then again, this may very well be something completely unrelated to ide write caching. > Not sure I helped or not ... Yes you did, thanks! -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2003-11-01 19:01 ` Willy Tarreau 2003-11-01 21:02 ` Ville Herva @ 2004-01-02 19:42 ` Ville Herva 2004-01-02 20:02 ` Ville Herva 2004-01-14 14:46 ` Ville Herva 1 sibling, 2 replies; 19+ messages in thread From: Ville Herva @ 2004-01-02 19:42 UTC (permalink / raw) To: linux-kernel; +Cc: Willy Tarreau Summary: I've been experiencing strange corruption on a raid5 volume for some time. The kernel is 2.2.x + RAID-0.90 patch. Fs is ext2 (+e2compr). After unmounting the filesystem, I can mount it again without problems. I can also raidstop the raid device in between and all is still fine: > umount /dev/md4; mount /dev/md4 - no corruption > umount /dev/md4; raidstop /dev/md4; raidstart /dev/md4; mount /dev/md4 - no corruption But after a reboot, the filesystem is corrupted - few bytes differ in the beginning of /dev/md4 between 1k and and 5k. See the threads http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MMYt.4B2.1%40gated-at.bofh.it&rnum=1&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MZsH.72R.5%40gated-at.bofh.it&rnum=4&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg for details. I did some futher research. First I thought this was an artifact of using "non-normal" blocksize on the fs, 4096 bytes. The other raid partitions I have on the system are 1024 and do not get corrupted.). Also the corrupting fs is on raid5 on bare disks (hdb+hdc+hdg), while the others are on partitions (hda1+hdd1+hdf1 and so on.) I tried to reproduce this under vmware with 3-disk raid5 (hda+hdb+hdd) using 4096-byte ext2 and the exact same kernel. Initially, I thought I was able to trigger it by mounting the fs while raid rebuild was on progress. The kernel spitted this: set_blocksize: b_count 1, dev md(9,4), block 15642112, from c014c3fb set_blocksize: b_count 1, dev md(9,4), block 15642113, from c014c3fb set_blocksize: b_count 1, dev md(9,4), block 15642114, from c014c3fb ... set_blocksize: b_count 2, dev md(9,4), block 15642367, from c014c3fb md4: blocksize changed during read nr_blocks changed to 64 (blocksize 4096, j 3910528, max_blocks 39091968) and fsck reported problems, but only once (the set_blocksize stuff appeared each time). It seems the "set_blocksize" outpouring is a known issue, and not severe: http://www.ussg.iu.edu/hypermail/linux/kernel/0110.1/0493.html The fsck errors were probably just a side-effect of unclean shutdown I used to force raid rebuild. After the failed vmware experiment, I tried to isolate when exactly the corruption happens, shutdown or boot. Also, in the mentioned threads, people had suggested turning off the write cache of the IDE disk. I found out that the difference (corruption) is usually on three bytes on /dev/hdg, but sometimes on /dev/hdc, too. (/dev/md4 = hdb+hdc+hdg; hdb&hdc are on i810, hdg is on hpt370). First, I did umount /dev/md4 raidstop /dev/md4 head -c 50k /dev/hdg > /save/hdg reboot To rule out kernel raid autodetect and raid code in general, I booted 2.2.25-1-secure with "single init=/bin/bash raid=noautodetect". Did head -c50k /dev/hdg | cmp -l /save/hdg Three bytes differed: 4641 0 35 4642 0 205 4643 0 10 bytepos after before boot boot wrote the original stuff back: dd if=/save/hdg /dev/hdg sync hdparm -W0 /dev/hdg sync reboot Booted 2.2.25-1-secure with "single init=/bin/bash raid=noautodetect" again. Did head -c50k /dev/hdg | cmp -l /save/hdg Three same three bytes differed again. Wrote the stuff back, sync'ed, did hdparm, and powered off. Still, the the bytes differed on next boot. Then I booted 2.4.21-jam1 with "single init=/bin/bash raid=noautodetect" (I happened to have 2.4.21-jam1 compiled with suitable drivers at hand). Wrote the same stuff back with dd, synced, turned ide cache off. Booted 2.4.21-jam1 with "single init=/bin/bash raid=noautodetect" again. Did the diff; the three bytes differed again. Note that sometimes few bytes on hdc differed, too. Usually it was just the three hdg bytes. So this is not a 2.2 kernel issue. I very much doubt it's a kernel issue at all. Unless it is a bug in kernel partition detection that is still present in 2.4.x. I tried to turn off the ide write cache with hdparm -W0, so it shouldn't be a write caching issue. If it's a bios issue, it's really a strange one, since it affects both disks on i810 ide and on hpt370. The disks have no partition table, though, which _could_ confuse the bios. Any ideas? Who the heck could write to those three bytes, and why? -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2004-01-02 19:42 ` Something corrupts raid5 disks slightly during reboot Ville Herva @ 2004-01-02 20:02 ` Ville Herva 2004-01-14 14:46 ` Ville Herva 1 sibling, 0 replies; 19+ messages in thread From: Ville Herva @ 2004-01-02 20:02 UTC (permalink / raw) To: linux-kernel; +Cc: Willy Tarreau > So this is not a 2.2 kernel issue. I very much doubt it's a kernel issue at > all. Unless it is a bug in kernel partition detection that is still present > in 2.4.x. Short addition: in the earlier thread, it was suggested to inspect the disk with another OS (DOS, Windows, something else) to rule out Linux kernel completely. I couldn't easily find anything that boots from cd or preferably from floppy (since I don't have cdrom attached due to ide cable shortage) *and* supports the HPT370 ide controller /dev/hdg is connected to. If I find something that fits the bill, I'll give it a shot. -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2004-01-02 19:42 ` Something corrupts raid5 disks slightly during reboot Ville Herva 2004-01-02 20:02 ` Ville Herva @ 2004-01-14 14:46 ` Ville Herva 2004-01-14 22:22 ` Willy Tarreau 1 sibling, 1 reply; 19+ messages in thread From: Ville Herva @ 2004-01-14 14:46 UTC (permalink / raw) To: linux-kernel, Willy Tarreau On Fri, Jan 02, 2004 at 09:42:00PM +0200, you [Ville Herva] wrote: > Summary: > > I've been experiencing strange corruption on a raid5 volume for some time. > The kernel is 2.2.x + RAID-0.90 patch. Fs is ext2 (+e2compr). After > unmounting the filesystem, I can mount it again without problems. I can also > raidstop the raid device in between and all is still fine: > > > umount /dev/md4; mount /dev/md4 > - no corruption > > umount /dev/md4; raidstop /dev/md4; raidstart /dev/md4; mount /dev/md4 > - no corruption > > But after a reboot, the filesystem is corrupted - few bytes differ in the > beginning of /dev/md4 between 1k and and 5k. > > See the threads > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MMYt.4B2.1%40gated-at.bofh.it&rnum=1&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MZsH.72R.5%40gated-at.bofh.it&rnum=4&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg > for details. (...) > I found out that the difference (corruption) is usually on three bytes on > /dev/hdg, but sometimes on /dev/hdc, too. (/dev/md4 = hdb+hdc+hdg; hdb&hdc > are on i810, hdg is on hpt370). > > First, I did > umount /dev/md4 > raidstop /dev/md4 > head -c 50k /dev/hdg > /save/hdg > reboot > > To rule out kernel raid autodetect and raid code in general, I > booted 2.2.25-1-secure with "single init=/bin/bash raid=noautodetect". > Did > head -c50k /dev/hdg | cmp -l /save/hdg > Three bytes differed: > 4641 0 35 > 4642 0 205 > 4643 0 10 > bytepos after before > boot boot > > wrote the original stuff back: > dd if=/save/hdg /dev/hdg > sync > hdparm -W0 /dev/hdg > sync > reboot > > Booted 2.2.25-1-secure with "single init=/bin/bash raid=noautodetect" > again. > Did > head -c50k /dev/hdg | cmp -l /save/hdg > Three same three bytes differed again. > Wrote the stuff back, sync'ed, did hdparm, and powered off. Still, the the > bytes differed on next boot. > > Then I booted 2.4.21-jam1 with "single init=/bin/bash raid=noautodetect" (I > happened to have 2.4.21-jam1 compiled with suitable drivers at hand). > Wrote the same stuff back with dd, synced, turned ide cache off. > Booted 2.4.21-jam1 with "single init=/bin/bash raid=noautodetect" again. > Did the diff; the three bytes differed again. > > Note that sometimes few bytes on hdc differed, too. Usually it was just the > three hdg bytes. > > So this is not a 2.2 kernel issue. I very much doubt it's a kernel issue at > all. Unless it is a bug in kernel partition detection that is still present > in 2.4.x. > > I tried to turn off the ide write cache with hdparm -W0, so it shouldn't > be a write caching issue. > > If it's a bios issue, it's really a strange one, since it affects both disks > on i810 ide and on hpt370. The disks have no partition table, though, which > _could_ confuse the bios. Addition: - I tried booting from 2.6.1 single user mode to 2.6.1 single user mode (booting with sysrq-b to avoid shutdown process): -> The corruption on /dev/hdg happens like with 2.2 and 2.4 - I booted from 2.6.1 single user mode to 2.6.1 single user mode with kexec patch to avoid entering BIOS in between -> The corruption DOES NOT happen I'm pretty much out of ideas. -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2004-01-14 14:46 ` Ville Herva @ 2004-01-14 22:22 ` Willy Tarreau 2004-01-14 22:46 ` Ville Herva 0 siblings, 1 reply; 19+ messages in thread From: Willy Tarreau @ 2004-01-14 22:22 UTC (permalink / raw) To: Ville Herva, linux-kernel Hi Ville, On Wed, Jan 14, 2004 at 04:46:46PM +0200, Ville Herva wrote: > - I tried booting from 2.6.1 single user mode to 2.6.1 single user > mode (booting with sysrq-b to avoid shutdown process): > -> The corruption on /dev/hdg happens like with 2.2 and 2.4 > > - I booted from 2.6.1 single user mode to 2.6.1 single user > mode with kexec patch to avoid entering BIOS in between > -> The corruption DOES NOT happen > > I'm pretty much out of ideas. To me, it proves that the bios triggers the problem. It could also be in the device enumeration functions or device initialization that it does this thing. Perhaps even a more nasty thing such as a pending DMA write which completes during a device reset. That's very odd anyway. I don't quite remember well all your setup. Have you tried enabling/disabling shadow ram/caching on bios regions to check if a faster/slower code execution in the bios changes something ? Also do it on additionnal ROMs if you have an onboard bios on your secondary controller. I'm also getting stuck without any other idea :-/ Regards, Willy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Something corrupts raid5 disks slightly during reboot 2004-01-14 22:22 ` Willy Tarreau @ 2004-01-14 22:46 ` Ville Herva 0 siblings, 0 replies; 19+ messages in thread From: Ville Herva @ 2004-01-14 22:46 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel On Wed, Jan 14, 2004 at 11:22:14PM +0100, you [Willy Tarreau] wrote: > Hi Ville, > > On Wed, Jan 14, 2004 at 04:46:46PM +0200, Ville Herva wrote: > > > - I tried booting from 2.6.1 single user mode to 2.6.1 single user > > mode (booting with sysrq-b to avoid shutdown process): > > -> The corruption on /dev/hdg happens like with 2.2 and 2.4 > > > > - I booted from 2.6.1 single user mode to 2.6.1 single user > > mode with kexec patch to avoid entering BIOS in between > > -> The corruption DOES NOT happen > > > > I'm pretty much out of ideas. > > To me, it proves that the bios triggers the problem. Or lilo. Abit BIOS, Adaptec SCSI BIOS, Highpoint HPT370 BIOS and lilo are the only pieces of code that get executed between power on and the kernel. Unfortunately, I was unable to rule that (unlikely) alternative out just yet, because I found out that the box doesn't have a working floppy either (cdrom is not plugged because of lack of cables - I guess I miswired the floppy drive too when I last messed with the power cables.) This is also why I didn't try your DOS disk on the box. It seems its diskedit can recognize at least scsi disks, so it could well handle the disk on Highpoint controller, too. Anyway, thanks for that (and reminding me how rusty my French is - and has always been :). I plan to try booting from floppy without lilo and the dos editor, when I next open the box and can fix the floppy wiring. It's a server so I don't take it down all the time... > It could also be in the device enumeration functions or device > initialization that it does this thing. Perhaps even a more nasty thing > such as a pending DMA write which completes during a device reset. Something like that crossed my mind initially, but waiting >10min between the write and boot didn't help, nor did "hdparm -W 0"... > That's very odd anyway. I don't quite remember well all your setup. http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=MMYt.4B2.1%40gated-at.bofh.it&rnum=1&prev=/groups%3Fnum%3D50%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26q%3DSomething%2Bcorrupts%2Braid5%2Bdisks%2Bslightly%2Bduring%2Breboot%26sa%3DN%26tab%3Dwg gives some details. Basicly it's a Abit ST6R mobo (i815 and HPT370 IDEs), and three Maxtor 250GB disk (root and first data fs), 3 Samsung 80GB's (second data fs). One of the Samsungs on the HPT370 is the one that exhibits the corruption. > Have you tried enabling/disabling shadow ram/caching on bios regions to > check if a faster/slower code execution in the bios changes something ? No. I could try that. > Also do it on additionnal ROMs if you have an onboard bios on your > secondary controller. Ok, if only I can manage to find such options from the BIOS. > I'm also getting stuck without any other idea :-/ No wonder. So far you have been most helpful - bug thanks for that. PS: Again, the next round of results will only be in after some time - as I said, I'll need to wait for a suitable reboot time for the box... Sorry for the trickle. -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2004-01-14 22:47 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-10-31 19:08 Something corrupts raid5 disks slightly during reboot Ville Herva 2003-11-01 1:41 ` Jeffrey E. Hundstad 2003-11-01 1:57 ` Mike Fedyk 2003-11-01 8:33 ` Ville Herva 2003-11-01 8:27 ` ide write cache issue? [Re: Something corrupts raid5 disks slightly during reboot] Ville Herva 2003-11-01 15:56 ` Willy Tarreau 2003-11-01 18:25 ` Ville Herva 2003-11-01 19:01 ` Willy Tarreau 2003-11-01 21:02 ` Ville Herva 2003-11-02 6:05 ` Andre Hedrick 2003-11-02 8:28 ` Ville Herva 2003-11-02 20:57 ` Matthias Andree 2003-11-03 5:34 ` Andre Hedrick 2003-11-03 6:38 ` Ville Herva 2004-01-02 19:42 ` Something corrupts raid5 disks slightly during reboot Ville Herva 2004-01-02 20:02 ` Ville Herva 2004-01-14 14:46 ` Ville Herva 2004-01-14 22:22 ` Willy Tarreau 2004-01-14 22:46 ` Ville Herva
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).