Mild filesystem corruption on ext4 (no journal)

* Mild filesystem corruption on ext4 (no journal)
@ 2009-06-05 10:49 Alan Jenkins
  2009-06-05 14:40 ` Aioanei Rares
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Jenkins @ 2009-06-05 10:49 UTC (permalink / raw)
  To: linux-ext4, Linux Kernel Mailing List

Hi,

I run ext4 without a journal on my cheap netbook with a 4 gig SSD.  I 
suspect "without a journal" is significant, I don't think I'm doing 
anything else strange.

When I upgrade libc from 2.7 (debian stable) to 2.9 (debian unstable), 
the locale breaks every reboot, and I have to repair it by running 
locale-gen.  This happened now when I only upgraded libc, in order to 
play with signalfd().  It also happened before, when I upgraded the 
entire machine to debian unstable (which I later reverted).

The problem is that /usr/lib/locale/locale-archive gets corrupted when I 
reboot.  The exact corruption differs with each reboot (i.e. the md5sum 
differs).  Last time, the first ~70K was overwritten with data from 
xorg.log and my web browsing history.  I have copies of the original and 
corrupted state which I can send, the full file is 1.3 megs, but I can 
limit it to the first 70K, since that's all that was corrupted.

To try and rule out a faulty userspace program, I marked the file as 
read-only (chmod a-w) and immutable (chattr +i).  After a reboot, the 
file was still read-only and immutable, yet it still became corrupted.

Also, I ran md5sum in the shutdown scripts, after mounting the root 
filesystem read-only (which is also preceeded by a sync in a different 
script).  This showed that the file did not appear corrupted at this 
point.  (Though maybe it was ok in page-cache, but corrupted on-disk).

The locale-archive file is read by the libc locale routines using 
mmap().  The mapping is read only and is not modified.  It seems likely 
that some process has it mapped when the kernel shuts down.

I tried reproducing this by writting a minimal daemon which maps a copy 
of the locale-archive file, and starting it just before the filesystem 
is remounted read-only.  It didn't work though; this copy of the 
locale-archive file remained uncorrupted.

I forced a fsck on boot, and the filesystem was reported to be clean.  I 
am currently running with e2fsprogs v1.41.6 (from debian unstable), and 
a custom-built kernel, 2.6.30-rc7.

Thanks in advance!
Alan

^ permalink raw reply	[flat|nested] 18+ messages in thread