* dm-crypt is broken and causes massive data corruption
@ 2006-05-08 17:20 Tillmann Steinbrecher
2006-05-08 17:57 ` [dm-crypt] " Simpson, Brett
2006-05-09 19:04 ` Alasdair G Kergon
0 siblings, 2 replies; 8+ messages in thread
From: Tillmann Steinbrecher @ 2006-05-08 17:20 UTC (permalink / raw)
To: linux-kernel, dm-crypt
Hi,
it's been many months that dm-crypt has been broken, and is known to
cause massive data corruption.
Various people have noticed this, have lost data and wasted many hours
trying to find the reason, and still NOTHING is being done about it. The
problem seems to occur only in conjunction with RAID (dm-crypt on top of
RAID) (or possibly it occurs only in conjunction with large
filesystems). I've had issues with that for many months as well, trying
to eliminate other possible reasons. There are none.
Let's say this loud and clear:
dm-crypt causes data corruption. Yet it is not even marked as
"EXPERIMENTAL" in the kernel config, when in fact it's more than just
experimental, it's "DANGEROUS/BROKEN".
Here are some more reports:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=336153
(That was for 2.6.8, but the problems are still the same in recent
kernel versions)
http://www.ubuntuforums.org/showthread.php?t=170304
(Similar config, similar problem - this time with 2.6.12 and 2.6.15)
http://episteme.arstechnica.com/groupee/forums/a/tpc/f/96509133/m/282007248731/r/224008458731
(Again the same constellation, and the same problem.)
http://marc.theaimsgroup.com/?l=linux-kernel&m=114664786711245&w=2
(Same config, same problem. This time with 2.6.16!)
BTW the problem seems to be independent from the filesystem used;
however, filesystems seem to be more or less robust against this type of
corruption. With ext3, the filesystem would mess itself up within hours
on my system. With XFS, massive corruption (all data lost) had occured
after a few weeks. With ReiserFS 3, occasional problems that were
fixable using reiserfsck --rebuild-tree occured.
Sorry for the rant. But I think this is an important issue that needs to
be adressed ASAP, before even more people lose their data. Keep in mind
that crypto filesystems are typically used for systems where the data is
sensitive and important! Something must be done about it - in the worst
case, removing dm-crypt from the mainline kernel.
Please CC replies to me, as I'm not subscribed to either linux-kernel or
dm-crypt.
bye,
Tillmann
--
Dipl.-Ing. Tillmann Steinbrecher http://www.igd.fhg.de/~tsteinbr/
Cognitive Computing & Medical Imaging
Fraunhofer IGD, Fraunhoferstr. 5, D-64283 Darmstadt, Germany
All opinions are mine and not those of my employer.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-08 17:20 dm-crypt is broken and causes massive data corruption Tillmann Steinbrecher
@ 2006-05-08 17:57 ` Simpson, Brett
2006-05-08 18:27 ` Christophe Saout
2006-05-09 19:04 ` Alasdair G Kergon
1 sibling, 1 reply; 8+ messages in thread
From: Simpson, Brett @ 2006-05-08 17:57 UTC (permalink / raw)
To: dm-crypt; +Cc: linux-kernel, Tillmann Steinbrecher
On Mon, 2006-05-08 at 19:20 +0200, Tillmann Steinbrecher wrote:
> it's been many months that dm-crypt has been broken, and is known to
> cause massive data corruption.
>
> Various people have noticed this, have lost data and wasted many hours
> trying to find the reason, and still NOTHING is being done about it. The
> problem seems to occur only in conjunction with RAID (dm-crypt on top of
> RAID) (or possibly it occurs only in conjunction with large
> filesystems). I've had issues with that for many months as well, trying
> to eliminate other possible reasons. There are none.
I've been running Gentoo for over month with a 54GB ext3 filesystem via
dm-crypt on an IDE drive. No problems so far.
I've used Gentoo-sources 2.6.16-r1 and vanilla kernels 2.6.17-rc1
through rc3.
I've been using cryptsetup-1.0.1-i686-pc-linux-gnu-static and have it in
my initrd so I can mount my root partition.
Brett
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-08 17:57 ` [dm-crypt] " Simpson, Brett
@ 2006-05-08 18:27 ` Christophe Saout
0 siblings, 0 replies; 8+ messages in thread
From: Christophe Saout @ 2006-05-08 18:27 UTC (permalink / raw)
To: bart; +Cc: dm-crypt, linux-kernel, Tillmann Steinbrecher
[-- Attachment #1: Type: text/plain, Size: 287 bytes --]
Am Montag, den 08.05.2006, 13:57 -0400 schrieb Simpson, Brett:
> I've been running Gentoo for over month with a 54GB ext3 filesystem via
> dm-crypt on an IDE drive. No problems so far.
It's a problem with dm-crypt on top of md. I'm trying to figure out
what's going on there.
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-08 17:20 dm-crypt is broken and causes massive data corruption Tillmann Steinbrecher
2006-05-08 17:57 ` [dm-crypt] " Simpson, Brett
@ 2006-05-09 19:04 ` Alasdair G Kergon
2006-05-11 15:15 ` Paul Slootman
1 sibling, 1 reply; 8+ messages in thread
From: Alasdair G Kergon @ 2006-05-09 19:04 UTC (permalink / raw)
To: Tillmann Steinbrecher; +Cc: linux-kernel, dm-crypt
On Mon, May 08, 2006 at 07:20:12PM +0200, Tillmann Steinbrecher wrote:
> it's been many months that dm-crypt has been broken, and is known to
> cause massive data corruption.
> Various people have noticed this, have lost data and wasted many hours
> trying to find the reason, and still NOTHING is being done about it.
Perhaps that's because it wasn't until last week that the upstream
maintainers heard of these problems?
So far there isn't much in the way of controlled experiments, but:
All the reports agree the problem is independent of filesystem.
One thread suggests only filesystem metadata is corrupted, not file
data, and wonders if something's going wrong with (unsupported) write
barriers.
Another report said dm-crypt over raid5 failed while raid5
over dm-crypt worked.
Alasdair
--
agk@redhat.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-09 19:04 ` Alasdair G Kergon
@ 2006-05-11 15:15 ` Paul Slootman
2006-05-11 15:42 ` Andrea Gelmini
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Paul Slootman @ 2006-05-11 15:15 UTC (permalink / raw)
To: linux-kernel
Alasdair G Kergon <agk@redhat.com> wrote:
>On Mon, May 08, 2006 at 07:20:12PM +0200, Tillmann Steinbrecher wrote:
>> it's been many months that dm-crypt has been broken, and is known to
>> cause massive data corruption.
>So far there isn't much in the way of controlled experiments, but:
>
> All the reports agree the problem is independent of filesystem.
>
> One thread suggests only filesystem metadata is corrupted, not file
> data, and wonders if something's going wrong with (unsupported) write
> barriers.
>
> Another report said dm-crypt over raid5 failed while raid5
> over dm-crypt worked.
A data point:
I'm running my /home on reiserfs3 over dm-crypt over lvm over raid5 for
at least a year now, without any problems. Currently running 2.6.13.4
(that's my "stable" work system...).
Paul Slootman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-11 15:15 ` Paul Slootman
@ 2006-05-11 15:42 ` Andrea Gelmini
2006-05-11 23:17 ` Christian Schmidt
2006-05-12 21:47 ` Dan Merillat
2 siblings, 0 replies; 8+ messages in thread
From: Andrea Gelmini @ 2006-05-11 15:42 UTC (permalink / raw)
To: Paul Slootman; +Cc: linux-kernel
On Thu, May 11, 2006 at 03:15:29PM +0000, Paul Slootman wrote:
> A data point:
>
> I'm running my /home on reiserfs3 over dm-crypt over lvm over raid5 for
> at least a year now, without any problems. Currently running 2.6.13.4
> (that's my "stable" work system...).
It seems the write pattern is important... I can replicate corruption
copying giga of data from an locale attached IDE disk. Do you write mostly
from network or from slow devices?
ciao,
gelma
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-11 15:15 ` Paul Slootman
2006-05-11 15:42 ` Andrea Gelmini
@ 2006-05-11 23:17 ` Christian Schmidt
2006-05-12 21:47 ` Dan Merillat
2 siblings, 0 replies; 8+ messages in thread
From: Christian Schmidt @ 2006-05-11 23:17 UTC (permalink / raw)
To: linux-kernel; +Cc: tsteinbr
Paul Slootman wrote:
> Alasdair G Kergon <agk@redhat.com> wrote:
>> On Mon, May 08, 2006 at 07:20:12PM +0200, Tillmann Steinbrecher wrote:
>>> it's been many months that dm-crypt has been broken, and is known to
>>> cause massive data corruption.
>
>> So far there isn't much in the way of controlled experiments, but:
>>
>> All the reports agree the problem is independent of filesystem.
>>
>> One thread suggests only filesystem metadata is corrupted, not file
>> data, and wonders if something's going wrong with (unsupported) write
>> barriers.
>>
>> Another report said dm-crypt over raid5 failed while raid5
>> over dm-crypt worked.
>
> A data point:
>
> I'm running my /home on reiserfs3 over dm-crypt over lvm over raid5 for
> at least a year now, without any problems. Currently running 2.6.13.4
> (that's my "stable" work system...).
Just so you know,
I'm running dm-crypt on top of raid-5 as well. Kernels ranging from
gentoo's hardened 2.6.11 to 2.6.15.X with gentoo patchset on AMD64. The
raid is running since February 2005 with >1TB and survived a disk
failure with rebuild.
Cipher module was aes, now the asm-accelerated x86_64 version. The
filesystem is ext-3. Survived several hard lockups (damn cheap SATA
controllers hanging if a drive passes out), an LV/filesystem resize, and
feeding with GBytes of data in a row (at max ~30MByte/s to 2-3 files in
parallel).
Just re-checked the filesystem: no metadata information wrong. I
remember I checked the crc of several bigger archives when I had to
replace a drive two month ago, and couldn't find any problems then.
Best regards,
Christian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-crypt] dm-crypt is broken and causes massive data corruption
2006-05-11 15:15 ` Paul Slootman
2006-05-11 15:42 ` Andrea Gelmini
2006-05-11 23:17 ` Christian Schmidt
@ 2006-05-12 21:47 ` Dan Merillat
2 siblings, 0 replies; 8+ messages in thread
From: Dan Merillat @ 2006-05-12 21:47 UTC (permalink / raw)
To: linux-kernel
On 5/11/06, Paul Slootman <paul+nospam@wurtel.net> wrote:
> A data point:
>
> I'm running my /home on reiserfs3 over dm-crypt over lvm over raid5 for
> at least a year now, without any problems. Currently running 2.6.13.4
> (that's my "stable" work system...).
Datapoint:
Linux fileserver 2.6.15.6 #1 PREEMPT Wed Mar 8 20:26:55 EST 2006
x86_64 GNU/Linux
CONFIG_MD_RAID5=y
CONFIG_BLK_DEV_DM=y
CONFIG_DM_SNAPSHOT=y
CONFIG_CRYPTO_AES_X86_64=y
encrypted logical volume on a raid-5 MD on 4 SATA drives, mounted reiser3.
aes-cbc-plain
It's worked through multiple kernels, and moving from 32 to 64bits.
2.6.11 (64-bit) 2.6.10 (64bit) 2.6.8 (32bit) is the kernel history I
have so far. I'm not sure when I switched from cryptoloop to dm-crypt
though, at least before may '05.
I'm not running dm-crypt directly on MD, though, the stack is
SATA->MD->DM->DM-crypt->reiser3. That may be the difference.
I've got plenty of free space, I could make a ~75gb encrypted
partition and run any sort
of write pattern test/filesystem you want me to try.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-05-12 21:47 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-08 17:20 dm-crypt is broken and causes massive data corruption Tillmann Steinbrecher
2006-05-08 17:57 ` [dm-crypt] " Simpson, Brett
2006-05-08 18:27 ` Christophe Saout
2006-05-09 19:04 ` Alasdair G Kergon
2006-05-11 15:15 ` Paul Slootman
2006-05-11 15:42 ` Andrea Gelmini
2006-05-11 23:17 ` Christian Schmidt
2006-05-12 21:47 ` Dan Merillat
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).