All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug#403426: kernel corrupts LUKS partition header on arm
       [not found] <20061217022906.2434.60658.reportbug@LKG8A754B.example.org>
@ 2006-12-20 16:15 ` Martin Michlmayr
  2006-12-29 10:52   ` Clemens Fruhwirth
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2006-12-20 16:15 UTC (permalink / raw)
  To: dm-devel, Clemens Fruhwirth; +Cc: Brian Brunswick, 403426

We're seeing corruption of LUKS partition headers on ARM.  I've
confirmed this on two different ARM platforms (IXP4xx and IOP32x) and
with 2.6.17 and 2.6.18.

Basically, when you create a LUKS partition on a PC and then connect
it to an ARM box and open it, you get an "automatic header conversion
from 0.99 to 0.991 triggered" message and afterwards the LUKS
partition header is corrupted.

Here are steps to reproduce this:

On the PC:

27340:tbm@deprecation: ] sudo cryptsetup luksFormat /dev/sda6

WARNING!
========
This will overwrite data on /dev/sda6 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase:
Verify passphrase:
Command successful.
27348:tbm@deprecation: ~] sudo cryptsetup luksOpen /dev/sda6 x
Enter LUKS passphrase:
key slot 0 unlocked.
Command successful.
27351:tbm@deprecation: ~] sudo cryptsetup luksClose x
27352:tbm@deprecation: ~]

Connect the drive to the ARM box:

debian:~# cryptsetup luksOpen /dev/sdb6 x
Enter LUKS passphrase:
automatic header conversion from 0.99 to 0.991 triggered
[here it appears to hang; I press ctrl-c]
debian:~# cryptsetup luksOpen /dev/sdb6 x
Enter LUKS passphrase:
unknown hash spec in phdrEnter LUKS passphrase:
unknown hash spec in phdrEnter LUKS passphrase:
unknown hash spec in phdrCommand failed: No key available with this passphrase.

debian:~#

The original bug report with some more info:

* Brian Brunswick <bdb-reportbug@forbidden.co.uk> [2006-12-17 02:29]:
> Package: linux-image-2.6.18-3-ixp4xx
> Version: 2.6.18-8
> Severity: critical
> Justification: causes serious data loss
> 
> This is on an NSLU2, I wanted to use it to access some disks that I
> had used previously on another system that had encrypted partitons.
> However, when I tried cryptsetup luksOpen, I got a automatic header
> conversion from 0.99 to 0.991 triggered message, and then an infinite
> loop. Trying the same partition on the other system, now I get the
> same thing - its header is corrupted. Luckily, I'm paranoid and had a
> backup of the LUKS header! If I didn't have this, the whole
> partition's data would probably be lost.
> 
> Here's the result of some tests - and its stranger than you think...
> 
> Used cryptsetup luksFormat on another system to set up a partition..
> 
> LKG8A754B:~# uname -a
> Linux LKG8A754B 2.6.18-3-ixp4xx #1 Mon Dec 11 17:20:00 UTC 2006 armv5tel GNU/Linux
> 
> !!! ... after some experiments corrupting it... now I restore it again
> and decide to strace things... !!!
> 
> LKG8A754B:~# dd < sde4-luks-header-backup  > /dev/sde4
> 150+0 records in
> 150+0 records out
> 76800 bytes (77 kB) copied, 0.0148216 seconds, 5.2 MB/s
> LKG8A754B:~# strace 2>trace cryptsetup luksOpen /dev/sde4 sde4
> Enter LUKS passphrase: 
> key slot 0 unlocked.
> 
> !!! Hey, that just worked !!!
> 
> LKG8A754B:~# ls -l /dev/mapper/sde4
> brw-rw---- 1 root disk 254, 0 Dec 17 02:00 /dev/mapper/sde4
> LKG8A754B:~# cmp -l /dev/sde4 sde4-luks-header-backup 
> cmp: EOF on sde4-luks-header-backup
> 
> !!! But: !!!
> 
> LKG8A754B:~# cryptsetup remove sde4
> !!! no strace !!!
> LKG8A754B:~# cryptsetup luksOpen /dev/sde4 sde4
> Enter LUKS passphrase: 
> automatic header conversion from 0.99 to 0.991 triggered
> <ctrl-c>
> LKG8A754B:~# cmp -l /dev/sde4 sde4-luks-header-backup        
>   168   0  12
>   513 114   0
>   514 125   0
>   515 113   0
>   516 123   0
>   517 272   0
>   518 276   0
>   520   1   0
>   521 141   0
>   522 145   0
>   523 163   0
>   539   0   3
>   540   0  10
>   543   0  17
>   544   0 240
>   553 143   0
>   554 142   0
>   555 143   0
>   556  55   0
>   557 145   0
>   558 163   0
>   559 163   0
>   560 151   0
>   561 166   0
>   562  72   0
>   563 163   0
>   564 150   0
>   565 141   0
>   566  62   0
>   567  65   0
>   568  66   0
>   585 163   0
>   586 150   0
>   587 141   3
>   588  61 210
>   591   0  17
>   592   0 240
> cmp: EOF on sde4-luks-header-backup
> LKG8A754B:~# 
> 
> !!!
> 
> Its done something like overwrite the second sector of the header with
> the first one. I had a look at the cryptsetup code, and the conversion
> message is triggered by it finding the wrong state code for the
> passphrase slot - so the data has been overwritten by the time its got
> there.
> 
> This is reliable - it always seems to corrupt it without strace,
> always works with!???
> 
> Err....

According to the orignal bug reporter, cryptsetup hangs when it shows
"automatic header conversion from 0.99 to 0.991 triggered".  However,
I tried it with a small partition now (ca 90 MB) and it doesn't hang.
I got:

foobar:~# cryptsetup luksOpen /dev/sda5 x
Enter LUKS passphrase:
automatic header conversion from 0.99 to 0.991 triggereddevice-mapper: crypt: Device lookup failed
device-mapper: error adding target to table
device-mapper: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Enter LUKS passphrase:
unknown hash spec in phdrEnter LUKS passphrase:

-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Bug#403426: kernel corrupts LUKS partition header on arm
  2006-12-20 16:15 ` Bug#403426: kernel corrupts LUKS partition header on arm Martin Michlmayr
@ 2006-12-29 10:52   ` Clemens Fruhwirth
  2006-12-29 16:38     ` Martin Michlmayr
  2006-12-29 20:24     ` Martin Michlmayr
  0 siblings, 2 replies; 19+ messages in thread
From: Clemens Fruhwirth @ 2006-12-29 10:52 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: dm-devel, Brian Brunswick, 403426

At Wed, 20 Dec 2006 17:15:29 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:

> We're seeing corruption of LUKS partition headers on ARM.  I've
> confirmed this on two different ARM platforms (IXP4xx and IOP32x) and
> with 2.6.17 and 2.6.18.
> 
> Basically, when you create a LUKS partition on a PC and then connect
> it to an ARM box and open it, you get an "automatic header conversion
> from 0.99 to 0.991 triggered" message and afterwards the LUKS
> partition header is corrupted.

Please try the version from subversion
http://luks.endorphin.org/svn/cryptsetup 

I just kicked this conversion routine as it is for pre-1.0 releases
and guess there is no single deployment that will ever need it.

This won't change the bug itself, but it won't corrupt your partition
anymore. It just fails. 

> > Its done something like overwrite the second sector of the header with
> > the first one. I had a look at the cryptsetup code, and the conversion
> > message is triggered by it finding the wrong state code for the
> > passphrase slot - so the data has been overwritten by the time its got
> > there.

That looks right.

A good amount of staring out of the window, drew my attention to 
(read|write|write_lseek)_blockwise in util.c.

Reading from a file description opened with O_DIRECT requires
blockwise reading into an aligned memory segments. That's the reason
for all the magic in these routines.

Looking at read_blockwise, r=read(fd,buf,size) might just return a
short read, that is r<size. But the read_blockwise routine never
covers that case. For some reason arm might behave different than
other archs here.

I just added the r!=bsize case to error checking and an error message
as well.
	while(count) {
		r = read(fd,padbuf,bsize);
		if(r < 0 || r != bsize) {
			fprintf(stderr, "read failed in read_blockwise.\n");
			goto out;
		}
		step = count<bsize?count:bsize;
		memcpy(buf,padbuf,step);
		buf += step;
		count -= step;
	}

The changes are also in subversion.

Please try.
-- 
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2006-12-29 10:52   ` Clemens Fruhwirth
@ 2006-12-29 16:38     ` Martin Michlmayr
  2006-12-29 20:24     ` Martin Michlmayr
  1 sibling, 0 replies; 19+ messages in thread
From: Martin Michlmayr @ 2006-12-29 16:38 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick

* Clemens Fruhwirth <clemens@endorphin.org> [2006-12-29 11:52]:
> I just added the r!=bsize case to error checking and an error message
> as well.
...
> The changes are also in subversion.

This particular change didn't make any difference.  I still get the
header conversion message when I only apply the patch from utils.c.


#! /bin/sh /usr/share/dpatch/dpatch-run
## 02_fix_arm.dpatch by Clemens Fruhwirth <clemens@endorphin.org>
##
## DP: Add error checking to read_blockwise for short reads.
## DP: Commit a patch that fixes http://bugs.debian.org/403075

@DPATCH@
Index: lib/utils.c
===================================================================
--- cryptsetup-1.0.4~/lib/utils.c	(revision 1)
+++ cryptsetup-1.0.4/lib/utils.c	(working copy)
@@ -151,8 +151,10 @@
 static int sector_size(int fd) 
 {
 	int bsize;
-	ioctl(fd,BLKSSZGET, &bsize);
-	return bsize;
+	if (ioctl(fd,BLKSSZGET, &bsize) < 0)
+		return -EINVAL;
+	else
+		return bsize;
 }
 
 int sector_size_for_device(const char *device)
@@ -171,8 +173,11 @@
 	char *padbuf; char *padbuf_base;
 	char *buf = (char *)orig_buf;
 	int r;
-	int hangover; int solid; int bsize = sector_size(fd);
+	int hangover; int solid; int bsize;
 
+	if ((bsize = sector_size(fd)) < 0)
+		return bsize;
+
 	hangover = count % bsize;
 	solid = count - hangover;
 
@@ -209,15 +214,20 @@
 	char *buf = (char *)orig_buf;
 	int r;
 	int step;
-	int bsize = sector_size(fd);
+	int bsize;
 
+	if ((bsize = sector_size(fd)) < 0)
+		return bsize;
+
 	padbuf = aligned_malloc(&padbuf_base, bsize, bsize);
 	if(padbuf == NULL) return -ENOMEM;
 
 	while(count) {
 		r = read(fd,padbuf,bsize);
-		if(r < 0) goto out;
-		
+		if(r < 0 || r != bsize) {
+			fprintf(stderr, "read failed in read_blockwise.\n");
+			goto out;
+		}
 		step = count<bsize?count:bsize;
 		memcpy(buf,padbuf,step);
 		buf += step;
@@ -242,6 +252,9 @@
 	int frontHang = offset % bsize;
 	int r;
 
+	if (bsize < 0)
+		return bsize;
+
 	lseek(fd, offset - frontHang, SEEK_SET);
 	if(offset % bsize) {
 		int innerCount = count<bsize?count:bsize;

-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2006-12-29 10:52   ` Clemens Fruhwirth
  2006-12-29 16:38     ` Martin Michlmayr
@ 2006-12-29 20:24     ` Martin Michlmayr
  2006-12-30 10:50       ` Clemens Fruhwirth
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2006-12-29 20:24 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick

[-- Attachment #1: Type: text/plain, Size: 2945 bytes --]

* Clemens Fruhwirth <clemens@endorphin.org> [2006-12-29 11:52]:
> Please try the version from subversion
> http://luks.endorphin.org/svn/cryptsetup

With 1.0.4 plus the attached 2 patches from SVN I no longer get any
corruption but I also cannot access my encrypted data.  Is there
anything else I should try?


foobar:~# cryptsetup luksOpen /dev/sda5 x
Enter LUKS passphrase:
device-mapper: table: 254:0: crypt: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Enter LUKS passphrase:
device-mapper: table: 254:0: crypt: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Enter LUKS passphrase:
device-mapper: table: 254:0: crypt: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Command failed: No key available with this passphrase.

foobar:~# cryptsetup luksOpen /dev/sda5 x
Enter LUKS passphrase:
device-mapper: table: 254:0: crypt: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Enter LUKS passphrase:
device-mapper: table: 254:0: crypt: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Enter LUKS passphrase:
device-mapper: table: 254:0: crypt: Device lookup failed
device-mapper: ioctl: error adding target to table
device-mapper: ioctl: device doesn't appear to be in the dev hash table.
Failed to setup dm-crypt key mapping.
Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
Failed to read from key storage
Command failed: No key available with this passphrase.

foobar:~#

-- 
Martin Michlmayr
http://www.cyrius.com/

[-- Attachment #2: 02_fix_arm.dpatch --]
[-- Type: text/plain, Size: 1725 bytes --]

#! /bin/sh /usr/share/dpatch/dpatch-run
## 02_fix_arm.dpatch by Clemens Fruhwirth <clemens@endorphin.org>
##
## DP: Add error checking to read_blockwise for short reads.
## DP: Commit a patch that fixes http://bugs.debian.org/403075

@DPATCH@
Index: lib/utils.c
===================================================================
--- cryptsetup-1.0.4~/lib/utils.c	(revision 1)
+++ cryptsetup-1.0.4/lib/utils.c	(working copy)
@@ -151,8 +151,10 @@
 static int sector_size(int fd) 
 {
 	int bsize;
-	ioctl(fd,BLKSSZGET, &bsize);
-	return bsize;
+	if (ioctl(fd,BLKSSZGET, &bsize) < 0)
+		return -EINVAL;
+	else
+		return bsize;
 }
 
 int sector_size_for_device(const char *device)
@@ -171,8 +173,11 @@
 	char *padbuf; char *padbuf_base;
 	char *buf = (char *)orig_buf;
 	int r;
-	int hangover; int solid; int bsize = sector_size(fd);
+	int hangover; int solid; int bsize;
 
+	if ((bsize = sector_size(fd)) < 0)
+		return bsize;
+
 	hangover = count % bsize;
 	solid = count - hangover;
 
@@ -209,15 +214,20 @@
 	char *buf = (char *)orig_buf;
 	int r;
 	int step;
-	int bsize = sector_size(fd);
+	int bsize;
 
+	if ((bsize = sector_size(fd)) < 0)
+		return bsize;
+
 	padbuf = aligned_malloc(&padbuf_base, bsize, bsize);
 	if(padbuf == NULL) return -ENOMEM;
 
 	while(count) {
 		r = read(fd,padbuf,bsize);
-		if(r < 0) goto out;
-		
+		if(r < 0 || r != bsize) {
+			fprintf(stderr, "read failed in read_blockwise.\n");
+			goto out;
+		}
 		step = count<bsize?count:bsize;
 		memcpy(buf,padbuf,step);
 		buf += step;
@@ -242,6 +252,9 @@
 	int frontHang = offset % bsize;
 	int r;
 
+	if (bsize < 0)
+		return bsize;
+
 	lseek(fd, offset - frontHang, SEEK_SET);
 	if(offset % bsize) {
 		int innerCount = count<bsize?count:bsize;

[-- Attachment #3: 03_no_header_conv.dpatch --]
[-- Type: text/plain, Size: 1398 bytes --]

#! /bin/sh /usr/share/dpatch/dpatch-run
## 03_no_header_conv.patch by Clemens Fruhwirth <clemens@endorphin.org>
##
## DP: Kick ancient version header conversion.

@DPATCH@
Index: luks/keymanage.c
===================================================================
--- a/luks/keymanage.c	(revision 19)
+++ b/luks/keymanage.c	(working copy)
@@ -67,14 +67,6 @@
 	return mk;
 }
 
-static inline void convert_V99toV991(char const *device, struct luks_phdr *hdr) {
-	struct luks_phdr tmp_phdr;
-	fputs(_("automatic header conversion from 0.99 to 0.991 triggered"), stderr);
-	hdr->mkDigestIterations = ntohs(htonl(hdr->mkDigestIterations));
-	memcpy(&tmp_phdr, hdr, sizeof(struct luks_phdr));
-	LUKS_write_phdr(device, &tmp_phdr); 
-}
-
 int LUKS_read_phdr(const char *device, struct luks_phdr *hdr)
 {
 	int devfd = 0; 
@@ -109,14 +101,6 @@
 			hdr->keyblock[i].passwordIterations = ntohl(hdr->keyblock[i].passwordIterations);
 			hdr->keyblock[i].keyMaterialOffset  = ntohl(hdr->keyblock[i].keyMaterialOffset);
 			hdr->keyblock[i].stripes            = ntohl(hdr->keyblock[i].stripes);
-
-			if(hdr->keyblock[i].active == LUKS_KEY_DISABLED_OLD) {
-				hdr->keyblock[i].active = LUKS_KEY_DISABLED;
-				convert_V99toV991(device, hdr);
-			} else if(hdr->keyblock[i].active == LUKS_KEY_ENABLED_OLD) {
-				hdr->keyblock[i].active = LUKS_KEY_ENABLED;
-				convert_V99toV991(device, hdr);
-			}
 		}
 	}
 

[-- Attachment #4: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Bug#403426: kernel corrupts LUKS partition header on arm
  2006-12-29 20:24     ` Martin Michlmayr
@ 2006-12-30 10:50       ` Clemens Fruhwirth
  2006-12-30 13:13         ` Martin Michlmayr
  0 siblings, 1 reply; 19+ messages in thread
From: Clemens Fruhwirth @ 2006-12-30 10:50 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: dm-devel, Brian Brunswick, 403426

At Fri, 29 Dec 2006 21:24:34 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:
> 
> * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-29 11:52]:
> > Please try the version from subversion
> > http://luks.endorphin.org/svn/cryptsetup
> 
> With 1.0.4 plus the attached 2 patches from SVN I no longer get any
> corruption but I also cannot access my encrypted data.  

That's good :)

> Is there anything else I should try?
> foobar:~# cryptsetup luksOpen /dev/sda5 x
> Enter LUKS passphrase:
> device-mapper: table: 254:0: crypt: Device lookup failed
> device-mapper: ioctl: error adding target to table
> device-mapper: ioctl: device doesn't appear to be in the dev hash table.
> Failed to setup dm-crypt key mapping.
> Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors.
> Failed to read from key storage

Are you sure we don't see any device mapper problems here? You might
just go over to a box where LUKS works (when the LUKS partition does
not contain anything security relevant to you), and do

dmsetup table mappingname > dm-table-file

open it up, and find the 7th entry that says something like x:y
(major:minor device number of your underlaying device). Replace this
by your correct device path on the arm box, copy that file over to arm
and set it up as

dmsetup create mappingname dm-table-file

If that does not work, we are seeing a dm-crypt layer problem here.

> Enter LUKS passphrase:

I just commited a patch that prevents password retrying with I/O
errors.
-- 
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2006-12-30 10:50       ` Clemens Fruhwirth
@ 2006-12-30 13:13         ` Martin Michlmayr
  2007-01-02 17:00           ` Clemens Fruhwirth
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2006-12-30 13:13 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick

* Clemens Fruhwirth <clemens@endorphin.org> [2006-12-30 11:50]:
> > Is there anything else I should try?
> > foobar:~# cryptsetup luksOpen /dev/sda5 x
> > Enter LUKS passphrase:
> > device-mapper: table: 254:0: crypt: Device lookup failed

Strange.  I haven't changed cryptsetup but now I don't get this
message anymore.  I simply get:

foobar:~# cryptsetup luksOpen /dev/sdb2 x
Enter LUKS passphrase:
Enter LUKS passphrase:
Enter LUKS passphrase:
Command failed: No key available with this passphrase.

But I'm sure I've typed the passphrase correctly, and it works on my
x86 box (on which I created the LUKS partition).

> dmsetup create mappingname dm-table-file
> 
> If that does not work, we are seeing a dm-crypt layer problem here.

This works:

foobar:~# cat map
0 1954808 crypt aes-cbc-essiv:sha256 a0671e89e060742caf3f9a3e7d762b24 0 /dev/sdb2 1032
foobar:~# dmsetup create x map
foobar:~# mount /dev/mapper/x /mnt
foobar:~# ls -l /mnt
total 20
drwx------ 2 root root 16384 Dec 30  2006 lost+found
-rw-r--r-- 1 root root     5 Dec 30  2006 x
foobar:~#

-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2006-12-30 13:13         ` Martin Michlmayr
@ 2007-01-02 17:00           ` Clemens Fruhwirth
  2007-01-02 18:04             ` Martin Michlmayr
  0 siblings, 1 reply; 19+ messages in thread
From: Clemens Fruhwirth @ 2007-01-02 17:00 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick

At Sat, 30 Dec 2006 14:13:42 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:
> 
> * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-30 11:50]:
> > > Is there anything else I should try?
> > > foobar:~# cryptsetup luksOpen /dev/sda5 x
> > > Enter LUKS passphrase:
> > > device-mapper: table: 254:0: crypt: Device lookup failed
> 
> Strange.  I haven't changed cryptsetup but now I don't get this
> message anymore.  I simply get:
> 
> foobar:~# cryptsetup luksOpen /dev/sdb2 x
> Enter LUKS passphrase:
> Enter LUKS passphrase:
> Enter LUKS passphrase:
> Command failed: No key available with this passphrase.
> 
> But I'm sure I've typed the passphrase correctly, and it works on my
> x86 box (on which I created the LUKS partition).

Does luksDump report the same things on both architecture?
--
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-02 17:00           ` Clemens Fruhwirth
@ 2007-01-02 18:04             ` Martin Michlmayr
  2007-01-02 18:34               ` Clemens Fruhwirth
  2007-01-03 16:59               ` Clemens Fruhwirth
  0 siblings, 2 replies; 19+ messages in thread
From: Martin Michlmayr @ 2007-01-02 18:04 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick

* Clemens Fruhwirth <clemens@endorphin.org> [2007-01-02 18:00]:
> Does luksDump report the same things on both architecture?

Yes.
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-02 18:04             ` Martin Michlmayr
@ 2007-01-02 18:34               ` Clemens Fruhwirth
  2007-01-03 16:59               ` Clemens Fruhwirth
  1 sibling, 0 replies; 19+ messages in thread
From: Clemens Fruhwirth @ 2007-01-02 18:34 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick

At Tue, 2 Jan 2007 19:04:41 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:
> 
> * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-02 18:00]:
> > Does luksDump report the same things on both architecture?
> 
> Yes.

Strange. Can I somehow gain access to your test box? I'm not entirely
out of guesses, but it's not efficient to communicate them.

PGP key is on subkeys.pgp.net
pub   1024D/31092E4E 2003-02-09
      Key fingerprint = 0039 316D D85C 1FB3 4A39  10A2 5BBB 2BF4 3109 2E4E
uid                  Fruhwirth Clemens <clemens@endorphin.org>
-- 
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-02 18:04             ` Martin Michlmayr
  2007-01-02 18:34               ` Clemens Fruhwirth
@ 2007-01-03 16:59               ` Clemens Fruhwirth
  2007-01-03 19:14                 ` Martin Michlmayr
  1 sibling, 1 reply; 19+ messages in thread
From: Clemens Fruhwirth @ 2007-01-03 16:59 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick

At Tue, 2 Jan 2007 19:04:41 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:
> 
> * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-02 18:00]:
> > Does luksDump report the same things on both architecture?
> 
> Yes.

After a bit of debugging on Gordon's slug, I found out that we have
some kind of read race/read corruption when reading the encrypted
master key from a key slot.

If you get into luks/keyencryption.c:LUKS_decrypt_from_storage and replace
	return LUKS_endec_template(dst,dstLength,hdr,key,keyLength, device, sector, backend, read_blockwise, O_RDONLY);
by something like
        int r=LUKS_endec_template(dst,dstLength,hdr,key,keyLength, device, sector, backend, read_blockwise, O_RDONLY);
        while(dstLength) {
                hexprint(dst, 32);
                dstLength-=32;
                dst+=32;
                printf("\n");
        }
        return r;
you get the whole decrypted content dumped to stdout. Just to give you
a brief idea of how master key decryption from a key slot works:
cryptsetup derives a user key, sets up a temporary dm-crypt mapping
with that user key, and starts to read encrypted content from the
underlying device via the temporary dm-crypt mapping.

The problem: The decrypted content is _different_ for two identical
runs. When you use the debugging sniplet from above you can see, the
corruptions seem to be displaced copies of other content parts. For
instance, when every character below is a 32 byte block then we would
see:

a b c d e f 
a b c d c f

I have no idea why 32 byte block in particular. There seem to be no
regular pattern in the corruption, for instance every x'th block, but
there is a regular pattern in which block is duplicated at the point
of the corruption, namely the previous 16th block. (Notice 16*32byte =
512 byte = sector size. Welcome to linux number mysticism).

As far as I understand page caching comes after dm-crypt, so maybe we
have some kind of cache corruption here?

On more things: The situation stabilizes under strace. Using strace
usually lets you open the LUKS partition.
-- 
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-03 16:59               ` Clemens Fruhwirth
@ 2007-01-03 19:14                 ` Martin Michlmayr
  2007-01-03 19:32                   ` Clemens Fruhwirth
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2007-01-03 19:14 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick

* Clemens Fruhwirth <clemens@endorphin.org> [2007-01-03 17:59]:
> After a bit of debugging on Gordon's slug, I found out that we have
> some kind of read race/read corruption when reading the encrypted
> master key from a key slot.
...
> As far as I understand page caching comes after dm-crypt, so maybe we
> have some kind of cache corruption here?

Do you think this is related to http://lkml.org/lkml/2006/12/21/157
I just applied the two patches from that thread and successfully ran
'cryptsetup luksClose' on ARM.  Would the lack of __flush_anon_page()
on ARM explain the corruption you've observed?
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-03 19:14                 ` Martin Michlmayr
@ 2007-01-03 19:32                   ` Clemens Fruhwirth
  2007-01-03 19:37                     ` Martin Michlmayr
  0 siblings, 1 reply; 19+ messages in thread
From: Clemens Fruhwirth @ 2007-01-03 19:32 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick

At Wed, 3 Jan 2007 20:14:42 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:
> 
> * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-03 17:59]:
> > After a bit of debugging on Gordon's slug, I found out that we have
> > some kind of read race/read corruption when reading the encrypted
> > master key from a key slot.
> ...
> > As far as I understand page caching comes after dm-crypt, so maybe we
> > have some kind of cache corruption here?
> 
> Do you think this is related to http://lkml.org/lkml/2006/12/21/157

I'm sorry, I have no idea.

> I just applied the two patches from that thread and successfully ran
> 'cryptsetup luksClose' on ARM.  

"cryptsetup luksClose" is just an alias for "cryptsetup remove". This
should never fail. What's with luksOpen after the patches?

> Would the lack of __flush_anon_page() on ARM explain the corruption
> you've observed?

Again, I'm not familiar with this.
-- 
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-03 19:32                   ` Clemens Fruhwirth
@ 2007-01-03 19:37                     ` Martin Michlmayr
  2007-01-04 11:56                       ` Clemens Fruhwirth
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2007-01-03 19:37 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick

* Clemens Fruhwirth <clemens@endorphin.org> [2007-01-03 20:32]:
> > I just applied the two patches from that thread and successfully ran
> > 'cryptsetup luksClose' on ARM.
> 
> "cryptsetup luksClose" is just an alias for "cryptsetup remove". This
> should never fail. What's with luksOpen after the patches?

Sorry, I meant to say that luksOpen worked with the patches.  I simply
copy&pasted the wrong line.

> > Would the lack of __flush_anon_page() on ARM explain the corruption
> > you've observed?
> Again, I'm not familiar with this.

Well, it seems to fix the problem and according to the thread on lkml
the lack of flush_anon_page() on ARM is associated with some
corruption.  At least FUSE doesn't work on ARM without those patches,
so it seems likely that luksOpen is also affected.
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-03 19:37                     ` Martin Michlmayr
@ 2007-01-04 11:56                       ` Clemens Fruhwirth
  2007-01-04 15:09                         ` Martin Michlmayr
  0 siblings, 1 reply; 19+ messages in thread
From: Clemens Fruhwirth @ 2007-01-04 11:56 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick

At Wed, 3 Jan 2007 20:37:22 +0100,
Martin Michlmayr <tbm@cyrius.com> wrote:
> 
> Well, it seems to fix the problem and according to the thread on lkml
> the lack of flush_anon_page() on ARM is associated with some
> corruption.  At least FUSE doesn't work on ARM without those patches,
> so it seems likely that luksOpen is also affected.

So, can we close the bug against cryptsetup in this case?
Maybe someone else can verify that?
-- 
Fruhwirth Clemens - http://clemens.endorphin.org 
for robots: sp4mtrap@endorphin.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-04 11:56                       ` Clemens Fruhwirth
@ 2007-01-04 15:09                         ` Martin Michlmayr
  2007-01-05  8:36                           ` Gordon Farquharson
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2007-01-04 15:09 UTC (permalink / raw)
  To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick, Gordon Farquharson

* Clemens Fruhwirth <clemens@endorphin.org> [2007-01-04 12:56]:
> > corruption.  At least FUSE doesn't work on ARM without those patches,
> > so it seems likely that luksOpen is also affected.
> 
> So, can we close the bug against cryptsetup in this case?

#403426 is really about the header corruption which you have fixed in
SVN.  It should be closed when the Debian maintainers make a new
upload with that fix.

> Maybe someone else can verify that?

CCing Gordon. :)
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-04 15:09                         ` Martin Michlmayr
@ 2007-01-05  8:36                           ` Gordon Farquharson
  2007-01-05  9:59                             ` Martin Michlmayr
  2007-01-07  5:47                             ` Gordon Farquharson
  0 siblings, 2 replies; 19+ messages in thread
From: Gordon Farquharson @ 2007-01-05  8:36 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426

On 1/4/07, Martin Michlmayr <tbm@cyrius.com> wrote:

> * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-04 12:56]:
> > So, can we close the bug against cryptsetup in this case?
>
> #403426 is really about the header corruption which you have fixed in
> SVN.  It should be closed when the Debian maintainers make a new
> upload with that fix.
>
> > Maybe someone else can verify that?
>
> CCing Gordon. :)

Ok, so here are some interesting results...

I am able to access the LUKS partition on the NSLU2 running 2.6.18
from subversion (which includes flush_anon_page-generic.patch and
flush_anon_page-arm.patch) with both cryptsetup-1.0.4-8 (the latest
version in testing) and cryptsetup-1.0.4-8 plus 02_fix_arm.dpatch and
03_no_header_conv.dpatch that were posted to this thread.

$ sudo cryptsetup luksOpen /dev/sdb3 testfs
Enter LUKS passphrase:
key slot 0 unlocked.
Command successful.
gordon@LKG7102D7:~$ sudo mount /dev/mapper/testfs /mnt/tmp
gordon@LKG7102D7:~$ sudo umount /mnt/tmp
gordon@LKG7102D7:~$ sudo cryptsetup luksClose testfs

However, I have found that I am unable to access the LUKS partition
when the system is under heavy load and swapping.

$ sudo cryptsetup luksOpen /dev/sdb3 testfs
Enter LUKS passphrase:
Enter LUKS passphrase:
Enter LUKS passphrase:
Command failed: No key available with this passphrase.

gordon@LKG7102D7:~$ uptime
 00:22:23 up 16 min,  2 users,  load average: 3.01, 1.85, 0.93
gordon@LKG7102D7:~$ free
             total       used       free     shared    buffers     cached
Mem:         29988      28908       1080          0        172       3028
-/+ buffers/cache:      25708       4280
Swap:        88316      67508      20808

Once the system load decreases and the swapping stops, I am able to
access the LUKS partition again. This behaviour is very repeatable.

Martin, I wonder if this has anything to do with the virtual memory
bug in the kernel that we experienced with apt. It could be that this
bug existed before 2.6.19 but was much harder to trigger (e.g. see
http://lkml.org/lkml/2007/1/3/285). It would be interesting to try
accessing a LUKS partition under heavy load while running 2.6.20-git,
but that will have to wait until the weekend for me to test it.

Gordon

-- 
Gordon Farquharson

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-05  8:36                           ` Gordon Farquharson
@ 2007-01-05  9:59                             ` Martin Michlmayr
  2007-01-06  6:38                               ` Gordon Farquharson
  2007-01-07  5:47                             ` Gordon Farquharson
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Michlmayr @ 2007-01-05  9:59 UTC (permalink / raw)
  To: Gordon Farquharson; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426

* Gordon Farquharson <gordonfarquharson@gmail.com> [2007-01-05 01:36]:
> However, I have found that I am unable to access the LUKS partition
> when the system is under heavy load and swapping.

Interesting.  Can you check whether you see the same problems with
FUSE (see #402876)?

> Martin, I wonder if this has anything to do with the virtual memory
> bug in the kernel that we experienced with apt. It could be that this
> bug existed before 2.6.19 but was much harder to trigger (e.g. see

I don't know.  I'm aware this bug has been around for a while (but
hard to trigger) but I'd b cautious to attribute every bug we see to
it.  Of course it's possible that this is the problem but somehow I
doubt it.
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-05  9:59                             ` Martin Michlmayr
@ 2007-01-06  6:38                               ` Gordon Farquharson
  0 siblings, 0 replies; 19+ messages in thread
From: Gordon Farquharson @ 2007-01-06  6:38 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426

On 1/5/07, Martin Michlmayr <tbm@cyrius.com> wrote:
> * Gordon Farquharson <gordonfarquharson@gmail.com> [2007-01-05 01:36]:
> > However, I have found that I am unable to access the LUKS partition
> > when the system is under heavy load and swapping.
>
> Interesting.  Can you check whether you see the same problems with
> FUSE (see #402876)?

FUSE works with 2.6.18-9 (checked out from subversion on 2007.01.04)
under heavy load and swapping.

gordon@LKG7102D7:~$ encfs /home/gordon/.encrypted /home/gordon/encrypted
EncFS Password:
gordon@LKG7102D7:~$ ls -l encrypted/
total 4
-rw-r--r-- 1 gordon gordon 13 2007-01-05 22:40 example.txt
gordon@LKG7102D7:~$ cat encrypted/example.txt
Some text...
gordon@LKG7102D7:~$ fusermount -u /home/gordon/encrypted
gordon@LKG7102D7:~$ uptime
 23:33:58 up 22:37,  2 users,  load average: 3.38, 2.27, 1.08
gordon@LKG7102D7:~$ free
             total       used       free     shared    buffers     cached
Mem:         29988      28876       1112          0        172       1840
-/+ buffers/cache:      26864       3124
Swap:        88316      63124      25192

Gordon

-- 
Gordon Farquharson

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Bug#403426: kernel corrupts LUKS partition header on arm
  2007-01-05  8:36                           ` Gordon Farquharson
  2007-01-05  9:59                             ` Martin Michlmayr
@ 2007-01-07  5:47                             ` Gordon Farquharson
  1 sibling, 0 replies; 19+ messages in thread
From: Gordon Farquharson @ 2007-01-07  5:47 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426

On 1/5/07, Gordon Farquharson <gordonfarquharson@gmail.com> wrote:

> However, I have found that I am unable to access the LUKS partition
> when the system is under heavy load and swapping.
>
> Martin, I wonder if this has anything to do with the virtual memory
> bug in the kernel that we experienced with apt. It could be that this
> bug existed before 2.6.19 but was much harder to trigger (e.g. see
> http://lkml.org/lkml/2007/1/3/285). It would be interesting to try
> accessing a LUKS partition under heavy load while running 2.6.20-git,
> but that will have to wait until the weekend for me to test it.

Ok, I tested 2.6.19-1~experimental.1 +

   vm-fix-nasty-and-subtle-race-in-shared-mmap-ed-page-writeback.patch
   flush_anon_page-generic.patch
   flush_anon_page-arm.patch

and I am still unable to access the LUKS paritition under heavy load
and swapping, so it appears that there is someting else causing this
problem.

Gordon

-- 
Gordon Farquharson

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-01-07  5:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20061217022906.2434.60658.reportbug@LKG8A754B.example.org>
2006-12-20 16:15 ` Bug#403426: kernel corrupts LUKS partition header on arm Martin Michlmayr
2006-12-29 10:52   ` Clemens Fruhwirth
2006-12-29 16:38     ` Martin Michlmayr
2006-12-29 20:24     ` Martin Michlmayr
2006-12-30 10:50       ` Clemens Fruhwirth
2006-12-30 13:13         ` Martin Michlmayr
2007-01-02 17:00           ` Clemens Fruhwirth
2007-01-02 18:04             ` Martin Michlmayr
2007-01-02 18:34               ` Clemens Fruhwirth
2007-01-03 16:59               ` Clemens Fruhwirth
2007-01-03 19:14                 ` Martin Michlmayr
2007-01-03 19:32                   ` Clemens Fruhwirth
2007-01-03 19:37                     ` Martin Michlmayr
2007-01-04 11:56                       ` Clemens Fruhwirth
2007-01-04 15:09                         ` Martin Michlmayr
2007-01-05  8:36                           ` Gordon Farquharson
2007-01-05  9:59                             ` Martin Michlmayr
2007-01-06  6:38                               ` Gordon Farquharson
2007-01-07  5:47                             ` Gordon Farquharson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.