* Bug#403426: kernel corrupts LUKS partition header on arm [not found] <20061217022906.2434.60658.reportbug@LKG8A754B.example.org> @ 2006-12-20 16:15 ` Martin Michlmayr 2006-12-29 10:52 ` Clemens Fruhwirth 0 siblings, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2006-12-20 16:15 UTC (permalink / raw) To: dm-devel, Clemens Fruhwirth; +Cc: Brian Brunswick, 403426 We're seeing corruption of LUKS partition headers on ARM. I've confirmed this on two different ARM platforms (IXP4xx and IOP32x) and with 2.6.17 and 2.6.18. Basically, when you create a LUKS partition on a PC and then connect it to an ARM box and open it, you get an "automatic header conversion from 0.99 to 0.991 triggered" message and afterwards the LUKS partition header is corrupted. Here are steps to reproduce this: On the PC: 27340:tbm@deprecation: ] sudo cryptsetup luksFormat /dev/sda6 WARNING! ======== This will overwrite data on /dev/sda6 irrevocably. Are you sure? (Type uppercase yes): YES Enter LUKS passphrase: Verify passphrase: Command successful. 27348:tbm@deprecation: ~] sudo cryptsetup luksOpen /dev/sda6 x Enter LUKS passphrase: key slot 0 unlocked. Command successful. 27351:tbm@deprecation: ~] sudo cryptsetup luksClose x 27352:tbm@deprecation: ~] Connect the drive to the ARM box: debian:~# cryptsetup luksOpen /dev/sdb6 x Enter LUKS passphrase: automatic header conversion from 0.99 to 0.991 triggered [here it appears to hang; I press ctrl-c] debian:~# cryptsetup luksOpen /dev/sdb6 x Enter LUKS passphrase: unknown hash spec in phdrEnter LUKS passphrase: unknown hash spec in phdrEnter LUKS passphrase: unknown hash spec in phdrCommand failed: No key available with this passphrase. debian:~# The original bug report with some more info: * Brian Brunswick <bdb-reportbug@forbidden.co.uk> [2006-12-17 02:29]: > Package: linux-image-2.6.18-3-ixp4xx > Version: 2.6.18-8 > Severity: critical > Justification: causes serious data loss > > This is on an NSLU2, I wanted to use it to access some disks that I > had used previously on another system that had encrypted partitons. > However, when I tried cryptsetup luksOpen, I got a automatic header > conversion from 0.99 to 0.991 triggered message, and then an infinite > loop. Trying the same partition on the other system, now I get the > same thing - its header is corrupted. Luckily, I'm paranoid and had a > backup of the LUKS header! If I didn't have this, the whole > partition's data would probably be lost. > > Here's the result of some tests - and its stranger than you think... > > Used cryptsetup luksFormat on another system to set up a partition.. > > LKG8A754B:~# uname -a > Linux LKG8A754B 2.6.18-3-ixp4xx #1 Mon Dec 11 17:20:00 UTC 2006 armv5tel GNU/Linux > > !!! ... after some experiments corrupting it... now I restore it again > and decide to strace things... !!! > > LKG8A754B:~# dd < sde4-luks-header-backup > /dev/sde4 > 150+0 records in > 150+0 records out > 76800 bytes (77 kB) copied, 0.0148216 seconds, 5.2 MB/s > LKG8A754B:~# strace 2>trace cryptsetup luksOpen /dev/sde4 sde4 > Enter LUKS passphrase: > key slot 0 unlocked. > > !!! Hey, that just worked !!! > > LKG8A754B:~# ls -l /dev/mapper/sde4 > brw-rw---- 1 root disk 254, 0 Dec 17 02:00 /dev/mapper/sde4 > LKG8A754B:~# cmp -l /dev/sde4 sde4-luks-header-backup > cmp: EOF on sde4-luks-header-backup > > !!! But: !!! > > LKG8A754B:~# cryptsetup remove sde4 > !!! no strace !!! > LKG8A754B:~# cryptsetup luksOpen /dev/sde4 sde4 > Enter LUKS passphrase: > automatic header conversion from 0.99 to 0.991 triggered > <ctrl-c> > LKG8A754B:~# cmp -l /dev/sde4 sde4-luks-header-backup > 168 0 12 > 513 114 0 > 514 125 0 > 515 113 0 > 516 123 0 > 517 272 0 > 518 276 0 > 520 1 0 > 521 141 0 > 522 145 0 > 523 163 0 > 539 0 3 > 540 0 10 > 543 0 17 > 544 0 240 > 553 143 0 > 554 142 0 > 555 143 0 > 556 55 0 > 557 145 0 > 558 163 0 > 559 163 0 > 560 151 0 > 561 166 0 > 562 72 0 > 563 163 0 > 564 150 0 > 565 141 0 > 566 62 0 > 567 65 0 > 568 66 0 > 585 163 0 > 586 150 0 > 587 141 3 > 588 61 210 > 591 0 17 > 592 0 240 > cmp: EOF on sde4-luks-header-backup > LKG8A754B:~# > > !!! > > Its done something like overwrite the second sector of the header with > the first one. I had a look at the cryptsetup code, and the conversion > message is triggered by it finding the wrong state code for the > passphrase slot - so the data has been overwritten by the time its got > there. > > This is reliable - it always seems to corrupt it without strace, > always works with!??? > > Err.... According to the orignal bug reporter, cryptsetup hangs when it shows "automatic header conversion from 0.99 to 0.991 triggered". However, I tried it with a small partition now (ca 90 MB) and it doesn't hang. I got: foobar:~# cryptsetup luksOpen /dev/sda5 x Enter LUKS passphrase: automatic header conversion from 0.99 to 0.991 triggereddevice-mapper: crypt: Device lookup failed device-mapper: error adding target to table device-mapper: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Enter LUKS passphrase: unknown hash spec in phdrEnter LUKS passphrase: -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Bug#403426: kernel corrupts LUKS partition header on arm 2006-12-20 16:15 ` Bug#403426: kernel corrupts LUKS partition header on arm Martin Michlmayr @ 2006-12-29 10:52 ` Clemens Fruhwirth 2006-12-29 16:38 ` Martin Michlmayr 2006-12-29 20:24 ` Martin Michlmayr 0 siblings, 2 replies; 19+ messages in thread From: Clemens Fruhwirth @ 2006-12-29 10:52 UTC (permalink / raw) To: Martin Michlmayr; +Cc: dm-devel, Brian Brunswick, 403426 At Wed, 20 Dec 2006 17:15:29 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > We're seeing corruption of LUKS partition headers on ARM. I've > confirmed this on two different ARM platforms (IXP4xx and IOP32x) and > with 2.6.17 and 2.6.18. > > Basically, when you create a LUKS partition on a PC and then connect > it to an ARM box and open it, you get an "automatic header conversion > from 0.99 to 0.991 triggered" message and afterwards the LUKS > partition header is corrupted. Please try the version from subversion http://luks.endorphin.org/svn/cryptsetup I just kicked this conversion routine as it is for pre-1.0 releases and guess there is no single deployment that will ever need it. This won't change the bug itself, but it won't corrupt your partition anymore. It just fails. > > Its done something like overwrite the second sector of the header with > > the first one. I had a look at the cryptsetup code, and the conversion > > message is triggered by it finding the wrong state code for the > > passphrase slot - so the data has been overwritten by the time its got > > there. That looks right. A good amount of staring out of the window, drew my attention to (read|write|write_lseek)_blockwise in util.c. Reading from a file description opened with O_DIRECT requires blockwise reading into an aligned memory segments. That's the reason for all the magic in these routines. Looking at read_blockwise, r=read(fd,buf,size) might just return a short read, that is r<size. But the read_blockwise routine never covers that case. For some reason arm might behave different than other archs here. I just added the r!=bsize case to error checking and an error message as well. while(count) { r = read(fd,padbuf,bsize); if(r < 0 || r != bsize) { fprintf(stderr, "read failed in read_blockwise.\n"); goto out; } step = count<bsize?count:bsize; memcpy(buf,padbuf,step); buf += step; count -= step; } The changes are also in subversion. Please try. -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2006-12-29 10:52 ` Clemens Fruhwirth @ 2006-12-29 16:38 ` Martin Michlmayr 2006-12-29 20:24 ` Martin Michlmayr 1 sibling, 0 replies; 19+ messages in thread From: Martin Michlmayr @ 2006-12-29 16:38 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-29 11:52]: > I just added the r!=bsize case to error checking and an error message > as well. ... > The changes are also in subversion. This particular change didn't make any difference. I still get the header conversion message when I only apply the patch from utils.c. #! /bin/sh /usr/share/dpatch/dpatch-run ## 02_fix_arm.dpatch by Clemens Fruhwirth <clemens@endorphin.org> ## ## DP: Add error checking to read_blockwise for short reads. ## DP: Commit a patch that fixes http://bugs.debian.org/403075 @DPATCH@ Index: lib/utils.c =================================================================== --- cryptsetup-1.0.4~/lib/utils.c (revision 1) +++ cryptsetup-1.0.4/lib/utils.c (working copy) @@ -151,8 +151,10 @@ static int sector_size(int fd) { int bsize; - ioctl(fd,BLKSSZGET, &bsize); - return bsize; + if (ioctl(fd,BLKSSZGET, &bsize) < 0) + return -EINVAL; + else + return bsize; } int sector_size_for_device(const char *device) @@ -171,8 +173,11 @@ char *padbuf; char *padbuf_base; char *buf = (char *)orig_buf; int r; - int hangover; int solid; int bsize = sector_size(fd); + int hangover; int solid; int bsize; + if ((bsize = sector_size(fd)) < 0) + return bsize; + hangover = count % bsize; solid = count - hangover; @@ -209,15 +214,20 @@ char *buf = (char *)orig_buf; int r; int step; - int bsize = sector_size(fd); + int bsize; + if ((bsize = sector_size(fd)) < 0) + return bsize; + padbuf = aligned_malloc(&padbuf_base, bsize, bsize); if(padbuf == NULL) return -ENOMEM; while(count) { r = read(fd,padbuf,bsize); - if(r < 0) goto out; - + if(r < 0 || r != bsize) { + fprintf(stderr, "read failed in read_blockwise.\n"); + goto out; + } step = count<bsize?count:bsize; memcpy(buf,padbuf,step); buf += step; @@ -242,6 +252,9 @@ int frontHang = offset % bsize; int r; + if (bsize < 0) + return bsize; + lseek(fd, offset - frontHang, SEEK_SET); if(offset % bsize) { int innerCount = count<bsize?count:bsize; -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2006-12-29 10:52 ` Clemens Fruhwirth 2006-12-29 16:38 ` Martin Michlmayr @ 2006-12-29 20:24 ` Martin Michlmayr 2006-12-30 10:50 ` Clemens Fruhwirth 1 sibling, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2006-12-29 20:24 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick [-- Attachment #1: Type: text/plain, Size: 2945 bytes --] * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-29 11:52]: > Please try the version from subversion > http://luks.endorphin.org/svn/cryptsetup With 1.0.4 plus the attached 2 patches from SVN I no longer get any corruption but I also cannot access my encrypted data. Is there anything else I should try? foobar:~# cryptsetup luksOpen /dev/sda5 x Enter LUKS passphrase: device-mapper: table: 254:0: crypt: Device lookup failed device-mapper: ioctl: error adding target to table device-mapper: ioctl: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Enter LUKS passphrase: device-mapper: table: 254:0: crypt: Device lookup failed device-mapper: ioctl: error adding target to table device-mapper: ioctl: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Enter LUKS passphrase: device-mapper: table: 254:0: crypt: Device lookup failed device-mapper: ioctl: error adding target to table device-mapper: ioctl: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Command failed: No key available with this passphrase. foobar:~# cryptsetup luksOpen /dev/sda5 x Enter LUKS passphrase: device-mapper: table: 254:0: crypt: Device lookup failed device-mapper: ioctl: error adding target to table device-mapper: ioctl: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Enter LUKS passphrase: device-mapper: table: 254:0: crypt: Device lookup failed device-mapper: ioctl: error adding target to table device-mapper: ioctl: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Enter LUKS passphrase: device-mapper: table: 254:0: crypt: Device lookup failed device-mapper: ioctl: error adding target to table device-mapper: ioctl: device doesn't appear to be in the dev hash table. Failed to setup dm-crypt key mapping. Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. Failed to read from key storage Command failed: No key available with this passphrase. foobar:~# -- Martin Michlmayr http://www.cyrius.com/ [-- Attachment #2: 02_fix_arm.dpatch --] [-- Type: text/plain, Size: 1725 bytes --] #! /bin/sh /usr/share/dpatch/dpatch-run ## 02_fix_arm.dpatch by Clemens Fruhwirth <clemens@endorphin.org> ## ## DP: Add error checking to read_blockwise for short reads. ## DP: Commit a patch that fixes http://bugs.debian.org/403075 @DPATCH@ Index: lib/utils.c =================================================================== --- cryptsetup-1.0.4~/lib/utils.c (revision 1) +++ cryptsetup-1.0.4/lib/utils.c (working copy) @@ -151,8 +151,10 @@ static int sector_size(int fd) { int bsize; - ioctl(fd,BLKSSZGET, &bsize); - return bsize; + if (ioctl(fd,BLKSSZGET, &bsize) < 0) + return -EINVAL; + else + return bsize; } int sector_size_for_device(const char *device) @@ -171,8 +173,11 @@ char *padbuf; char *padbuf_base; char *buf = (char *)orig_buf; int r; - int hangover; int solid; int bsize = sector_size(fd); + int hangover; int solid; int bsize; + if ((bsize = sector_size(fd)) < 0) + return bsize; + hangover = count % bsize; solid = count - hangover; @@ -209,15 +214,20 @@ char *buf = (char *)orig_buf; int r; int step; - int bsize = sector_size(fd); + int bsize; + if ((bsize = sector_size(fd)) < 0) + return bsize; + padbuf = aligned_malloc(&padbuf_base, bsize, bsize); if(padbuf == NULL) return -ENOMEM; while(count) { r = read(fd,padbuf,bsize); - if(r < 0) goto out; - + if(r < 0 || r != bsize) { + fprintf(stderr, "read failed in read_blockwise.\n"); + goto out; + } step = count<bsize?count:bsize; memcpy(buf,padbuf,step); buf += step; @@ -242,6 +252,9 @@ int frontHang = offset % bsize; int r; + if (bsize < 0) + return bsize; + lseek(fd, offset - frontHang, SEEK_SET); if(offset % bsize) { int innerCount = count<bsize?count:bsize; [-- Attachment #3: 03_no_header_conv.dpatch --] [-- Type: text/plain, Size: 1398 bytes --] #! /bin/sh /usr/share/dpatch/dpatch-run ## 03_no_header_conv.patch by Clemens Fruhwirth <clemens@endorphin.org> ## ## DP: Kick ancient version header conversion. @DPATCH@ Index: luks/keymanage.c =================================================================== --- a/luks/keymanage.c (revision 19) +++ b/luks/keymanage.c (working copy) @@ -67,14 +67,6 @@ return mk; } -static inline void convert_V99toV991(char const *device, struct luks_phdr *hdr) { - struct luks_phdr tmp_phdr; - fputs(_("automatic header conversion from 0.99 to 0.991 triggered"), stderr); - hdr->mkDigestIterations = ntohs(htonl(hdr->mkDigestIterations)); - memcpy(&tmp_phdr, hdr, sizeof(struct luks_phdr)); - LUKS_write_phdr(device, &tmp_phdr); -} - int LUKS_read_phdr(const char *device, struct luks_phdr *hdr) { int devfd = 0; @@ -109,14 +101,6 @@ hdr->keyblock[i].passwordIterations = ntohl(hdr->keyblock[i].passwordIterations); hdr->keyblock[i].keyMaterialOffset = ntohl(hdr->keyblock[i].keyMaterialOffset); hdr->keyblock[i].stripes = ntohl(hdr->keyblock[i].stripes); - - if(hdr->keyblock[i].active == LUKS_KEY_DISABLED_OLD) { - hdr->keyblock[i].active = LUKS_KEY_DISABLED; - convert_V99toV991(device, hdr); - } else if(hdr->keyblock[i].active == LUKS_KEY_ENABLED_OLD) { - hdr->keyblock[i].active = LUKS_KEY_ENABLED; - convert_V99toV991(device, hdr); - } } } [-- Attachment #4: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Bug#403426: kernel corrupts LUKS partition header on arm 2006-12-29 20:24 ` Martin Michlmayr @ 2006-12-30 10:50 ` Clemens Fruhwirth 2006-12-30 13:13 ` Martin Michlmayr 0 siblings, 1 reply; 19+ messages in thread From: Clemens Fruhwirth @ 2006-12-30 10:50 UTC (permalink / raw) To: Martin Michlmayr; +Cc: dm-devel, Brian Brunswick, 403426 At Fri, 29 Dec 2006 21:24:34 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > > * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-29 11:52]: > > Please try the version from subversion > > http://luks.endorphin.org/svn/cryptsetup > > With 1.0.4 plus the attached 2 patches from SVN I no longer get any > corruption but I also cannot access my encrypted data. That's good :) > Is there anything else I should try? > foobar:~# cryptsetup luksOpen /dev/sda5 x > Enter LUKS passphrase: > device-mapper: table: 254:0: crypt: Device lookup failed > device-mapper: ioctl: error adding target to table > device-mapper: ioctl: device doesn't appear to be in the dev hash table. > Failed to setup dm-crypt key mapping. > Check kernel for support for the aes-cbc-essiv:sha256 cipher spec and verify that /dev/sda5 contains at least 133 sectors. > Failed to read from key storage Are you sure we don't see any device mapper problems here? You might just go over to a box where LUKS works (when the LUKS partition does not contain anything security relevant to you), and do dmsetup table mappingname > dm-table-file open it up, and find the 7th entry that says something like x:y (major:minor device number of your underlaying device). Replace this by your correct device path on the arm box, copy that file over to arm and set it up as dmsetup create mappingname dm-table-file If that does not work, we are seeing a dm-crypt layer problem here. > Enter LUKS passphrase: I just commited a patch that prevents password retrying with I/O errors. -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2006-12-30 10:50 ` Clemens Fruhwirth @ 2006-12-30 13:13 ` Martin Michlmayr 2007-01-02 17:00 ` Clemens Fruhwirth 0 siblings, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2006-12-30 13:13 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-30 11:50]: > > Is there anything else I should try? > > foobar:~# cryptsetup luksOpen /dev/sda5 x > > Enter LUKS passphrase: > > device-mapper: table: 254:0: crypt: Device lookup failed Strange. I haven't changed cryptsetup but now I don't get this message anymore. I simply get: foobar:~# cryptsetup luksOpen /dev/sdb2 x Enter LUKS passphrase: Enter LUKS passphrase: Enter LUKS passphrase: Command failed: No key available with this passphrase. But I'm sure I've typed the passphrase correctly, and it works on my x86 box (on which I created the LUKS partition). > dmsetup create mappingname dm-table-file > > If that does not work, we are seeing a dm-crypt layer problem here. This works: foobar:~# cat map 0 1954808 crypt aes-cbc-essiv:sha256 a0671e89e060742caf3f9a3e7d762b24 0 /dev/sdb2 1032 foobar:~# dmsetup create x map foobar:~# mount /dev/mapper/x /mnt foobar:~# ls -l /mnt total 20 drwx------ 2 root root 16384 Dec 30 2006 lost+found -rw-r--r-- 1 root root 5 Dec 30 2006 x foobar:~# -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2006-12-30 13:13 ` Martin Michlmayr @ 2007-01-02 17:00 ` Clemens Fruhwirth 2007-01-02 18:04 ` Martin Michlmayr 0 siblings, 1 reply; 19+ messages in thread From: Clemens Fruhwirth @ 2007-01-02 17:00 UTC (permalink / raw) To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick At Sat, 30 Dec 2006 14:13:42 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > > * Clemens Fruhwirth <clemens@endorphin.org> [2006-12-30 11:50]: > > > Is there anything else I should try? > > > foobar:~# cryptsetup luksOpen /dev/sda5 x > > > Enter LUKS passphrase: > > > device-mapper: table: 254:0: crypt: Device lookup failed > > Strange. I haven't changed cryptsetup but now I don't get this > message anymore. I simply get: > > foobar:~# cryptsetup luksOpen /dev/sdb2 x > Enter LUKS passphrase: > Enter LUKS passphrase: > Enter LUKS passphrase: > Command failed: No key available with this passphrase. > > But I'm sure I've typed the passphrase correctly, and it works on my > x86 box (on which I created the LUKS partition). Does luksDump report the same things on both architecture? -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-02 17:00 ` Clemens Fruhwirth @ 2007-01-02 18:04 ` Martin Michlmayr 2007-01-02 18:34 ` Clemens Fruhwirth 2007-01-03 16:59 ` Clemens Fruhwirth 0 siblings, 2 replies; 19+ messages in thread From: Martin Michlmayr @ 2007-01-02 18:04 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-02 18:00]: > Does luksDump report the same things on both architecture? Yes. -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-02 18:04 ` Martin Michlmayr @ 2007-01-02 18:34 ` Clemens Fruhwirth 2007-01-03 16:59 ` Clemens Fruhwirth 1 sibling, 0 replies; 19+ messages in thread From: Clemens Fruhwirth @ 2007-01-02 18:34 UTC (permalink / raw) To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick At Tue, 2 Jan 2007 19:04:41 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > > * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-02 18:00]: > > Does luksDump report the same things on both architecture? > > Yes. Strange. Can I somehow gain access to your test box? I'm not entirely out of guesses, but it's not efficient to communicate them. PGP key is on subkeys.pgp.net pub 1024D/31092E4E 2003-02-09 Key fingerprint = 0039 316D D85C 1FB3 4A39 10A2 5BBB 2BF4 3109 2E4E uid Fruhwirth Clemens <clemens@endorphin.org> -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-02 18:04 ` Martin Michlmayr 2007-01-02 18:34 ` Clemens Fruhwirth @ 2007-01-03 16:59 ` Clemens Fruhwirth 2007-01-03 19:14 ` Martin Michlmayr 1 sibling, 1 reply; 19+ messages in thread From: Clemens Fruhwirth @ 2007-01-03 16:59 UTC (permalink / raw) To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick At Tue, 2 Jan 2007 19:04:41 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > > * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-02 18:00]: > > Does luksDump report the same things on both architecture? > > Yes. After a bit of debugging on Gordon's slug, I found out that we have some kind of read race/read corruption when reading the encrypted master key from a key slot. If you get into luks/keyencryption.c:LUKS_decrypt_from_storage and replace return LUKS_endec_template(dst,dstLength,hdr,key,keyLength, device, sector, backend, read_blockwise, O_RDONLY); by something like int r=LUKS_endec_template(dst,dstLength,hdr,key,keyLength, device, sector, backend, read_blockwise, O_RDONLY); while(dstLength) { hexprint(dst, 32); dstLength-=32; dst+=32; printf("\n"); } return r; you get the whole decrypted content dumped to stdout. Just to give you a brief idea of how master key decryption from a key slot works: cryptsetup derives a user key, sets up a temporary dm-crypt mapping with that user key, and starts to read encrypted content from the underlying device via the temporary dm-crypt mapping. The problem: The decrypted content is _different_ for two identical runs. When you use the debugging sniplet from above you can see, the corruptions seem to be displaced copies of other content parts. For instance, when every character below is a 32 byte block then we would see: a b c d e f a b c d c f I have no idea why 32 byte block in particular. There seem to be no regular pattern in the corruption, for instance every x'th block, but there is a regular pattern in which block is duplicated at the point of the corruption, namely the previous 16th block. (Notice 16*32byte = 512 byte = sector size. Welcome to linux number mysticism). As far as I understand page caching comes after dm-crypt, so maybe we have some kind of cache corruption here? On more things: The situation stabilizes under strace. Using strace usually lets you open the LUKS partition. -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-03 16:59 ` Clemens Fruhwirth @ 2007-01-03 19:14 ` Martin Michlmayr 2007-01-03 19:32 ` Clemens Fruhwirth 0 siblings, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2007-01-03 19:14 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-03 17:59]: > After a bit of debugging on Gordon's slug, I found out that we have > some kind of read race/read corruption when reading the encrypted > master key from a key slot. ... > As far as I understand page caching comes after dm-crypt, so maybe we > have some kind of cache corruption here? Do you think this is related to http://lkml.org/lkml/2006/12/21/157 I just applied the two patches from that thread and successfully ran 'cryptsetup luksClose' on ARM. Would the lack of __flush_anon_page() on ARM explain the corruption you've observed? -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-03 19:14 ` Martin Michlmayr @ 2007-01-03 19:32 ` Clemens Fruhwirth 2007-01-03 19:37 ` Martin Michlmayr 0 siblings, 1 reply; 19+ messages in thread From: Clemens Fruhwirth @ 2007-01-03 19:32 UTC (permalink / raw) To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick At Wed, 3 Jan 2007 20:14:42 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > > * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-03 17:59]: > > After a bit of debugging on Gordon's slug, I found out that we have > > some kind of read race/read corruption when reading the encrypted > > master key from a key slot. > ... > > As far as I understand page caching comes after dm-crypt, so maybe we > > have some kind of cache corruption here? > > Do you think this is related to http://lkml.org/lkml/2006/12/21/157 I'm sorry, I have no idea. > I just applied the two patches from that thread and successfully ran > 'cryptsetup luksClose' on ARM. "cryptsetup luksClose" is just an alias for "cryptsetup remove". This should never fail. What's with luksOpen after the patches? > Would the lack of __flush_anon_page() on ARM explain the corruption > you've observed? Again, I'm not familiar with this. -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-03 19:32 ` Clemens Fruhwirth @ 2007-01-03 19:37 ` Martin Michlmayr 2007-01-04 11:56 ` Clemens Fruhwirth 0 siblings, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2007-01-03 19:37 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-03 20:32]: > > I just applied the two patches from that thread and successfully ran > > 'cryptsetup luksClose' on ARM. > > "cryptsetup luksClose" is just an alias for "cryptsetup remove". This > should never fail. What's with luksOpen after the patches? Sorry, I meant to say that luksOpen worked with the patches. I simply copy&pasted the wrong line. > > Would the lack of __flush_anon_page() on ARM explain the corruption > > you've observed? > Again, I'm not familiar with this. Well, it seems to fix the problem and according to the thread on lkml the lack of flush_anon_page() on ARM is associated with some corruption. At least FUSE doesn't work on ARM without those patches, so it seems likely that luksOpen is also affected. -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-03 19:37 ` Martin Michlmayr @ 2007-01-04 11:56 ` Clemens Fruhwirth 2007-01-04 15:09 ` Martin Michlmayr 0 siblings, 1 reply; 19+ messages in thread From: Clemens Fruhwirth @ 2007-01-04 11:56 UTC (permalink / raw) To: Martin Michlmayr; +Cc: 403426, dm-devel, Brian Brunswick At Wed, 3 Jan 2007 20:37:22 +0100, Martin Michlmayr <tbm@cyrius.com> wrote: > > Well, it seems to fix the problem and according to the thread on lkml > the lack of flush_anon_page() on ARM is associated with some > corruption. At least FUSE doesn't work on ARM without those patches, > so it seems likely that luksOpen is also affected. So, can we close the bug against cryptsetup in this case? Maybe someone else can verify that? -- Fruhwirth Clemens - http://clemens.endorphin.org for robots: sp4mtrap@endorphin.org ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-04 11:56 ` Clemens Fruhwirth @ 2007-01-04 15:09 ` Martin Michlmayr 2007-01-05 8:36 ` Gordon Farquharson 0 siblings, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2007-01-04 15:09 UTC (permalink / raw) To: Clemens Fruhwirth; +Cc: 403426, dm-devel, Brian Brunswick, Gordon Farquharson * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-04 12:56]: > > corruption. At least FUSE doesn't work on ARM without those patches, > > so it seems likely that luksOpen is also affected. > > So, can we close the bug against cryptsetup in this case? #403426 is really about the header corruption which you have fixed in SVN. It should be closed when the Debian maintainers make a new upload with that fix. > Maybe someone else can verify that? CCing Gordon. :) -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-04 15:09 ` Martin Michlmayr @ 2007-01-05 8:36 ` Gordon Farquharson 2007-01-05 9:59 ` Martin Michlmayr 2007-01-07 5:47 ` Gordon Farquharson 0 siblings, 2 replies; 19+ messages in thread From: Gordon Farquharson @ 2007-01-05 8:36 UTC (permalink / raw) To: Martin Michlmayr; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426 On 1/4/07, Martin Michlmayr <tbm@cyrius.com> wrote: > * Clemens Fruhwirth <clemens@endorphin.org> [2007-01-04 12:56]: > > So, can we close the bug against cryptsetup in this case? > > #403426 is really about the header corruption which you have fixed in > SVN. It should be closed when the Debian maintainers make a new > upload with that fix. > > > Maybe someone else can verify that? > > CCing Gordon. :) Ok, so here are some interesting results... I am able to access the LUKS partition on the NSLU2 running 2.6.18 from subversion (which includes flush_anon_page-generic.patch and flush_anon_page-arm.patch) with both cryptsetup-1.0.4-8 (the latest version in testing) and cryptsetup-1.0.4-8 plus 02_fix_arm.dpatch and 03_no_header_conv.dpatch that were posted to this thread. $ sudo cryptsetup luksOpen /dev/sdb3 testfs Enter LUKS passphrase: key slot 0 unlocked. Command successful. gordon@LKG7102D7:~$ sudo mount /dev/mapper/testfs /mnt/tmp gordon@LKG7102D7:~$ sudo umount /mnt/tmp gordon@LKG7102D7:~$ sudo cryptsetup luksClose testfs However, I have found that I am unable to access the LUKS partition when the system is under heavy load and swapping. $ sudo cryptsetup luksOpen /dev/sdb3 testfs Enter LUKS passphrase: Enter LUKS passphrase: Enter LUKS passphrase: Command failed: No key available with this passphrase. gordon@LKG7102D7:~$ uptime 00:22:23 up 16 min, 2 users, load average: 3.01, 1.85, 0.93 gordon@LKG7102D7:~$ free total used free shared buffers cached Mem: 29988 28908 1080 0 172 3028 -/+ buffers/cache: 25708 4280 Swap: 88316 67508 20808 Once the system load decreases and the swapping stops, I am able to access the LUKS partition again. This behaviour is very repeatable. Martin, I wonder if this has anything to do with the virtual memory bug in the kernel that we experienced with apt. It could be that this bug existed before 2.6.19 but was much harder to trigger (e.g. see http://lkml.org/lkml/2007/1/3/285). It would be interesting to try accessing a LUKS partition under heavy load while running 2.6.20-git, but that will have to wait until the weekend for me to test it. Gordon -- Gordon Farquharson ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-05 8:36 ` Gordon Farquharson @ 2007-01-05 9:59 ` Martin Michlmayr 2007-01-06 6:38 ` Gordon Farquharson 2007-01-07 5:47 ` Gordon Farquharson 1 sibling, 1 reply; 19+ messages in thread From: Martin Michlmayr @ 2007-01-05 9:59 UTC (permalink / raw) To: Gordon Farquharson; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426 * Gordon Farquharson <gordonfarquharson@gmail.com> [2007-01-05 01:36]: > However, I have found that I am unable to access the LUKS partition > when the system is under heavy load and swapping. Interesting. Can you check whether you see the same problems with FUSE (see #402876)? > Martin, I wonder if this has anything to do with the virtual memory > bug in the kernel that we experienced with apt. It could be that this > bug existed before 2.6.19 but was much harder to trigger (e.g. see I don't know. I'm aware this bug has been around for a while (but hard to trigger) but I'd b cautious to attribute every bug we see to it. Of course it's possible that this is the problem but somehow I doubt it. -- Martin Michlmayr http://www.cyrius.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-05 9:59 ` Martin Michlmayr @ 2007-01-06 6:38 ` Gordon Farquharson 0 siblings, 0 replies; 19+ messages in thread From: Gordon Farquharson @ 2007-01-06 6:38 UTC (permalink / raw) To: Martin Michlmayr; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426 On 1/5/07, Martin Michlmayr <tbm@cyrius.com> wrote: > * Gordon Farquharson <gordonfarquharson@gmail.com> [2007-01-05 01:36]: > > However, I have found that I am unable to access the LUKS partition > > when the system is under heavy load and swapping. > > Interesting. Can you check whether you see the same problems with > FUSE (see #402876)? FUSE works with 2.6.18-9 (checked out from subversion on 2007.01.04) under heavy load and swapping. gordon@LKG7102D7:~$ encfs /home/gordon/.encrypted /home/gordon/encrypted EncFS Password: gordon@LKG7102D7:~$ ls -l encrypted/ total 4 -rw-r--r-- 1 gordon gordon 13 2007-01-05 22:40 example.txt gordon@LKG7102D7:~$ cat encrypted/example.txt Some text... gordon@LKG7102D7:~$ fusermount -u /home/gordon/encrypted gordon@LKG7102D7:~$ uptime 23:33:58 up 22:37, 2 users, load average: 3.38, 2.27, 1.08 gordon@LKG7102D7:~$ free total used free shared buffers cached Mem: 29988 28876 1112 0 172 1840 -/+ buffers/cache: 26864 3124 Swap: 88316 63124 25192 Gordon -- Gordon Farquharson ^ permalink raw reply [flat|nested] 19+ messages in thread
* Bug#403426: kernel corrupts LUKS partition header on arm 2007-01-05 8:36 ` Gordon Farquharson 2007-01-05 9:59 ` Martin Michlmayr @ 2007-01-07 5:47 ` Gordon Farquharson 1 sibling, 0 replies; 19+ messages in thread From: Gordon Farquharson @ 2007-01-07 5:47 UTC (permalink / raw) To: Martin Michlmayr; +Cc: Clemens Fruhwirth, dm-devel, Brian Brunswick, 403426 On 1/5/07, Gordon Farquharson <gordonfarquharson@gmail.com> wrote: > However, I have found that I am unable to access the LUKS partition > when the system is under heavy load and swapping. > > Martin, I wonder if this has anything to do with the virtual memory > bug in the kernel that we experienced with apt. It could be that this > bug existed before 2.6.19 but was much harder to trigger (e.g. see > http://lkml.org/lkml/2007/1/3/285). It would be interesting to try > accessing a LUKS partition under heavy load while running 2.6.20-git, > but that will have to wait until the weekend for me to test it. Ok, I tested 2.6.19-1~experimental.1 + vm-fix-nasty-and-subtle-race-in-shared-mmap-ed-page-writeback.patch flush_anon_page-generic.patch flush_anon_page-arm.patch and I am still unable to access the LUKS paritition under heavy load and swapping, so it appears that there is someting else causing this problem. Gordon -- Gordon Farquharson ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-01-07 5:47 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20061217022906.2434.60658.reportbug@LKG8A754B.example.org> 2006-12-20 16:15 ` Bug#403426: kernel corrupts LUKS partition header on arm Martin Michlmayr 2006-12-29 10:52 ` Clemens Fruhwirth 2006-12-29 16:38 ` Martin Michlmayr 2006-12-29 20:24 ` Martin Michlmayr 2006-12-30 10:50 ` Clemens Fruhwirth 2006-12-30 13:13 ` Martin Michlmayr 2007-01-02 17:00 ` Clemens Fruhwirth 2007-01-02 18:04 ` Martin Michlmayr 2007-01-02 18:34 ` Clemens Fruhwirth 2007-01-03 16:59 ` Clemens Fruhwirth 2007-01-03 19:14 ` Martin Michlmayr 2007-01-03 19:32 ` Clemens Fruhwirth 2007-01-03 19:37 ` Martin Michlmayr 2007-01-04 11:56 ` Clemens Fruhwirth 2007-01-04 15:09 ` Martin Michlmayr 2007-01-05 8:36 ` Gordon Farquharson 2007-01-05 9:59 ` Martin Michlmayr 2007-01-06 6:38 ` Gordon Farquharson 2007-01-07 5:47 ` Gordon Farquharson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.