All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-crypt] what touches the LUKS header?
@ 2010-08-07 21:06 epvdm
  2010-08-08  0:53 ` Arno Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: epvdm @ 2010-08-07 21:06 UTC (permalink / raw)
  To: dm-crypt


I'm trying to discover what happened to a Luks encrypted partition that's
become impossible to unlock following a reboot.  The partition is a md mirror 
of two disk partitions.  

I'm assuming that something has managed to corrupt the LUKS header, as trying 
to unlock gives "No key available with this passphrase." I'm quite confident 
the passphrase is correct, as it worked after a reboot a few days ago, and i'm 
irresponsibly consistent in my passphrase choices, sad to say. :)

I have a bunch of mirror copies of the partition and the luks header data on
all of them (from luksHeaderBackup) is identical. luksDump shows what looks 
like plausible data: 

Version:       	1
Cipher name:   	aes
Cipher mode:   	cbc-essiv:sha256
Hash spec:     	sha1
Payload offset:	1032
MK bits:       	128
MK digest:     	a4 5e 29 97 a6 01 0c c8 f7 be ad e2 75 18 19 07 0b a2 e9 cd 
MK salt:       	<...>
MK iterations: 	10
UUID:          	eab03b35-6945-4608-9ae3-14c4bda8c8df

Key Slot 0: ENABLED
	Iterations:         	250772
	Salt:               	<...>
	Key material offset:	8
	AF stripes:            	4000
Key Slot 1: DISABLED
...

All the necessary kernel modules, etc, are in place, as far as I can tell. 
Again, it was mounted and working until a recent crash and reboot. 

I'm wondering if the header area ever gets written to under normal operation
such that a crash could have left it corrupted, or if it's only written when
modifying keyslots, etc... 

There are other partitions on the disks which don't appear to have become
corrupted. My main suspect so far is that md had something to do with it, but
it seems odd that it'd selective corrupt a small area of disk that wouldn't
have been written recently, that's well off in the middle of the drive, and
not damage more widely. 

Naturally I'd be thrilled to find some way of recovering the data but I'm 
more concerned at the moment with finding out why it's suddenly stopped 
working in the first place. 

I don't believe corruption to other parts of the volume, outside of the header
area, could cause it to fail to unlock like this; is that correct? 

thanks, 
eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-crypt] what touches the LUKS header?
  2010-08-07 21:06 [dm-crypt] what touches the LUKS header? epvdm
@ 2010-08-08  0:53 ` Arno Wagner
  2010-08-08  1:48   ` epvdm
  0 siblings, 1 reply; 6+ messages in thread
From: Arno Wagner @ 2010-08-08  0:53 UTC (permalink / raw)
  To: dm-crypt

On Sat, Aug 07, 2010 at 02:06:59PM -0700, epvdm@limpoc.com wrote:
> 
> I'm trying to discover what happened to a Luks encrypted partition that's
> become impossible to unlock following a reboot.  The partition is a md
> mirror of two disk partitions.
> 
> I'm assuming that something has managed to corrupt the LUKS header, as
> trying to unlock gives "No key available with this passphrase." I'm quite
> confident the passphrase is correct, as it worked after a reboot a few
> days ago, and i'm irresponsibly consistent in my passphrase choices, sad
> to say. :)
> 
> I have a bunch of mirror copies of the partition and the luks header data on
> all of them (from luksHeaderBackup) is identical. 

So this is an n-way (n>2) RAID1?

> luksDump shows what looks 
> like plausible data: 
> 
> Version:       	1
> Cipher name:   	aes
> Cipher mode:   	cbc-essiv:sha256
> Hash spec:     	sha1
> Payload offset:	1032
> MK bits:       	128
> MK digest:     	a4 5e 29 97 a6 01 0c c8 f7 be ad e2 75 18 19 07 0b a2 e9 cd 
> MK salt:       	<...>
> MK iterations: 	10
> UUID:          	eab03b35-6945-4608-9ae3-14c4bda8c8df
> 
> Key Slot 0: ENABLED
> 	Iterations:         	250772
> 	Salt:               	<...>
> 	Key material offset:	8
> 	AF stripes:            	4000
> Key Slot 1: DISABLED
> ...
> 
> All the necessary kernel modules, etc, are in place, as far as I can tell. 
> Again, it was mounted and working until a recent crash and reboot. 
> 
> I'm wondering if the header area ever gets written to under normal operation
> such that a crash could have left it corrupted, or if it's only written when
> modifying keyslots, etc... 

As far as I am aware of, nothing gets written unless you change
a key-slot.

> There are other partitions on the disks which don't appear to have become
> corrupted. My main suspect so far is that md had something to do with it,
> but it seems odd that it'd selective corrupt a small area of disk that
> wouldn't have been written recently, that's well off in the middle of the
> drive, and not damage more widely.
> 
> Naturally I'd be thrilled to find some way of recovering the data but I'm 
> more concerned at the moment with finding out why it's suddenly stopped 
> working in the first place. 
> 
> I don't believe corruption to other parts of the volume, outside of the
> header area, could cause it to fail to unlock like this; is that correct?

AFAIK the above error will also happen if the key has been corrupted.
As you can see from the FAQ, every key is about 128kB in size. Any bit 
changed in the key-stripes will result in unrecoverability.

I have been using md-RAID for a long time, also in a 3-way RAID1
configuration and never had any corruption that I know of. For 
the time I would rule that out, especially if the data area on all
mirrors is equal. I think you should compare the  header, keyslots and
key-stripes though. One way to do that would be to use 'cmp' on the 
raw devices and see where the first different byte is.

The one possibility when md will ever corrupt somethin is when
you get a manual mapping of the RAIDed area wrong and the 
RAID superblock happens to fall into a data area.

There is a second possibility: Keyboard input problems. I know
it sounds stupid, but try every character and symbol you have in
your passphrase and see whether it echos right.

Arno
-- 
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans

If it's in the news, don't worry about it.  The very definition of 
"news" is "something that hardly ever happens." -- Bruce Schneier 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-crypt] what touches the LUKS header?
  2010-08-08  0:53 ` Arno Wagner
@ 2010-08-08  1:48   ` epvdm
  2010-08-08  3:57     ` Arno Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: epvdm @ 2010-08-08  1:48 UTC (permalink / raw)
  To: dm-crypt

On Sun, Aug 08, 2010 at 02:53:47AM +0200, Arno Wagner wrote:
> 
> So this is an n-way (n>2) RAID1?
> 

It's a 2-way raid1, but i've (prior to the failure, of course) pulled copies 
of it by breaking one member out, and resyncing a new blank drive to it. 
I've done this many times in the past without trouble but I'm willing to 
accept that it could be bad... 


> > I'm wondering if the header area ever gets written to under normal operation
> > such that a crash could have left it corrupted, or if it's only written when
> > modifying keyslots, etc... 
> 
> As far as I am aware of, nothing gets written unless you change
> a key-slot.

That makes sense, and it's what I expected. 

> AFAIK the above error will also happen if the key has been corrupted.
> As you can see from the FAQ, every key is about 128kB in size. Any bit 
> changed in the key-stripes will result in unrecoverability.
> 
> I have been using md-RAID for a long time, also in a 3-way RAID1
> configuration and never had any corruption that I know of. For 
> the time I would rule that out, especially if the data area on all
> mirrors is equal. I think you should compare the  header, keyslots and
> key-stripes though. One way to do that would be to use 'cmp' on the 
> raw devices and see where the first different byte is.


> The one possibility when md will ever corrupt somethin is when
> you get a manual mapping of the RAIDed area wrong and the 
> RAID superblock happens to fall into a data area.
> 
> There is a second possibility: Keyboard input problems. I know
> it sounds stupid, but try every character and symbol you have in
> your passphrase and see whether it echos right.

Oh, certainly. I spent a long time on this before even looking into other
possibilities. I put the disks on another machine to test, and tried with
the passphrase in a keyfile, loaded with --key-file, with and without
trailing cr/lf, as well as typing the passphrase in the clear and cut-n-pasting
it into the cryptsetup prompt. 

for what it's worth, the partitions are identical at least for a few gigabytes
in. Though I haven't compared the whole 900+ GB, I assume 3 or 4 GB should be
more than enough to cover any possible key material. So whatever corruption
has happened would seem to have been above the disk level. 

here's a couple of questions - first, how do I determine the total extent
of the partition in which corruption could cause this problem; i.e, header,
all key material? And second, is that area sparse, or should it all be
filled in. I was thinking of looking through it manually trying to find 
patterns of data that might have been dropped on top of it from buffer cache
or elsewhere, for instance readable text, raid or filesystem superblocks,
magic numbers of common executable or other file types, etc. This could at 
least provide a clue. But if the area is sparse and might normally contain 
data that was already on the raw partition before it was luksFormatted, it 
would be more difficult. 

thanks very much for your help,  btw. 

eric
> 
> Arno
> -- 
> Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name 
> GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
> ----
> Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
> 
> If it's in the news, don't worry about it.  The very definition of 
> "news" is "something that hardly ever happens." -- Bruce Schneier 
> _______________________________________________
> dm-crypt mailing list
> dm-crypt@saout.de
> http://www.saout.de/mailman/listinfo/dm-crypt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-crypt] what touches the LUKS header?
  2010-08-08  1:48   ` epvdm
@ 2010-08-08  3:57     ` Arno Wagner
  2010-08-09 23:04       ` epvdm
  0 siblings, 1 reply; 6+ messages in thread
From: Arno Wagner @ 2010-08-08  3:57 UTC (permalink / raw)
  To: dm-crypt


Ok, first read the relevant FAQ items. They really help. (I hope)

On Sat, Aug 07, 2010 at 06:48:39PM -0700, epvdm@limpoc.com wrote:
> On Sun, Aug 08, 2010 at 02:53:47AM +0200, Arno Wagner wrote:
> > 
> > So this is an n-way (n>2) RAID1?
> > 
> 
> It's a 2-way raid1, but i've (prior to the failure, of course) pulled copies 
> of it by breaking one member out, and resyncing a new blank drive to it. 
> I've done this many times in the past without trouble but I'm willing to 
> accept that it could be bad... 

Possible, but unlikely, I think. Lets look at other things first.
  
> > > I'm wondering if the header area ever gets written to under normal operation
> > > such that a crash could have left it corrupted, or if it's only written when
> > > modifying keyslots, etc... 
> > 
> > As far as I am aware of, nothing gets written unless you change
> > a key-slot.
> 
> That makes sense, and it's what I expected. 
> 
> > AFAIK the above error will also happen if the key has been corrupted.
> > As you can see from the FAQ, every key is about 128kB in size. Any bit 
> > changed in the key-stripes will result in unrecoverability.
> > 
> > I have been using md-RAID for a long time, also in a 3-way RAID1
> > configuration and never had any corruption that I know of. For 
> > the time I would rule that out, especially if the data area on all
> > mirrors is equal. I think you should compare the  header, keyslots and
> > key-stripes though. One way to do that would be to use 'cmp' on the 
> > raw devices and see where the first different byte is.
> 
> 
> > The one possibility when md will ever corrupt somethin is when
> > you get a manual mapping of the RAIDed area wrong and the 
> > RAID superblock happens to fall into a data area.
> > 
> > There is a second possibility: Keyboard input problems. I know
> > it sounds stupid, but try every character and symbol you have in
> > your passphrase and see whether it echos right.
> 
> Oh, certainly. I spent a long time on this before even looking into other
> possibilities. I put the disks on another machine to test, and tried with
> the passphrase in a keyfile, loaded with --key-file, with and without
> trailing cr/lf, as well as typing the passphrase in the clear and cut-n-pasting
> it into the cryptsetup prompt. 

Ok. Have you tried one of your backups for comparison as well? 
They should work. Just for completeness...

Incidentially, your backups should contain a good header + key-slots, 
so copying them over should repair any possible damage. See
FAQ item on making header backups. But don't do that yet, compare
the first 1MiB+4096B of a backup and a life disk first. Any header
or key-slot corruption should show up as difference. If there is no 
difference, then you have some other problem.

> for what it's worth, the partitions are identical at least for a few gigabytes
> in. Though I haven't compared the whole 900+ GB, I assume 3 or 4 GB should be
> more than enough to cover any possible key material. So whatever corruption
> has happened would seem to have been above the disk level. 

1MiB+4096B is enough to cover header and all keyslots. Hmm.
 
> here's a couple of questions - first, how do I determine the total extent
> of the partition in which corruption could cause this problem; i.e, header,
> all key material? 

Not a partition. Just the first 1MiB+4096B. They are not shown
in the decrypted device, the decrypted device is the sectors right 
after that. Also documented in the FAQ.

The problem could also happen, I think, if one of the salts
got corrupted. But I would need to try that to be sure. Apart
from that, the key-slots are the main suspect.

> And second, is that area sparse, or should it all be
> filled in. 

Mostly key data, but the key-stripes do not quite fill the 128kiB
allocated for each. See FAQ.

> I was thinking of looking through it manually trying to find 
> patterns of data that might have been dropped on top of it from buffer cache
> or elsewhere, for instance readable text, raid or filesystem superblocks,
> magic numbers of common executable or other file types, etc. This could at 
> least provide a clue. But if the area is sparse and might normally contain 
> data that was already on the raw partition before it was luksFormatted, it 
> would be more difficult.

No, this is a good idea. But do the comparison with the header and 
key-slots on a working backup disk first. See FAQ item 
"What does the on-disk structure of LUKS look like?" 
for exact length and position of the key-slots. A key-slot consists 
of tighly packed (no spacer or unused space) anti-forensic stripes 
and looks like encrypted data, i.e. "random". If you want to get a 
feel for it, FAQ item "How do I use LUKS with a loop-device?" gives 
instructions how to do LUKS on a file via the loop-device.

> thanks very much for your help,  btw. 

You are welcome.

Sorry for pointing to the FAQ so often, it really gives you most 
of the info you need. Current copy posted on this list today or 
on the web at

  http://code.google.com/p/cryptsetup/wiki/FrequentlyAskedQuestions

Arno
-- 
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans

If it's in the news, don't worry about it.  The very definition of 
"news" is "something that hardly ever happens." -- Bruce Schneier 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-crypt] what touches the LUKS header?
  2010-08-08  3:57     ` Arno Wagner
@ 2010-08-09 23:04       ` epvdm
  2010-08-09 23:35         ` Arno Wagner
  0 siblings, 1 reply; 6+ messages in thread
From: epvdm @ 2010-08-09 23:04 UTC (permalink / raw)
  To: dm-crypt

On Sun, Aug 08, 2010 at 05:57:26AM +0200, Arno Wagner wrote:
> > Oh, certainly. I spent a long time on this before even looking into other
> > possibilities. I put the disks on another machine to test, and tried with
> > the passphrase in a keyfile, loaded with --key-file, with and without
> > trailing cr/lf, as well as typing the passphrase in the clear and cut-n-pasting
> > it into the cryptsetup prompt. 
> 
> Ok. Have you tried one of your backups for comparison as well? 
> They should work. Just for completeness...
> 
> Incidentially, your backups should contain a good header + key-slots, 
> so copying them over should repair any possible damage. See
> FAQ item on making header backups. But don't do that yet, compare
> the first 1MiB+4096B of a backup and a life disk first. Any header
> or key-slot corruption should show up as difference. If there is no 
> difference, then you have some other problem.

The "real" backups are taken from the mounted filesystem, so they don't contain
the LUKS key material. The mirror-copies I have were all made over a short period
of time and display the same problem, suggesting that the damage happened some
time before that and wasn't noticed until the reboot. 

> > for what it's worth, the partitions are identical at least for a few gigabytes
> > in. Though I haven't compared the whole 900+ GB, I assume 3 or 4 GB should be
> > more than enough to cover any possible key material. So whatever corruption
> > has happened would seem to have been above the disk level. 
> 
> 1MiB+4096B is enough to cover header and all keyslots. Hmm.
>  
> > here's a couple of questions - first, how do I determine the total extent
> > of the partition in which corruption could cause this problem; i.e, header,
> > all key material? 
> 
> Not a partition. Just the first 1MiB+4096B. They are not shown
> in the decrypted device, the decrypted device is the sectors right 
> after that. Also documented in the FAQ.
> 
> The problem could also happen, I think, if one of the salts
> got corrupted. But I would need to try that to be sure. Apart
> from that, the key-slots are the main suspect.
> 
> > And second, is that area sparse, or should it all be
> > filled in. 
> 
> Mostly key data, but the key-stripes do not quite fill the 128kiB
> allocated for each. See FAQ.
> 
> > I was thinking of looking through it manually trying to find 
> > patterns of data that might have been dropped on top of it from buffer cache
> > or elsewhere, for instance readable text, raid or filesystem superblocks,
> > magic numbers of common executable or other file types, etc. This could at 
> > least provide a clue. But if the area is sparse and might normally contain 
> > data that was already on the raw partition before it was luksFormatted, it 
> > would be more difficult.
> 
> No, this is a good idea. But do the comparison with the header and 
> key-slots on a working backup disk first. See FAQ item 
> "What does the on-disk structure of LUKS look like?" 
> for exact length and position of the key-slots. A key-slot consists 
> of tighly packed (no spacer or unused space) anti-forensic stripes 
> and looks like encrypted data, i.e. "random". If you want to get a 
> feel for it, FAQ item "How do I use LUKS with a loop-device?" gives 
> instructions how to do LUKS on a file via the loop-device.

This is interesting. Looking through the first 1MiB+4096B I see quite a lot
of material that is obviously not key material - i.e, text, perl snippets, and
other stuff one would ordinarily see lying around a linux system disk. Now, 
there was only ever a single LUKS keyslot in use, so if the space dedicated to
to the rest of them does not get initialized, it could be that I am just seeing
what was on the disk before LUKS was initialized. However, it could also 
be bits of other areas of the disk, or buffer cache, that got written to the
keyslot areas. 

> > thanks very much for your help,  btw. 
> 
> You are welcome.
> 
> Sorry for pointing to the FAQ so often, it really gives you most 
> of the info you need. Current copy posted on this list today or 
> on the web at
> 
>   http://code.google.com/p/cryptsetup/wiki/FrequentlyAskedQuestions
> 

The FAQ is very helpful; sorry I missed a few parts such as the size of the key
area. :) 

Thanks,
eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-crypt] what touches the LUKS header?
  2010-08-09 23:04       ` epvdm
@ 2010-08-09 23:35         ` Arno Wagner
  0 siblings, 0 replies; 6+ messages in thread
From: Arno Wagner @ 2010-08-09 23:35 UTC (permalink / raw)
  To: dm-crypt

On Mon, Aug 09, 2010 at 04:04:04PM -0700, epvdm@limpoc.com wrote:
> On Sun, Aug 08, 2010 at 05:57:26AM +0200, Arno Wagner wrote:
> > > Oh, certainly. I spent a long time on this before even looking into other
> > > possibilities. I put the disks on another machine to test, and tried with
> > > the passphrase in a keyfile, loaded with --key-file, with and without
> > > trailing cr/lf, as well as typing the passphrase in the clear and cut-n-pasting
> > > it into the cryptsetup prompt. 
> > 
> > Ok. Have you tried one of your backups for comparison as well? 
> > They should work. Just for completeness...
> > 
> > Incidentially, your backups should contain a good header + key-slots, 
> > so copying them over should repair any possible damage. See
> > FAQ item on making header backups. But don't do that yet, compare
> > the first 1MiB+4096B of a backup and a life disk first. Any header
> > or key-slot corruption should show up as difference. If there is no 
> > difference, then you have some other problem.

> 
> The "real" backups are taken from the mounted filesystem, so they don't
> contain the LUKS key material. The mirror-copies I have were all made over
> a short period of time and display the same problem, suggesting that the
> damage happened some time before that and wasn't noticed until the reboot.

I see. A pity.

[...] 
> > No, this is a good idea. But do the comparison with the header and 
> > key-slots on a working backup disk first. See FAQ item 
> > "What does the on-disk structure of LUKS look like?" 
> > for exact length and position of the key-slots. A key-slot consists 
> > of tighly packed (no spacer or unused space) anti-forensic stripes 
> > and looks like encrypted data, i.e. "random". If you want to get a 
> > feel for it, FAQ item "How do I use LUKS with a loop-device?" gives 
> > instructions how to do LUKS on a file via the loop-device.
> 
> This is interesting. Looking through the first 1MiB+4096B I see quite a
> lot of material that is obviously not key material - i.e, text, perl
> snippets, and other stuff one would ordinarily see lying around a linux
> system disk. Now, there was only ever a single LUKS keyslot in use, so if
> the space dedicated to to the rest of them does not get initialized, it
> could be that I am just seeing what was on the disk before LUKS was
> initialized. However, it could also be bits of other areas of the disk, or
> buffer cache, that got written to the keyslot areas.

The space does not get initialized. So for you the first 128kiB would
be the relevant area.

> > > thanks very much for your help,  btw. 
> > 
> > You are welcome.
> > 
> > Sorry for pointing to the FAQ so often, it really gives you most 
> > of the info you need. Current copy posted on this list today or 
> > on the web at
> > 
> >   http://code.google.com/p/cryptsetup/wiki/FrequentlyAskedQuestions
> > 
> 
> The FAQ is very helpful; sorry I missed a few parts such as the 
>  size of the key area. :) 

It has gotten a bit long, addmitedly. 

Arno
-- 
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans

If it's in the news, don't worry about it.  The very definition of 
"news" is "something that hardly ever happens." -- Bruce Schneier 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-08-09 23:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-07 21:06 [dm-crypt] what touches the LUKS header? epvdm
2010-08-08  0:53 ` Arno Wagner
2010-08-08  1:48   ` epvdm
2010-08-08  3:57     ` Arno Wagner
2010-08-09 23:04       ` epvdm
2010-08-09 23:35         ` Arno Wagner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.