All of lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
@ 2022-06-28  5:13 Alexandre Messier
  2022-06-28  9:20 ` Borislav Petkov
  0 siblings, 1 reply; 7+ messages in thread
From: Alexandre Messier @ 2022-06-28  5:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, Andrew.Cooper3, mingo, bp, dave.hansen, x86, regressions,
	Alexandre Messier

Hello,

I tested 5.19-rc4 on my system that is currently running 5.18.0, and came
across an issue when unlocking the encrypted rootfs disk at startup. The error
message is:

device-mapper: reload ioctl on nvme0n1p3_crypt (254:0) failed: No such file or directory

The kernel log shows:

device-mapper: table: 254:0: crypt: Error allocating crypto tfm (-ENOENT)
device-mapper: ioctl: error adding target to table

I tested the previous 5.19-rcX, and the issue started happening with 5.19-rc1.
A bisection between 5.18.0 and 5.19-rc1 identifies the following commit:

8ad7e8f69695 ("x86/fpu/xsave: Support XSAVEC in the kernel")

I reverted that commit on top of 5.19-rc4, and unlocking the encrypted disk
works again.

Some more information about the system:
- CPU is AMD Ryzen 5700G
- Userspace is Debian Sid
- The encrypted disk setup is a default encrypted rootfs, as configured by the
  standard Debian installer

Please let me know if more information is needed, or if some tests are needed
to be run.

Thanks,
Alex

#regzbot introduced 8ad7e8f69695

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
  2022-06-28  5:13 [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ Alexandre Messier
@ 2022-06-28  9:20 ` Borislav Petkov
  2022-06-28 16:52   ` Dave Hansen
  2022-06-28 21:31   ` Alexandre Messier
  0 siblings, 2 replies; 7+ messages in thread
From: Borislav Petkov @ 2022-06-28  9:20 UTC (permalink / raw)
  To: Alexandre Messier
  Cc: linux-kernel, tglx, Andrew.Cooper3, mingo, dave.hansen, x86, regressions

On Tue, Jun 28, 2022 at 01:13:30AM -0400, Alexandre Messier wrote:
> Please let me know if more information is needed, or if some tests are needed
> to be run.

Yeah, pls send /proc/cpuinfo and full dmesg - privately is fine too.

Also, it would be lovely if I were able to reproduce this on a machine
here but mine doesn't have a crypto rootfs.

Perhaps you can point me to the exact instructions you're running to
decrypt your rootfs and I can try to create a usb crypto disk and try to
reproduce it with them...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
  2022-06-28  9:20 ` Borislav Petkov
@ 2022-06-28 16:52   ` Dave Hansen
  2022-06-28 21:31   ` Alexandre Messier
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Hansen @ 2022-06-28 16:52 UTC (permalink / raw)
  To: Borislav Petkov, Alexandre Messier
  Cc: linux-kernel, tglx, Andrew.Cooper3, mingo, dave.hansen, x86, regressions

First of all, thank you for bisecting this!  I know those are a lot of work.

That XSAVEC patch modifies the AVX register save/restore code.  There is
a set of x86 AES acceleration instructions called AES-NI.  Those
instructions use the AVX registers.  So, it's at least a plausible
connection between that patch and your symptoms.  But, I don't think
anyone's been able to reproduce what you're seeing yet.

The kernel XSAVE buffer formats also differ slightly between AMD and
Intel.  That *should* be OK, but it might explain why I can't reproduce
this.

If you get a chance, could you apply this (ugly hackish) patch to the
userspace 'cryptsetup' utility and run it?

	https://sr71.net/~dave/intel/cryptsetup-memcmp.patch

On Ubuntu at least, it was as simple as:

	apt-get source cryptsetup
	apt-get build-dep cryptsetup
	cd cryptsetup-1.6.6
	./configure
	make

Then I could run:

	./src/cryptsetup benchmark --cipher=aes-xts --key-size=512
and
	./src/cryptsetup benchmark --cipher=aes-xts --key-size=256

With that patch applied, you should see some output like:

# ./src/cryptsetup benchmark --cipher=aes-xts --key-size=512
# Tests are approximate using memory only (no storage IO).
memcmp12: 0
memcmp23: 0
memcmp13: 0
memcmp12: -173
memcmp23: 173
memcmp13: 0
#  Algorithm | Key |  Encryption |  Decryption
     aes-xts   512b  4592.2 MiB/s  4192.0 MiB/s

The "memcmp13:" lines should both be 0.  That means that an encryption
and decryption cycle didn't change the data.  You *might* have to run
this in a loop if there's some kind of bad timing involved in triggering
the bug.

If you see a "memcmp13:" with something other than 0, that will narrow
things down and means we'll have a pretty quick reproducer that doesn't
involve luks which should speed things along.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
  2022-06-28  9:20 ` Borislav Petkov
  2022-06-28 16:52   ` Dave Hansen
@ 2022-06-28 21:31   ` Alexandre Messier
  2022-06-28 22:59     ` Thomas Gleixner
  1 sibling, 1 reply; 7+ messages in thread
From: Alexandre Messier @ 2022-06-28 21:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, Andrew.Cooper3, mingo, dave.hansen, x86, regressions

On 2022-06-28 05:20, Borislav Petkov wrote:
> On Tue, Jun 28, 2022 at 01:13:30AM -0400, Alexandre Messier wrote:
>> Please let me know if more information is needed, or if some tests are needed
>> to be run.
> 
> Yeah, pls send /proc/cpuinfo and full dmesg - privately is fine too.

Here is the cpuinfo output:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 25
model		: 80
model name	: AMD Ryzen 7 5700G with Radeon Graphics
stepping	: 0
microcode	: 0xa50000c
cpu MHz		: 3514.072
cache size	: 512 KB
physical id	: 0
siblings	: 16
core id		: 0
cpu cores	: 8
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 16
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
                  pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
                  fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl
                  nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq
                  monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave
                  avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm
                  sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce
                  topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb
                  cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall
                  fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed
                  adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1
                  xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local
                  clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv
                  svm_lock nrip_save tsc_scale vmcb_clean flushbyasid
                  decodeassists pausefilter pfthreshold avic v_vmsave_vmload
                  vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid
                  overflow_recov succor smca fsrm
bugs		: sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7585.33
TLB size	: 2560 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

And here is the dmesg output of 5.19-rc4 without the revert (taken from the
initramfs). I put it on a paste service since it is too big for email:

  https://paste.debian.net/1245491/

> 
> Also, it would be lovely if I were able to reproduce this on a machine
> here but mine doesn't have a crypto rootfs.
> 
> Perhaps you can point me to the exact instructions you're running to
> decrypt your rootfs and I can try to create a usb crypto disk and try to
> reproduce it with them...

I setup an unencrypted Debian installation on another drive to be able to run
cryptsetup commands in userspace while using rc4, and was able to see the
issue. In a up-to-date Debian Sid installation (important, more on this below),
running these commands makes it possible to reproduce the issue:

  dd if=/dev/zero bs=1M count=20 of=./test.img
  sudo cryptsetup luksFormat ./test.img
  sudo cryptsetup luksOpen ./test.img test_crypt

The "luksOpen" will fail with the same error message I get on my main system.

It seems using the latest Debian Sid is important. At first, I was trying with
Debian Bullseye, but everything was working, even unlocking my main drive.

Could it be a difference due to the cryptsetup version? Sid is using 2.4.3,
while Bullseye is based on 2.3.7. I will try to compile cryptsetup 2.4.3 and
use it in a Bullseye system with kernel 5.19-rc4, to see if the issue occurs
in that setup.

Thanks,
Alex
 
> 
> Thx.
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
  2022-06-28 21:31   ` Alexandre Messier
@ 2022-06-28 22:59     ` Thomas Gleixner
  2022-06-28 23:24       ` Alexandre Messier
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2022-06-28 22:59 UTC (permalink / raw)
  To: Alexandre Messier, Borislav Petkov
  Cc: linux-kernel, Andrew.Cooper3, mingo, dave.hansen, x86, regressions

Alexandre,

On Tue, Jun 28 2022 at 17:31, Alexandre Messier wrote:
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
>                   pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
>                   fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl
>                   nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq
>                   monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave
>                   avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm
>                   sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce
>                   topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb
>                   cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall
>                   fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed
>                   adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1
>                   xsaves cqm_llc cqm_occup_llc cqm_mbm_total
>                   cqm_mbm_local

So this CPU supports XSAVEC and XSAVES which means the kernel uses
XSAVES as the kernel before that.

> And here is the dmesg output of 5.19-rc4 without the revert (taken from the
> initramfs). I put it on a paste service since it is too big for email:
>
>   https://paste.debian.net/1245491/

[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: xstate_offset[9]:  832, xstate_sizes[9]:    8
[    0.000000] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format.

This is correct. Is there any difference on a 5.18 kernel or on 5.19-rc
with the commit reverted? I doubt that.

I'm completely puzzled and stared at the commit in question on and off,
but I can't spot the fail.

> I setup an unencrypted Debian installation on another drive to be able to run
> cryptsetup commands in userspace while using rc4, and was able to see the
> issue. In a up-to-date Debian Sid installation (important, more on this below),
> running these commands makes it possible to reproduce the issue:
>
>   dd if=/dev/zero bs=1M count=20 of=./test.img
>   sudo cryptsetup luksFormat ./test.img
>   sudo cryptsetup luksOpen ./test.img test_crypt
>
> The "luksOpen" will fail with the same error message I get on my main system.
>
> It seems using the latest Debian Sid is important. At first, I was trying with
> Debian Bullseye, but everything was working, even unlocking my main drive.
>
> Could it be a difference due to the cryptsetup version? Sid is using 2.4.3,
> while Bullseye is based on 2.3.7. I will try to compile cryptsetup 2.4.3 and
> use it in a Bullseye system with kernel 5.19-rc4, to see if the issue occurs
> in that setup.

It might use a different crypto algorithm.

Still confused....

I'll have another look tomorrow morning with brain awake.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
  2022-06-28 22:59     ` Thomas Gleixner
@ 2022-06-28 23:24       ` Alexandre Messier
  2022-06-29 15:24         ` Dave Hansen
  0 siblings, 1 reply; 7+ messages in thread
From: Alexandre Messier @ 2022-06-28 23:24 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: linux-kernel, Andrew.Cooper3, mingo, dave.hansen, x86, regressions

On 2022-06-28 18:59, Thomas Gleixner wrote:
> Alexandre,
> 
> On Tue, Jun 28 2022 at 17:31, Alexandre Messier wrote:
>> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
>>                   pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
>>                   fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl
>>                   nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq
>>                   monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave
>>                   avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm
>>                   sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce
>>                   topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb
>>                   cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall
>>                   fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed
>>                   adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1
>>                   xsaves cqm_llc cqm_occup_llc cqm_mbm_total
>>                   cqm_mbm_local
> 
> So this CPU supports XSAVEC and XSAVES which means the kernel uses
> XSAVES as the kernel before that.
> 
>> And here is the dmesg output of 5.19-rc4 without the revert (taken from the
>> initramfs). I put it on a paste service since it is too big for email:
>>
>>   https://paste.debian.net/1245491/
> 
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
> [    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
> [    0.000000] x86/fpu: xstate_offset[9]:  832, xstate_sizes[9]:    8
> [    0.000000] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format.
> 
> This is correct. Is there any difference on a 5.18 kernel or on 5.19-rc
> with the commit reverted? I doubt that.
> 
> I'm completely puzzled and stared at the commit in question on and off,
> but I can't spot the fail.
> 
>> I setup an unencrypted Debian installation on another drive to be able to run
>> cryptsetup commands in userspace while using rc4, and was able to see the
>> issue. In a up-to-date Debian Sid installation (important, more on this below),
>> running these commands makes it possible to reproduce the issue:
>>
>>   dd if=/dev/zero bs=1M count=20 of=./test.img
>>   sudo cryptsetup luksFormat ./test.img
>>   sudo cryptsetup luksOpen ./test.img test_crypt
>>
>> The "luksOpen" will fail with the same error message I get on my main system.
>>
>> It seems using the latest Debian Sid is important. At first, I was trying with
>> Debian Bullseye, but everything was working, even unlocking my main drive.
>>
>> Could it be a difference due to the cryptsetup version? Sid is using 2.4.3,
>> while Bullseye is based on 2.3.7. I will try to compile cryptsetup 2.4.3 and
>> use it in a Bullseye system with kernel 5.19-rc4, to see if the issue occurs
>> in that setup.
> 
> It might use a different crypto algorithm.
> 
> Still confused....
> 
> I'll have another look tomorrow morning with brain awake.

Thomas, Borislav,

Well this is embarrassing... I ran the test Dave sent in his email, and when
running it on that unencrypted Debian Sid installation with kernel 5.19-rc4, it
failed too, but indicated that "aes-xts" was not available... It was right. 

I forgot to mention I am using a custom kernel config, and indeed CRYPTO_XTS
was not enabled. When I enabled it, the cryptsetup benchmark worked, along with
the test that previously failed with the test file.

So I enabled that option too on my main installation and I am now able to
unlock the drive like before. I don't know why it is needed now, but that fixed
the issue.

Sorry again for the trouble, this was not a kernel regression, but my error.

Thanks,
Alex

#regzbot invalid: Missing kernel config, not kernel regression

> 
> Thanks,
> 
>         tglx


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+
  2022-06-28 23:24       ` Alexandre Messier
@ 2022-06-29 15:24         ` Dave Hansen
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Hansen @ 2022-06-29 15:24 UTC (permalink / raw)
  To: Alexandre Messier, Thomas Gleixner, Borislav Petkov
  Cc: linux-kernel, Andrew.Cooper3, mingo, dave.hansen, x86, regressions

On 6/28/22 16:24, Alexandre Messier wrote:
> Sorry again for the trouble, this was not a kernel regression, but my error.

Been there, done that!  I'm just glad we don't have anything to fix. :)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-29 15:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28  5:13 [REGRESSION] Unable to unlock encrypted disk starting with kernel 5.19-rc1+ Alexandre Messier
2022-06-28  9:20 ` Borislav Petkov
2022-06-28 16:52   ` Dave Hansen
2022-06-28 21:31   ` Alexandre Messier
2022-06-28 22:59     ` Thomas Gleixner
2022-06-28 23:24       ` Alexandre Messier
2022-06-29 15:24         ` Dave Hansen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.