linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Sanitize CPU-state when switching from virtual-8086 mode to other task
@ 2013-12-28 22:02 halfdog
  2013-12-29  2:37 ` H. Peter Anvin
  0 siblings, 1 reply; 24+ messages in thread
From: halfdog @ 2013-12-28 22:02 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin; +Cc: x86, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It seems that missing CPU-state sanitation during task switching
triggers kernel-panic. This might be related to unhandled FPU-errors.
See [1] for POC and serial console log of OOPs. Due to missing real
32-bit x86-hardware it is not clear, if this issue might be related to
subtle differences in virtual-8086 mode handling when inside a
virtualbox guest.

hd

[1] http://www.halfdog.net/Security/2013/Vm86SyscallTaskSwitchKernelPanic/


[  348.270712] fpu exception: 0000 [#1]
[  348.270763] Modules linked in: nfnetlink_log nfnetlink xt_multiport
xt_hashlimit xt_tcpudp ipt_ULOG xt_LOG xt_conntrack iptable_raw
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack iptable_mangle iptable_filter ip_tables x_tables snd_pcm
snd_page_alloc snd_timer snd parport_pc soundcore microcode psmouse
serio_raw pcspkr evdev parport ac battery button i2c_piix4 i2c_core
ext4 crc16 mbcache jbd2 sg sr_mod sd_mod cdrom crc_t10dif ata_generic
ata_piix mptspi scsi_transport_spi mptscsih libata mptbase pcnet32 mii
scsi_mod
[  348.270763] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.11-2-486
#1 Debian 3.11.10-1
[  348.270763] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  348.270763] task: cf835400 ti: cf930000 task.ti: cf84a000
[  348.270763] EIP: 0060:[<c10013e0>] EFLAGS: 00010002 CPU: 0
[  348.270763] EIP is at __switch_to+0x190/0x300
[  348.270763] EAX: cd2eec00 EBX: cd2eec00 ECX: 00000000 EDX: 00000000
[  348.270763] ESI: cf835400 EDI: 00000001 EBP: cd2eedf8 ESP: cf931a40
[  348.270763]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[  348.270763] CR0: 80050033 CR2: b76997e0 CR3: 0d11a000 CR4: 00000690
[  348.270763] Stack:
[  348.270763]  4a6ef7ab ccee9c80 ccee9900 cf835400 c13978cf cd2eec00
00200082 c15de480
[  348.270763]  00000018 67bf6d70 cf930000 cd2eec00 1625d3df 00000051
cd2eec2c c1056e15
[  348.270763]  00200086 0000000a cf931a90 c1006cc8 00393f1e 00000000
5d3e5d0f 00000040
[  348.270763] Call Trace:
[  348.270763]  [<c13978cf>] ? __schedule+0x1ef/0x510
[  348.270763]  [<c1056e15>] ? update_curr+0x95/0x140
[  348.270763]  [<c1006cc8>] ? sched_clock+0x8/0x10
[  348.270763]  [<c13973d5>] ? schedule_hrtimeout_range_clock+0x165/0x180
[  348.270763]  [<c1044e9f>] ? __flush_work+0xbf/0x100
[  348.270763]  [<d0a4fa59>] ? nf_nat_get_offset+0x39/0x60 [nf_nat]
[  348.270763]  [<d0a68df7>] ? tcp_packet+0x637/0xf40 [nf_conntrack]
[  348.270763]  [<c124932c>] ? tty_write_room+0xc/0x20
[  348.270763]  [<c1246fb9>] ? n_tty_poll+0x189/0x1a0
[  348.270763]  [<c13973ff>] ? schedule_hrtimeout_range+0xf/0x20
[  348.270763]  [<c11093a0>] ? poll_schedule_timeout+0x20/0x40
[  348.270763]  [<c1109c77>] ? do_select+0x537/0x5f0
[  348.270763]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.270763]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.270763]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.270763]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.270763]  [<c12f688d>] ? nf_iterate+0x7d/0x90
[  348.270763]  [<c1067e6c>] ? __getnstimeofday+0x2c/0x110
[  348.270763]  [<c133f7f2>] ? bictcp_cong_avoid+0x12/0x4a0
[  348.270763]  [<c1067f55>] ? getnstimeofday+0x5/0x20
[  348.270763]  [<c131116b>] ? tcp_ack+0x82b/0xdc0
[  348.270763]  [<c10353a0>] ? local_bh_enable+0x70/0x80
[  348.270763]  [<c1300301>] ? ip_finish_output+0x151/0x350
[  348.270763]  [<c10c612a>] ? put_compound_page+0xa/0xe0
[  348.270763]  [<c1311b07>] ? tcp_rcv_established+0xf7/0x7a0
[  348.270763]  [<c12c1edc>] ? sk_reset_timer+0xc/0x20
[  348.270763]  [<c131a94e>] ? tcp_v4_do_rcv+0x15e/0x3b0
[  348.270763]  [<c12c3558>] ? release_sock+0x88/0xf0
[  348.270763]  [<c13088d7>] ? tcp_sendmsg+0x177/0xc60
[  348.270763]  [<c1056e15>] ? update_curr+0x95/0x140
[  348.270763]  [<c1109e5c>] ? core_sys_select+0x12c/0x220
[  348.270763]  [<c12beee1>] ? sock_aio_write+0xe1/0x110
[  348.270763]  [<c10f9cda>] ? do_sync_write+0x6a/0xa0
[  348.270763]  [<c112b673>] ? fsnotify+0x203/0x2f0
[  348.270763]  [<c1109fdf>] ? SyS_select+0x8f/0xc0
[  348.270763]  [<c100aca2>] ? syscall_trace_leave+0xa2/0xb0
[  348.270763]  [<c1398fef>] ? syscall_call+0x7/0xb
[  348.270763] Code: e9 1d ff ff ff 8d b6 00 00 00 00 b8 7d 00 00 00
e8 36 b8 00 00 84 c0 0f 85 e1 fe ff ff 0f 06 8d 74 26 00 e9 d6 fe ff
ff 8d 76 00 <0f> 77 db 83 4c 02 00 00 89 f6 8d b6 00 00 00 00 eb 66 b8
ff ff
[  348.270763] EIP: [<c10013e0>] __switch_to+0x190/0x300 SS:ESP
0068:cf931a40
[  348.270763] ---[ end trace c3836805b501f815 ]---
[  348.274764] ------------[ cut here ]------------
[  348.278424] kernel BUG at
/build/linux-tAcKXn/linux-3.11.10/kernel/exit.c:870!
[  348.278764] invalid opcode: 0000 [#2]
[  348.278764] Modules linked in: nfnetlink_log nfnetlink xt_multiport
xt_hashlimit xt_tcpudp ipt_ULOG xt_LOG xt_conntrack iptable_raw
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack iptable_mangle iptable_filter ip_tables x_tables snd_pcm
snd_page_alloc snd_timer snd parport_pc soundcore microcode psmouse
serio_raw pcspkr evdev parport ac battery button i2c_piix4 i2c_core
ext4 crc16 mbcache jbd2 sg sr_mod sd_mod cdrom crc_t10dif ata_generic
ata_piix mptspi scsi_transport_spi mptscsih libata mptbase pcnet32 mii
scsi_mod
[  348.278764] CPU: 0 PID: 2220 Comm: sshd Tainted: G      D
3.11-2-486 #1 Debian 3.11.10-1
[  348.278764] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  348.278764] task: cd2eec00 ti: cf930000 task.ti: cf930000
[  348.278764] EIP: 0060:[<c103348a>] EFLAGS: 00010282 CPU: 0
[  348.278764] EIP is at do_exit+0x44a/0x830
[  348.278764] EAX: 00000080 EBX: cf835400 ECX: 00000000 EDX: cd2eec00
[  348.278764] ESI: 00000001 EDI: 00000001 EBP: cf835c00 ESP: cf93190c
[  348.278764]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[  348.278764] CR0: 80050033 CR2: b74faf38 CR3: 0d11a000 CR4: 00000690
[  348.278764] Stack:
[  348.278764]  0000000b cf931a04 00000010 c1393e1c cf835510 cf8353f8
cf835510 00000001
[  348.278764]  cf835558 cf931930 cf931930 00000046 0000000b cf931a04
00000010 c1399cf1
[  348.278764]  cf931a04 cf931a04 cf835400 c1446e22 c10029be 00000000
00000010 00000008
[  348.278764] Call Trace:
[  348.278764]  [<c1393e1c>] ? printk+0x37/0x3b
[  348.278764]  [<c1399cf1>] ? oops_end+0x81/0xc0
[  348.278764]  [<c10029be>] ? math_error+0x14e/0x2d0
[  348.278764]  [<c1056e15>] ? update_curr+0x95/0x140
[  348.278764]  [<c1056921>] ? sched_slice.isra.35+0x41/0x80
[  348.278764]  [<c1055a8a>] ? update_cpu_load_active+0x1a/0x80
[  348.278764]  [<c1056e15>] ? update_curr+0x95/0x140
[  348.278764]  [<c1002b40>] ? math_error+0x2d0/0x2d0
[  348.278764]  [<c1399585>] ? error_code+0x65/0x70
[  348.278764]  [<c10013e0>] ? __switch_to+0x190/0x300
[  348.278764]  [<c13978cf>] ? __schedule+0x1ef/0x510
[  348.278764]  [<c1056e15>] ? update_curr+0x95/0x140
[  348.278764]  [<c1006cc8>] ? sched_clock+0x8/0x10
[  348.278764]  [<c13973d5>] ? schedule_hrtimeout_range_clock+0x165/0x180
[  348.278764]  [<c1044e9f>] ? __flush_work+0xbf/0x100
[  348.278764]  [<d0a4fa59>] ? nf_nat_get_offset+0x39/0x60 [nf_nat]
[  348.278764]  [<d0a68df7>] ? tcp_packet+0x637/0xf40 [nf_conntrack]
[  348.278764]  [<c124932c>] ? tty_write_room+0xc/0x20
[  348.278764]  [<c1246fb9>] ? n_tty_poll+0x189/0x1a0
[  348.278764]  [<c13973ff>] ? schedule_hrtimeout_range+0xf/0x20
[  348.278764]  [<c11093a0>] ? poll_schedule_timeout+0x20/0x40
[  348.278764]  [<c1109c77>] ? do_select+0x537/0x5f0
[  348.278764]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.278764]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.278764]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.278764]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.278764]  [<c12f688d>] ? nf_iterate+0x7d/0x90
[  348.278764]  [<c1067e6c>] ? __getnstimeofday+0x2c/0x110
[  348.278764]  [<c133f7f2>] ? bictcp_cong_avoid+0x12/0x4a0
[  348.278764]  [<c1067f55>] ? getnstimeofday+0x5/0x20
[  348.278764]  [<c131116b>] ? tcp_ack+0x82b/0xdc0
[  348.278764]  [<c10353a0>] ? local_bh_enable+0x70/0x80
[  348.278764]  [<c1300301>] ? ip_finish_output+0x151/0x350
[  348.278764]  [<c10c612a>] ? put_compound_page+0xa/0xe0
[  348.278764]  [<c1311b07>] ? tcp_rcv_established+0xf7/0x7a0
[  348.278764]  [<c12c1edc>] ? sk_reset_timer+0xc/0x20
[  348.278764]  [<c131a94e>] ? tcp_v4_do_rcv+0x15e/0x3b0
[  348.278764]  [<c12c3558>] ? release_sock+0x88/0xf0
[  348.278764]  [<c13088d7>] ? tcp_sendmsg+0x177/0xc60
[  348.278764]  [<c1056e15>] ? update_curr+0x95/0x140
[  348.278764]  [<c1109e5c>] ? core_sys_select+0x12c/0x220
[  348.278764]  [<c12beee1>] ? sock_aio_write+0xe1/0x110
[  348.278764]  [<c10f9cda>] ? do_sync_write+0x6a/0xa0
[  348.278764]  [<c112b673>] ? fsnotify+0x203/0x2f0
[  348.278764]  [<c1109fdf>] ? SyS_select+0x8f/0xc0
[  348.278764]  [<c100aca2>] ? syscall_trace_leave+0xa2/0xb0
[  348.278764]  [<c1398fef>] ? syscall_call+0x7/0xb
[  348.278764] Code: 74 05 e8 9a 2d 09 00 8b 83 c4 03 00 00 85 c0 74
06 01 05 60 d8 4e c1 f3 90 81 4b 0c 00 80 00 00 c7 03 40 00 00 00 e8
66 47 36 00 <0f> 0b 8d 74 26 00 8b 46 10 85 c0 0f 85 67 02 00 00 89 ae
0c 01
[  348.278764] EIP: [<c103348a>] do_exit+0x44a/0x830 SS:ESP 0068:cf93190c
[  348.278776] ---[ end trace c3836805b501f816 ]---
[  348.285890] type=1106 audit(1388235169.398:64338): pid=2218 uid=0
auid=1000 ses=2
[  348.285890]  msg='op=PAM:session_close acct="test"
exe="/usr/sbin/sshd" hostname=10.255.255.1 addr=10.255.255.1
terminal=ssh res=success'
[  348.287096] type=1104 audit(1388235169.402:64339): pid=2218 uid=0
auid=1000 ses=2
[  348.287096]  msg='op=PAM:setcred acct="test" exe="/usr/sbin/sshd"
hostname=10.255.255.1 addr=10.255.255.1 terminal=ssh res=success'
[  348.766895] fpu exception: 0000 [#3]
[  348.770794] Modules linked in: nfnetlink_log nfnetlink xt_multiport
xt_hashlimit xt_tcpudp ipt_ULOG xt_LOG xt_conntrack iptable_raw
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack iptable_mangle iptable_filter ip_tables x_tables snd_pcm
snd_page_alloc snd_timer snd parport_pc soundcore microcode psmouse
serio_raw pcspkr evdev parport ac battery button i2c_piix4 i2c_core
ext4 crc16 mbcache jbd2 sg sr_mod sd_mod cdrom crc_t10dif ata_generic
ata_piix mptspi scsi_transport_spi mptscsih libata mptbase pcnet32 mii
scsi_mod
[  348.770794] CPU: 0 PID: 0 Comm: swapper Tainted: G      D
3.11-2-486 #1 Debian 3.11.10-1
[  348.770794] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  348.770794] task: c14d84e0 ti: cdd84000 task.ti: c14cc000
[  348.770794] EIP: 0060:[<c10013e0>] EFLAGS: 00210002 CPU: 0
[  348.770794] EIP is at __switch_to+0x190/0x300
[  348.770794] EAX: cf5ec000 EBX: cf5ec000 ECX: 00000000 EDX: 00000000
[  348.770794] ESI: c14d84e0 EDI: 00000001 EBP: cf5ec1f8 ESP: cdd85ad8
[  348.770794]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[  348.770794] CR0: 80050033 CR2: b7662000 CR3: 0cdb3000 CR4: 00000690
[  348.770794] Stack:
[  348.770794]  37df9a44 ccf3d040 ccf3dac0 c14d84e0 c13978cf cf5ec000
00200082 00000000
[  348.770794]  00000000 00000000 cdd84000 cf5ec000 00000000 ccf11ef0
c14e6e98 c11c4d70
[  348.770794]  65747300 cdd85b7c c14e6e8c c104d0ca 65747300 cdd85b7c
c14e6e8c 00200292
[  348.770794] Call Trace:
[  348.770794]  [<c13978cf>] ? __schedule+0x1ef/0x510
[  348.770794]  [<c11c4d70>] ? timerqueue_add+0x50/0xb0
[  348.770794]  [<c104d0ca>] ? enqueue_hrtimer+0x1a/0x60
[  348.770794]  [<c1397332>] ? schedule_hrtimeout_range_clock+0xc2/0x180
[  348.770794]  [<c104cdc0>] ? hrtimer_get_res+0x30/0x30
[  348.770794]  [<c139731d>] ? schedule_hrtimeout_range_clock+0xad/0x180
[  348.770794]  [<c13973ff>] ? schedule_hrtimeout_range+0xf/0x20
[  348.770794]  [<c11093a0>] ? poll_schedule_timeout+0x20/0x40
[  348.770794]  [<c110a671>] ? do_sys_poll+0x3f1/0x490
[  348.770794]  [<c12d33c8>] ? dev_queue_xmit+0x1f8/0x3b0
[  348.770794]  [<c10353a0>] ? local_bh_enable+0x70/0x80
[  348.770794]  [<c1300301>] ? ip_finish_output+0x151/0x350
[  348.770794]  [<c13005c8>] ? ip_local_out+0x18/0x20
[  348.770794]  [<c13017cb>] ? ip_send_skb+0xb/0x50
[  348.770794]  [<c132376b>] ? udp_send_skb+0x27b/0x340
[  348.770794]  [<c1323af8>] ? udp_sendmsg+0x268/0x820
[  348.770794]  [<c12ff070>] ? ip_copy_metadata+0x140/0x140
[  348.770794]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.770794]  [<c11094d0>] ? poll_select_copy_remaining+0x110/0x110
[  348.770794]  [<c11c59f8>] ? put_dec.part.1+0xb8/0x100
[  348.770794]  [<c11c5dcf>] ? number.isra.2+0x38f/0x3a0
[  348.770794]  [<c11c76d9>] ? vsnprintf+0x179/0x420
[  348.770794]  [<c10bbc60>] ? find_get_page+0x10/0x50
[  348.770794]  [<c10bc5af>] ? find_lock_page+0x1f/0x60
[  348.770794]  [<c10ce33d>] ? shmem_getpage_gfp+0x7d/0x680
[  348.770794]  [<c11c5448>] ? format_decode+0x308/0x370
[  348.770794]  [<c11c770b>] ? vsnprintf+0x1ab/0x420
[  348.770794]  [<c10cf09f>] ? shmem_fault+0x3f/0x90
[  348.770794]  [<c10d8059>] ? __do_fault+0x329/0x450
[  348.770794]  [<c1396c18>] ? mutex_lock+0x8/0x15
[  348.770794]  [<c1100f35>] ? pipe_read+0x205/0x470
[  348.770794]  [<c10f9c3a>] ? do_sync_read+0x6a/0xa0
[  348.770794]  [<c1068117>] ? ktime_get_ts+0x37/0xf0
[  348.770794]  [<c1109718>] ? poll_select_set_timeout+0x58/0x80
[  348.770794]  [<c110a7ad>] ? SyS_poll+0x4d/0xb0
[  348.770794]  [<c1398fef>] ? syscall_call+0x7/0xb
[  348.770794] Code: e9 1d ff ff ff 8d b6 00 00 00 00 b8 7d 00 00 00
e8 36 b8 00 00 84 c0 0f 85 e1 fe ff ff 0f 06 8d 74 26 00 e9 d6 fe ff
ff 8d 76 00 <0f> 77 db 83 4c 02 00 00 89 f6 8d b6 00 00 00 00 eb 66 b8
ff ff
[  348.770794] EIP: [<c10013e0>] __switch_to+0x190/0x300 SS:ESP
0068:cdd85ad8
[  348.770794] ---[ end trace c3836805b501f817 ]---
[  348.770794] Kernel panic - not syncing: Attempted to kill the idle
task!



- -- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlK/Sl0ACgkQxFmThv7tq+6hcwCfSwoLsuqvl62oKVsbwUun2fi4
67sAn3UXxmyW8oEbMSuOu2KX7r/D4CMe
=YIVj
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-28 22:02 Sanitize CPU-state when switching from virtual-8086 mode to other task halfdog
@ 2013-12-29  2:37 ` H. Peter Anvin
  2013-12-29 20:44   ` halfdog
  0 siblings, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2013-12-29  2:37 UTC (permalink / raw)
  To: halfdog, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

On 12/28/2013 02:02 PM, halfdog wrote:
> It seems that missing CPU-state sanitation during task switching 
> triggers kernel-panic. This might be related to unhandled
> FPU-errors. See [1] for POC and serial console log of OOPs. Due to
> missing real 32-bit x86-hardware it is not clear, if this issue
> might be related to subtle differences in virtual-8086 mode
> handling when inside a virtualbox guest.
> 

This oops happens inside the guest?  Either way, I would be *very*
skeptical of Virtualbox in this case.

You can run a 32-bit kernel on 64-bit hardware, you know...

	-hpa

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-29  2:37 ` H. Peter Anvin
@ 2013-12-29 20:44   ` halfdog
  2013-12-30  1:18     ` H. Peter Anvin
  0 siblings, 1 reply; 24+ messages in thread
From: halfdog @ 2013-12-29 20:44 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

H. Peter Anvin wrote:
> On 12/28/2013 02:02 PM, halfdog wrote:
>> It seems that missing CPU-state sanitation during task switching
>>  triggers kernel-panic. This might be related to unhandled 
>> FPU-errors. See [1] for POC and serial console log of OOPs. Due
>> to missing real 32-bit x86-hardware it is not clear, if this
>> issue might be related to subtle differences in virtual-8086
>> mode handling when inside a virtualbox guest.
>> 
> 
> This oops happens inside the guest?  Either way, I would be *very* 
> skeptical of Virtualbox in this case.
> 
> You can run a 32-bit kernel on 64-bit hardware, you know...

I know, but hardware was occupied with long-running simulation.

With the initial POC, there might be a timing issue involved, with
different process layout, exception does not occur in swith_to but
sometimes on other locations.

I created a new random-code testcase [1] , which works around that
problem. When booted a Debian initrd and tried id, OOPSes are fired
like wild but at least system does not lock up immediately.

hd

[1]
http://www.halfdog.net/Security/2013/Vm86SyscallTaskSwitchKernelPanic/Virtual86RandomCode.c

- -- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlLAiZEACgkQxFmThv7tq+5dsgCeIqOicLB17PuV7C6AzfZIY9J9
I0UAnA7YftR+4Jz2d5jP6YbpmBBtNOAz
=9MJY
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-29 20:44   ` halfdog
@ 2013-12-30  1:18     ` H. Peter Anvin
  2013-12-30 15:52       ` halfdog
  0 siblings, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2013-12-30  1:18 UTC (permalink / raw)
  To: halfdog, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

On 12/29/2013 12:44 PM, halfdog wrote:
> H. Peter Anvin wrote:
>> On 12/28/2013 02:02 PM, halfdog wrote:
>>> It seems that missing CPU-state sanitation during task 
>>> switching triggers kernel-panic. This might be related to 
>>> unhandled FPU-errors. See [1] for POC and serial console log
>>> of OOPs. Due to missing real 32-bit x86-hardware it is not
>>> clear, if this issue might be related to subtle differences in 
>>> virtual-8086 mode handling when inside a virtualbox guest.
>>> 
> 
>> This oops happens inside the guest?  Either way, I would be 
>> *very* skeptical of Virtualbox in this case.
> 
>> You can run a 32-bit kernel on 64-bit hardware, you know...
> 
> I know, but hardware was occupied with long-running simulation.
> 
> With the initial POC, there might be a timing issue involved, with
>  different process layout, exception does not occur in swith_to but
>  sometimes on other locations.
> 
> I created a new random-code testcase [1] , which works around that
>  problem. When booted a Debian initrd and tried id, OOPSes are 
> fired like wild but at least system does not lock up immediately.
> 

Still in VirtualBox?

	-hpa



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-30  1:18     ` H. Peter Anvin
@ 2013-12-30 15:52       ` halfdog
  2013-12-31 18:42         ` H. Peter Anvin
  0 siblings, 1 reply; 24+ messages in thread
From: halfdog @ 2013-12-30 15:52 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

H. Peter Anvin wrote:
> On 12/29/2013 12:44 PM, halfdog wrote:
>> H. Peter Anvin wrote:
>>> On 12/28/2013 02:02 PM, halfdog wrote:
>>>> It seems that missing CPU-state sanitation during task 
>>>> switching triggers kernel-panic. This might be related to 
>>>> unhandled FPU-errors. See [1] for POC and serial console log
>>>> of OOPs. Due to missing real 32-bit x86-hardware it is not
>>>> clear, if this issue might be related to subtle differences in 
>>>> virtual-8086 mode handling when inside a virtualbox guest.
>>>>
>>
>>> This oops happens inside the guest?  Either way, I would be 
>>> *very* skeptical of Virtualbox in this case.
>>
>>> You can run a 32-bit kernel on 64-bit hardware, you know...
>>
>> I know, but hardware was occupied with long-running simulation.
>>
>> With the initial POC, there might be a timing issue involved, with
>>  different process layout, exception does not occur in swith_to but
>>  sometimes on other locations.
>>
>> I created a new random-code testcase [1] , which works around that
>>  problem. When booted a Debian initrd and tried id, OOPSes are 
>> fired like wild but at least system does not lock up immediately.
>>
> 
> Still in VirtualBox?

Yes, again: after comparing the results from initrd on real hardware
with Vbox, I'm getting to understand the timing problem involved and why
timing in VBox is different: The test program usually OOPSes when
touching FPU multiple times, otherwise, when terminated before second
FPU-interacation, it OOPSes on next invocation, stumbling over invalid
CPU state from prior invocation. With improved code, I can rather
reliably bring CPU into that state, so that next process invoked and
touching FPU/MMX-state is OOPSed. Currently searching SUID-binaries and
running UID=0 daemons, that might show interesting reaction on that
event, but only on DOS level yet, e.g. after running V2 test program
once and then connecting via SSH, this currently kills the ssh daemon
nicely.

It seems that machine lockup occurs when e.g. switch to idle task
happens at exactly the right moment, which I currently cannot trigger on
real hardware, but still working on that.

-- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-30 15:52       ` halfdog
@ 2013-12-31 18:42         ` H. Peter Anvin
  2013-12-31 19:21           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2013-12-31 18:42 UTC (permalink / raw)
  To: halfdog, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

On 12/30/2013 07:52 AM, halfdog wrote:
>>
>> Still in VirtualBox?
> 
> Yes, again: after comparing the results from initrd on real hardware
> with Vbox, I'm getting to understand the timing problem involved and why
> timing in VBox is different: The test program usually OOPSes when
> touching FPU multiple times, otherwise, when terminated before second
> FPU-interacation, it OOPSes on next invocation, stumbling over invalid
> CPU state from prior invocation. With improved code, I can rather
> reliably bring CPU into that state, so that next process invoked and
> touching FPU/MMX-state is OOPSed. Currently searching SUID-binaries and
> running UID=0 daemons, that might show interesting reaction on that
> event, but only on DOS level yet, e.g. after running V2 test program
> once and then connecting via SSH, this currently kills the ssh daemon
> nicely.
> 
> It seems that machine lockup occurs when e.g. switch to idle task
> happens at exactly the right moment, which I currently cannot trigger on
> real hardware, but still working on that.
> 

I'm still wondering if this is a VirtualBox-specific problem or if it is
something that *could* occur on hardware, or in other virtualization
environments (KVM, Xen HVM, Hy-perV, VMware etc.)

	-hpa


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-31 18:42         ` H. Peter Anvin
@ 2013-12-31 19:21           ` Konrad Rzeszutek Wilk
  2013-12-31 22:40             ` H. Peter Anvin
  0 siblings, 1 reply; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-31 19:21 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: halfdog, Thomas Gleixner, Ingo Molnar, x86, linux-kernel

On Tue, Dec 31, 2013 at 10:42:47AM -0800, H. Peter Anvin wrote:
> On 12/30/2013 07:52 AM, halfdog wrote:
> >>
> >> Still in VirtualBox?
> > 
> > Yes, again: after comparing the results from initrd on real hardware
> > with Vbox, I'm getting to understand the timing problem involved and why
> > timing in VBox is different: The test program usually OOPSes when
> > touching FPU multiple times, otherwise, when terminated before second
> > FPU-interacation, it OOPSes on next invocation, stumbling over invalid
> > CPU state from prior invocation. With improved code, I can rather
> > reliably bring CPU into that state, so that next process invoked and
> > touching FPU/MMX-state is OOPSed. Currently searching SUID-binaries and
> > running UID=0 daemons, that might show interesting reaction on that
> > event, but only on DOS level yet, e.g. after running V2 test program
> > once and then connecting via SSH, this currently kills the ssh daemon
> > nicely.
> > 
> > It seems that machine lockup occurs when e.g. switch to idle task
> > happens at exactly the right moment, which I currently cannot trigger on
> > real hardware, but still working on that.
> > 
> 
> I'm still wondering if this is a VirtualBox-specific problem or if it is
> something that *could* occur on hardware, or in other virtualization
> environments (KVM, Xen HVM, Hy-perV, VMware etc.)

So, I am wondering if this is related to " x86/fpu: CR0.TS should be set before trap
into PV guest's #NM exception handle" which does have a similar pattern - you
do enough of the task switches and the FPU is screwed.

See http://mid.gmane.org/1383720072-6242-1-git-send-email-gaoyang.zyh@taobao.com

(I thought there was a thread about this on LKML too but I can't
find it).
> 
> 	-hpa
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching from virtual-8086 mode to other task
  2013-12-31 19:21           ` Konrad Rzeszutek Wilk
@ 2013-12-31 22:40             ` H. Peter Anvin
  2014-01-03 23:07               ` Sanitize FPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
  2014-01-08  7:45               ` Sanitize CPU-state " halfdog
  0 siblings, 2 replies; 24+ messages in thread
From: H. Peter Anvin @ 2013-12-31 22:40 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: halfdog, Thomas Gleixner, Ingo Molnar, x86, linux-kernel

On 12/31/2013 11:21 AM, Konrad Rzeszutek Wilk wrote:
> 
> So, I am wondering if this is related to " x86/fpu: CR0.TS should be set before trap
> into PV guest's #NM exception handle" which does have a similar pattern - you
> do enough of the task switches and the FPU is screwed.
> 
> See http://mid.gmane.org/1383720072-6242-1-git-send-email-gaoyang.zyh@taobao.com
> 
> (I thought there was a thread about this on LKML too but I can't
> find it).

That would be a bug in Xen, so I guess you're surmising a similar bug in
VirtualBox?

	-hpa



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize FPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2013-12-31 22:40             ` H. Peter Anvin
@ 2014-01-03 23:07               ` halfdog
  2014-01-08  7:45               ` Sanitize CPU-state " halfdog
  1 sibling, 0 replies; 24+ messages in thread
From: halfdog @ 2014-01-03 23:07 UTC (permalink / raw)
  To: H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

H. Peter Anvin wrote:
> On 12/31/2013 11:21 AM, Konrad Rzeszutek Wilk wrote:
>> 
>> So, I am wondering if this is related to " x86/fpu: CR0.TS should
>> be set before trap into PV guest's #NM exception handle" which
>> does have a similar pattern - you do enough of the task switches
>> and the FPU is screwed.
>> 
>> See
>> http://mid.gmane.org/1383720072-6242-1-git-send-email-gaoyang.zyh@taobao.com
>>
>>
>> 
(I thought there was a thread about this on LKML too but I can't
>> find it).
> 
> That would be a bug in Xen, so I guess you're surmising a similar
> bug in VirtualBox?

Not sure on that yet, but the whole thing is getting even more
funnier, the longer I can play with it. Here is some more information
from my latest tests:

* Although first observed with virtual-8086 mode, the bug is not
specific to virtual-8086 mode, it can be triggered with normal x86
userspace code also, even with better reproducibility.

* It seems, that when changing the FPU control word with "fstcw" just
before exit of the process, then another process could suffer when
doing __do_switch

* By having two rogue processes writing data to each other via a
socket, time and code-position of OOPS can be influenced.

* When deactivating mmap_min_addr, the NULL-dereferences during
task-switch are exploitable, but I did not get full ring-0 code
execution yet, putting EIP to the NULL-seg seem to have failed,
perhaps wrong RPL? Hoping to fix that during next days.

You can find the new improved test code at [1].

hd

[1]
http://www.halfdog.net/Security/2013/Vm86SyscallTaskSwitchKernelPanic/FpuStateTaskSwitchOops.c


- -- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlLHQrwACgkQxFmThv7tq+4C+wCfZ0a0LhaJqI7DW78ZFGbnzIyu
6H8AnROrUklhvdbAGV5+7/ELEzPikU7T
=jKjH
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2013-12-31 22:40             ` H. Peter Anvin
  2014-01-03 23:07               ` Sanitize FPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
@ 2014-01-08  7:45               ` halfdog
  2014-01-08 17:42                 ` H. Peter Anvin
  1 sibling, 1 reply; 24+ messages in thread
From: halfdog @ 2014-01-08  7:45 UTC (permalink / raw)
  To: H. Peter Anvin, Konrad Rzeszutek Wilk
  Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel, Ben Hutchings

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Update to the issue:

* Although first observed with virtual-8086 mode, the bug is not
specific to virtual-8086 mode, it can be triggered with normal x86
userspace code also, even with better reproducibility.

* Ben Hutchings looked at the Debian bug report [1], he failed to
reproduce on his hardware, so it might be specific to some CPU models
(currently my AMD E-350 is only machine known to be affected).

* When deactivating mmap_min_addr, the NULL-dereferences during
task-switch is exploitable, works both on native hardware and within
VirtualBox. See [2] for POC to gain root privileges.

* It seems, that when changing the FPU control word with "fstcw" just
before exit of the process, then another process could suffer when
doing __do_switch, probably related to the xsave instruction and a x86
processor bug workaround, see "noxsave" switch [3]: [BUGS=X86]
Disables x86 extended register state save and restore using xsave. The
kernel will fallback to enabling legacy floating-point and sse state.

hd

[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733551
[2]
http://www.halfdog.net/Security/2013/Vm86SyscallTaskSwitchKernelPanic/
[3] https://www.kernel.org/doc/Documentation/kernel-parameters.txt

- -- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlLNAjEACgkQxFmThv7tq+44FACfeDHQHK71+7tZawm9Ftjw7Hvp
j04AmwY04UwG9clERS3e1HisM2swbo1i
=KoQL
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-08  7:45               ` Sanitize CPU-state " halfdog
@ 2014-01-08 17:42                 ` H. Peter Anvin
  2014-01-08 19:36                   ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2014-01-08 17:42 UTC (permalink / raw)
  To: halfdog, Konrad Rzeszutek Wilk
  Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel, Ben Hutchings,
	Borislav Petkov

Adding Borislav.

Boris, do you happen to know of any erratum on AMD E-350 which may be
in play here?

	-hpa


On 01/07/2014 11:45 PM, halfdog wrote:
> Update to the issue:
> 
> * Although first observed with virtual-8086 mode, the bug is not 
> specific to virtual-8086 mode, it can be triggered with normal x86 
> userspace code also, even with better reproducibility.
> 
> * Ben Hutchings looked at the Debian bug report [1], he failed to 
> reproduce on his hardware, so it might be specific to some CPU
> models (currently my AMD E-350 is only machine known to be
> affected).
> 
> * When deactivating mmap_min_addr, the NULL-dereferences during 
> task-switch is exploitable, works both on native hardware and
> within VirtualBox. See [2] for POC to gain root privileges.
> 
> * It seems, that when changing the FPU control word with "fstcw"
> just before exit of the process, then another process could suffer
> when doing __do_switch, probably related to the xsave instruction
> and a x86 processor bug workaround, see "noxsave" switch [3]:
> [BUGS=X86] Disables x86 extended register state save and restore
> using xsave. The kernel will fallback to enabling legacy
> floating-point and sse state.
> 
> hd
> 
> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733551 [2] 
> http://www.halfdog.net/Security/2013/Vm86SyscallTaskSwitchKernelPanic/
>
> 
[3] https://www.kernel.org/doc/Documentation/kernel-parameters.txt
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-08 17:42                 ` H. Peter Anvin
@ 2014-01-08 19:36                   ` Borislav Petkov
  2014-01-08 21:28                     ` halfdog
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-01-08 19:36 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar,
	x86, linux-kernel, Ben Hutchings

On Wed, Jan 08, 2014 at 09:42:40AM -0800, H. Peter Anvin wrote:
> Adding Borislav.
> 
> Boris, do you happen to know of any erratum on AMD E-350 which may be
> in play here?

Interesting. Well, nothing looks even remotely related from looking at the F14h
rev guide here:

http://developer.amd.com/wordpress/media/2012/10/47534_14h_Mod_00h-0Fh_Rev_Guide.pdf

Btw, hd (if that is your real name :-)), can you post /proc/cpuinfo? I
think I might have a E-350 here too and I could try to reproduce. Btw,
how exactly do you trigger?

You run FpuStateTaskSwitchShmemXattrHandlersOverwriteWithNullPage.c
first to modify shmem_xattr_handlers and then
ManipulatedXattrHandlerForPrivEscalation.c? You need a 32-bit kernel and
userspace, right? Anything else?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-08 19:36                   ` Borislav Petkov
@ 2014-01-08 21:28                     ` halfdog
  2014-01-08 22:39                       ` H. Peter Anvin
  2014-01-09 22:50                       ` Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
  0 siblings, 2 replies; 24+ messages in thread
From: halfdog @ 2014-01-08 21:28 UTC (permalink / raw)
  To: Borislav Petkov, H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, x86,
	linux-kernel, Ben Hutchings

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Borislav Petkov wrote:
> On Wed, Jan 08, 2014 at 09:42:40AM -0800, H. Peter Anvin wrote:
>> Adding Borislav.
>> 
>> Boris, do you happen to know of any erratum on AMD E-350 which
>> may be in play here?
> 
> Interesting. Well, nothing looks even remotely related from looking
> at the F14h rev guide here:
> 
> http://developer.amd.com/wordpress/media/2012/10/47534_14h_Mod_00h-0Fh_Rev_Guide.pdf
>
>  Btw, hd (if that is your real name :-)), can you post
> /proc/cpuinfo?

Of course (you can also find it in the Debian bug report [1]):

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 20
model		: 1
model name	: AMD E-350 Processor
stepping	: 0
microcode	: 0x5000028
cpu MHz		: 1596.563
cache size	: 512 KB
fdiv_bug	: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 6
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm constant_tsc nonstop_tsc extd_apicid aperfmperf pni
monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy
abm sse4a misalignsse 3dnowprefetch ibs skinit wdt arat hw_pstate npt
lbrv svm_lock nrip_save pausefilter
bogomips	: 3193.12
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

> I think I might have a E-350 here too and I could try to reproduce.
> Btw, how exactly do you trigger?
> 
> You run
> FpuStateTaskSwitchShmemXattrHandlersOverwriteWithNullPage.c first
> to modify shmem_xattr_handlers and then 
> ManipulatedXattrHandlerForPrivEscalation.c? You need a 32-bit
> kernel and userspace, right? Anything else?

Yes: I used the standard Debian Sid 468 kernel (32bit), the first tool
might just trigger the OOPS to early, this seems to be harmless to the
kernel, so one can invoke it until the handler pointer was modified.
Since I hardcoded the Debian kernel addresses (copied from
System.map), this is very unlikly to give you root on another kernel,
but the math OOPS should be reproducible.


Does this sound fishy (from [2])?

"There is no need to save any active fpu state to the task structure
memory if the task is dead. Just drop the state instead."

My rogue process might interfere with that: change control registers,
cause exception and then exit quickly


Or could it be invalid CPU-features detection, perhaps related to [3]?

The math-restore/__do_switch combination occurred already in older bug
reports, e.g. [4] (very close), [5] (similar, poor info). )))OOPS "EIP
is at math_state_restore"((( seems to be suitable search expression.


[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733551
[2] http://lkml.indiana.edu/hypermail/linux/kernel/1205.1/02182.html
[3] http://lkml.indiana.edu/hypermail/linux/kernel/0905.2/02599.html
[4] https://lkml.org/lkml/2008/6/16/146
[5] http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1536

- -- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlLNww0ACgkQxFmThv7tq+4LngCeI/ZVFtzEy9RDpVP9Jk46tzGs
9h8Ani/YO9FsUOpcKxiXovJkTPiKuI4e
=InkM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-08 21:28                     ` halfdog
@ 2014-01-08 22:39                       ` H. Peter Anvin
  2014-01-09 22:58                         ` Borislav Petkov
  2014-01-09 22:50                       ` Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
  1 sibling, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2014-01-08 22:39 UTC (permalink / raw)
  To: halfdog, Borislav Petkov
  Cc: Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, x86,
	linux-kernel, Ben Hutchings

It is obviously critical here that we get a handle on if this is a
CPU-specific problem that we might have to work around or a general
problem with the Linux code.

	-hpa

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-08 21:28                     ` halfdog
  2014-01-08 22:39                       ` H. Peter Anvin
@ 2014-01-09 22:50                       ` halfdog
  2014-01-09 23:02                         ` Borislav Petkov
  1 sibling, 1 reply; 24+ messages in thread
From: halfdog @ 2014-01-09 22:50 UTC (permalink / raw)
  To: Borislav Petkov, H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, x86,
	linux-kernel, Ben Hutchings

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It took me some time to build me the Debian Sid testing environment
for amd64 with the same quality, I have vor i386, but now it is ready.
And it seems, that amd64 is also affected, but lockup is immediately
(makes exploitation harder)

Here is the OOPS from the serial console, again in __switch_to

[  498.783577] fpu exception: 0000 [#1] SMP
[  498.787054] Modules linked in: xt_multiport xt_hashlimit xt_tcpudp
ipt_ULOG xt_LOG xt_conntrack iptable_raw iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_filter ip_tables x_tables fuse snd_pcm snd_page_alloc
snd_timer snd soundcore i2c_piix4 psmouse pcspkr evdev serio_raw
i2c_core parport_pc parport battery button ac ext4 crc16 mbcache jbd2
sd_mod crc_t10dif crct10dif_common sg sr_mod cdrom ata_generic
virtio_net mptspi scsi_transport_spi ata_piix virtio_pci virtio_ring
virtio mptscsih mptbase libata scsi_mod
[  498.787205] CPU: 0 PID: 1783 Comm: Test Not tainted 3.12-1-amd64 #1
Debian 3.12.6-2
[  498.787205] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  498.787205] task: ffff88000cb18840 ti: ffff88000b454000 task.ti:
ffff88000b454000
[  498.787205] RIP: 0010:[<ffffffff81011730>]  [<ffffffff81011730>]
__switch_to+0x2d0/0x490
[  498.787205] RSP: 0018:ffff88000e0c78b8  EFLAGS: 00010002
[  498.787205] RAX: 0000000000000001 RBX: ffff88000e0b77c0 RCX:
00000000c0000100
[  498.787205] RDX: 0000000000000000 RSI: 0000000051e3f800 RDI:
00000000c0000100
[  498.787205] RBP: ffff88000cb18840 R08: 0000000000000000 R09:
0000000000003314
[  498.787205] R10: 0000000000001746 R11: 000000000000000f R12:
0000000000000000
[  498.787205] R13: 0000000000000000 R14: ffff88000fc11780 R15:
0000000000000000
[  498.787205] FS:  00007fb651e3f800(0000) GS:ffff88000fc00000(0000)
knlGS:0000000000000000
[  498.787205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  498.787205] CR2: 00007f72ddfcc990 CR3: 000000000e22d000 CR4:
00000000000006f0
[  498.787205] Stack:
[  498.787205]  ffff88000e0b7bc0 000000010fc14330 ffff88000b4efac0
ffff88000e0b77c0
[  498.787205]  ffff88000fc142c0 ffff88000b5d3b40 0000000000000000
ffff88000e0b77c0
[  498.787205]  ffffffff8148febe ffff88000e0b77c0 0000000000000086
00000000000142c0
[  498.787205] Call Trace:
[  498.787205] Code: ff 66 2e 0f 1f 84 00 00 00 00 00 bf 7d 00 00 00
e8 e6 00 01 00 84 c0 0f 85 d7 fd ff ff 0f 06 66 66 90 66 90 e9 cb fd
ff ff 66 90 <0f> 77 db 83 94 04 00 00 66 90 eb 74 b8 ff ff ff ff 48 8b
bb 98
[  498.787205] RIP  [<ffffffff81011730>] __switch_to+0x2d0/0x490
[  498.787205]  RSP <ffff88000e0c78b8>
[  498.787205] ---[ end trace 3f873c38e16c8005 ]---
[  498.787205] Fixing recursive fault but reboot is needed!

I'll try to go the same line as before: understand it, write a
local-root-exploit for it (I feel it somehow, that this might be
really hard on that kernel) and test it on the bare hardware afterwards.

- -- 
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88  2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlLPJ7MACgkQxFmThv7tq+6CSACeK7/SBzJOVvLlVBas9NANZYFp
pEUAn21LoX0ewsnOag7fomqtvKqUzGyL
=pst0
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-08 22:39                       ` H. Peter Anvin
@ 2014-01-09 22:58                         ` Borislav Petkov
  2014-01-10  0:42                           ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-01-09 22:58 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar,
	x86, linux-kernel, Ben Hutchings

On Wed, Jan 08, 2014 at 02:39:42PM -0800, H. Peter Anvin wrote:
> It is obviously critical here that we get a handle on if this is a
> CPU-specific problem that we might have to work around or a general
> problem with the Linux code.

Ok, I was able to reproduce with

http://www.halfdog.net/Security/2013/Vm86SyscallTaskSwitchKernelPanic/FpuStateTaskSwitchShmemXattrHandlersOverwriteWithNullPage.c

here on the latest linus+tip, see OOPS below:

$ AFLAGS=--32 decodecode < ~/fpu.oops

...

Code is:

All code
========
   0:   89 d8                   mov    %ebx,%eax
   2:   e8 8c 96 00 00          call   0x9693
   7:   85 c0                   test   %eax,%eax
   9:   0f 85 9c 00 00 00       jne    0xab
   f:   fa                      cli
  10:   e8 7e bd 08 00          call   0x8bd93
  15:   e9 6f 00 00 00          jmp    0x89
  1a:   c7 83 a0 02 00 00 01    movl   $0x1,0x2a0(%ebx)
  21:   00 00 00
  24:   64 89 1d ac a7 8a c1    mov    %ebx,%fs:0xc18aa7ac
  2b:*  0f 77                   emms            <-- trapping instruction
  2d:   db 83 a0 02 00 00       fildl  0x2a0(%ebx)
  33:   89 f6                   mov    %esi,%esi
  35:   89 f6                   mov    %esi,%esi
  37:   eb 27                   jmp    0x60
  39:   b8 ff ff ff ff          mov    $0xffffffff,%eax
  3e:   8b                      .byte 0x8b
  3f:   bb                      .byte 0xbb

Code starting with the faulting instruction
===========================================
   0:   0f 77                   emms
   2:   db 83 a0 02 00 00       fildl  0x2a0(%ebx)
   8:   89 f6                   mov    %esi,%esi
   a:   89 f6                   mov    %esi,%esi
   c:   eb 27                   jmp    0x35
   e:   b8 ff ff ff ff          mov    $0xffffffff,%eax
  13:   8b                      .byte 0x8b
  14:   bb                      .byte 0xbb

which points at EMMS, which gets runtime-replaced in:

static inline int restore_fpu_checking(struct task_struct *tsk)
{
	/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
	   is pending.  Clear the x87 state here by setting it to fixed
	   values. "m" is a random variable that should be in L1 */
	alternative_input(
		ASM_NOP8 ASM_NOP2,
		"emms\n\t"		/* clear stack tags */
		"fildl %P[addr]",	/* set F?P to defined value */
		X86_FEATURE_FXSAVE_LEAK,
		[addr] "m" (tsk->thread.fpu.has_fpu));

	return fpu_restore_checking(&tsk->thread.fpu);
}

Now, judging by the exception type and if I'm not mistaken, we get an
#MF which, according to the EMMS documentation means we get an #MF
exception because an unmasked x87 floating-point exception was pending.

Now, we most likely have done FXSAVE before that and this one doesn't
check for pending unmasked x87 floating-point exceptions and we have
CR0.NE=1b which means in that case that a numeric exception gets
generated.

Yadda, yadda, all is fine but why do we have any pending x87 FPU
exceptions then where we shouldn't? I dunno - I've never warmed up to
the FPU diddling in the kernel. I'll try to wrap my head around this
later...

---
localhost vmunix: [  364.177380] fpu exception: 0000 [#1] PREEMPT SMP
localhost vmunix: [  364.185005] Modules linked in: ipv6 radeon snd_hda_codec_conexant rtl8192ce rtl_pci snd_hda_codec_hdmi rtlwifi snd_hda_intel mac80211 snd_hda_codec snd_hwdep usbhid snd_pcm cfg80211 rtsx_pci_sdmmc snd_page_alloc drm_kms_helper mmc_core snd_timer rtsx_pci rtl8192c_common ttm thinkpad_acpi nvram snd ohci_pci ehci_pci ohci_hcd mfd_core button ac video battery ehci_hcd pcspkr thermal k10temp soundcore
localhost vmunix: [  364.209743] CPU: 1 PID: 1200 Comm: find Tainted: G        W    3.13.0-rc7+ #4
localhost vmunix: [  364.217947] Hardware name: LENOVO 30515QG/30515QG, BIOS 8RET30WW (1.12 ) 09/15/2011
localhost vmunix: [  364.226231] task: f4104980 ti: f3800000 task.ti: f3800000
localhost vmunix: [  364.234511] EIP: 0060:[<c10026b8>] EFLAGS: 00010002 CPU: 1
localhost vmunix: [  364.242734] EIP is at math_state_restore+0x48/0x1a0
localhost vmunix: [  364.250939] EAX: f3801fb4 EBX: f4104980 ECX: 0000007b EDX: ffffffff
localhost vmunix: [  364.259188] ESI: 089c58f8 EDI: c1003510 EBP: f3801fa0 ESP: f3801f98
localhost vmunix: [  364.267364]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
localhost vmunix: [  364.275542] CR0: 80050033 CR2: b75d6f32 CR3: 331f9000 CR4: 000007d0
localhost vmunix: [  364.283758] Stack:
localhost vmunix: [  364.291834]  f3801fb4 c1003510 f3801fac c1003535 089c39f8 bf8b8e88 c155c96b 089c39f8
localhost vmunix: [  364.300213]  00000000 00000029 089c58f8 00000000 bf8b8e88 080681fc 0000007b 0000007b
localhost vmunix: [  364.308631]  00000000 c1003510 ffffffff 0805e0f8 00000073 00010206 bf8b8e30 0000007b
localhost vmunix: [  364.316985] Call Trace:
localhost vmunix: [  364.325104]  [<c1003510>] ? smp_thermal_interrupt+0x20/0x20
localhost vmunix: [  364.333300]  [<c1003535>] do_device_not_available+0x25/0x40
localhost vmunix: [  364.341458]  [<c155c96b>] error_code+0x5f/0x64
localhost vmunix: [  364.349541]  [<c1003510>] ? smp_thermal_interrupt+0x20/0x20
localhost vmunix: [  364.357649] Code: 89 d8 e8 8c 96 00 00 85 c0 0f 85 9c 00 00 00 fa e8 7e bd 08 00 e9 6f 00 00 00 c7 83 a0 02 00 00 01 00 00 00 64 89 1d ac a7 8a c1 <0f> 77 db 83 a0 02 00 00 89 f6 89 f6 eb 27 b8 ff ff ff ff 8b bb
localhost vmunix: [  364.375367] EIP: [<c10026b8>] math_state_restore+0x48/0x1a0 SS:ESP 0068:f3801f98
localhost vmunix: [  364.383830] ---[ end trace c8241b0abe86a792 ]---

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-09 22:50                       ` Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
@ 2014-01-09 23:02                         ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-01-09 23:02 UTC (permalink / raw)
  To: halfdog
  Cc: H. Peter Anvin, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, x86, linux-kernel, Ben Hutchings

On Thu, Jan 09, 2014 at 10:50:28PM +0000, halfdog wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> It took me some time to build me the Debian Sid testing environment
> for amd64 with the same quality, I have vor i386, but now it is ready.
> And it seems, that amd64 is also affected, but lockup is immediately
> (makes exploitation harder)
> 
> Here is the OOPS from the serial console, again in __switch_to
> 
> [  498.783577] fpu exception: 0000 [#1] SMP
> [  498.787054] Modules linked in: xt_multiport xt_hashlimit xt_tcpudp
> ipt_ULOG xt_LOG xt_conntrack iptable_raw iptable_nat nf_conntrack_ipv4
> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
> iptable_filter ip_tables x_tables fuse snd_pcm snd_page_alloc
> snd_timer snd soundcore i2c_piix4 psmouse pcspkr evdev serio_raw
> i2c_core parport_pc parport battery button ac ext4 crc16 mbcache jbd2
> sd_mod crc_t10dif crct10dif_common sg sr_mod cdrom ata_generic
> virtio_net mptspi scsi_transport_spi ata_piix virtio_pci virtio_ring
> virtio mptscsih mptbase libata scsi_mod
> [  498.787205] CPU: 0 PID: 1783 Comm: Test Not tainted 3.12-1-amd64 #1
> Debian 3.12.6-2
> [  498.787205] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> VirtualBox 12/01/2006
> [  498.787205] task: ffff88000cb18840 ti: ffff88000b454000 task.ti:
> ffff88000b454000
> [  498.787205] RIP: 0010:[<ffffffff81011730>]  [<ffffffff81011730>]
> __switch_to+0x2d0/0x490
> [  498.787205] RSP: 0018:ffff88000e0c78b8  EFLAGS: 00010002
> [  498.787205] RAX: 0000000000000001 RBX: ffff88000e0b77c0 RCX:
> 00000000c0000100
> [  498.787205] RDX: 0000000000000000 RSI: 0000000051e3f800 RDI:
> 00000000c0000100
> [  498.787205] RBP: ffff88000cb18840 R08: 0000000000000000 R09:
> 0000000000003314
> [  498.787205] R10: 0000000000001746 R11: 000000000000000f R12:
> 0000000000000000
> [  498.787205] R13: 0000000000000000 R14: ffff88000fc11780 R15:
> 0000000000000000
> [  498.787205] FS:  00007fb651e3f800(0000) GS:ffff88000fc00000(0000)
> knlGS:0000000000000000
> [  498.787205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  498.787205] CR2: 00007f72ddfcc990 CR3: 000000000e22d000 CR4:
> 00000000000006f0
> [  498.787205] Stack:
> [  498.787205]  ffff88000e0b7bc0 000000010fc14330 ffff88000b4efac0
> ffff88000e0b77c0
> [  498.787205]  ffff88000fc142c0 ffff88000b5d3b40 0000000000000000
> ffff88000e0b77c0
> [  498.787205]  ffffffff8148febe ffff88000e0b77c0 0000000000000086
> 00000000000142c0
> [  498.787205] Call Trace:
> [  498.787205] Code: ff 66 2e 0f 1f 84 00 00 00 00 00 bf 7d 00 00 00
> e8 e6 00 01 00 84 c0 0f 85 d7 fd ff ff 0f 06 66 66 90 66 90 e9 cb fd
> ff ff 66 90 <0f> 77 db 83 94 04 00 00 66 90 eb 74 b8 ff ff ff ff 48 8b

Yep, EMMS again: 0f 77 - unhandled x87 FPU exception, see my other mail
I just sent.

I'll try this on another AMD machine tomorrow to see whether it is
affected too.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-09 22:58                         ` Borislav Petkov
@ 2014-01-10  0:42                           ` Linus Torvalds
  2014-01-10  2:13                             ` H. Peter Anvin
  2014-01-12  3:22                             ` [tip:x86/urgent] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround tip-bot for Linus Torvalds
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2014-01-10  0:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: H. Peter Anvin, halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, the arch/x86 maintainers, Linux Kernel Mailing List,
	Ben Hutchings

[-- Attachment #1: Type: text/plain, Size: 451 bytes --]

On Fri, Jan 10, 2014 at 6:58 AM, Borislav Petkov <bp@alien8.de> wrote:
>
> Ok, I was able to reproduce

Looking at this, I think this is just a bug in our
restore_fpu_checking() hackery for X86_FEATURE_FXSAVE_LEAK..

Which also explains why it only triggers on E-350 - it's only relevant
for those K7/K8 CPU's that use this.

Maybe just add a fcnlex to before the emms? Something like this
(TOTALLY UNTESTED!!) attached patch.

                 Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 1070 bytes --]

 arch/x86/include/asm/fpu-internal.h | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index c49a613c6452..cea1c76d49bf 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -293,12 +293,13 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
 	/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
 	   is pending.  Clear the x87 state here by setting it to fixed
 	   values. "m" is a random variable that should be in L1 */
-	alternative_input(
-		ASM_NOP8 ASM_NOP2,
-		"emms\n\t"		/* clear stack tags */
-		"fildl %P[addr]",	/* set F?P to defined value */
-		X86_FEATURE_FXSAVE_LEAK,
-		[addr] "m" (tsk->thread.fpu.has_fpu));
+	if (unlikely(static_cpu_has(X86_FEATURE_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %P[addr]"	/* set F?P to defined value */
+			: : [addr] "m" (tsk->thread.fpu.has_fpu));
+	}
 
 	return fpu_restore_checking(&tsk->thread.fpu);
 }

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-10  0:42                           ` Linus Torvalds
@ 2014-01-10  2:13                             ` H. Peter Anvin
  2014-01-10 10:06                               ` Borislav Petkov
  2014-01-12  3:22                             ` [tip:x86/urgent] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround tip-bot for Linus Torvalds
  1 sibling, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2014-01-10  2:13 UTC (permalink / raw)
  To: Linus Torvalds, Borislav Petkov
  Cc: halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar,
	the arch/x86 maintainers, Linux Kernel Mailing List,
	Ben Hutchings

On 01/09/2014 04:42 PM, Linus Torvalds wrote:
> On Fri, Jan 10, 2014 at 6:58 AM, Borislav Petkov <bp@alien8.de> wrote:
>>
>> Ok, I was able to reproduce
> 
> Looking at this, I think this is just a bug in our
> restore_fpu_checking() hackery for X86_FEATURE_FXSAVE_LEAK..
> 
> Which also explains why it only triggers on E-350 - it's only relevant
> for those K7/K8 CPU's that use this.
> 
> Maybe just add a fcnlex to before the emms? Something like this
> (TOTALLY UNTESTED!!) attached patch.
> 

OK, that sounds very reasonable.  Boris, halfdog, does something like
this resolve your problem?

	-hpa



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-10  2:13                             ` H. Peter Anvin
@ 2014-01-10 10:06                               ` Borislav Petkov
  2014-01-10 11:16                                 ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-01-10 10:06 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linus Torvalds, halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, the arch/x86 maintainers, Linux Kernel Mailing List,
	Ben Hutchings

On Thu, Jan 09, 2014 at 06:13:19PM -0800, H. Peter Anvin wrote:
> OK, that sounds very reasonable. Boris, halfdog, does something like
> this resolve your problem?

Yeah, if in doubt, Linus to the rescue! :)

Tested-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-10 10:06                               ` Borislav Petkov
@ 2014-01-10 11:16                                 ` Linus Torvalds
  2014-01-10 11:34                                   ` Borislav Petkov
  2014-01-10 16:11                                   ` H. Peter Anvin
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2014-01-10 11:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: H. Peter Anvin, halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, the arch/x86 maintainers, Linux Kernel Mailing List,
	Ben Hutchings

On Fri, Jan 10, 2014 at 6:06 PM, Borislav Petkov <bp@alien8.de> wrote:
>
> Tested-by: Borislav Petkov <bp@suse.de>

Ok, good.

Peter, do you want to take it (feel free to add my sign-off), or
should I just commit it?

Also, is there a way to have a "likely not true" version of that
"static_cpu_has()"? There seems to be no way to make the non-K7/K8
case the fallthrough code.. Not that this is likely that
performance-critical, but..

                Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-10 11:16                                 ` Linus Torvalds
@ 2014-01-10 11:34                                   ` Borislav Petkov
  2014-01-10 16:11                                   ` H. Peter Anvin
  1 sibling, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-01-10 11:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, the arch/x86 maintainers, Linux Kernel Mailing List,
	Ben Hutchings

On Fri, Jan 10, 2014 at 07:16:24PM +0800, Linus Torvalds wrote:
> Also, is there a way to have a "likely not true" version of that
> "static_cpu_has()"? There seems to be no way to make the non-K7/K8
> case

FWIW, this is not only K7/K8 but actually all AMD from family 6 onwards,
which is - practically speaking - all AMD.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task)
  2014-01-10 11:16                                 ` Linus Torvalds
  2014-01-10 11:34                                   ` Borislav Petkov
@ 2014-01-10 16:11                                   ` H. Peter Anvin
  1 sibling, 0 replies; 24+ messages in thread
From: H. Peter Anvin @ 2014-01-10 16:11 UTC (permalink / raw)
  To: Linus Torvalds, Borislav Petkov
  Cc: halfdog, Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar,
	the arch/x86 maintainers, Linux Kernel Mailing List,
	Ben Hutchings

On 01/10/2014 03:16 AM, Linus Torvalds wrote:
> On Fri, Jan 10, 2014 at 6:06 PM, Borislav Petkov <bp@alien8.de> wrote:
>>
>> Tested-by: Borislav Petkov <bp@suse.de>
> 
> Ok, good.
> 
> Peter, do you want to take it (feel free to add my sign-off), or
> should I just commit it?
> 
> Also, is there a way to have a "likely not true" version of that
> "static_cpu_has()"? There seems to be no way to make the non-K7/K8
> case the fallthrough code.. Not that this is likely that
> performance-critical, but..
> 

I'll take it.

We don't have a "likely not true" version of static_cpu_has() at this
point... it would mean we couldn't do short jumps unfortunately (and
they would still take the false path until alternatives run.)

	-hpa



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [tip:x86/urgent] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround
  2014-01-10  0:42                           ` Linus Torvalds
  2014-01-10  2:13                             ` H. Peter Anvin
@ 2014-01-12  3:22                             ` tip-bot for Linus Torvalds
  1 sibling, 0 replies; 24+ messages in thread
From: tip-bot for Linus Torvalds @ 2014-01-12  3:22 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, torvalds, tglx, bp, me

Commit-ID:  26bef1318adc1b3a530ecc807ef99346db2aa8b0
Gitweb:     http://git.kernel.org/tip/26bef1318adc1b3a530ecc807ef99346db2aa8b0
Author:     Linus Torvalds <torvalds@linux-foundation.org>
AuthorDate: Sat, 11 Jan 2014 19:15:52 -0800
Committer:  H. Peter Anvin <hpa@zytor.com>
CommitDate: Sat, 11 Jan 2014 19:15:52 -0800

x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.

Reported-by: halfdog <me@halfdog.net>
Tested-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
 arch/x86/include/asm/fpu-internal.h | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index c49a613..cea1c76 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -293,12 +293,13 @@ static inline int restore_fpu_checking(struct task_struct *tsk)
 	/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
 	   is pending.  Clear the x87 state here by setting it to fixed
 	   values. "m" is a random variable that should be in L1 */
-	alternative_input(
-		ASM_NOP8 ASM_NOP2,
-		"emms\n\t"		/* clear stack tags */
-		"fildl %P[addr]",	/* set F?P to defined value */
-		X86_FEATURE_FXSAVE_LEAK,
-		[addr] "m" (tsk->thread.fpu.has_fpu));
+	if (unlikely(static_cpu_has(X86_FEATURE_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %P[addr]"	/* set F?P to defined value */
+			: : [addr] "m" (tsk->thread.fpu.has_fpu));
+	}
 
 	return fpu_restore_checking(&tsk->thread.fpu);
 }

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2014-01-12  3:22 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-28 22:02 Sanitize CPU-state when switching from virtual-8086 mode to other task halfdog
2013-12-29  2:37 ` H. Peter Anvin
2013-12-29 20:44   ` halfdog
2013-12-30  1:18     ` H. Peter Anvin
2013-12-30 15:52       ` halfdog
2013-12-31 18:42         ` H. Peter Anvin
2013-12-31 19:21           ` Konrad Rzeszutek Wilk
2013-12-31 22:40             ` H. Peter Anvin
2014-01-03 23:07               ` Sanitize FPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
2014-01-08  7:45               ` Sanitize CPU-state " halfdog
2014-01-08 17:42                 ` H. Peter Anvin
2014-01-08 19:36                   ` Borislav Petkov
2014-01-08 21:28                     ` halfdog
2014-01-08 22:39                       ` H. Peter Anvin
2014-01-09 22:58                         ` Borislav Petkov
2014-01-10  0:42                           ` Linus Torvalds
2014-01-10  2:13                             ` H. Peter Anvin
2014-01-10 10:06                               ` Borislav Petkov
2014-01-10 11:16                                 ` Linus Torvalds
2014-01-10 11:34                                   ` Borislav Petkov
2014-01-10 16:11                                   ` H. Peter Anvin
2014-01-12  3:22                             ` [tip:x86/urgent] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround tip-bot for Linus Torvalds
2014-01-09 22:50                       ` Sanitize CPU-state when switching tasks (was sanitize CPU-state when switching from virtual-8086 mode to other task) halfdog
2014-01-09 23:02                         ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).