All of lore.kernel.org
 help / color / mirror / Atom feed
* PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
@ 2015-03-15  8:17 Stefan Seyfried
  2015-03-18 14:16 ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-15  8:17 UTC (permalink / raw)
  To: LKML

Hi all,

in 4.0-rc I have recently seen a few crashes, always when running
KVM guests (IIRC). Today I was able to capture a crash dump, this
is the backtrace from dmesg.txt:

[242060.604870] PANIC: double fault, error_code: 0x0
[242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
[242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
[242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
[242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
[242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
[242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
[242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
[242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
[242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
[242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
[242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
[242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
[242060.604909] Stack:
[242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
[242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
[242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
[242060.605078] Oops: 0000 [#1] PREEMPT SMP 
[242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
[242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
[242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
[242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
[242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
[242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
[242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
[242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
[242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
[242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
[242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
[242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
[242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
[242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
[242060.605396] Stack:
[242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
[242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
[242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
[242060.605396] Call Trace:
[242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
[242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
[242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
[242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
[242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
[242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89 
[242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
[242060.605396]  RSP <ffff88023bc84e88>
[242060.605396] CR2: 00007fffa55eafb8

I would not totally rule out a hardware problem, since this machine had
another weird crash where it crashed and the bios beeper was constant
on until I hit the power button for 5 seconds.

Unfortunately, I cannot load the crashdump with the crash version in
openSUSE Tumbleweed, so the backtrace is all I have for now.

Any hints?

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-15  8:17 PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
@ 2015-03-18 14:16 ` Takashi Iwai
  2015-03-18 15:05   ` Takashi Iwai
  2015-03-18 17:43   ` Takashi Iwai
  0 siblings, 2 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-18 14:16 UTC (permalink / raw)
  To: Stefan Seyfried; +Cc: LKML

At Sun, 15 Mar 2015 09:17:15 +0100,
Stefan Seyfried wrote:
> 
> Hi all,
> 
> in 4.0-rc I have recently seen a few crashes, always when running
> KVM guests (IIRC). Today I was able to capture a crash dump, this
> is the backtrace from dmesg.txt:
> 
> [242060.604870] PANIC: double fault, error_code: 0x0
> [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
> [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
> [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
> [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
> [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
> [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
> [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
> [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
> [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
> [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
> [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
> [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
> [242060.604909] Stack:
> [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
> [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
> [242060.605078] Oops: 0000 [#1] PREEMPT SMP 
> [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
> [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
> [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
> [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
> [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
> [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
> [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
> [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
> [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
> [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
> [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
> [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
> [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
> [242060.605396] Stack:
> [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
> [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
> [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
> [242060.605396] Call Trace:
> [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
> [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
> [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
> [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
> [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
> [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89 
> [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> [242060.605396]  RSP <ffff88023bc84e88>
> [242060.605396] CR2: 00007fffa55eafb8
> 
> I would not totally rule out a hardware problem, since this machine had
> another weird crash where it crashed and the bios beeper was constant
> on until I hit the power button for 5 seconds.
> 
> Unfortunately, I cannot load the crashdump with the crash version in
> openSUSE Tumbleweed, so the backtrace is all I have for now.

Just "me too", I'm getting the very same crash out of sudden with the
recent 4.0-rc.  Judging from the very same pattern (usually crash
happened while using KVM (-smp 4) and kernel builds with -j8), I don't
think it's a hardware problem.

IIRC, this didn't happen with the early 4.0-rc, but can't say 100%
sure.

This happened with today's Linus tree (c58616580ea5).

I'm going to do stress tests whether I can trigger this reliably...


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 14:16 ` Takashi Iwai
@ 2015-03-18 15:05   ` Takashi Iwai
  2015-03-18 17:43   ` Takashi Iwai
  1 sibling, 0 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-18 15:05 UTC (permalink / raw)
  To: Stefan Seyfried; +Cc: LKML

At Wed, 18 Mar 2015 15:16:42 +0100,
Takashi Iwai wrote:
> 
> IIRC, this didn't happen with the early 4.0-rc, but can't say 100%
> sure.

I could reproduce the panic on 4.0-rc1, so scratch this comment.


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 14:16 ` Takashi Iwai
  2015-03-18 15:05   ` Takashi Iwai
@ 2015-03-18 17:43   ` Takashi Iwai
  2015-03-18 17:46     ` Takashi Iwai
  1 sibling, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-18 17:43 UTC (permalink / raw)
  To: x86; +Cc: LKML, Stefan Seyfried, Andy Lutomirski

At Wed, 18 Mar 2015 15:16:42 +0100,
Takashi Iwai wrote:
> 
> At Sun, 15 Mar 2015 09:17:15 +0100,
> Stefan Seyfried wrote:
> > 
> > Hi all,
> > 
> > in 4.0-rc I have recently seen a few crashes, always when running
> > KVM guests (IIRC). Today I was able to capture a crash dump, this
> > is the backtrace from dmesg.txt:
> > 
> > [242060.604870] PANIC: double fault, error_code: 0x0
> > [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
> > [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
> > [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
> > [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
> > [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
> > [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
> > [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
> > [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
> > [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
> > [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
> > [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
> > [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
> > [242060.604909] Stack:
> > [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
> > [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> > [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
> > [242060.605078] Oops: 0000 [#1] PREEMPT SMP 
> > [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
> > [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
> > [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
> > [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
> > [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
> > [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> > [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
> > [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
> > [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
> > [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
> > [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
> > [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
> > [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
> > [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
> > [242060.605396] Stack:
> > [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
> > [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
> > [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
> > [242060.605396] Call Trace:
> > [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
> > [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
> > [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
> > [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
> > [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
> > [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89 
> > [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> > [242060.605396]  RSP <ffff88023bc84e88>
> > [242060.605396] CR2: 00007fffa55eafb8
> > 
> > I would not totally rule out a hardware problem, since this machine had
> > another weird crash where it crashed and the bios beeper was constant
> > on until I hit the power button for 5 seconds.
> > 
> > Unfortunately, I cannot load the crashdump with the crash version in
> > openSUSE Tumbleweed, so the backtrace is all I have for now.
> 
> Just "me too", I'm getting the very same crash out of sudden with the
> recent 4.0-rc.  Judging from the very same pattern (usually crash
> happened while using KVM (-smp 4) and kernel builds with -j8), I don't
> think it's a hardware problem.

The git bisection pointed to the commit:
commit b926e6f61a26036ee9eabe6761483954d481ad25
    x86, traps: Fix ist_enter from userspace

And reverting this on top of the latest Linus tree seems working.
Seife, could you verify on your machine, too?


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 17:43   ` Takashi Iwai
@ 2015-03-18 17:46     ` Takashi Iwai
  2015-03-18 18:03       ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-18 17:46 UTC (permalink / raw)
  To: x86; +Cc: LKML, Stefan Seyfried, Andy Lutomirski

At Wed, 18 Mar 2015 18:43:52 +0100,
Takashi Iwai wrote:
> 
> At Wed, 18 Mar 2015 15:16:42 +0100,
> Takashi Iwai wrote:
> > 
> > At Sun, 15 Mar 2015 09:17:15 +0100,
> > Stefan Seyfried wrote:
> > > 
> > > Hi all,
> > > 
> > > in 4.0-rc I have recently seen a few crashes, always when running
> > > KVM guests (IIRC). Today I was able to capture a crash dump, this
> > > is the backtrace from dmesg.txt:
> > > 
> > > [242060.604870] PANIC: double fault, error_code: 0x0
> > > [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
> > > [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
> > > [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
> > > [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
> > > [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
> > > [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
> > > [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
> > > [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
> > > [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
> > > [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
> > > [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
> > > [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
> > > [242060.604909] Stack:
> > > [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
> > > [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> > > [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
> > > [242060.605078] Oops: 0000 [#1] PREEMPT SMP 
> > > [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
> > > [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
> > > [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
> > > [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
> > > [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
> > > [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> > > [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
> > > [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
> > > [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
> > > [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
> > > [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
> > > [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
> > > [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
> > > [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
> > > [242060.605396] Stack:
> > > [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
> > > [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
> > > [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
> > > [242060.605396] Call Trace:
> > > [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
> > > [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
> > > [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
> > > [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
> > > [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
> > > [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89 
> > > [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> > > [242060.605396]  RSP <ffff88023bc84e88>
> > > [242060.605396] CR2: 00007fffa55eafb8
> > > 
> > > I would not totally rule out a hardware problem, since this machine had
> > > another weird crash where it crashed and the bios beeper was constant
> > > on until I hit the power button for 5 seconds.
> > > 
> > > Unfortunately, I cannot load the crashdump with the crash version in
> > > openSUSE Tumbleweed, so the backtrace is all I have for now.
> > 
> > Just "me too", I'm getting the very same crash out of sudden with the
> > recent 4.0-rc.  Judging from the very same pattern (usually crash
> > happened while using KVM (-smp 4) and kernel builds with -j8), I don't
> > think it's a hardware problem.
> 
> The git bisection pointed to the commit:
> commit b926e6f61a26036ee9eabe6761483954d481ad25
>     x86, traps: Fix ist_enter from userspace
> 
> And reverting this on top of the latest Linus tree seems working.
> Seife, could you verify on your machine, too?

Argh, false positive.  Right after I wrote this mail, I got the very
same crash.  I seem to need running the test much longer than I
thought.

But somehow the commits around the above smell suspicious...


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 17:46     ` Takashi Iwai
@ 2015-03-18 18:03       ` Andy Lutomirski
  2015-03-18 19:03         ` Stefan Seyfried
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 18:03 UTC (permalink / raw)
  To: Takashi Iwai, Denys Vlasenko; +Cc: X86 ML, LKML, Stefan Seyfried, Tejun Heo

On Wed, Mar 18, 2015 at 10:46 AM, Takashi Iwai <tiwai@suse.de> wrote:
> At Wed, 18 Mar 2015 18:43:52 +0100,
> Takashi Iwai wrote:
>>
>> At Wed, 18 Mar 2015 15:16:42 +0100,
>> Takashi Iwai wrote:
>> >
>> > At Sun, 15 Mar 2015 09:17:15 +0100,
>> > Stefan Seyfried wrote:
>> > >
>> > > Hi all,
>> > >
>> > > in 4.0-rc I have recently seen a few crashes, always when running
>> > > KVM guests (IIRC). Today I was able to capture a crash dump, this
>> > > is the backtrace from dmesg.txt:
>> > >
>> > > [242060.604870] PANIC: double fault, error_code: 0x0

OK, we double faulted.  Too bad that x86 CPUs don't tell us why.

>> > > [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>> > > [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>> > > [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>> > > [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30

The double fault happened during page fault processing.  Could you
disassemble your page_fault function to find the offending
instruction?

>> > > [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016

Uh, what?  That RSP is a user address.

>> > > [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
>> > > [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
>> > > [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
>> > > [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
>> > > [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
>> > > [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>> > > [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>> > > [242060.604909] Stack:
>> > > [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
>> > > [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190

This is suspicious.  We need to have died, again, of a fatal page
fault while dumping the stack.

>> > > [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
>> > > [242060.605078] Oops: 0000 [#1] PREEMPT SMP
>> > > [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
>> > > [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
>> > > [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>> > > [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>> > > [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>> > > [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>> > > [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
>> > > [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
>> > > [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
>> > > [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
>> > > [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
>> > > [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
>> > > [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>> > > [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>> > > [242060.605396] Stack:
>> > > [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
>> > > [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
>> > > [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
>> > > [242060.605396] Call Trace:
>> > > [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
>> > > [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
>> > > [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
>> > > [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
>> > > [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
>> > > [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89
>> > > [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>> > > [242060.605396]  RSP <ffff88023bc84e88>
>> > > [242060.605396] CR2: 00007fffa55eafb8
>> > >
>> > > I would not totally rule out a hardware problem, since this machine had
>> > > another weird crash where it crashed and the bios beeper was constant
>> > > on until I hit the power button for 5 seconds.
>> > >
>> > > Unfortunately, I cannot load the crashdump with the crash version in
>> > > openSUSE Tumbleweed, so the backtrace is all I have for now.
>> >
>> > Just "me too", I'm getting the very same crash out of sudden with the
>> > recent 4.0-rc.  Judging from the very same pattern (usually crash
>> > happened while using KVM (-smp 4) and kernel builds with -j8), I don't
>> > think it's a hardware problem.
>>
>> The git bisection pointed to the commit:
>> commit b926e6f61a26036ee9eabe6761483954d481ad25
>>     x86, traps: Fix ist_enter from userspace
>>
>> And reverting this on top of the latest Linus tree seems working.
>> Seife, could you verify on your machine, too?
>
> Argh, false positive.  Right after I wrote this mail, I got the very
> same crash.  I seem to need running the test much longer than I
> thought.
>
> But somehow the commits around the above smell suspicious...
>

Those commits shouldn't really have affected page fault or double
fault behavior.  They made big changes to MCE, breakpoints, and debug
exceptions.

Something's very wrong here.  I'm guessing that we somehow ended up in
page_fault in a completely invalid context.

One hairy code path that could plausibly do this is:

1. syscall

2. vmalloc fault accessing old_rsp aka rsp_scratch (or kernel_stack --
same issue)

3. page fault.  Now we're on the user stack and we do awful things.
If we run off the end of the presently writable portion of the stack,
we double fault.

I'd really like to see lazy vmalloc faults go away.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 18:03       ` Andy Lutomirski
@ 2015-03-18 19:03         ` Stefan Seyfried
  2015-03-18 19:26           ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 19:03 UTC (permalink / raw)
  To: Andy Lutomirski, Takashi Iwai, Denys Vlasenko; +Cc: X86 ML, LKML, Tejun Heo

Hi all,

first, I'm kind of happy that I'm not the only one seeing this, and
thus my beloved Thinkpad can stay for a bit longer... :-)

Then, I'm mostly an amateur when it comes to kernel debugging, so bear
with me when I'm stumbling through the code...

Am 18.03.2015 um 19:03 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 10:46 AM, Takashi Iwai <tiwai@suse.de> wrote:
>> At Wed, 18 Mar 2015 18:43:52 +0100,
>> Takashi Iwai wrote:
>>>
>>> At Wed, 18 Mar 2015 15:16:42 +0100,
>>> Takashi Iwai wrote:
>>>>
>>>> At Sun, 15 Mar 2015 09:17:15 +0100,
>>>> Stefan Seyfried wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> in 4.0-rc I have recently seen a few crashes, always when running
>>>>> KVM guests (IIRC). Today I was able to capture a crash dump, this
>>>>> is the backtrace from dmesg.txt:
>>>>>
>>>>> [242060.604870] PANIC: double fault, error_code: 0x0
> 
> OK, we double faulted.  Too bad that x86 CPUs don't tell us why.
> 
>>>>> [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>> [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>> [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>> [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
> 
> The double fault happened during page fault processing.  Could you
> disassemble your page_fault function to find the offending
> instruction?

This one is easy:

crash> disassemble page_fault
Dump of assembler code for function page_fault:
   0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
   0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
   0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
   0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
   0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
   0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
   0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
   0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
   0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
   0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
End of assembler dump.


>>>>> [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
> 
> Uh, what?  That RSP is a user address.
> 
>>>>> [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
>>>>> [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
>>>>> [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
>>>>> [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
>>>>> [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
>>>>> [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>> [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>> [242060.604909] Stack:
>>>>> [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
>>>>> [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
> 
> This is suspicious.  We need to have died, again, of a fatal page
> fault while dumping the stack.

I posted the same problem to the opensuse kernel list shortly before turning
to LKML. There, Michal Kubecek noted:

"I encountered a similar problem recently. The thing is, x86
specification says that on a double fault, RIP and RSP registers are
undefined, i.e. you not only can't expect them to contain values
corresponding to the first or second fault but you can't even expect
them to have any usable values at all. Unfortunately the kernel double
fault handler doesn't take this into account and does try to display
usual crash related information so that it itself does usually crash
when trying to show stack content (that's the show_stack_log_lvl()
crash).

The result is a double fault (which itself would be very hard to debug)
followed by a crash in its handler so that analysing the outcome is
extremely difficult."

I cannot judge if this is true, but it sounded related to solving the
problem to me.

>>>>> [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
>>>>> [242060.605078] Oops: 0000 [#1] PREEMPT SMP
>>>>> [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
>>>>> [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
>>>>> [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>> [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>> [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>> [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>> [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
>>>>> [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
>>>>> [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
>>>>> [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
>>>>> [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
>>>>> [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
>>>>> [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>> [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>> [242060.605396] Stack:
>>>>> [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
>>>>> [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
>>>>> [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
>>>>> [242060.605396] Call Trace:
>>>>> [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
>>>>> [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
>>>>> [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
>>>>> [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
>>>>> [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>>>> [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89
>>>>> [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>> [242060.605396]  RSP <ffff88023bc84e88>
>>>>> [242060.605396] CR2: 00007fffa55eafb8
>>>>>
>>>>> I would not totally rule out a hardware problem, since this machine had
>>>>> another weird crash where it crashed and the bios beeper was constant
>>>>> on until I hit the power button for 5 seconds.
>>>>>
>>>>> Unfortunately, I cannot load the crashdump with the crash version in
>>>>> openSUSE Tumbleweed, so the backtrace is all I have for now.
>>>>
>>>> Just "me too", I'm getting the very same crash out of sudden with the
>>>> recent 4.0-rc.  Judging from the very same pattern (usually crash
>>>> happened while using KVM (-smp 4) and kernel builds with -j8), I don't
>>>> think it's a hardware problem.
>>>
>>> The git bisection pointed to the commit:
>>> commit b926e6f61a26036ee9eabe6761483954d481ad25
>>>     x86, traps: Fix ist_enter from userspace
>>>
>>> And reverting this on top of the latest Linus tree seems working.
>>> Seife, could you verify on your machine, too?
>>
>> Argh, false positive.  Right after I wrote this mail, I got the very
>> same crash.  I seem to need running the test much longer than I
>> thought.
>>
>> But somehow the commits around the above smell suspicious...
>>
> 
> Those commits shouldn't really have affected page fault or double
> fault behavior.  They made big changes to MCE, breakpoints, and debug
> exceptions.
> 
> Something's very wrong here.  I'm guessing that we somehow ended up in
> page_fault in a completely invalid context.
> 
> One hairy code path that could plausibly do this is:
> 
> 1. syscall
> 
> 2. vmalloc fault accessing old_rsp aka rsp_scratch (or kernel_stack --
> same issue)
> 
> 3. page fault.  Now we're on the user stack and we do awful things.
> If we run off the end of the presently writable portion of the stack,
> we double fault.

Maybe Michal's idea from above points in the right direction?

Now since I have a crash dump and the corresponding debuginfo at hand,
might this help somehow to find out where the problem originated from?

I mean -- it's only 508 processes to look at :-) but if I knew what to
look for in their backtraces, I would try to do it.

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 19:03         ` Stefan Seyfried
@ 2015-03-18 19:26           ` Andy Lutomirski
  2015-03-18 20:05             ` Stefan Seyfried
                               ` (3 more replies)
  0 siblings, 4 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 19:26 UTC (permalink / raw)
  To: Stefan Seyfried, Linus Torvalds
  Cc: Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

Hi Linus-

You seem to enjoy debugging these things.  Want to give this a shot?
My guess is a vmalloc fault accessing either old_rsp or kernel_stack
right after swapgs in syscall entry.

On Wed, Mar 18, 2015 at 12:03 PM, Stefan Seyfried
<stefan.seyfried@googlemail.com> wrote:
> Hi all,
>
> first, I'm kind of happy that I'm not the only one seeing this, and
> thus my beloved Thinkpad can stay for a bit longer... :-)
>
> Then, I'm mostly an amateur when it comes to kernel debugging, so bear
> with me when I'm stumbling through the code...
>
> Am 18.03.2015 um 19:03 schrieb Andy Lutomirski:
>> On Wed, Mar 18, 2015 at 10:46 AM, Takashi Iwai <tiwai@suse.de> wrote:
>>> At Wed, 18 Mar 2015 18:43:52 +0100,
>>> Takashi Iwai wrote:
>>>>
>>>> At Wed, 18 Mar 2015 15:16:42 +0100,
>>>> Takashi Iwai wrote:
>>>>>
>>>>> At Sun, 15 Mar 2015 09:17:15 +0100,
>>>>> Stefan Seyfried wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> in 4.0-rc I have recently seen a few crashes, always when running
>>>>>> KVM guests (IIRC). Today I was able to capture a crash dump, this
>>>>>> is the backtrace from dmesg.txt:
>>>>>>
>>>>>> [242060.604870] PANIC: double fault, error_code: 0x0
>>
>> OK, we double faulted.  Too bad that x86 CPUs don't tell us why.
>>
>>>>>> [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>>> [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>>> [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>>> [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>
>> The double fault happened during page fault processing.  Could you
>> disassemble your page_fault function to find the offending
>> instruction?
>
> This one is easy:
>
> crash> disassemble page_fault
> Dump of assembler code for function page_fault:
>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>

The callq was the double-faulting instruction, and it is indeed the
first function in here that would have accessed the stack.  (The sub
*changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
page fault, and the page fault is promoted to a double fault.  The
surprising thing is that the page fault itself seems to have been
delivered okay, and RSP wasn't on a page boundary.

You wouldn't happen to be using a Broadwell machine?

The only way to get here with bogus RSP is if we interrupted something
that was previously running at CPL0 with similarly bogus RSP.

I don't know if I trust CR2.  It's 16 bytes lower than I'd expect.

>    0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
>    0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
>    0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
>    0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
>    0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
> End of assembler dump.
>
>
>>>>>> [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
>>
>> Uh, what?  That RSP is a user address.
>>
>>>>>> [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
>>>>>> [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
>>>>>> [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
>>>>>> [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
>>>>>> [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
>>>>>> [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>>> [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>>> [242060.604909] Stack:
>>>>>> [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
>>>>>> [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>
>> This is suspicious.  We need to have died, again, of a fatal page
>> fault while dumping the stack.
>
> I posted the same problem to the opensuse kernel list shortly before turning
> to LKML. There, Michal Kubecek noted:
>
> "I encountered a similar problem recently. The thing is, x86
> specification says that on a double fault, RIP and RSP registers are
> undefined, i.e. you not only can't expect them to contain values
> corresponding to the first or second fault but you can't even expect
> them to have any usable values at all. Unfortunately the kernel double
> fault handler doesn't take this into account and does try to display
> usual crash related information so that it itself does usually crash
> when trying to show stack content (that's the show_stack_log_lvl()
> crash).

I think that's not entirely true.  RIP is reliable for many classes of
double faults, and we rely on that for espfix64.  The fact that hpa
was willing to write that code strongly suggests that Intel chips at
least really do work that way.

>
> The result is a double fault (which itself would be very hard to debug)
> followed by a crash in its handler so that analysing the outcome is
> extremely difficult."
>
> I cannot judge if this is true, but it sounded related to solving the
> problem to me.


The crash in the handler is a separate bug.



>
>>>>>> [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
>>>>>> [242060.605078] Oops: 0000 [#1] PREEMPT SMP
>>>>>> [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
>>>>>> [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
>>>>>> [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>>> [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>>> [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>>> [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>>> [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
>>>>>> [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
>>>>>> [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
>>>>>> [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
>>>>>> [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
>>>>>> [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
>>>>>> [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>>> [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>>> [242060.605396] Stack:
>>>>>> [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
>>>>>> [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
>>>>>> [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
>>>>>> [242060.605396] Call Trace:
>>>>>> [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
>>>>>> [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
>>>>>> [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
>>>>>> [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
>>>>>> [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>>>>> [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89
>>>>>> [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>>> [242060.605396]  RSP <ffff88023bc84e88>
>>>>>> [242060.605396] CR2: 00007fffa55eafb8
>>>>>>
>>>>>> I would not totally rule out a hardware problem, since this machine had
>>>>>> another weird crash where it crashed and the bios beeper was constant
>>>>>> on until I hit the power button for 5 seconds.
>>>>>>
>>>>>> Unfortunately, I cannot load the crashdump with the crash version in
>>>>>> openSUSE Tumbleweed, so the backtrace is all I have for now.
>>>>>
>>>>> Just "me too", I'm getting the very same crash out of sudden with the
>>>>> recent 4.0-rc.  Judging from the very same pattern (usually crash
>>>>> happened while using KVM (-smp 4) and kernel builds with -j8), I don't
>>>>> think it's a hardware problem.
>>>>
>>>> The git bisection pointed to the commit:
>>>> commit b926e6f61a26036ee9eabe6761483954d481ad25
>>>>     x86, traps: Fix ist_enter from userspace
>>>>
>>>> And reverting this on top of the latest Linus tree seems working.
>>>> Seife, could you verify on your machine, too?
>>>
>>> Argh, false positive.  Right after I wrote this mail, I got the very
>>> same crash.  I seem to need running the test much longer than I
>>> thought.
>>>
>>> But somehow the commits around the above smell suspicious...
>>>
>>
>> Those commits shouldn't really have affected page fault or double
>> fault behavior.  They made big changes to MCE, breakpoints, and debug
>> exceptions.
>>
>> Something's very wrong here.  I'm guessing that we somehow ended up in
>> page_fault in a completely invalid context.
>>
>> One hairy code path that could plausibly do this is:
>>
>> 1. syscall
>>
>> 2. vmalloc fault accessing old_rsp aka rsp_scratch (or kernel_stack --
>> same issue)
>>
>> 3. page fault.  Now we're on the user stack and we do awful things.
>> If we run off the end of the presently writable portion of the stack,
>> we double fault.
>
> Maybe Michal's idea from above points in the right direction?
>
> Now since I have a crash dump and the corresponding debuginfo at hand,
> might this help somehow to find out where the problem originated from?
>
> I mean -- it's only 508 processes to look at :-) but if I knew what to
> look for in their backtraces, I would try to do it.

The relevant thread's stack is here (see ti in the trace):

ffff8801013d4000

It could be interesting to see what's there.

I don't suppose you want to try to walk the paging structures to see
if ffff88023bc80000 (i.e. gsbase) and, more specifically,
ffff88023bc80000 + old_rsp and ffff88023bc80000 + kernel_stack are
present?  You'd only have to walk one level -- presumably, if the PGD
entry is there, the rest of the entries are okay, too.


--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 19:26           ` Andy Lutomirski
@ 2015-03-18 20:05             ` Stefan Seyfried
  2015-03-18 20:51               ` Andy Lutomirski
  2015-03-18 20:06             ` Denys Vlasenko
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 20:05 UTC (permalink / raw)
  To: Andy Lutomirski, Linus Torvalds
  Cc: Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

Hi Andy,

Am 18.03.2015 um 20:26 schrieb Andy Lutomirski:
> Hi Linus-
> 
> You seem to enjoy debugging these things.  Want to give this a shot?
> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
> right after swapgs in syscall entry.
> 
> On Wed, Mar 18, 2015 at 12:03 PM, Stefan Seyfried
> <stefan.seyfried@googlemail.com> wrote:
>> Hi all,
>>
>> first, I'm kind of happy that I'm not the only one seeing this, and
>> thus my beloved Thinkpad can stay for a bit longer... :-)
>>
>> Then, I'm mostly an amateur when it comes to kernel debugging, so bear
>> with me when I'm stumbling through the code...
>>
>> Am 18.03.2015 um 19:03 schrieb Andy Lutomirski:
>>> On Wed, Mar 18, 2015 at 10:46 AM, Takashi Iwai <tiwai@suse.de> wrote:
>>>> At Wed, 18 Mar 2015 18:43:52 +0100,
>>>> Takashi Iwai wrote:
>>>>>
>>>>> At Wed, 18 Mar 2015 15:16:42 +0100,
>>>>> Takashi Iwai wrote:
>>>>>>
>>>>>> At Sun, 15 Mar 2015 09:17:15 +0100,
>>>>>> Stefan Seyfried wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> in 4.0-rc I have recently seen a few crashes, always when running
>>>>>>> KVM guests (IIRC). Today I was able to capture a crash dump, this
>>>>>>> is the backtrace from dmesg.txt:
>>>>>>>
>>>>>>> [242060.604870] PANIC: double fault, error_code: 0x0
>>>
>>> OK, we double faulted.  Too bad that x86 CPUs don't tell us why.
>>>
>>>>>>> [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>>>> [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>>>> [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>>>> [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>>
>>> The double fault happened during page fault processing.  Could you
>>> disassemble your page_fault function to find the offending
>>> instruction?
>>
>> This one is easy:
>>
>> crash> disassemble page_fault
>> Dump of assembler code for function page_fault:
>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
> 
> The callq was the double-faulting instruction, and it is indeed the
> first function in here that would have accessed the stack.  (The sub
> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
> page fault, and the page fault is promoted to a double fault.  The
> surprising thing is that the page fault itself seems to have been
> delivered okay, and RSP wasn't on a page boundary.
> 
> You wouldn't happen to be using a Broadwell machine?

No, this is a quite old Thinkpad X200s, Core2duo
processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz
stepping        : 10
microcode       : 0xa0c

> The only way to get here with bogus RSP is if we interrupted something
> that was previously running at CPL0 with similarly bogus RSP.
> 
> I don't know if I trust CR2.  It's 16 bytes lower than I'd expect.
> 
>>    0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
>>    0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
>>    0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
>>    0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
>>    0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
>> End of assembler dump.
>>
>>
>>>>>>> [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
>>>
>>> Uh, what?  That RSP is a user address.
>>>
>>>>>>> [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
>>>>>>> [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
>>>>>>> [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
>>>>>>> [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
>>>>>>> [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
>>>>>>> [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>>>> [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>>>> [242060.604909] Stack:
>>>>>>> [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
>>>>>>> [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>
>>> This is suspicious.  We need to have died, again, of a fatal page
>>> fault while dumping the stack.
>>
>> I posted the same problem to the opensuse kernel list shortly before turning
>> to LKML. There, Michal Kubecek noted:
>>
>> "I encountered a similar problem recently. The thing is, x86
>> specification says that on a double fault, RIP and RSP registers are
>> undefined, i.e. you not only can't expect them to contain values
>> corresponding to the first or second fault but you can't even expect
>> them to have any usable values at all. Unfortunately the kernel double
>> fault handler doesn't take this into account and does try to display
>> usual crash related information so that it itself does usually crash
>> when trying to show stack content (that's the show_stack_log_lvl()
>> crash).
> 
> I think that's not entirely true.  RIP is reliable for many classes of
> double faults, and we rely on that for espfix64.  The fact that hpa
> was willing to write that code strongly suggests that Intel chips at
> least really do work that way.
> 
>>
>> The result is a double fault (which itself would be very hard to debug)
>> followed by a crash in its handler so that analysing the outcome is
>> extremely difficult."
>>
>> I cannot judge if this is true, but it sounded related to solving the
>> problem to me.
> 
> 
> The crash in the handler is a separate bug.
> 
> 
> 
>>
>>>>>>> [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
>>>>>>> [242060.605078] Oops: 0000 [#1] PREEMPT SMP
>>>>>>> [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
>>>>>>> [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
>>>>>>> [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>>>> [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>>>> [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>>>> [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>>>> [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
>>>>>>> [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
>>>>>>> [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
>>>>>>> [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
>>>>>>> [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
>>>>>>> [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
>>>>>>> [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>>>> [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>>>> [242060.605396] Stack:
>>>>>>> [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
>>>>>>> [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
>>>>>>> [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
>>>>>>> [242060.605396] Call Trace:
>>>>>>> [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
>>>>>>> [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
>>>>>>> [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
>>>>>>> [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
>>>>>>> [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>>>>>> [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89
>>>>>>> [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>>>> [242060.605396]  RSP <ffff88023bc84e88>
>>>>>>> [242060.605396] CR2: 00007fffa55eafb8
>>>>>>>
>>>>>>> I would not totally rule out a hardware problem, since this machine had
>>>>>>> another weird crash where it crashed and the bios beeper was constant
>>>>>>> on until I hit the power button for 5 seconds.
>>>>>>>
>>>>>>> Unfortunately, I cannot load the crashdump with the crash version in
>>>>>>> openSUSE Tumbleweed, so the backtrace is all I have for now.
>>>>>>
>>>>>> Just "me too", I'm getting the very same crash out of sudden with the
>>>>>> recent 4.0-rc.  Judging from the very same pattern (usually crash
>>>>>> happened while using KVM (-smp 4) and kernel builds with -j8), I don't
>>>>>> think it's a hardware problem.
>>>>>
>>>>> The git bisection pointed to the commit:
>>>>> commit b926e6f61a26036ee9eabe6761483954d481ad25
>>>>>     x86, traps: Fix ist_enter from userspace
>>>>>
>>>>> And reverting this on top of the latest Linus tree seems working.
>>>>> Seife, could you verify on your machine, too?
>>>>
>>>> Argh, false positive.  Right after I wrote this mail, I got the very
>>>> same crash.  I seem to need running the test much longer than I
>>>> thought.
>>>>
>>>> But somehow the commits around the above smell suspicious...
>>>>
>>>
>>> Those commits shouldn't really have affected page fault or double
>>> fault behavior.  They made big changes to MCE, breakpoints, and debug
>>> exceptions.
>>>
>>> Something's very wrong here.  I'm guessing that we somehow ended up in
>>> page_fault in a completely invalid context.
>>>
>>> One hairy code path that could plausibly do this is:
>>>
>>> 1. syscall
>>>
>>> 2. vmalloc fault accessing old_rsp aka rsp_scratch (or kernel_stack --
>>> same issue)
>>>
>>> 3. page fault.  Now we're on the user stack and we do awful things.
>>> If we run off the end of the presently writable portion of the stack,
>>> we double fault.
>>
>> Maybe Michal's idea from above points in the right direction?
>>
>> Now since I have a crash dump and the corresponding debuginfo at hand,
>> might this help somehow to find out where the problem originated from?
>>
>> I mean -- it's only 508 processes to look at :-) but if I knew what to
>> look for in their backtraces, I would try to do it.
> 
> The relevant thread's stack is here (see ti in the trace):
> 
> ffff8801013d4000
> 
> It could be interesting to see what's there.
> 
> I don't suppose you want to try to walk the paging structures to see
> if ffff88023bc80000 (i.e. gsbase) and, more specifically,
> ffff88023bc80000 + old_rsp and ffff88023bc80000 + kernel_stack are
> present?  You'd only have to walk one level -- presumably, if the PGD
> entry is there, the rest of the entries are okay, too.

That's all greek to me :-)

I see that there is something at ffff88023bc80000:

crash> x /64xg 0xffff88023bc80000
0xffff88023bc80000:     0x0000000000000000      0x0000000000000000
0xffff88023bc80010:     0x0000000000000000      0x0000000000000000
0xffff88023bc80020:     0x0000000000000000      0x000000006686ada9
0xffff88023bc80030:     0x0000000000000000      0x0000000000000000
0xffff88023bc80040:     0x0000000000000000      0x0000000000000000
0xffff88023bc80050:     0x0000000000000000      0x0000000000000000
0xffff88023bc80060:     0x0000000000000000      0x0000000000000000
0xffff88023bc80070:     0x0000000000000000      0x0000000000000000
0xffff88023bc80080:     0x0000000000000000      0x0000000000000000
0xffff88023bc80090:     0x0000000000000000      0x0000000000000000
0xffff88023bc800a0:     0x0000000000000000      0x0000000000000000
0xffff88023bc800b0:     0x0000000000000000      0x0000000000000000
0xffff88023bc800c0:     0x0000000000000000      0x0000000000000000
0xffff88023bc800d0:     0x0000000000000000      0x0000000000000000
0xffff88023bc800e0:     0x0000000000000000      0x0000000000000000
0xffff88023bc800f0:     0x0000000000000000      0x0000000000000000
0xffff88023bc80100:     0x0000000000000000      0x0000000000000000
0xffff88023bc80110:     0x0000000000000000      0x0000000000000000
0xffff88023bc80120:     0x0000000000000000      0x0000000000000000
0xffff88023bc80130:     0x0000000000000000      0x0000000000000000
0xffff88023bc80140:     0x0000000000000000      0x0000000000000000
0xffff88023bc80150:     0x0000000000000000      0x0000000000000000
0xffff88023bc80160:     0x0000000000000000      0x0000000000000000
0xffff88023bc80170:     0x0000000000000000      0x0000000000000000
0xffff88023bc80180:     0x0000000000000000      0x0000000000000000
0xffff88023bc80190:     0x0000000000000000      0x0000000000000000
0xffff88023bc801a0:     0x0000000000000000      0x0000000000000000
0xffff88023bc801b0:     0x0000000000000000      0x0000000000000000
0xffff88023bc801c0:     0x0000000000000000      0x0000000000000000
0xffff88023bc801d0:     0x0000000000000000      0x0000000000000000
0xffff88023bc801e0:     0x0000000000000000      0x0000000000000000
0xffff88023bc801f0:     0x0000000000000000      0x0000000000000000

old_rsp and kernel_stack seem bogus:
crash> print old_rsp
Cannot access memory at address 0xa200
gdb: gdb request failed: print old_rsp
crash> print kernel_stack
Cannot access memory at address 0xaa48
gdb: gdb request failed: print kernel_stack

kernel_stack is not a pointer? So 0xffff88023bc80000 + 0xaa48 it is:

crash> x /64xg 0xffff88023bc8aa00
0xffff88023bc8aa00:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aa10:     0x0000000000000000      0xffff880103f46150
0xffff88023bc8aa20:     0xffffffff80010001      0xffff88023bc83fc0
0xffff88023bc8aa30:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aa40:     0xffff880103f46150      0xffff8801013d7fd8
0xffff88023bc8aa50:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aa60:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aa70:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aa80:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aa90:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aaa0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aab0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aac0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aad0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aae0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aaf0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab00:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab10:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab20:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab30:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab40:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab50:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab60:     0x0000000000000000      0x000000007fffffff
0xffff88023bc8ab70:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab80:     0x0000000000000000      0x0000000000000000
0xffff88023bc8ab90:     0x0000000000000000      0x0000000000000000
0xffff88023bc8aba0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8abb0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8abc0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8abd0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8abe0:     0x0000000000000000      0x0000000000000000
0xffff88023bc8abf0:     0x0000000000000000      0x0000000000000000

and for old_rsp:
crash> x /64xg 0xffff88023bc8a200
0xffff88023bc8a200:     0x00007fffa55f1b98      0x0000000000000000
0xffff88023bc8a210:     0x0000000000000000      0x0000000000000000
0xffff88023bc8a220:     0x0000000000000000      0x0000000000000000
0xffff88023bc8a230:     0x0000000000003ce6      0x0000000000000000
0xffff88023bc8a240:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a250:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a260:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a270:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a280:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a290:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a2a0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a2b0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a2c0:     0xffffffffffffffff      0xffffffff0000001b
0xffff88023bc8a2d0:     0xffffffff00000023      0xffffffffffffffff
0xffff88023bc8a2e0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a2f0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a300:     0x0000000100000000      0x0000000300000002
0xffff88023bc8a310:     0x0000000500000004      0x0000000700000006
0xffff88023bc8a320:     0x0000000900000008      0x0000000b0000000a
0xffff88023bc8a330:     0x0000000d0000000c      0x0000000f0000000e
0xffff88023bc8a340:     0x00000014ffffffff      0xffffffffffffffff
0xffff88023bc8a350:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a360:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a370:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a380:     0x00000015ffffffff      0xffffffffffffffff
0xffff88023bc8a390:     0xffffffff0000001e      0xffffffffffffffff
0xffff88023bc8a3a0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a3b0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a3c0:     0x00000016ffffffff      0xffffffffffffffff
0xffff88023bc8a3d0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a3e0:     0xffffffffffffffff      0xffffffffffffffff
0xffff88023bc8a3f0:     0xffffffffffffffff      0xffffffffffffffff

But it's entirely possible I'm using the tool wrong.

What type of data is supposed to be at those places? I tried
crash> print *(pgtable_t) 0xffff88023bc8a200
but I cannot judge if the output is useful :-(

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 19:26           ` Andy Lutomirski
  2015-03-18 20:05             ` Stefan Seyfried
@ 2015-03-18 20:06             ` Denys Vlasenko
  2015-03-18 20:49               ` Andy Lutomirski
  2015-03-18 21:32             ` Linus Torvalds
  2015-03-28 23:57             ` Maciej W. Rozycki
  3 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-18 20:06 UTC (permalink / raw)
  To: Andy Lutomirski, Stefan Seyfried, Linus Torvalds
  Cc: Takashi Iwai, X86 ML, LKML, Tejun Heo

On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
> Hi Linus-
> 
> You seem to enjoy debugging these things.  Want to give this a shot?
> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
> right after swapgs in syscall entry.

The code is:

ENTRY(system_call)
        SWAPGS_UNSAFE_STACK
GLOBAL(system_call_after_swapgs)
        movq    %rsp,PER_CPU_VAR(rsp_scratch)
	movq    PER_CPU_VAR(kernel_stack),%rsp

If PER_CPU_VAR(var) memory access can page fault
(I was thinking this is ensured to never fault),
then on these two instructions such page fault
will be fatal: we will still have userspace %rsp.

I thought we can only get a NMI or debug interrupt here,
and they are both set up to use IST stacks
to prevent this scenario (among other reasons).

-- 
vda

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 20:06             ` Denys Vlasenko
@ 2015-03-18 20:49               ` Andy Lutomirski
  2015-03-18 21:06                 ` Denys Vlasenko
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 20:49 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Stefan Seyfried, Linus Torvalds, Takashi Iwai, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 1:06 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
>> Hi Linus-
>>
>> You seem to enjoy debugging these things.  Want to give this a shot?
>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>> right after swapgs in syscall entry.
>
> The code is:
>
> ENTRY(system_call)
>         SWAPGS_UNSAFE_STACK
> GLOBAL(system_call_after_swapgs)
>         movq    %rsp,PER_CPU_VAR(rsp_scratch)
>         movq    PER_CPU_VAR(kernel_stack),%rsp
>
> If PER_CPU_VAR(var) memory access can page fault
> (I was thinking this is ensured to never fault),
> then on these two instructions such page fault
> will be fatal: we will still have userspace %rsp.
>
> I thought we can only get a NMI or debug interrupt here,
> and they are both set up to use IST stacks
> to prevent this scenario (among other reasons).

I don't think that #DB is possible -- we should never have a
watchpoint on percpu memory like that (unless we're using kgdb, in
which case I think that kgdb should be fixed).

On the other hand, we can and do take page faults on percpu memory,
because percpu lives in vmap space and we lazily populate PGD entries
in per-mm PGDs.  (That is, when we allocate a kernel PGD entry, we
populate it in init_mm's pgd, but we don't proactively copy it during
context switches.)

But the affected system is a laptop, so there shouldn't be CPU hotplug
or enough memory for this to happen.  Confused.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 20:05             ` Stefan Seyfried
@ 2015-03-18 20:51               ` Andy Lutomirski
  2015-03-18 21:12                 ` Stefan Seyfried
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 20:51 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Linus Torvalds, Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 1:05 PM, Stefan Seyfried
<stefan.seyfried@googlemail.com> wrote:
> Hi Andy,
>
> Am 18.03.2015 um 20:26 schrieb Andy Lutomirski:
>> Hi Linus-
>>
>> You seem to enjoy debugging these things.  Want to give this a shot?
>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>> right after swapgs in syscall entry.
>>
>> On Wed, Mar 18, 2015 at 12:03 PM, Stefan Seyfried
>> <stefan.seyfried@googlemail.com> wrote:
>>> Hi all,
>>>
>>> first, I'm kind of happy that I'm not the only one seeing this, and
>>> thus my beloved Thinkpad can stay for a bit longer... :-)
>>>
>>> Then, I'm mostly an amateur when it comes to kernel debugging, so bear
>>> with me when I'm stumbling through the code...
>>>
>>> Am 18.03.2015 um 19:03 schrieb Andy Lutomirski:
>>>> On Wed, Mar 18, 2015 at 10:46 AM, Takashi Iwai <tiwai@suse.de> wrote:
>>>>> At Wed, 18 Mar 2015 18:43:52 +0100,
>>>>> Takashi Iwai wrote:
>>>>>>
>>>>>> At Wed, 18 Mar 2015 15:16:42 +0100,
>>>>>> Takashi Iwai wrote:
>>>>>>>
>>>>>>> At Sun, 15 Mar 2015 09:17:15 +0100,
>>>>>>> Stefan Seyfried wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> in 4.0-rc I have recently seen a few crashes, always when running
>>>>>>>> KVM guests (IIRC). Today I was able to capture a crash dump, this
>>>>>>>> is the backtrace from dmesg.txt:
>>>>>>>>
>>>>>>>> [242060.604870] PANIC: double fault, error_code: 0x0
>>>>
>>>> OK, we double faulted.  Too bad that x86 CPUs don't tell us why.
>>>>
>>>>>>>> [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>>>>> [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>>>>> [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>>>>> [242060.604885] RIP: 0010:[<ffffffff816834ad>]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>>>
>>>> The double fault happened during page fault processing.  Could you
>>>> disassemble your page_fault function to find the offending
>>>> instruction?
>>>
>>> This one is easy:
>>>
>>> crash> disassemble page_fault
>>> Dump of assembler code for function page_fault:
>>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>>
>> The callq was the double-faulting instruction, and it is indeed the
>> first function in here that would have accessed the stack.  (The sub
>> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
>> page fault, and the page fault is promoted to a double fault.  The
>> surprising thing is that the page fault itself seems to have been
>> delivered okay, and RSP wasn't on a page boundary.
>>
>> You wouldn't happen to be using a Broadwell machine?
>
> No, this is a quite old Thinkpad X200s, Core2duo
> processor       : 1
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 23
> model name      : Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz
> stepping        : 10
> microcode       : 0xa0c
>
>> The only way to get here with bogus RSP is if we interrupted something
>> that was previously running at CPL0 with similarly bogus RSP.
>>
>> I don't know if I trust CR2.  It's 16 bytes lower than I'd expect.
>>
>>>    0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
>>>    0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
>>>    0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
>>>    0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
>>>    0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
>>> End of assembler dump.
>>>
>>>
>>>>>>>> [242060.604893] RSP: 0018:00007fffa55eafb8  EFLAGS: 00010016
>>>>
>>>> Uh, what?  That RSP is a user address.
>>>>
>>>>>>>> [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
>>>>>>>> [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
>>>>>>>> [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
>>>>>>>> [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
>>>>>>>> [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
>>>>>>>> [242060.604904] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>>>>> [242060.604906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>> [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>>>>> [242060.604909] Stack:
>>>>>>>> [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
>>>>>>>> [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>
>>>> This is suspicious.  We need to have died, again, of a fatal page
>>>> fault while dumping the stack.
>>>
>>> I posted the same problem to the opensuse kernel list shortly before turning
>>> to LKML. There, Michal Kubecek noted:
>>>
>>> "I encountered a similar problem recently. The thing is, x86
>>> specification says that on a double fault, RIP and RSP registers are
>>> undefined, i.e. you not only can't expect them to contain values
>>> corresponding to the first or second fault but you can't even expect
>>> them to have any usable values at all. Unfortunately the kernel double
>>> fault handler doesn't take this into account and does try to display
>>> usual crash related information so that it itself does usually crash
>>> when trying to show stack content (that's the show_stack_log_lvl()
>>> crash).
>>
>> I think that's not entirely true.  RIP is reliable for many classes of
>> double faults, and we rely on that for espfix64.  The fact that hpa
>> was willing to write that code strongly suggests that Intel chips at
>> least really do work that way.
>>
>>>
>>> The result is a double fault (which itself would be very hard to debug)
>>> followed by a crash in its handler so that analysing the outcome is
>>> extremely difficult."
>>>
>>> I cannot judge if this is true, but it sounded related to solving the
>>> problem to me.
>>
>>
>> The crash in the handler is a separate bug.
>>
>>
>>
>>>
>>>>>>>> [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
>>>>>>>> [242060.605078] Oops: 0000 [#1] PREEMPT SMP
>>>>>>>> [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
>>>>>>>> [242060.605396]  dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
>>>>>>>> [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G        W       4.0.0-rc3-2.gd5c547f-desktop #1
>>>>>>>> [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>>>>> [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>>>>> [242060.605396] RIP: 0010:[<ffffffff81005b44>]  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>>>>> [242060.605396] RSP: 0018:ffff88023bc84e88  EFLAGS: 00010046
>>>>>>>> [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
>>>>>>>> [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
>>>>>>>> [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
>>>>>>>> [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
>>>>>>>> [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
>>>>>>>> [242060.605396] FS:  00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>>>>> [242060.605396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>> [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>>>>> [242060.605396] Stack:
>>>>>>>> [242060.605396]  0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
>>>>>>>> [242060.605396]  0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
>>>>>>>> [242060.605396]  00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
>>>>>>>> [242060.605396] Call Trace:
>>>>>>>> [242060.605396]  [<ffffffff81005c36>] show_regs+0x86/0x210
>>>>>>>> [242060.605396]  [<ffffffff8104636f>] df_debug+0x1f/0x30
>>>>>>>> [242060.605396]  [<ffffffff810041a4>] do_double_fault+0x84/0x100
>>>>>>>> [242060.605396]  [<ffffffff81683088>] double_fault+0x28/0x30
>>>>>>>> [242060.605396]  [<ffffffff816834ad>] page_fault+0xd/0x30
>>>>>>>> [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89
>>>>>>>> [242060.605396] RIP  [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>>>>> [242060.605396]  RSP <ffff88023bc84e88>
>>>>>>>> [242060.605396] CR2: 00007fffa55eafb8
>>>>>>>>
>>>>>>>> I would not totally rule out a hardware problem, since this machine had
>>>>>>>> another weird crash where it crashed and the bios beeper was constant
>>>>>>>> on until I hit the power button for 5 seconds.
>>>>>>>>
>>>>>>>> Unfortunately, I cannot load the crashdump with the crash version in
>>>>>>>> openSUSE Tumbleweed, so the backtrace is all I have for now.
>>>>>>>
>>>>>>> Just "me too", I'm getting the very same crash out of sudden with the
>>>>>>> recent 4.0-rc.  Judging from the very same pattern (usually crash
>>>>>>> happened while using KVM (-smp 4) and kernel builds with -j8), I don't
>>>>>>> think it's a hardware problem.
>>>>>>
>>>>>> The git bisection pointed to the commit:
>>>>>> commit b926e6f61a26036ee9eabe6761483954d481ad25
>>>>>>     x86, traps: Fix ist_enter from userspace
>>>>>>
>>>>>> And reverting this on top of the latest Linus tree seems working.
>>>>>> Seife, could you verify on your machine, too?
>>>>>
>>>>> Argh, false positive.  Right after I wrote this mail, I got the very
>>>>> same crash.  I seem to need running the test much longer than I
>>>>> thought.
>>>>>
>>>>> But somehow the commits around the above smell suspicious...
>>>>>
>>>>
>>>> Those commits shouldn't really have affected page fault or double
>>>> fault behavior.  They made big changes to MCE, breakpoints, and debug
>>>> exceptions.
>>>>
>>>> Something's very wrong here.  I'm guessing that we somehow ended up in
>>>> page_fault in a completely invalid context.
>>>>
>>>> One hairy code path that could plausibly do this is:
>>>>
>>>> 1. syscall
>>>>
>>>> 2. vmalloc fault accessing old_rsp aka rsp_scratch (or kernel_stack --
>>>> same issue)
>>>>
>>>> 3. page fault.  Now we're on the user stack and we do awful things.
>>>> If we run off the end of the presently writable portion of the stack,
>>>> we double fault.
>>>
>>> Maybe Michal's idea from above points in the right direction?
>>>
>>> Now since I have a crash dump and the corresponding debuginfo at hand,
>>> might this help somehow to find out where the problem originated from?
>>>
>>> I mean -- it's only 508 processes to look at :-) but if I knew what to
>>> look for in their backtraces, I would try to do it.
>>
>> The relevant thread's stack is here (see ti in the trace):
>>
>> ffff8801013d4000
>>
>> It could be interesting to see what's there.
>>
>> I don't suppose you want to try to walk the paging structures to see
>> if ffff88023bc80000 (i.e. gsbase) and, more specifically,
>> ffff88023bc80000 + old_rsp and ffff88023bc80000 + kernel_stack are
>> present?  You'd only have to walk one level -- presumably, if the PGD
>> entry is there, the rest of the entries are okay, too.
>
> That's all greek to me :-)
>
> I see that there is something at ffff88023bc80000:
>
> crash> x /64xg 0xffff88023bc80000
> 0xffff88023bc80000:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80010:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80020:     0x0000000000000000      0x000000006686ada9
> 0xffff88023bc80030:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80040:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80050:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80060:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80070:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80080:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80090:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc800a0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc800b0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc800c0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc800d0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc800e0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc800f0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80100:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80110:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80120:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80130:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80140:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80150:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80160:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80170:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80180:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc80190:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc801a0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc801b0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc801c0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc801d0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc801e0:     0x0000000000000000      0x0000000000000000
> 0xffff88023bc801f0:     0x0000000000000000      0x0000000000000000
>
> old_rsp and kernel_stack seem bogus:
> crash> print old_rsp
> Cannot access memory at address 0xa200
> gdb: gdb request failed: print old_rsp
> crash> print kernel_stack
> Cannot access memory at address 0xaa48
> gdb: gdb request failed: print kernel_stack
>
> kernel_stack is not a pointer? So 0xffff88023bc80000 + 0xaa48 it is:

Yup.  old_rsp and kernel_stack are offsets relative to gsbase.

>
> crash> x /64xg 0xffff88023bc8aa00
> 0xffff88023bc8aa00:     0x0000000000000000      0x0000000000000000

[...]

I don't know enough about crashkernel to know whether the fact that
this worked means anything.

Can you dump the page of physical memory at 0x4779a067?  That's the PGD.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 20:49               ` Andy Lutomirski
@ 2015-03-18 21:06                 ` Denys Vlasenko
  2015-03-18 21:17                   ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-18 21:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stefan Seyfried, Linus Torvalds, Takashi Iwai, X86 ML, LKML, Tejun Heo

On 03/18/2015 09:49 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 1:06 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
>>> Hi Linus-
>>>
>>> You seem to enjoy debugging these things.  Want to give this a shot?
>>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>>> right after swapgs in syscall entry.
>>
>> The code is:
>>
>> ENTRY(system_call)
>>         SWAPGS_UNSAFE_STACK
>> GLOBAL(system_call_after_swapgs)
>>         movq    %rsp,PER_CPU_VAR(rsp_scratch)
>>         movq    PER_CPU_VAR(kernel_stack),%rsp
>>
>> If PER_CPU_VAR(var) memory access can page fault
>> (I was thinking this is ensured to never fault),
>> then on these two instructions such page fault
>> will be fatal: we will still have userspace %rsp.
>>
>> I thought we can only get a NMI or debug interrupt here,
>> and they are both set up to use IST stacks
>> to prevent this scenario (among other reasons).
> 
> I don't think that #DB is possible -- we should never have a
> watchpoint on percpu memory like that (unless we're using kgdb, in
> which case I think that kgdb should be fixed).

And #DB shouldn't cause a problem even if it happens (it's on
an IST stack).

I was thinking about it more and the thing is, CPU did manage
to enter page fault handler.

It means that it managed to store iret frame.

This means that stores to (%rsp) worked, whatever %rsp is
(even if it points to user's page).

The double fault happened only when CALL insn inside the handler
attempted to push yet another word. _This_ is what did not work.

Why?

I almost ready to declare that it's SMAP triggering:
that attempts to access (write to) userspace were caught.
However, disassembly shows

crash> disassemble page_fault
Dump of assembler code for function page_fault:
   0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
   0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
   0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
   0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
   0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^KABOOM HERE^^^^^^^^^^^^^^^^^^^^^^^
   0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
   0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
   0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
   0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
   0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
End of assembler dump.

Those NOPs at the beginning are ASM_CLAC and PARAVIRT_ADJUST_EXCEPTION_FRAME
from this source:


.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym)
        /* Sanity check */
        .if \shift_ist != -1 && \paranoid == 0
        .error "using shift_ist requires paranoid=1"
        .endif

        .if \has_error_code
        XCPT_FRAME
        .else
        INTR_FRAME
        .endif

        ASM_CLAC
        PARAVIRT_ADJUST_EXCEPTION_FRAME

        subq $ORIG_RAX-R15, %rsp
        call error_entry
        ...

If ASM_CLAC is replaced by NOPs, this CPU must be not SMAP capable.
If so, then another store to (%rsp) should have worked too...


Stefan, Takashi - are you seeing this on SMAP-capable CPUs?

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 20:51               ` Andy Lutomirski
@ 2015-03-18 21:12                 ` Stefan Seyfried
  2015-03-18 21:21                   ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 21:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

Am 18.03.2015 um 21:51 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 1:05 PM, Stefan Seyfried
> <stefan.seyfried@googlemail.com> wrote:

>>> The relevant thread's stack is here (see ti in the trace):
>>>
>>> ffff8801013d4000
>>>
>>> It could be interesting to see what's there.
>>>
>>> I don't suppose you want to try to walk the paging structures to see
>>> if ffff88023bc80000 (i.e. gsbase) and, more specifically,
>>> ffff88023bc80000 + old_rsp and ffff88023bc80000 + kernel_stack are
>>> present?  You'd only have to walk one level -- presumably, if the PGD
>>> entry is there, the rest of the entries are okay, too.
>>
>> That's all greek to me :-)
>>
>> I see that there is something at ffff88023bc80000:
>>
>> crash> x /64xg 0xffff88023bc80000
>> 0xffff88023bc80000:     0x0000000000000000      0x0000000000000000
>> 0xffff88023bc80010:     0x0000000000000000      0x0000000000000000
>> 0xffff88023bc80020:     0x0000000000000000      0x000000006686ada9
>> 0xffff88023bc80030:     0x0000000000000000      0x0000000000000000
>> 0xffff88023bc80040:     0x0000000000000000      0x0000000000000000
>> [all zeroes]
>> 0xffff88023bc801f0:     0x0000000000000000      0x0000000000000000
>>
>> old_rsp and kernel_stack seem bogus:
>> crash> print old_rsp
>> Cannot access memory at address 0xa200
>> gdb: gdb request failed: print old_rsp
>> crash> print kernel_stack
>> Cannot access memory at address 0xaa48
>> gdb: gdb request failed: print kernel_stack
>>
>> kernel_stack is not a pointer? So 0xffff88023bc80000 + 0xaa48 it is:
> 
> Yup.  old_rsp and kernel_stack are offsets relative to gsbase.
> 
>>
>> crash> x /64xg 0xffff88023bc8aa00
>> 0xffff88023bc8aa00:     0x0000000000000000      0x0000000000000000
> 
> [...]
> 
> I don't know enough about crashkernel to know whether the fact that
> this worked means anything.

AFAIK this just means that the memory at this location is included in
the dump :-)

> Can you dump the page of physical memory at 0x4779a067?  That's the PGD.

Unfortunately not, this is a partial dump (I think the default config in
openSUSE, but I might have changed it some time ago) and the dump_level
is 31 which means that the following are excluded:

                     |      |cache  |cache  |      |
                dump | zero |without|with   | user | free
               level | page |private|private| data | page
              -------+------+-------+-------+------+------
                  31 |  X   |   X   |   X   |  X   |  X

so this:
crash> x /64xg 0x4779a067
0x4779a067:     Cannot access memory at address 0x4779a067
gdb: gdb request failed: x /64xg

probably just means, that the PGD falls in one of the above excluded
categories.

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:06                 ` Denys Vlasenko
@ 2015-03-18 21:17                   ` Andy Lutomirski
  0 siblings, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 21:17 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Stefan Seyfried, Linus Torvalds, Takashi Iwai, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 2:06 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/18/2015 09:49 PM, Andy Lutomirski wrote:
>> On Wed, Mar 18, 2015 at 1:06 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>> On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
>>>> Hi Linus-
>>>>
>>>> You seem to enjoy debugging these things.  Want to give this a shot?
>>>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>>>> right after swapgs in syscall entry.
>>>
>>> The code is:
>>>
>>> ENTRY(system_call)
>>>         SWAPGS_UNSAFE_STACK
>>> GLOBAL(system_call_after_swapgs)
>>>         movq    %rsp,PER_CPU_VAR(rsp_scratch)
>>>         movq    PER_CPU_VAR(kernel_stack),%rsp
>>>
>>> If PER_CPU_VAR(var) memory access can page fault
>>> (I was thinking this is ensured to never fault),
>>> then on these two instructions such page fault
>>> will be fatal: we will still have userspace %rsp.
>>>
>>> I thought we can only get a NMI or debug interrupt here,
>>> and they are both set up to use IST stacks
>>> to prevent this scenario (among other reasons).
>>
>> I don't think that #DB is possible -- we should never have a
>> watchpoint on percpu memory like that (unless we're using kgdb, in
>> which case I think that kgdb should be fixed).
>
> And #DB shouldn't cause a problem even if it happens (it's on
> an IST stack).
>
> I was thinking about it more and the thing is, CPU did manage
> to enter page fault handler.
>
> It means that it managed to store iret frame.
>
> This means that stores to (%rsp) worked, whatever %rsp is
> (even if it points to user's page).
>
> The double fault happened only when CALL insn inside the handler
> attempted to push yet another word. _This_ is what did not work.
>
> Why?
>
> I almost ready to declare that it's SMAP triggering:
> that attempts to access (write to) userspace were caught.
> However, disassembly shows
>
> crash> disassemble page_fault
> Dump of assembler code for function page_fault:
>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^KABOOM HERE^^^^^^^^^^^^^^^^^^^^^^^
>    0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
>    0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
>    0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
>    0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
>    0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
> End of assembler dump.
>
> Those NOPs at the beginning are ASM_CLAC and PARAVIRT_ADJUST_EXCEPTION_FRAME
> from this source:
>
>
> .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
> ENTRY(\sym)
>         /* Sanity check */
>         .if \shift_ist != -1 && \paranoid == 0
>         .error "using shift_ist requires paranoid=1"
>         .endif
>
>         .if \has_error_code
>         XCPT_FRAME
>         .else
>         INTR_FRAME
>         .endif
>
>         ASM_CLAC
>         PARAVIRT_ADJUST_EXCEPTION_FRAME
>
>         subq $ORIG_RAX-R15, %rsp
>         call error_entry
>         ...
>
> If ASM_CLAC is replaced by NOPs, this CPU must be not SMAP capable.
> If so, then another store to (%rsp) should have worked too...
>
>
> Stefan, Takashi - are you seeing this on SMAP-capable CPUs?

That's why I asked if this was Broadwell.  It's not :(

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:12                 ` Stefan Seyfried
@ 2015-03-18 21:21                   ` Andy Lutomirski
  2015-03-18 21:41                     ` Stefan Seyfried
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 21:21 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Linus Torvalds, Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 2:12 PM, Stefan Seyfried
<stefan.seyfried@googlemail.com> wrote:
> Am 18.03.2015 um 21:51 schrieb Andy Lutomirski:
>> On Wed, Mar 18, 2015 at 1:05 PM, Stefan Seyfried
>> <stefan.seyfried@googlemail.com> wrote:
>
>>>> The relevant thread's stack is here (see ti in the trace):
>>>>
>>>> ffff8801013d4000
>>>>
>>>> It could be interesting to see what's there.
>>>>
>>>> I don't suppose you want to try to walk the paging structures to see
>>>> if ffff88023bc80000 (i.e. gsbase) and, more specifically,
>>>> ffff88023bc80000 + old_rsp and ffff88023bc80000 + kernel_stack are
>>>> present?  You'd only have to walk one level -- presumably, if the PGD
>>>> entry is there, the rest of the entries are okay, too.
>>>
>>> That's all greek to me :-)
>>>
>>> I see that there is something at ffff88023bc80000:
>>>
>>> crash> x /64xg 0xffff88023bc80000
>>> 0xffff88023bc80000:     0x0000000000000000      0x0000000000000000
>>> 0xffff88023bc80010:     0x0000000000000000      0x0000000000000000
>>> 0xffff88023bc80020:     0x0000000000000000      0x000000006686ada9
>>> 0xffff88023bc80030:     0x0000000000000000      0x0000000000000000
>>> 0xffff88023bc80040:     0x0000000000000000      0x0000000000000000
>>> [all zeroes]
>>> 0xffff88023bc801f0:     0x0000000000000000      0x0000000000000000
>>>
>>> old_rsp and kernel_stack seem bogus:
>>> crash> print old_rsp
>>> Cannot access memory at address 0xa200
>>> gdb: gdb request failed: print old_rsp
>>> crash> print kernel_stack
>>> Cannot access memory at address 0xaa48
>>> gdb: gdb request failed: print kernel_stack
>>>
>>> kernel_stack is not a pointer? So 0xffff88023bc80000 + 0xaa48 it is:
>>
>> Yup.  old_rsp and kernel_stack are offsets relative to gsbase.
>>
>>>
>>> crash> x /64xg 0xffff88023bc8aa00
>>> 0xffff88023bc8aa00:     0x0000000000000000      0x0000000000000000
>>
>> [...]
>>
>> I don't know enough about crashkernel to know whether the fact that
>> this worked means anything.
>
> AFAIK this just means that the memory at this location is included in
> the dump :-)
>
>> Can you dump the page of physical memory at 0x4779a067?  That's the PGD.
>
> Unfortunately not, this is a partial dump (I think the default config in
> openSUSE, but I might have changed it some time ago) and the dump_level
> is 31 which means that the following are excluded:
>
>                      |      |cache  |cache  |      |
>                 dump | zero |without|with   | user | free
>                level | page |private|private| data | page
>               -------+------+-------+-------+------+------
>                   31 |  X   |   X   |   X   |  X   |  X
>
> so this:
> crash> x /64xg 0x4779a067
> 0x4779a067:     Cannot access memory at address 0x4779a067
> gdb: gdb request failed: x /64xg
>
> probably just means, that the PGD falls in one of the above excluded
> categories.

I suspect that it actually means that gdb sees virtual addresses, not
physical addresses.  But I screwed up completely -- "PGD" in the dump
is the PGD *entry*, not the PGD pointer.

We could plausibly fish it out from current->mm, but that's a mess.  I
don't suppose that "info registers" or "p/x $cr3" will show the cr3
value?

In any case, Denys is right -- my theory doesn't really hold water on
non-SMAP systems.

--Andy

>
> Best regards,
>
>         Stefan
> --
> Stefan Seyfried
> Linux Consultant & Developer -- GPG Key: 0x731B665B
>
> B1 Systems GmbH
> Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
> GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 19:26           ` Andy Lutomirski
  2015-03-18 20:05             ` Stefan Seyfried
  2015-03-18 20:06             ` Denys Vlasenko
@ 2015-03-18 21:32             ` Linus Torvalds
  2015-03-18 21:42               ` Denys Vlasenko
  2015-03-18 21:49               ` Stefan Seyfried
  2015-03-28 23:57             ` Maciej W. Rozycki
  3 siblings, 2 replies; 77+ messages in thread
From: Linus Torvalds @ 2015-03-18 21:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stefan Seyfried, Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 12:26 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>
>> crash> disassemble page_fault
>> Dump of assembler code for function page_fault:
>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>
> The callq was the double-faulting instruction, and it is indeed the
> first function in here that would have accessed the stack.  (The sub
> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
> page fault, and the page fault is promoted to a double fault.  The
> surprising thing is that the page fault itself seems to have been
> delivered okay, and RSP wasn't on a page boundary.

Not at all surprising, and sure it was on a page boundry..

Look closer.

%rsp is 00007fffa55eafb8.

But that's *after* page_fault has done that

    sub    $0x78,%rsp

so %rsp when the page fault happened was 0x7fffa55eb030. Which is a
different page.

And that page happened to be mapped.

So what happened is:

 - we somehow entered kernel mode without switching stacks

   (ie presumably syscall)

 - the user stack was still fine

 - we took a page fault, which once again didn't switch stacks,
because we were already in kernel mode. And this page fault worked,
because it just pushed the error code onto the user stack which was
mapped.

 - we now took a second page fault within the page fault handler,
because now the stack pointer has been decremented and points one user
page down that is *not* mapped, so now that page fault cannot push the
error code and return information.

Now, how we took that original page fault is sadly not very clear at
all.  I agree that it's something about system-call (how could we not
change stacks otherwise), but why it should have started now, I don't
know. I don't think "system_call" has changed at all.

Maybe there is something wrong with the new "ret_from_sys_call" logic,
and that "use sysret to return to user mode" thing. Because this code
sequence:

+       movq (RSP-RIP)(%rsp),%rsp
+       USERGS_SYSRET64

in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
kernel with a user stack poiinter, maybe we're *exiting* the kernel,
and have just reloaded the user stack pointer when "USERGS_SYSRET64"
takes some fault.

Is PARAVIRT enabled? The three nop's at the beginning of 'page_fault'
makes me suspect it is,  and that that is some paravirt rewriting
area. What does paravirt go for that USERGS_SYSRET64 (or for
SWAPGS_UNSAFE_STACK, for that matter).

                        Linus

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:21                   ` Andy Lutomirski
@ 2015-03-18 21:41                     ` Stefan Seyfried
  2015-03-18 21:49                       ` Denys Vlasenko
  0 siblings, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 21:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

Am 18.03.2015 um 22:21 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 2:12 PM, Stefan Seyfried
> <stefan.seyfried@googlemail.com> wrote:
>> Am 18.03.2015 um 21:51 schrieb Andy Lutomirski:
>>> On Wed, Mar 18, 2015 at 1:05 PM, Stefan Seyfried
>>> <stefan.seyfried@googlemail.com> wrote:
>>
>>>>> The relevant thread's stack is here (see ti in the trace):
>>>>>
>>>>> ffff8801013d4000
>>>>>
>>>>> It could be interesting to see what's there.
>>>>>
>>>>> I don't suppose you want to try to walk the paging structures to see
>>>>> if ffff88023bc80000 (i.e. gsbase) and, more specifically,
>>>>> ffff88023bc80000 + old_rsp and ffff88023bc80000 + kernel_stack are
>>>>> present?  You'd only have to walk one level -- presumably, if the PGD
>>>>> entry is there, the rest of the entries are okay, too.
>>>>
>>>> That's all greek to me :-)
>>>>
>>>> I see that there is something at ffff88023bc80000:
>>>>
>>>> crash> x /64xg 0xffff88023bc80000
>>>> 0xffff88023bc80000:     0x0000000000000000      0x0000000000000000
>>>> 0xffff88023bc80010:     0x0000000000000000      0x0000000000000000
>>>> 0xffff88023bc80020:     0x0000000000000000      0x000000006686ada9
>>>> 0xffff88023bc80030:     0x0000000000000000      0x0000000000000000
>>>> 0xffff88023bc80040:     0x0000000000000000      0x0000000000000000
>>>> [all zeroes]
>>>> 0xffff88023bc801f0:     0x0000000000000000      0x0000000000000000
>>>>
>>>> old_rsp and kernel_stack seem bogus:
>>>> crash> print old_rsp
>>>> Cannot access memory at address 0xa200
>>>> gdb: gdb request failed: print old_rsp
>>>> crash> print kernel_stack
>>>> Cannot access memory at address 0xaa48
>>>> gdb: gdb request failed: print kernel_stack
>>>>
>>>> kernel_stack is not a pointer? So 0xffff88023bc80000 + 0xaa48 it is:
>>>
>>> Yup.  old_rsp and kernel_stack are offsets relative to gsbase.
>>>
>>>>
>>>> crash> x /64xg 0xffff88023bc8aa00
>>>> 0xffff88023bc8aa00:     0x0000000000000000      0x0000000000000000
>>>
>>> [...]
>>>
>>> I don't know enough about crashkernel to know whether the fact that
>>> this worked means anything.
>>
>> AFAIK this just means that the memory at this location is included in
>> the dump :-)
>>
>>> Can you dump the page of physical memory at 0x4779a067?  That's the PGD.
>>
>> Unfortunately not, this is a partial dump (I think the default config in
>> openSUSE, but I might have changed it some time ago) and the dump_level
>> is 31 which means that the following are excluded:
>>
>>                      |      |cache  |cache  |      |
>>                 dump | zero |without|with   | user | free
>>                level | page |private|private| data | page
>>               -------+------+-------+-------+------+------
>>                   31 |  X   |   X   |   X   |  X   |  X
>>
>> so this:
>> crash> x /64xg 0x4779a067
>> 0x4779a067:     Cannot access memory at address 0x4779a067
>> gdb: gdb request failed: x /64xg
>>
>> probably just means, that the PGD falls in one of the above excluded
>> categories.
> 
> I suspect that it actually means that gdb sees virtual addresses, not
> physical addresses.  But I screwed up completely -- "PGD" in the dump
> is the PGD *entry*, not the PGD pointer.

in crash, usually physical addresses work (it's a sophisticated wrapper
around gdb AFAICT)
> 
> We could plausibly fish it out from current->mm, but that's a mess.

I'll come to that later
  I
> don't suppose that "info registers" or "p/x $cr3" will show the cr3
> value?

No, that does not work from crash.

But current->mm is easy:
crash> task|grep mm
      start_comm =
"\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
  mm = 0xffff8800b8a9c040,
  active_mm = 0xffff8800b8a9c040,
  comm = "qemu-system-x86",

and (guessing the type :-)
crash> print *(struct mm_struct *)0xffff8800b8a9c040|grep pgd
  pgd = 0xffff880002d7e000,

But if that's correct, pgd contains all zeroes:
crash> print *(pgd_t *)0xffff880002d7e000
$15 = {
  pgd = 0
}
crash> x /16xg 0xffff880002d7e000
0xffff880002d7e000:     0x0000000000000000      0x0000000000000000
0xffff880002d7e010:     0x0000000000000000      0x0000000000000000
0xffff880002d7e020:     0x0000000000000000      0x0000000000000000
0xffff880002d7e030:     0x0000000000000000      0x0000000000000000
0xffff880002d7e040:     0x0000000000000000      0x0000000000000000
0xffff880002d7e050:     0x0000000000000000      0x0000000000000000
0xffff880002d7e060:     0x0000000000000000      0x0000000000000000
0xffff880002d7e070:     0x0000000000000000      0x0000000000000000

> In any case, Denys is right -- my theory doesn't really hold water on
> non-SMAP systems.

Mine is definitely not new enough for this feature :)

Maybe it would be more helpful if Takashi who is able to reproduce this
more reliably than me would do a crash dump, preferably with a lower
dumplevel, to investigate on.
I have seen the bug two or three times in a week or two, which makes
waiting for it to happen a boring experience.

Best regards,

	Stefan

-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:32             ` Linus Torvalds
@ 2015-03-18 21:42               ` Denys Vlasenko
  2015-03-18 21:55                 ` Andy Lutomirski
  2015-03-18 21:49               ` Stefan Seyfried
  1 sibling, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-18 21:42 UTC (permalink / raw)
  To: Linus Torvalds, Andy Lutomirski
  Cc: Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On 03/18/2015 10:32 PM, Linus Torvalds wrote:
> On Wed, Mar 18, 2015 at 12:26 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>
>>> crash> disassemble page_fault
>>> Dump of assembler code for function page_fault:
>>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>>
>> The callq was the double-faulting instruction, and it is indeed the
>> first function in here that would have accessed the stack.  (The sub
>> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
>> page fault, and the page fault is promoted to a double fault.  The
>> surprising thing is that the page fault itself seems to have been
>> delivered okay, and RSP wasn't on a page boundary.
> 
> Not at all surprising, and sure it was on a page boundry..
> 
> Look closer.
> 
> %rsp is 00007fffa55eafb8.
> 
> But that's *after* page_fault has done that
> 
>     sub    $0x78,%rsp
> 
> so %rsp when the page fault happened was 0x7fffa55eb030. Which is a
> different page.
> 
> And that page happened to be mapped.
> 
> So what happened is:
> 
>  - we somehow entered kernel mode without switching stacks
> 
>    (ie presumably syscall)
> 
>  - the user stack was still fine
> 
>  - we took a page fault, which once again didn't switch stacks,
> because we were already in kernel mode. And this page fault worked,
> because it just pushed the error code onto the user stack which was
> mapped.
> 
>  - we now took a second page fault within the page fault handler,
> because now the stack pointer has been decremented and points one user
> page down that is *not* mapped, so now that page fault cannot push the
> error code and return information.
> 
> Now, how we took that original page fault is sadly not very clear at
> all.  I agree that it's something about system-call (how could we not
> change stacks otherwise), but why it should have started now, I don't
> know. I don't think "system_call" has changed at all.
> 
> Maybe there is something wrong with the new "ret_from_sys_call" logic,
> and that "use sysret to return to user mode" thing. Because this code
> sequence:
> 
> +       movq (RSP-RIP)(%rsp),%rsp
> +       USERGS_SYSRET64
> 
> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
> takes some fault.

Yes, so far we happily thought that SYSRET never fails...

This merits adding some code which would at least BUG_ON
if the faulting address is seen to match SYSRET64.

Now we only check for faulting IRETQ:

error_kernelspace:
        CFI_REL_OFFSET rcx, RCX+8
        incl %ebx
        leaq native_irq_return_iret(%rip),%rcx
        cmpq %rcx,RIP+8(%rsp)
        je error_bad_iret

> 
> Is PARAVIRT enabled? The three nop's at the beginning of 'page_fault'
> makes me suspect it is,  and that that is some paravirt rewriting
> area. What does paravirt go for that USERGS_SYSRET64 (or for
> SWAPGS_UNSAFE_STACK, for that matter).
> 
>                         Linus
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:32             ` Linus Torvalds
  2015-03-18 21:42               ` Denys Vlasenko
@ 2015-03-18 21:49               ` Stefan Seyfried
  1 sibling, 0 replies; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 21:49 UTC (permalink / raw)
  To: Linus Torvalds, Andy Lutomirski
  Cc: Takashi Iwai, Denys Vlasenko, X86 ML, LKML, Tejun Heo

Am 18.03.2015 um 22:32 schrieb Linus Torvalds:
> Is PARAVIRT enabled? The three nop's at the beginning of 'page_fault'
> makes me suspect it is,  and that that is some paravirt rewriting
> area. What does paravirt go for that USERGS_SYSRET64 (or for
> SWAPGS_UNSAFE_STACK, for that matter).

This from the newer kernel package, but I doubt this configuration has
been changed in the openSUSE kernel:

susi:~ # grep PARAVIRT /boot/config-4.0.0-rc4-1.g126fc64-desktop
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
# CONFIG_PARAVIRT_SPINLOCKS is not set
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
CONFIG_PARAVIRT_CLOCK=y

So yes, PARAVIRT is enabled.

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:41                     ` Stefan Seyfried
@ 2015-03-18 21:49                       ` Denys Vlasenko
  2015-03-18 21:53                         ` Stefan Seyfried
  0 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-18 21:49 UTC (permalink / raw)
  To: Stefan Seyfried, Andy Lutomirski
  Cc: Linus Torvalds, Takashi Iwai, X86 ML, LKML, Tejun Heo

Stefan, Takashi, can you post your /proc/cpuinfo
and dmesg after boot?


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:49                       ` Denys Vlasenko
@ 2015-03-18 21:53                         ` Stefan Seyfried
  0 siblings, 0 replies; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 21:53 UTC (permalink / raw)
  To: Denys Vlasenko, Andy Lutomirski
  Cc: Linus Torvalds, Takashi Iwai, X86 ML, LKML, Tejun Heo

Am 18.03.2015 um 22:49 schrieb Denys Vlasenko:
> Stefan, Takashi, can you post your /proc/cpuinfo
> and dmesg after boot?

susi:~ # cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz
stepping        : 10
microcode       : 0xa0c
cpu MHz         : 1867.000
cache size      : 6144 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts nopl aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm ida dtherm tpr_shadow vnmi flexpriority bugs            :
bogomips        : 3723.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

(repeats for second core :)

I'm running 3.19 now, but the dmesg extracted from the crash
dump of 4.0-rc3 is at http://paste.opensuse.org/48196621
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:42               ` Denys Vlasenko
@ 2015-03-18 21:55                 ` Andy Lutomirski
  2015-03-18 22:17                   ` Denys Vlasenko
                                     ` (3 more replies)
  0 siblings, 4 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 21:55 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 2:42 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/18/2015 10:32 PM, Linus Torvalds wrote:
>> On Wed, Mar 18, 2015 at 12:26 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>
>>>> crash> disassemble page_fault
>>>> Dump of assembler code for function page_fault:
>>>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>>>
>>> The callq was the double-faulting instruction, and it is indeed the
>>> first function in here that would have accessed the stack.  (The sub
>>> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
>>> page fault, and the page fault is promoted to a double fault.  The
>>> surprising thing is that the page fault itself seems to have been
>>> delivered okay, and RSP wasn't on a page boundary.
>>
>> Not at all surprising, and sure it was on a page boundry..
>>
>> Look closer.
>>
>> %rsp is 00007fffa55eafb8.
>>
>> But that's *after* page_fault has done that
>>
>>     sub    $0x78,%rsp
>>
>> so %rsp when the page fault happened was 0x7fffa55eb030. Which is a
>> different page.

Ah, I forgot to add 0x78.  You're right, of course.

>>
>> And that page happened to be mapped.
>>
>> So what happened is:
>>
>>  - we somehow entered kernel mode without switching stacks
>>
>>    (ie presumably syscall)
>>
>>  - the user stack was still fine
>>
>>  - we took a page fault, which once again didn't switch stacks,
>> because we were already in kernel mode. And this page fault worked,
>> because it just pushed the error code onto the user stack which was
>> mapped.
>>
>>  - we now took a second page fault within the page fault handler,
>> because now the stack pointer has been decremented and points one user
>> page down that is *not* mapped, so now that page fault cannot push the
>> error code and return information.
>>
>> Now, how we took that original page fault is sadly not very clear at
>> all.  I agree that it's something about system-call (how could we not
>> change stacks otherwise), but why it should have started now, I don't
>> know. I don't think "system_call" has changed at all.
>>
>> Maybe there is something wrong with the new "ret_from_sys_call" logic,
>> and that "use sysret to return to user mode" thing. Because this code
>> sequence:
>>
>> +       movq (RSP-RIP)(%rsp),%rsp
>> +       USERGS_SYSRET64
>>
>> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
>> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
>> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
>> takes some fault.
>
> Yes, so far we happily thought that SYSRET never fails...
>
> This merits adding some code which would at least BUG_ON
> if the faulting address is seen to match SYSRET64.

sysret64 can only fail with #GP, and we're totally screwed if that
happens, although I agree about the BUG_ON in principle.  Where would
we add it that would help in this case, though?  We never even made it
to C code.

In any event, this was a page fault.  sysret64 doesn't access memory.

>
> Now we only check for faulting IRETQ:
>
> error_kernelspace:
>         CFI_REL_OFFSET rcx, RCX+8
>         incl %ebx
>         leaq native_irq_return_iret(%rip),%rcx
>         cmpq %rcx,RIP+8(%rsp)
>         je error_bad_iret
>
>>
>> Is PARAVIRT enabled? The three nop's at the beginning of 'page_fault'
>> makes me suspect it is,  and that that is some paravirt rewriting
>> area. What does paravirt go for that USERGS_SYSRET64 (or for
>> SWAPGS_UNSAFE_STACK, for that matter).

On Xen, it goes to xen_sysret64, which touches the same percpu
variables that we touch on entry.  So I still like my percpu vmap
fault hypothesis, even though I don't understand what would trigger
it.

At the risk of asking awful questions, what happens if we deliver an
IST interrupt in vmx_handle_external_intr?  Can that happen?  It can't
be a good thing if it happens.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:55                 ` Andy Lutomirski
@ 2015-03-18 22:17                   ` Denys Vlasenko
  2015-03-18 22:20                     ` Andy Lutomirski
  2015-03-18 22:18                   ` Linus Torvalds
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-18 22:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On 03/18/2015 10:55 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 2:42 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> On 03/18/2015 10:32 PM, Linus Torvalds wrote:
>>> On Wed, Mar 18, 2015 at 12:26 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>>
>>>>> crash> disassemble page_fault
>>>>> Dump of assembler code for function page_fault:
>>>>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>>>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>>>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>>>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>>>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>>>>
>>>> The callq was the double-faulting instruction, and it is indeed the
>>>> first function in here that would have accessed the stack.  (The sub
>>>> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
>>>> page fault, and the page fault is promoted to a double fault.  The
>>>> surprising thing is that the page fault itself seems to have been
>>>> delivered okay, and RSP wasn't on a page boundary.
>>>
>>> Not at all surprising, and sure it was on a page boundry..
>>>
>>> Look closer.
>>>
>>> %rsp is 00007fffa55eafb8.
>>>
>>> But that's *after* page_fault has done that
>>>
>>>     sub    $0x78,%rsp
>>>
>>> so %rsp when the page fault happened was 0x7fffa55eb030. Which is a
>>> different page.
> 
> Ah, I forgot to add 0x78.  You're right, of course.
> 
>>>
>>> And that page happened to be mapped.
>>>
>>> So what happened is:
>>>
>>>  - we somehow entered kernel mode without switching stacks
>>>
>>>    (ie presumably syscall)
>>>
>>>  - the user stack was still fine
>>>
>>>  - we took a page fault, which once again didn't switch stacks,
>>> because we were already in kernel mode. And this page fault worked,
>>> because it just pushed the error code onto the user stack which was
>>> mapped.
>>>
>>>  - we now took a second page fault within the page fault handler,
>>> because now the stack pointer has been decremented and points one user
>>> page down that is *not* mapped, so now that page fault cannot push the
>>> error code and return information.
>>>
>>> Now, how we took that original page fault is sadly not very clear at
>>> all.  I agree that it's something about system-call (how could we not
>>> change stacks otherwise), but why it should have started now, I don't
>>> know. I don't think "system_call" has changed at all.
>>>
>>> Maybe there is something wrong with the new "ret_from_sys_call" logic,
>>> and that "use sysret to return to user mode" thing. Because this code
>>> sequence:
>>>
>>> +       movq (RSP-RIP)(%rsp),%rsp
>>> +       USERGS_SYSRET64
>>>
>>> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
>>> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
>>> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
>>> takes some fault.
>>
>> Yes, so far we happily thought that SYSRET never fails...
>>
>> This merits adding some code which would at least BUG_ON
>> if the faulting address is seen to match SYSRET64.
> 
> sysret64 can only fail with #GP, and we're totally screwed if that
> happens, although I agree about the BUG_ON in principle.  Where would
> we add it that would help in this case, though?  We never even made it
> to C code.
> 
> In any event, this was a page fault.  sysret64 doesn't access memory.

Let's see.

Faulting SYSRET will still be in CPL0.
It would drop CPU into the #GP handler
but %rsp is already loaded with _user_ %rsp (!).

#GP handler will start pushing stuff onto stack,
happily thinking that it is a kernel stack.

This can cause a page fault.

Most likely, this page fault won't succeed,
and we'd get a double fault with %pir somewhere in #GP handler.

Yes, this doesn't entirely matches what we see...

There is an easy way to test the theory that SYSRET is to blame.

Just replace

        movq RCX(%rsp),%rcx
        cmpq %rcx,RIP(%rsp)             /* RCX == RIP */
        jne opportunistic_sysret_failed

this "jne" with "jmp", and try to reproduce.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:55                 ` Andy Lutomirski
  2015-03-18 22:17                   ` Denys Vlasenko
@ 2015-03-18 22:18                   ` Linus Torvalds
  2015-03-18 22:24                     ` Andy Lutomirski
  2015-03-18 22:22                   ` Jiri Kosina
  2015-03-19 13:21                   ` Denys Vlasenko
  3 siblings, 1 reply; 77+ messages in thread
From: Linus Torvalds @ 2015-03-18 22:18 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 2:55 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>
> On Xen, it goes to xen_sysret64, which touches the same percpu
> variables that we touch on entry.  So I still like my percpu vmap
> fault hypothesis, even though I don't understand what would trigger
> it.

I don't dislike the theory per se, but not only don't I see how it
could happen on regular execution on a laptop, but I also don't see
why this fault behavior would be new to 4.0.

(And I do believe that we should make sure that CPU bringup ends up
faulting in the percpu area, even if I don't really see why that would
be the issue here)

Afaik, the system call entry code hasn't changed at all.

What *has* changed is the "paranoid" handling (double-fault has that
magical "paranoid=2" thing, for example) and the return to user-space
code.

Which is really why I don't believe in that syscall thing. Not because
it isn't the obvious culprit, but simply because it hasn't *changed*.

Or is there something subtle I've missed?

                        Linus

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:17                   ` Denys Vlasenko
@ 2015-03-18 22:20                     ` Andy Lutomirski
  2015-03-18 22:27                       ` Denys Vlasenko
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 22:20 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:17 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/18/2015 10:55 PM, Andy Lutomirski wrote:
>> On Wed, Mar 18, 2015 at 2:42 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>> On 03/18/2015 10:32 PM, Linus Torvalds wrote:
>>>> On Wed, Mar 18, 2015 at 12:26 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>>>
>>>>>> crash> disassemble page_fault
>>>>>> Dump of assembler code for function page_fault:
>>>>>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>>>>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>>>>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>>>>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>>>>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>>>>>
>>>>> The callq was the double-faulting instruction, and it is indeed the
>>>>> first function in here that would have accessed the stack.  (The sub
>>>>> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
>>>>> page fault, and the page fault is promoted to a double fault.  The
>>>>> surprising thing is that the page fault itself seems to have been
>>>>> delivered okay, and RSP wasn't on a page boundary.
>>>>
>>>> Not at all surprising, and sure it was on a page boundry..
>>>>
>>>> Look closer.
>>>>
>>>> %rsp is 00007fffa55eafb8.
>>>>
>>>> But that's *after* page_fault has done that
>>>>
>>>>     sub    $0x78,%rsp
>>>>
>>>> so %rsp when the page fault happened was 0x7fffa55eb030. Which is a
>>>> different page.
>>
>> Ah, I forgot to add 0x78.  You're right, of course.
>>
>>>>
>>>> And that page happened to be mapped.
>>>>
>>>> So what happened is:
>>>>
>>>>  - we somehow entered kernel mode without switching stacks
>>>>
>>>>    (ie presumably syscall)
>>>>
>>>>  - the user stack was still fine
>>>>
>>>>  - we took a page fault, which once again didn't switch stacks,
>>>> because we were already in kernel mode. And this page fault worked,
>>>> because it just pushed the error code onto the user stack which was
>>>> mapped.
>>>>
>>>>  - we now took a second page fault within the page fault handler,
>>>> because now the stack pointer has been decremented and points one user
>>>> page down that is *not* mapped, so now that page fault cannot push the
>>>> error code and return information.
>>>>
>>>> Now, how we took that original page fault is sadly not very clear at
>>>> all.  I agree that it's something about system-call (how could we not
>>>> change stacks otherwise), but why it should have started now, I don't
>>>> know. I don't think "system_call" has changed at all.
>>>>
>>>> Maybe there is something wrong with the new "ret_from_sys_call" logic,
>>>> and that "use sysret to return to user mode" thing. Because this code
>>>> sequence:
>>>>
>>>> +       movq (RSP-RIP)(%rsp),%rsp
>>>> +       USERGS_SYSRET64
>>>>
>>>> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
>>>> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
>>>> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
>>>> takes some fault.
>>>
>>> Yes, so far we happily thought that SYSRET never fails...
>>>
>>> This merits adding some code which would at least BUG_ON
>>> if the faulting address is seen to match SYSRET64.
>>
>> sysret64 can only fail with #GP, and we're totally screwed if that
>> happens, although I agree about the BUG_ON in principle.  Where would
>> we add it that would help in this case, though?  We never even made it
>> to C code.
>>
>> In any event, this was a page fault.  sysret64 doesn't access memory.
>
> Let's see.
>
> Faulting SYSRET will still be in CPL0.
> It would drop CPU into the #GP handler
> but %rsp is already loaded with _user_ %rsp (!).
>
> #GP handler will start pushing stuff onto stack,
> happily thinking that it is a kernel stack.
>
> This can cause a page fault.
>
> Most likely, this page fault won't succeed,
> and we'd get a double fault with %pir somewhere in #GP handler.
>
> Yes, this doesn't entirely matches what we see...
>
> There is an easy way to test the theory that SYSRET is to blame.
>
> Just replace
>
>         movq RCX(%rsp),%rcx
>         cmpq %rcx,RIP(%rsp)             /* RCX == RIP */
>         jne opportunistic_sysret_failed
>
> this "jne" with "jmp", and try to reproduce.
>

This is a classic root exploit, and it's why we check for
non-canonical RIP.  In theory, that's the only way this can happen.
Intel screwed up -- AMD never fails SYSRET.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:55                 ` Andy Lutomirski
  2015-03-18 22:17                   ` Denys Vlasenko
  2015-03-18 22:18                   ` Linus Torvalds
@ 2015-03-18 22:22                   ` Jiri Kosina
  2015-03-18 22:28                     ` Linus Torvalds
  2015-03-18 22:29                     ` Andy Lutomirski
  2015-03-19 13:21                   ` Denys Vlasenko
  3 siblings, 2 replies; 77+ messages in thread
From: Jiri Kosina @ 2015-03-18 22:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Linus Torvalds, Stefan Seyfried, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, 18 Mar 2015, Andy Lutomirski wrote:

> sysret64 can only fail with #GP, and we're totally screwed if that
> happens, 

But what if the GPF handler pagefaults afterwards? It'd be operating on 
user stack already.

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:18                   ` Linus Torvalds
@ 2015-03-18 22:24                     ` Andy Lutomirski
  0 siblings, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 22:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:18 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Mar 18, 2015 at 2:55 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>
>> On Xen, it goes to xen_sysret64, which touches the same percpu
>> variables that we touch on entry.  So I still like my percpu vmap
>> fault hypothesis, even though I don't understand what would trigger
>> it.
>
> I don't dislike the theory per se, but not only don't I see how it
> could happen on regular execution on a laptop, but I also don't see
> why this fault behavior would be new to 4.0.
>
> (And I do believe that we should make sure that CPU bringup ends up
> faulting in the percpu area, even if I don't really see why that would
> be the issue here)
>
> Afaik, the system call entry code hasn't changed at all.
>
> What *has* changed is the "paranoid" handling (double-fault has that
> magical "paranoid=2" thing, for example) and the return to user-space
> code.

Indeed.  If this were #DB, #BP, or #MC, I'd believe that, but the page
fault code didn't change.  And double-fault didn't materially change
-- the paranoid=2 thing means to opt *out* of the recent changes.  So
I'm not convinced by that theory.

>
> Which is really why I don't believe in that syscall thing. Not because
> it isn't the obvious culprit, but simply because it hasn't *changed*.
>
> Or is there something subtle I've missed?

We did change one thing here: for the first time* it's possible to
exit using sysret when we didn't enter using syscall.  But this really
shouldn't matter on native, since we don't touch any memory at all
between the stack switch and sysret.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:20                     ` Andy Lutomirski
@ 2015-03-18 22:27                       ` Denys Vlasenko
  0 siblings, 0 replies; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-18 22:27 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Linus Torvalds, Stefan Seyfried, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 11:20 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> There is an easy way to test the theory that SYSRET is to blame.
>>
>> Just replace
>>
>>         movq RCX(%rsp),%rcx
>>         cmpq %rcx,RIP(%rsp)             /* RCX == RIP */
>>         jne opportunistic_sysret_failed
>>
>> this "jne" with "jmp", and try to reproduce.
>>
>
> This is a classic root exploit, and it's why we check for
> non-canonical RIP.  In theory, that's the only way this can happen.
> Intel screwed up -- AMD never fails SYSRET.

I'm not saying the code needs to be changed.

I'm saying that *people who see the crash* can make this change,
run the modified kernel, and if crash disappears -
then it is caused by "opportunistic SYSRET".

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:22                   ` Jiri Kosina
@ 2015-03-18 22:28                     ` Linus Torvalds
  2015-03-18 22:29                       ` Andy Lutomirski
  2015-03-18 22:29                     ` Andy Lutomirski
  1 sibling, 1 reply; 77+ messages in thread
From: Linus Torvalds @ 2015-03-18 22:28 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Andy Lutomirski, Denys Vlasenko, Stefan Seyfried, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
>
> But what if the GPF handler pagefaults afterwards? It'd be operating on
> user stack already.

So I think this might be the answer. We don't see the GP fault,
because we don't have a backtrace, because that backtrace is on the
user stack (which is why the stack trace dumping fails - we should
probably fix that, btw - the second oops is just confusing and not
helpful).

Is the intel check for canonical address (that __VIRTUAL_MASK_SHIFT
thing) perhaps wrong or not as strict as Intel CPU's do? We'd never
notice in normal situations..

                    Linus

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:28                     ` Linus Torvalds
@ 2015-03-18 22:29                       ` Andy Lutomirski
  0 siblings, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 22:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jiri Kosina, Denys Vlasenko, Stefan Seyfried, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:28 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
>>
>> But what if the GPF handler pagefaults afterwards? It'd be operating on
>> user stack already.
>
> So I think this might be the answer. We don't see the GP fault,
> because we don't have a backtrace, because that backtrace is on the
> user stack (which is why the stack trace dumping fails - we should
> probably fix that, btw - the second oops is just confusing and not
> helpful).
>
> Is the intel check for canonical address (that __VIRTUAL_MASK_SHIFT
> thing) perhaps wrong or not as strict as Intel CPU's do? We'd never
> notice in normal situations..

I explicitly tested that I could blow up the kernel if I intentionally
broke that test, and I couldn't blow it up with the test as written.
That doesn't prove it's correct, though.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:22                   ` Jiri Kosina
  2015-03-18 22:28                     ` Linus Torvalds
@ 2015-03-18 22:29                     ` Andy Lutomirski
  2015-03-18 22:38                       ` Stefan Seyfried
  2015-03-19 10:16                       ` Takashi Iwai
  1 sibling, 2 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 22:29 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Denys Vlasenko, Linus Torvalds, Stefan Seyfried, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
> On Wed, 18 Mar 2015, Andy Lutomirski wrote:
>
>> sysret64 can only fail with #GP, and we're totally screwed if that
>> happens,
>
> But what if the GPF handler pagefaults afterwards? It'd be operating on
> user stack already.

Good point.

Stefan, can you try changing the first "jne
opportunistic_sysret_failed" to "jmp opportunistic_sysret_failed" in
entry_64.S and seeing if you can reproduce this?  (Is it easy enough
to reproduce that this would tell us anything?)

It's a shame that double_fault doesn't record what gs was on entry.
If we did sysret -> general_protection -> page_fault -> double_fault,
then we'd enter double_fault with usergs, whereas syscall ->
page_fault -> double_fault would enter double_fault with kernelgs.

Hmm.  We may be able to answer this more directly.  Stefan, can you
dump a couple hundred bytes starting at 0x00007fffa55eafb8 (i.e. your
page_fault stack at the time of the failure)?  That will tell us the
faulting address.  If that fails, try starting at 00007fffa55eb000
instead.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:29                     ` Andy Lutomirski
@ 2015-03-18 22:38                       ` Stefan Seyfried
  2015-03-18 22:40                         ` Andy Lutomirski
  2015-03-19 10:16                       ` Takashi Iwai
  1 sibling, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-18 22:38 UTC (permalink / raw)
  To: Andy Lutomirski, Jiri Kosina
  Cc: Denys Vlasenko, Linus Torvalds, Takashi Iwai, X86 ML, LKML, Tejun Heo

Am 18.03.2015 um 23:29 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
>> On Wed, 18 Mar 2015, Andy Lutomirski wrote:
>>
>>> sysret64 can only fail with #GP, and we're totally screwed if that
>>> happens,
>>
>> But what if the GPF handler pagefaults afterwards? It'd be operating on
>> user stack already.
> 
> Good point.
> 
> Stefan, can you try changing the first "jne
> opportunistic_sysret_failed" to "jmp opportunistic_sysret_failed" in
> entry_64.S and seeing if you can reproduce this?  (Is it easy enough
> to reproduce that this would tell us anything?)

I have no good way of reproducing the issue (happens once per week...)
but apparently Takashi has, so I'd like to hand this task over to him.

> It's a shame that double_fault doesn't record what gs was on entry.
> If we did sysret -> general_protection -> page_fault -> double_fault,
> then we'd enter double_fault with usergs, whereas syscall ->
> page_fault -> double_fault would enter double_fault with kernelgs.
> 
> Hmm.  We may be able to answer this more directly.  Stefan, can you
> dump a couple hundred bytes starting at 0x00007fffa55eafb8 (i.e. your
> page_fault stack at the time of the failure)?  That will tell us the
> faulting address.  If that fails, try starting at 00007fffa55eb000
> instead.

Unfortunately not, is this userspace memory? It's not in the dump I have.
This issue is the first I have seen where having a full dump would be
really helpful apart from cosmetic reasons...
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:38                       ` Stefan Seyfried
@ 2015-03-18 22:40                         ` Andy Lutomirski
  2015-03-18 23:22                           ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 22:40 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Jiri Kosina, Denys Vlasenko, Linus Torvalds, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:38 PM, Stefan Seyfried
<stefan.seyfried@googlemail.com> wrote:
> Am 18.03.2015 um 23:29 schrieb Andy Lutomirski:
>> On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
>>> On Wed, 18 Mar 2015, Andy Lutomirski wrote:
>>>
>>>> sysret64 can only fail with #GP, and we're totally screwed if that
>>>> happens,
>>>
>>> But what if the GPF handler pagefaults afterwards? It'd be operating on
>>> user stack already.
>>
>> Good point.
>>
>> Stefan, can you try changing the first "jne
>> opportunistic_sysret_failed" to "jmp opportunistic_sysret_failed" in
>> entry_64.S and seeing if you can reproduce this?  (Is it easy enough
>> to reproduce that this would tell us anything?)
>
> I have no good way of reproducing the issue (happens once per week...)
> but apparently Takashi has, so I'd like to hand this task over to him.
>
>> It's a shame that double_fault doesn't record what gs was on entry.
>> If we did sysret -> general_protection -> page_fault -> double_fault,
>> then we'd enter double_fault with usergs, whereas syscall ->
>> page_fault -> double_fault would enter double_fault with kernelgs.
>>
>> Hmm.  We may be able to answer this more directly.  Stefan, can you
>> dump a couple hundred bytes starting at 0x00007fffa55eafb8 (i.e. your
>> page_fault stack at the time of the failure)?  That will tell us the
>> faulting address.  If that fails, try starting at 00007fffa55eb000
>> instead.
>
> Unfortunately not, is this userspace memory? It's not in the dump I have.
> This issue is the first I have seen where having a full dump would be
> really helpful apart from cosmetic reasons...

Yes, it's userspace.  Thanks for checking, though.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:40                         ` Andy Lutomirski
@ 2015-03-18 23:22                           ` Andy Lutomirski
  2015-03-19  0:23                             ` Stefan Seyfried
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-18 23:22 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Jiri Kosina, Denys Vlasenko, Linus Torvalds, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 3:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Wed, Mar 18, 2015 at 3:38 PM, Stefan Seyfried
> <stefan.seyfried@googlemail.com> wrote:
>> Am 18.03.2015 um 23:29 schrieb Andy Lutomirski:
>>> On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
>>>> On Wed, 18 Mar 2015, Andy Lutomirski wrote:
>>>>
>>>>> sysret64 can only fail with #GP, and we're totally screwed if that
>>>>> happens,
>>>>
>>>> But what if the GPF handler pagefaults afterwards? It'd be operating on
>>>> user stack already.
>>>
>>> Good point.
>>>
>>> Stefan, can you try changing the first "jne
>>> opportunistic_sysret_failed" to "jmp opportunistic_sysret_failed" in
>>> entry_64.S and seeing if you can reproduce this?  (Is it easy enough
>>> to reproduce that this would tell us anything?)
>>
>> I have no good way of reproducing the issue (happens once per week...)
>> but apparently Takashi has, so I'd like to hand this task over to him.
>>
>>> It's a shame that double_fault doesn't record what gs was on entry.
>>> If we did sysret -> general_protection -> page_fault -> double_fault,
>>> then we'd enter double_fault with usergs, whereas syscall ->
>>> page_fault -> double_fault would enter double_fault with kernelgs.
>>>
>>> Hmm.  We may be able to answer this more directly.  Stefan, can you
>>> dump a couple hundred bytes starting at 0x00007fffa55eafb8 (i.e. your
>>> page_fault stack at the time of the failure)?  That will tell us the
>>> faulting address.  If that fails, try starting at 00007fffa55eb000
>>> instead.
>>
>> Unfortunately not, is this userspace memory? It's not in the dump I have.
>> This issue is the first I have seen where having a full dump would be
>> really helpful apart from cosmetic reasons...
>
> Yes, it's userspace.  Thanks for checking, though.

One more stupid hunch:

Can you do:
x/21xg ffff8801013d4f58

If I counted right, that'll dump task_pt_regs(current).

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 23:22                           ` Andy Lutomirski
@ 2015-03-19  0:23                             ` Stefan Seyfried
  2015-03-19  0:57                               ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-19  0:23 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jiri Kosina, Denys Vlasenko, Linus Torvalds, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

Am 19.03.2015 um 00:22 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 3:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> Yes, it's userspace.  Thanks for checking, though.
> 
> One more stupid hunch:
> 
> Can you do:
> x/21xg ffff8801013d4f58
> 
> If I counted right, that'll dump task_pt_regs(current).

That's all zeroes:
crash> x /21xg 0xffff8801013d4f58
0xffff8801013d4f58:     0x0000000000000000      0x0000000000000000
0xffff8801013d4f68:     0x0000000000000000      0x0000000000000000
0xffff8801013d4f78:     0x0000000000000000      0x0000000000000000
0xffff8801013d4f88:     0x0000000000000000      0x0000000000000000
0xffff8801013d4f98:     0x0000000000000000      0x0000000000000000
0xffff8801013d4fa8:     0x0000000000000000      0x0000000000000000
0xffff8801013d4fb8:     0x0000000000000000      0x0000000000000000
0xffff8801013d4fc8:     0x0000000000000000      0x0000000000000000
0xffff8801013d4fd8:     0x0000000000000000      0x0000000000000000
0xffff8801013d4fe8:     0x0000000000000000      0x0000000000000000
0xffff8801013d4ff8:     0x0000000000000000

But maybe you counted wrong (or I'm reading arch/x86/include/asm/processor.h wrong, which is at least as likely...).

#define task_pt_regs(tsk)  ((struct pt_regs *)(tsk)->thread.sp0 - 1)

=> I have the task_struct readily available decoded in the crash utility.

crash> task, search for thread, in thread:
     sp0 = 18446612136629993472
crash> eval 18446612136629993472
hexadecimal: ffff8801013d8000  (18014269664677728KB)
....
crash> print *(struct pt_regs *)(18446612136629993472 - sizeof(struct pt_regs))
$20 = {
  r15 = 18446744071585666077, 
  r14 = 16, 
  r13 = 582, 
  r12 = 18446612136629993352, 
  bp = 24, 
  bx = 18446744071585666061, 
  r11 = 582, 
  r10 = 10760856, 
  r9 = 140712613762160, 
  r8 = 140735967861216, 
  ax = 1, 
  cx = 140712476030103, 
  dx = 140712613782304, 
  si = 1, 
  di = 140712589295616, 
  orig_ax = 209, 
  ip = 140712571864823, 
  cs = 51, 
  flags = 582, 
  sp = 140735967860552, 
  ss = 43
}

=>
r15 = ffffffff8168141d
r12 = ffff8801013d7f88
bx  = ffffffff8168140d
r9  = 7ffa355bd470
ip  = 7ffa32dc86f7
sp  = 7fffa55f1748

looks somehow legit, to my totally untrained eye (ip and sp actually).

I'm off to bed now (01:20 around here ;), will be back in about 7 hours.

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19  0:23                             ` Stefan Seyfried
@ 2015-03-19  0:57                               ` Andy Lutomirski
  2015-03-19  2:15                                 ` Linus Torvalds
  2015-03-19  6:24                                 ` Stefan Seyfried
  0 siblings, 2 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-19  0:57 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Jiri Kosina, Denys Vlasenko, Linus Torvalds, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 5:23 PM, Stefan Seyfried
<stefan.seyfried@googlemail.com> wrote:
> Am 19.03.2015 um 00:22 schrieb Andy Lutomirski:
>> On Wed, Mar 18, 2015 at 3:40 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> Yes, it's userspace.  Thanks for checking, though.
>>
>> One more stupid hunch:
>>
>> Can you do:
>> x/21xg ffff8801013d4f58
>>
>> If I counted right, that'll dump task_pt_regs(current).
>
> That's all zeroes:
> crash> x /21xg 0xffff8801013d4f58
> 0xffff8801013d4f58:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4f68:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4f78:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4f88:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4f98:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4fa8:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4fb8:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4fc8:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4fd8:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4fe8:     0x0000000000000000      0x0000000000000000
> 0xffff8801013d4ff8:     0x0000000000000000
>
> But maybe you counted wrong (or I'm reading arch/x86/include/asm/processor.h wrong, which is at least as likely...).
>
> #define task_pt_regs(tsk)  ((struct pt_regs *)(tsk)->thread.sp0 - 1)
>
> => I have the task_struct readily available decoded in the crash utility.
>
> crash> task, search for thread, in thread:
>      sp0 = 18446612136629993472
> crash> eval 18446612136629993472
> hexadecimal: ffff8801013d8000  (18014269664677728KB)

I did indeed count wrong -- THREAD_SIZE != 0x1000.  Whoops.

> ....
> crash> print *(struct pt_regs *)(18446612136629993472 - sizeof(struct pt_regs))

Looks like we last entered via an io_submit syscall.

> $20 = {
>   r15 = 18446744071585666077,
>   r14 = 16,
>   r13 = 582,
>   r12 = 18446612136629993352,
>   bp = 24,
>   bx = 18446744071585666061,
>   r11 = 582,

==flags, which is consistent with a syscall.  However, Denys' big
cleanup isn't in play here, so we probably did FIXUP_TOP_OF_STACK,
maybe even in the syscall in question.

>   r10 = 10760856,
>   r9 = 140712613762160,
>   r8 = 140735967861216,
>   ax = 1,

Entirely resonable if we're trying to exit from io_submit.

>   cx = 140712476030103,

0x7ffa2d263497

>   dx = 140712613782304,
>   si = 1,
>   di = 140712589295616,
>   orig_ax = 209,

__NR_io_submit

>   ip = 140712571864823,

0x7ffa32dc86f7, which is not equal to cx (oddly, given that this seems
to have been a syscall) and is canonical.  To me, this suggests that
FIXUP_TOP_OF_STACK last executed on a different syscall, in which case
all this opportunistic sysret stuff is a red herring - we never
executed FIXUP_TOP_OF_STACK for this syscall.

>   cs = 51,

__USER_CS

>   flags = 582,

0x246 (i.e. totally normal for userspace, I think)

>   sp = 140735967860552,

0x7fffa55f1748

Note that the double fault happened with rsp == 0x00007fffa55eafb8,
which is the saved rsp here - 0x6790.  That difference kind of large
to make sense if this is a sysret problem.  Not that I have a better
explanation...

OTOH, if it's a syscall problem, then these regs are from the previous
syscall, so 0x6790 byts of additional user stack usage is entirely
sensible.  Alternatively, we could have taken a whole pile of nested
page faults until we crossed into the land of unwritable user stack
pages.

>   ss = 43

__USER_DS

> }
>
> =>
> r15 = ffffffff8168141d
> r12 = ffff8801013d7f88
> bx  = ffffffff8168140d
> r9  = 7ffa355bd470
> ip  = 7ffa32dc86f7
> sp  = 7fffa55f1748
>
> looks somehow legit, to my totally untrained eye (ip and sp actually).

One potentially interesting thing that changed is that we now return
from KVM to userspace (to the next scheduled task, not necessarily to
the run ioctl) via sysret *even if the user return notifier runs*.
This was part of the point of the opportunistic sysret code, and KVM
seems to be involved here.

>
> I'm off to bed now (01:20 around here ;), will be back in about 7 hours.

Thanks for the evening debugging help :)

FWIW, I just noticed that stub_execveat incorrect calls
RESTORE_TOP_OF_STACK before jumping to int_ret_from_sys_call.
Actually, there seems to be an impressive number of bugs like that
(the syscall slow path totally screws this up, but it seems harmless
to me).  I'm really glad that Denys is removing that code...

Stefan, do you happen to know whether your disassembly of page_fault
came from the instructions in memory or if they came from the vmlinux
file?  Not that I have any relevant ideas there.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19  0:57                               ` Andy Lutomirski
@ 2015-03-19  2:15                                 ` Linus Torvalds
  2015-03-19  6:24                                 ` Stefan Seyfried
  1 sibling, 0 replies; 77+ messages in thread
From: Linus Torvalds @ 2015-03-19  2:15 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stefan Seyfried, Jiri Kosina, Denys Vlasenko, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

On Wed, Mar 18, 2015 at 5:57 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>
>>   sp = 140735967860552,
>
> 0x7fffa55f1748
>
> Note that the double fault happened with rsp == 0x00007fffa55eafb8,
> which is the saved rsp here - 0x6790.  That difference kind of large
> to make sense if this is a sysret problem.  Not that I have a better
> explanation...

Actually, that kind of large difference is what I'd expect if it's a
GP fault on sysret then cascades to more faults because our kernel
stack pointer is crap.

So it starts with getting a GP fault due to the sysret, but now we're
in la-la-land with really odd core register state, so what's not to
say that we don't get a recursive fault. We don't use the kernel stack
pointer for getting thread-info any more like we used to, but we still
have code like this in entry_64.c:

        testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)

which seems to know that the thread info is below the kernel stack. So
let's say that the GP fault starts taking a recursive GP faults (or
recursive page faults) due to confusion with thread_info accesses or
something. And the stack keeps growing down, because all the faults
just fault themselves. Until finally we hit an unmapped area, and that
stops it - because while we had recursive faulting before, it was our
kernel code that was confused. But now the fault handling ends up
takiung a page fault while setting up the error information.

You would *not* expect the stack to be unmapped just under the
original %rsp value. User space has big frames and probably had deep
call chains before it ever hit the problematic case, so there's some
"slop" on the user stack. Only when we run out of slop do we get the
double-fault. Which explains why you should *not* expect the %rsp
values to be similar.

And around 30kB of stack before that happens sounds quite reasonable.

Now, to be honest, I don't see why we'd get the cascading faults, I
just get this feeling that if %rsp is crap, just about anything might
go wrong, and that if it's sysret taking a #GP fault, we're just
screwed.

                         Linus

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19  0:57                               ` Andy Lutomirski
  2015-03-19  2:15                                 ` Linus Torvalds
@ 2015-03-19  6:24                                 ` Stefan Seyfried
  1 sibling, 0 replies; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-19  6:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jiri Kosina, Denys Vlasenko, Linus Torvalds, Takashi Iwai,
	X86 ML, LKML, Tejun Heo

Good Morning :-)

Am 19.03.2015 um 01:57 schrieb Andy Lutomirski:

> Stefan, do you happen to know whether your disassembly of page_fault
> came from the instructions in memory or if they came from the vmlinux
> file?  Not that I have any relevant ideas there.

I think they came from memory. At least, the disassemble in crash...
crash> disassemble page_fault
Dump of assembler code for function page_fault:
   0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
   0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
   0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
   0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
   0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
   0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
   0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
   0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
   0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
   0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
End of assembler dump.

...is different than the one from loading vmlinux in gdb:

Reading symbols from vmlinux-4.0.0-rc3-2.gd5c547f-desktop...done.
Reading symbols from /usr/lib/debug/boot/vmlinux-4.0.0-rc3-2.gd5c547f-desktop.debug...done.
(gdb) disassemble page_fault
Dump of assembler code for function page_fault:
   0xffffffff816834a0 <+0>:     data16 xchg %ax,%ax
   0xffffffff816834a3 <+3>:     callq  *0x7a5b07(%rip)        # 0xffffffff81e28fb0 <pv_irq_ops+48>
   0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
   0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
   0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
   0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
   0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
   0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
   0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
End of assembler dump.

Best regards,

	Stefan
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 22:29                     ` Andy Lutomirski
  2015-03-18 22:38                       ` Stefan Seyfried
@ 2015-03-19 10:16                       ` Takashi Iwai
  2015-03-19 10:58                         ` Denys Vlasenko
  1 sibling, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-19 10:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jiri Kosina, Denys Vlasenko, Linus Torvalds, Stefan Seyfried,
	X86 ML, LKML, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]

Hi,

sorry to take time to back to this topic.

At Wed, 18 Mar 2015 15:29:14 -0700,
Andy Lutomirski wrote:
> 
> On Wed, Mar 18, 2015 at 3:22 PM, Jiri Kosina <jkosina@suse.cz> wrote:
> > On Wed, 18 Mar 2015, Andy Lutomirski wrote:
> >
> >> sysret64 can only fail with #GP, and we're totally screwed if that
> >> happens,
> >
> > But what if the GPF handler pagefaults afterwards? It'd be operating on
> > user stack already.
> 
> Good point.
> 
> Stefan, can you try changing the first "jne
> opportunistic_sysret_failed" to "jmp opportunistic_sysret_failed" in
> entry_64.S and seeing if you can reproduce this?  (Is it easy enough
> to reproduce that this would tell us anything?)

I tried this, and the same crash still happens.

On my machine (a Dell desktop with IvyBridge 4-core, 8GB RAM), I could
reproduce it relatively easily.  Start a desktop session as usual, and
start a KVM with 1GB memory 4 CPU, and start compiling a kernel on VM
with make -j4.  Meanwhile, start compiling a kernel with make -j8 on
the host, too.  So nothing too special there.  The kconfig is attached
below.

Currently I haven't set up kdump for this machine due to the disk
space.  Will try to adjust somehow from now on.


Takashi


[-- Attachment #2: .config --]
[-- Type: application/octet-stream, Size: 109086 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.0.0-rc4 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="-testx"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_GENERIC_MSI_IRQ=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_PREEMPT_RCU=y
CONFIG_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
# CONFIG_RCU_USER_QS is not set
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FANOUT_EXACT is not set
CONFIG_RCU_FAST_NO_HZ=y
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_BOOST=y
CONFIG_RCU_KTHREAD_PRIO=1
CONFIG_RCU_BOOST_DELAY=500
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_NONE is not set
# CONFIG_RCU_NOCB_CPU_ZERO is not set
CONFIG_RCU_NOCB_CPU_ALL=y
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_NUMA_BALANCING=y
CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
# CONFIG_MEMCG_SWAP_ENABLED is not set
# CONFIG_MEMCG_KMEM is not set
CONFIG_CGROUP_HUGETLB=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CHECKPOINT_RESTORE=y
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
CONFIG_UID16=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_PCI_QUIRKS=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
# CONFIG_SYSTEM_TRUSTED_KEYRING is not set
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
# CONFIG_OPROFILE is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
# CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR is not set
CONFIG_CC_STACKPROTECTOR_NONE=y
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_CC_STACKPROTECTOR_STRONG is not set
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_MODULE_SIG is not set
# CONFIG_MODULE_COMPRESS is not set
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_DEV_THROTTLING=y
# CONFIG_BLK_CMDLINE_PARSER is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
CONFIG_OSF_PARTITION=y
# CONFIG_AMIGA_PARTITION is not set
CONFIG_ATARI_PARTITION=y
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
# CONFIG_MINIX_SUBPARTITION is not set
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
CONFIG_LDM_PARTITION=y
# CONFIG_LDM_DEBUG is not set
CONFIG_SGI_PARTITION=y
CONFIG_ULTRIX_PARTITION=y
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
CONFIG_SYSV68_PARTITION=y
# CONFIG_CMDLINE_PARTITION is not set
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_CFQ_GROUP_IOSCHED=y
CONFIG_IOSCHED_BFQ=y
# CONFIG_CGROUP_BFQIO is not set
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
CONFIG_DEFAULT_BFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="bfq"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_UNINLINE_SPIN_UNLOCK=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUE_RWLOCK=y
CONFIG_QUEUE_RWLOCK=y
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_ZONE_DMA=y
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_X2APIC=y
CONFIG_X86_MPPARSE=y
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_NUMACHIP is not set
# CONFIG_X86_VSMP is not set
CONFIG_X86_UV=y
# CONFIG_X86_GOLDFISH is not set
CONFIG_X86_INTEL_LPSS=y
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
CONFIG_IOSF_MBI=m
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
# CONFIG_PARAVIRT_SPINLOCKS is not set
# CONFIG_XEN is not set
CONFIG_KVM_GUEST=y
# CONFIG_KVM_DEBUG_FS is not set
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
CONFIG_PARAVIRT_CLOCK=y
CONFIG_NO_BOOTMEM=y
CONFIG_MEMTEST=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
# CONFIG_PROCESSOR_SELECT is not set
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
CONFIG_CALGARY_IOMMU=y
# CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS=512
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_UP_APIC_MSI=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_AMD=y
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set
CONFIG_X86_THERMAL_VECTOR=y
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_AMD=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_MICROCODE_INTEL_EARLY=y
CONFIG_MICROCODE_AMD_EARLY=y
CONFIG_MICROCODE_EARLY=y
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_DIRECT_GBPAGES=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NODES_SPAN_OTHER_NODES=y
CONFIG_NUMA_EMU=y
CONFIG_NODES_SHIFT=9
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_MEMORY_PROBE=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_MOVABLE_NODE=y
CONFIG_HAVE_BOOTMEM_INFO_NODE=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
# CONFIG_HWPOISON_INJECT is not set
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_CLEANCACHE=y
CONFIG_FRONTSWAP=y
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
CONFIG_CMA_AREAS=7
CONFIG_MEM_SOFT_DIRTY=y
CONFIG_ZSWAP=y
CONFIG_ZPOOL=y
CONFIG_ZBUD=y
CONFIG_ZSMALLOC=y
# CONFIG_PGTABLE_MAPPING is not set
# CONFIG_ZSMALLOC_STAT is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_SMAP=y
# CONFIG_X86_INTEL_MPX is not set
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_MIXED is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
# CONFIG_KEXEC_FILE is not set
CONFIG_CRASH_DUMP=y
# CONFIG_KEXEC_JUMP is not set
CONFIG_PHYSICAL_START=0x200000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
# CONFIG_COMPAT_VDSO is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_AUTOSLEEP=y
CONFIG_PM_WAKELOCKS=y
CONFIG_PM_WAKELOCKS_LIMIT=100
CONFIG_PM_WAKELOCKS_GC=y
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_PM_ADVANCED_DEBUG=y
# CONFIG_PM_TEST_SUSPEND is not set
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_DPM_WATCHDOG=y
CONFIG_DPM_WATCHDOG_TIMEOUT=12
CONFIG_PM_TRACE=y
CONFIG_PM_TRACE_RTC=y
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_PROCFS_POWER=y
# CONFIG_ACPI_EC_DEBUGFS is not set
# CONFIG_ACPI_AC is not set
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
# CONFIG_ACPI_FAN is not set
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_HOTPLUG_CPU=y
# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
CONFIG_ACPI_THERMAL=m
CONFIG_ACPI_NUMA=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_INITRD_TABLE_OVERRIDE=y
CONFIG_ACPI_DEBUG=y
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
# CONFIG_ACPI_SBS is not set
CONFIG_ACPI_HED=y
# CONFIG_ACPI_CUSTOM_METHOD is not set
CONFIG_ACPI_BGRT=y
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
CONFIG_ACPI_APEI_PCIEAER=y
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
# CONFIG_ACPI_APEI_EINJ is not set
# CONFIG_ACPI_APEI_ERST_DEBUG is not set
# CONFIG_ACPI_EXTLOG is not set
# CONFIG_PMIC_OPREGION is not set
# CONFIG_SFI is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_COMMON=y
# CONFIG_CPU_FREQ_STAT is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set

#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
# CONFIG_X86_PCC_CPUFREQ is not set
# CONFIG_X86_ACPI_CPUFREQ is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_P4_CLOCKMOD is not set

#
# shared options
#
# CONFIG_X86_SPEEDSTEP_LIB is not set

#
# CPU Idle
#
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
CONFIG_INTEL_IDLE=y

#
# Memory power savings
#
# CONFIG_I7300_IDLE is not set

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
# CONFIG_PCI_CNB20LE_QUIRK is not set
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
# CONFIG_PCIE_ECRC is not set
# CONFIG_PCIEAER_INJECT is not set
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
CONFIG_PCI_MSI=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
CONFIG_PCI_STUB=y
CONFIG_HT_IRQ=y
CONFIG_PCI_ATS=y
CONFIG_PCI_IOV=y
CONFIG_PCI_PRI=y
CONFIG_PCI_PASID=y
CONFIG_PCI_LABEL=y

#
# PCI host controller drivers
#
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
# CONFIG_PCCARD is not set
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
# CONFIG_HOTPLUG_PCI_ACPI_IBM is not set
CONFIG_HOTPLUG_PCI_CPCI=y
# CONFIG_HOTPLUG_PCI_CPCI_ZT5550 is not set
# CONFIG_HOTPLUG_PCI_CPCI_GENERIC is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=m
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
CONFIG_X86_X32=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_KEYS_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_PMC_ATOM=y
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=m
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
CONFIG_UNIX_DIAG=m
# CONFIG_XFRM_USER is not set
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
# CONFIG_IP_FIB_TRIE_STATS is not set
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE_DEMUX is not set
# CONFIG_NET_IP_TUNNEL is not set
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
# CONFIG_NET_UDP_TUNNEL is not set
# CONFIG_NET_FOU is not set
# CONFIG_GENEVE is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
# CONFIG_INET_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
CONFIG_TCP_CONG_ADVANCED=y
# CONFIG_TCP_CONG_BIC is not set
CONFIG_TCP_CONG_CUBIC=y
# CONFIG_TCP_CONG_WESTWOOD is not set
# CONFIG_TCP_CONG_HTCP is not set
# CONFIG_TCP_CONG_HSTCP is not set
# CONFIG_TCP_CONG_HYBLA is not set
# CONFIG_TCP_CONG_VEGAS is not set
# CONFIG_TCP_CONG_SCALABLE is not set
# CONFIG_TCP_CONG_LP is not set
# CONFIG_TCP_CONG_VENO is not set
# CONFIG_TCP_CONG_YEAH is not set
# CONFIG_TCP_CONG_ILLINOIS is not set
# CONFIG_TCP_CONG_DCTCP is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
# CONFIG_IPV6_OPTIMISTIC_DAD is not set
# CONFIG_INET6_AH is not set
# CONFIG_INET6_ESP is not set
# CONFIG_INET6_IPCOMP is not set
# CONFIG_IPV6_MIP6 is not set
# CONFIG_INET6_XFRM_TUNNEL is not set
# CONFIG_INET6_TUNNEL is not set
# CONFIG_INET6_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET6_XFRM_MODE_TUNNEL is not set
# CONFIG_INET6_XFRM_MODE_BEET is not set
# CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION is not set
# CONFIG_IPV6_SIT is not set
# CONFIG_IPV6_TUNNEL is not set
# CONFIG_IPV6_GRE is not set
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
# CONFIG_IPV6_MROUTE is not set
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=m

#
# Core Netfilter Configuration
#
# CONFIG_NETFILTER_NETLINK_ACCT is not set
# CONFIG_NETFILTER_NETLINK_QUEUE is not set
# CONFIG_NETFILTER_NETLINK_LOG is not set
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
CONFIG_NF_CONNTRACK_TIMEOUT=y
CONFIG_NF_CONNTRACK_TIMESTAMP=y
# CONFIG_NF_CT_PROTO_DCCP is not set
# CONFIG_NF_CT_PROTO_SCTP is not set
# CONFIG_NF_CT_PROTO_UDPLITE is not set
# CONFIG_NF_CONNTRACK_AMANDA is not set
# CONFIG_NF_CONNTRACK_FTP is not set
# CONFIG_NF_CONNTRACK_H323 is not set
# CONFIG_NF_CONNTRACK_IRC is not set
# CONFIG_NF_CONNTRACK_NETBIOS_NS is not set
# CONFIG_NF_CONNTRACK_SNMP is not set
# CONFIG_NF_CONNTRACK_PPTP is not set
# CONFIG_NF_CONNTRACK_SANE is not set
# CONFIG_NF_CONNTRACK_SIP is not set
# CONFIG_NF_CONNTRACK_TFTP is not set
# CONFIG_NF_CT_NETLINK is not set
# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
# CONFIG_NF_NAT_AMANDA is not set
# CONFIG_NF_NAT_FTP is not set
# CONFIG_NF_NAT_IRC is not set
# CONFIG_NF_NAT_SIP is not set
# CONFIG_NF_NAT_TFTP is not set
CONFIG_NF_NAT_REDIRECT=m
# CONFIG_NF_TABLES is not set
CONFIG_NETFILTER_XTABLES=m

#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m
CONFIG_NETFILTER_XT_CONNMARK=m

#
# Xtables targets
#
# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set
# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
# CONFIG_NETFILTER_XT_TARGET_CONNSECMARK is not set
# CONFIG_NETFILTER_XT_TARGET_HMARK is not set
# CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set
# CONFIG_NETFILTER_XT_TARGET_LED is not set
# CONFIG_NETFILTER_XT_TARGET_LOG is not set
CONFIG_NETFILTER_XT_TARGET_MARK=m
# CONFIG_NETFILTER_XT_NAT is not set
CONFIG_NETFILTER_XT_TARGET_NETMAP=m
# CONFIG_NETFILTER_XT_TARGET_NFLOG is not set
# CONFIG_NETFILTER_XT_TARGET_NFQUEUE is not set
# CONFIG_NETFILTER_XT_TARGET_RATEEST is not set
CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
# CONFIG_NETFILTER_XT_TARGET_TEE is not set
# CONFIG_NETFILTER_XT_TARGET_SECMARK is not set
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set

#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
# CONFIG_NETFILTER_XT_MATCH_CLUSTER is not set
# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set
# CONFIG_NETFILTER_XT_MATCH_CONNBYTES is not set
# CONFIG_NETFILTER_XT_MATCH_CONNLABEL is not set
# CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
# CONFIG_NETFILTER_XT_MATCH_CPU is not set
# CONFIG_NETFILTER_XT_MATCH_DCCP is not set
# CONFIG_NETFILTER_XT_MATCH_DEVGROUP is not set
# CONFIG_NETFILTER_XT_MATCH_DSCP is not set
CONFIG_NETFILTER_XT_MATCH_ECN=m
# CONFIG_NETFILTER_XT_MATCH_ESP is not set
# CONFIG_NETFILTER_XT_MATCH_HASHLIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_HELPER is not set
CONFIG_NETFILTER_XT_MATCH_HL=m
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
# CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set
# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
# CONFIG_NETFILTER_XT_MATCH_LENGTH is not set
# CONFIG_NETFILTER_XT_MATCH_LIMIT is not set
# CONFIG_NETFILTER_XT_MATCH_MAC is not set
CONFIG_NETFILTER_XT_MATCH_MARK=m
# CONFIG_NETFILTER_XT_MATCH_MULTIPORT is not set
# CONFIG_NETFILTER_XT_MATCH_NFACCT is not set
# CONFIG_NETFILTER_XT_MATCH_OWNER is not set
# CONFIG_NETFILTER_XT_MATCH_PHYSDEV is not set
# CONFIG_NETFILTER_XT_MATCH_PKTTYPE is not set
# CONFIG_NETFILTER_XT_MATCH_QUOTA is not set
# CONFIG_NETFILTER_XT_MATCH_RATEEST is not set
# CONFIG_NETFILTER_XT_MATCH_REALM is not set
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set
# CONFIG_NETFILTER_XT_MATCH_STATE is not set
# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set
# CONFIG_NETFILTER_XT_MATCH_STRING is not set
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
# CONFIG_IP_SET is not set
# CONFIG_IP_VS is not set

#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
# CONFIG_NF_CONNTRACK_PROC_COMPAT is not set
# CONFIG_NF_LOG_ARP is not set
# CONFIG_NF_LOG_IPV4 is not set
# CONFIG_NF_REJECT_IPV4 is not set
CONFIG_NF_NAT_IPV4=m
# CONFIG_NF_NAT_MASQUERADE_IPV4 is not set
# CONFIG_NF_NAT_PPTP is not set
# CONFIG_NF_NAT_H323 is not set
CONFIG_IP_NF_IPTABLES=m
# CONFIG_IP_NF_MATCH_AH is not set
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
# CONFIG_IP_NF_TARGET_REJECT is not set
# CONFIG_IP_NF_TARGET_SYNPROXY is not set
# CONFIG_IP_NF_NAT is not set
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_RAW is not set
# CONFIG_IP_NF_SECURITY is not set
# CONFIG_IP_NF_ARPTABLES is not set

#
# IPv6: Netfilter Configuration
#
# CONFIG_NF_DEFRAG_IPV6 is not set
# CONFIG_NF_CONNTRACK_IPV6 is not set
# CONFIG_NF_REJECT_IPV6 is not set
# CONFIG_NF_LOG_IPV6 is not set
# CONFIG_IP6_NF_IPTABLES is not set
# CONFIG_BRIDGE_NF_EBTABLES is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
CONFIG_STP=m
CONFIG_BRIDGE=m
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_HAVE_NET_DSA=y
CONFIG_NET_DSA=m
CONFIG_NET_DSA_HWMON=y
CONFIG_NET_DSA_TAG_DSA=y
CONFIG_NET_DSA_TAG_EDSA=y
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
# CONFIG_IEEE802154 is not set
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
# CONFIG_NET_SCH_CBQ is not set
# CONFIG_NET_SCH_HTB is not set
# CONFIG_NET_SCH_HFSC is not set
# CONFIG_NET_SCH_PRIO is not set
# CONFIG_NET_SCH_MULTIQ is not set
# CONFIG_NET_SCH_RED is not set
# CONFIG_NET_SCH_SFB is not set
# CONFIG_NET_SCH_SFQ is not set
# CONFIG_NET_SCH_TEQL is not set
# CONFIG_NET_SCH_TBF is not set
# CONFIG_NET_SCH_GRED is not set
# CONFIG_NET_SCH_DSMARK is not set
# CONFIG_NET_SCH_NETEM is not set
# CONFIG_NET_SCH_DRR is not set
# CONFIG_NET_SCH_MQPRIO is not set
# CONFIG_NET_SCH_CHOKE is not set
# CONFIG_NET_SCH_QFQ is not set
# CONFIG_NET_SCH_CODEL is not set
# CONFIG_NET_SCH_FQ_CODEL is not set
# CONFIG_NET_SCH_FQ is not set
# CONFIG_NET_SCH_HHF is not set
# CONFIG_NET_SCH_PIE is not set
# CONFIG_NET_SCH_INGRESS is not set
# CONFIG_NET_SCH_PLUG is not set

#
# Classification
#
CONFIG_NET_CLS=y
# CONFIG_NET_CLS_BASIC is not set
# CONFIG_NET_CLS_TCINDEX is not set
# CONFIG_NET_CLS_ROUTE4 is not set
# CONFIG_NET_CLS_FW is not set
# CONFIG_NET_CLS_U32 is not set
# CONFIG_NET_CLS_RSVP is not set
# CONFIG_NET_CLS_RSVP6 is not set
# CONFIG_NET_CLS_FLOW is not set
# CONFIG_NET_CLS_CGROUP is not set
# CONFIG_NET_CLS_BPF is not set
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
# CONFIG_NET_EMATCH_CMP is not set
# CONFIG_NET_EMATCH_NBYTE is not set
# CONFIG_NET_EMATCH_U32 is not set
# CONFIG_NET_EMATCH_META is not set
# CONFIG_NET_EMATCH_TEXT is not set
CONFIG_NET_CLS_ACT=y
# CONFIG_NET_ACT_POLICE is not set
# CONFIG_NET_ACT_GACT is not set
# CONFIG_NET_ACT_MIRRED is not set
# CONFIG_NET_ACT_IPT is not set
# CONFIG_NET_ACT_NAT is not set
# CONFIG_NET_ACT_PEDIT is not set
# CONFIG_NET_ACT_SIMP is not set
# CONFIG_NET_ACT_SKBEDIT is not set
# CONFIG_NET_ACT_CSUM is not set
# CONFIG_NET_ACT_VLAN is not set
# CONFIG_NET_ACT_BPF is not set
# CONFIG_NET_ACT_CONNMARK is not set
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y
CONFIG_DNS_RESOLVER=m
# CONFIG_BATMAN_ADV is not set
# CONFIG_OPENVSWITCH is not set
CONFIG_VSOCKETS=m
CONFIG_VMWARE_VMCI_VSOCKETS=m
CONFIG_NETLINK_MMAP=y
CONFIG_NETLINK_DIAG=m
CONFIG_NET_MPLS_GSO=m
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
CONFIG_CGROUP_NET_PRIO=y
CONFIG_CGROUP_NET_CLASSID=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_BPF_JIT=y
CONFIG_NET_FLOW_LIMIT=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_TCPPROBE is not set
# CONFIG_NET_DROP_MONITOR is not set
CONFIG_HAMRADIO=y

#
# Packet Radio protocols
#
# CONFIG_AX25 is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
# CONFIG_CFG80211 is not set
# CONFIG_LIB80211 is not set

#
# CFG80211 needs to be enabled for MAC80211
#
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
CONFIG_HAVE_BPF_JIT=y

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_STANDALONE is not set
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_GENERIC_CPU_DEVICES is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_REGMAP=y
CONFIG_REGMAP_MMIO=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_FENCE_TRACE is not set
CONFIG_DMA_CMA=y

#
# Default contiguous memory area size:
#
CONFIG_CMA_SIZE_MBYTES=16
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8

#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
# CONFIG_MTD is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
# CONFIG_PARPORT_SERIAL is not set
CONFIG_PARPORT_PC_FIFO=y
CONFIG_PARPORT_PC_SUPERIO=y
# CONFIG_PARPORT_GSC is not set
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_NULL_BLK is not set
# CONFIG_BLK_DEV_FD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
# CONFIG_ZRAM is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
CONFIG_BLK_DEV_CRYPTOLOOP=m
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_BLK_DEV_SKD is not set
# CONFIG_BLK_DEV_OSD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
# CONFIG_BLK_DEV_HD is not set
# CONFIG_BLK_DEV_RBD is not set
# CONFIG_BLK_DEV_RSXX is not set

#
# Misc devices
#
# CONFIG_SENSORS_LIS3LV02D is not set
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ICS932S401 is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_SGI_XP is not set
# CONFIG_HP_ILO is not set
# CONFIG_SGI_GRU is not set
# CONFIG_APDS9802ALS is not set
# CONFIG_ISL29003 is not set
# CONFIG_ISL29020 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_SENSORS_BH1780 is not set
# CONFIG_SENSORS_BH1770 is not set
# CONFIG_SENSORS_APDS990X is not set
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
CONFIG_VMWARE_BALLOON=m
# CONFIG_BMP085_I2C is not set
# CONFIG_USB_SWITCH_FSA9480 is not set
# CONFIG_SRAM is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_AT24 is not set
# CONFIG_EEPROM_LEGACY is not set
# CONFIG_EEPROM_MAX6875 is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_CB710_CORE is not set

#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
# CONFIG_SENSORS_LIS3_I2C is not set

#
# Altera FPGA firmware download module
#
# CONFIG_ALTERA_STAPL is not set
CONFIG_INTEL_MEI=m
CONFIG_INTEL_MEI_ME=m
# CONFIG_INTEL_MEI_TXE is not set
CONFIG_VMWARE_VMCI=m

#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set

#
# Intel MIC Host Driver
#

#
# Intel MIC Card Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_CXL_BASE is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_NETLINK is not set
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=m
# CONFIG_CHR_DEV_SCH is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
# CONFIG_SCSI_SCAN_ASYNC is not set

#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
CONFIG_ISCSI_BOOT_SYSFS=m
# CONFIG_SCSI_CXGB3_ISCSI is not set
# CONFIG_SCSI_CXGB4_ISCSI is not set
# CONFIG_SCSI_BNX2_ISCSI is not set
# CONFIG_BE2ISCSI is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_HPSA is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_3W_SAS is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_MVSAS is not set
# CONFIG_SCSI_MVUMI is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_SCSI_ESAS2R is not set
CONFIG_MEGARAID_NEWGEN=y
# CONFIG_MEGARAID_MM is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_MPT2SAS is not set
# CONFIG_SCSI_MPT3SAS is not set
# CONFIG_SCSI_UFSHCD is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
CONFIG_VMWARE_PVSCSI=m
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_ISCI is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_PMCRAID is not set
# CONFIG_SCSI_PM8001 is not set
# CONFIG_SCSI_DH is not set
CONFIG_SCSI_OSD_INITIATOR=m
CONFIG_SCSI_OSD_ULD=m
CONFIG_SCSI_OSD_DPRINT_SENSE=1
# CONFIG_SCSI_OSD_DEBUG is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_ZPODD=y
CONFIG_SATA_PMP=y

#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=y
# CONFIG_SATA_AHCI_PLATFORM is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_ACARD_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y

#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
CONFIG_ATA_BMDMA=y

#
# SATA SFF controllers with BMDMA
#
# CONFIG_ATA_PIIX is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set

#
# PATA SFF controllers with BMDMA
#
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_SCH is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set

#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_PLATFORM is not set
# CONFIG_PATA_RZ1000 is not set

#
# Generic fallback / legacy drivers
#
# CONFIG_PATA_ACPI is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=m
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_DEBUG is not set
CONFIG_DM_BUFIO=m
CONFIG_DM_BIO_PRISON=m
CONFIG_DM_PERSISTENT_DATA=m
# CONFIG_DM_DEBUG_BLOCK_STACK_TRACING is not set
# CONFIG_DM_CRYPT is not set
# CONFIG_DM_SNAPSHOT is not set
CONFIG_DM_THIN_PROVISIONING=m
# CONFIG_DM_CACHE is not set
# CONFIG_DM_ERA is not set
# CONFIG_DM_MIRROR is not set
# CONFIG_DM_RAID is not set
# CONFIG_DM_ZERO is not set
# CONFIG_DM_MULTIPATH is not set
# CONFIG_DM_DELAY is not set
CONFIG_DM_UEVENT=y
# CONFIG_DM_FLAKEY is not set
# CONFIG_DM_VERITY is not set
# CONFIG_DM_SWITCH is not set
# CONFIG_TARGET_CORE is not set
CONFIG_FUSION=y
# CONFIG_FUSION_SPI is not set
# CONFIG_FUSION_SAS is not set
CONFIG_FUSION_MAX_SGE=128
# CONFIG_FUSION_LOGGING is not set

#
# IEEE 1394 (FireWire) support
#
CONFIG_FIREWIRE=m
# CONFIG_FIREWIRE_OHCI is not set
# CONFIG_FIREWIRE_SBP2 is not set
# CONFIG_FIREWIRE_NET is not set
# CONFIG_FIREWIRE_NOSY is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
# CONFIG_DUMMY is not set
# CONFIG_EQUALIZER is not set
CONFIG_NET_FC=y
# CONFIG_IFB is not set
# CONFIG_NET_TEAM is not set
# CONFIG_MACVLAN is not set
# CONFIG_IPVLAN is not set
# CONFIG_VXLAN is not set
CONFIG_NETCONSOLE=m
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_TUN is not set
CONFIG_VETH=m
# CONFIG_NLMON is not set
# CONFIG_ARCNET is not set

#
# CAIF transport drivers
#
# CONFIG_VHOST_NET is not set

#
# Distributed Switch Architecture drivers
#
CONFIG_NET_DSA_MV88E6XXX=m
# CONFIG_NET_DSA_MV88E6060 is not set
CONFIG_NET_DSA_MV88E6XXX_NEED_PPU=y
CONFIG_NET_DSA_MV88E6131=m
CONFIG_NET_DSA_MV88E6123_61_65=m
# CONFIG_NET_DSA_MV88E6171 is not set
# CONFIG_NET_DSA_MV88E6352 is not set
# CONFIG_NET_DSA_BCM_SF2 is not set
CONFIG_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
# CONFIG_VORTEX is not set
# CONFIG_TYPHOON is not set
CONFIG_NET_VENDOR_ADAPTEC=y
# CONFIG_ADAPTEC_STARFIRE is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
CONFIG_NET_VENDOR_ALTEON=y
# CONFIG_ACENIC is not set
# CONFIG_ALTERA_TSE is not set
CONFIG_NET_VENDOR_AMD=y
# CONFIG_AMD8111_ETH is not set
# CONFIG_PCNET32 is not set
# CONFIG_AMD_XGBE is not set
# CONFIG_NET_XGENE is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
# CONFIG_ATL2 is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_ALX is not set
CONFIG_NET_VENDOR_BROADCOM=y
# CONFIG_B44 is not set
# CONFIG_BCMGENET is not set
# CONFIG_BNX2 is not set
# CONFIG_CNIC is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2X is not set
CONFIG_NET_VENDOR_BROCADE=y
# CONFIG_BNA is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
# CONFIG_CHELSIO_T3 is not set
# CONFIG_CHELSIO_T4 is not set
# CONFIG_CHELSIO_T4VF is not set
CONFIG_NET_VENDOR_CISCO=y
# CONFIG_ENIC is not set
# CONFIG_CX_ECAT is not set
# CONFIG_DNET is not set
CONFIG_NET_VENDOR_DEC=y
CONFIG_NET_TULIP=y
# CONFIG_DE2104X is not set
# CONFIG_TULIP is not set
# CONFIG_DE4X5 is not set
# CONFIG_WINBOND_840 is not set
# CONFIG_DM9102 is not set
# CONFIG_ULI526X is not set
CONFIG_NET_VENDOR_DLINK=y
# CONFIG_DL2K is not set
# CONFIG_SUNDANCE is not set
CONFIG_NET_VENDOR_EMULEX=y
# CONFIG_BE2NET is not set
CONFIG_NET_VENDOR_EXAR=y
# CONFIG_S2IO is not set
# CONFIG_VXGE is not set
CONFIG_NET_VENDOR_HP=y
# CONFIG_HP100 is not set
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
# CONFIG_E1000 is not set
CONFIG_E1000E=m
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_IXGB is not set
# CONFIG_IXGBE is not set
# CONFIG_IXGBEVF is not set
# CONFIG_I40E is not set
# CONFIG_I40EVF is not set
# CONFIG_FM10K is not set
CONFIG_NET_VENDOR_I825XX=y
# CONFIG_IP1000 is not set
# CONFIG_JME is not set
CONFIG_NET_VENDOR_MARVELL=y
# CONFIG_MVMDIO is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
CONFIG_NET_VENDOR_MELLANOX=y
# CONFIG_MLX4_EN is not set
# CONFIG_MLX4_CORE is not set
# CONFIG_MLX5_CORE is not set
CONFIG_NET_VENDOR_MICREL=y
# CONFIG_KS8851_MLL is not set
# CONFIG_KSZ884X_PCI is not set
CONFIG_NET_VENDOR_MYRI=y
# CONFIG_MYRI10GE is not set
# CONFIG_FEALNX is not set
CONFIG_NET_VENDOR_NATSEMI=y
# CONFIG_NATSEMI is not set
# CONFIG_NS83820 is not set
CONFIG_NET_VENDOR_8390=y
# CONFIG_NE2K_PCI is not set
CONFIG_NET_VENDOR_NVIDIA=y
# CONFIG_FORCEDETH is not set
CONFIG_NET_VENDOR_OKI=y
# CONFIG_ETHOC is not set
CONFIG_NET_PACKET_ENGINE=y
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_NET_VENDOR_QLOGIC=y
# CONFIG_QLA3XXX is not set
# CONFIG_QLCNIC is not set
# CONFIG_QLGE is not set
# CONFIG_NETXEN_NIC is not set
CONFIG_NET_VENDOR_QUALCOMM=y
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_ATP is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_R8169 is not set
CONFIG_NET_VENDOR_RDC=y
# CONFIG_R6040 is not set
CONFIG_NET_VENDOR_ROCKER=y
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
CONFIG_NET_VENDOR_SEEQ=y
CONFIG_NET_VENDOR_SILAN=y
# CONFIG_SC92031 is not set
CONFIG_NET_VENDOR_SIS=y
# CONFIG_SIS900 is not set
# CONFIG_SIS190 is not set
# CONFIG_SFC is not set
CONFIG_NET_VENDOR_SMSC=y
# CONFIG_EPIC100 is not set
# CONFIG_SMSC911X is not set
# CONFIG_SMSC9420 is not set
CONFIG_NET_VENDOR_STMICRO=y
# CONFIG_STMMAC_ETH is not set
CONFIG_NET_VENDOR_SUN=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NIU is not set
CONFIG_NET_VENDOR_TEHUTI=y
# CONFIG_TEHUTI is not set
CONFIG_NET_VENDOR_TI=y
# CONFIG_TI_CPSW_ALE is not set
# CONFIG_TLAN is not set
CONFIG_NET_VENDOR_VIA=y
# CONFIG_VIA_RHINE is not set
# CONFIG_VIA_VELOCITY is not set
CONFIG_NET_VENDOR_WIZNET=y
# CONFIG_WIZNET_W5100 is not set
# CONFIG_WIZNET_W5300 is not set
# CONFIG_FDDI is not set
CONFIG_HIPPI=y
# CONFIG_ROADRUNNER is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLIB=m

#
# MII PHY device drivers
#
# CONFIG_AT803X_PHY is not set
# CONFIG_AMD_PHY is not set
# CONFIG_AMD_XGBE_PHY is not set
# CONFIG_MARVELL_PHY is not set
# CONFIG_DAVICOM_PHY is not set
# CONFIG_QSEMI_PHY is not set
# CONFIG_LXT_PHY is not set
# CONFIG_CICADA_PHY is not set
# CONFIG_VITESSE_PHY is not set
# CONFIG_SMSC_PHY is not set
# CONFIG_BROADCOM_PHY is not set
# CONFIG_BCM7XXX_PHY is not set
# CONFIG_BCM87XX_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_REALTEK_PHY is not set
# CONFIG_NATIONAL_PHY is not set
# CONFIG_STE10XP is not set
# CONFIG_LSI_ET1011C_PHY is not set
# CONFIG_MICREL_PHY is not set
# CONFIG_FIXED_PHY is not set
# CONFIG_MDIO_BITBANG is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
CONFIG_USB_NET_DRIVERS=y
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_RTL8152 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_USB_IPHETH is not set
CONFIG_WLAN=y
# CONFIG_PRISM54 is not set
# CONFIG_HOSTAP is not set
CONFIG_WL_TI=y

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
CONFIG_WAN=y
# CONFIG_HDLC is not set
# CONFIG_DLCI is not set
# CONFIG_SBNI is not set
CONFIG_VMXNET3=m
CONFIG_ISDN=y
# CONFIG_ISDN_I4L is not set
# CONFIG_ISDN_CAPI is not set
# CONFIG_ISDN_DRV_GIGASET is not set
# CONFIG_HYSDN is not set
# CONFIG_MISDN is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
# CONFIG_INPUT_POLLDEV is not set
# CONFIG_INPUT_SPARSEKMAP is not set
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_MATRIX is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_ELANTECH=y
CONFIG_MOUSE_PS2_SENTELIC=y
CONFIG_MOUSE_PS2_TOUCHKIT=y
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_CYAPA is not set
# CONFIG_MOUSE_ELAN_I2C is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_MOUSE_GPIO is not set
# CONFIG_MOUSE_SYNAPTICS_I2C is not set
# CONFIG_MOUSE_SYNAPTICS_USB is not set
CONFIG_INPUT_JOYSTICK=y
CONFIG_JOYSTICK_ANALOG=m
# CONFIG_JOYSTICK_A3D is not set
# CONFIG_JOYSTICK_ADI is not set
# CONFIG_JOYSTICK_COBRA is not set
# CONFIG_JOYSTICK_GF2K is not set
# CONFIG_JOYSTICK_GRIP is not set
# CONFIG_JOYSTICK_GRIP_MP is not set
# CONFIG_JOYSTICK_GUILLEMOT is not set
# CONFIG_JOYSTICK_INTERACT is not set
# CONFIG_JOYSTICK_SIDEWINDER is not set
# CONFIG_JOYSTICK_TMDC is not set
# CONFIG_JOYSTICK_IFORCE is not set
# CONFIG_JOYSTICK_WARRIOR is not set
# CONFIG_JOYSTICK_MAGELLAN is not set
# CONFIG_JOYSTICK_SPACEORB is not set
# CONFIG_JOYSTICK_SPACEBALL is not set
# CONFIG_JOYSTICK_STINGER is not set
# CONFIG_JOYSTICK_TWIDJOY is not set
# CONFIG_JOYSTICK_ZHENHUA is not set
# CONFIG_JOYSTICK_DB9 is not set
# CONFIG_JOYSTICK_GAMECON is not set
# CONFIG_JOYSTICK_TURBOGRAFX is not set
# CONFIG_JOYSTICK_AS5011 is not set
# CONFIG_JOYSTICK_JOYDUMP is not set
# CONFIG_JOYSTICK_XPAD is not set
# CONFIG_JOYSTICK_WALKERA0701 is not set
CONFIG_INPUT_TABLET=y
# CONFIG_TABLET_USB_ACECAD is not set
# CONFIG_TABLET_USB_AIPTEK is not set
# CONFIG_TABLET_USB_GTCO is not set
# CONFIG_TABLET_USB_HANWANG is not set
# CONFIG_TABLET_USB_KBTAB is not set
# CONFIG_TABLET_SERIAL_WACOM4 is not set
CONFIG_INPUT_TOUCHSCREEN=y
# CONFIG_TOUCHSCREEN_AD7879 is not set
# CONFIG_TOUCHSCREEN_ATMEL_MXT is not set
# CONFIG_TOUCHSCREEN_AUO_PIXCIR is not set
# CONFIG_TOUCHSCREEN_BU21013 is not set
# CONFIG_TOUCHSCREEN_CY8CTMG110 is not set
# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set
# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set
# CONFIG_TOUCHSCREEN_DYNAPRO is not set
# CONFIG_TOUCHSCREEN_HAMPSHIRE is not set
# CONFIG_TOUCHSCREEN_EETI is not set
# CONFIG_TOUCHSCREEN_FUJITSU is not set
# CONFIG_TOUCHSCREEN_GOODIX is not set
# CONFIG_TOUCHSCREEN_ILI210X is not set
# CONFIG_TOUCHSCREEN_GUNZE is not set
# CONFIG_TOUCHSCREEN_ELAN is not set
# CONFIG_TOUCHSCREEN_ELO is not set
# CONFIG_TOUCHSCREEN_WACOM_W8001 is not set
# CONFIG_TOUCHSCREEN_WACOM_I2C is not set
# CONFIG_TOUCHSCREEN_MAX11801 is not set
# CONFIG_TOUCHSCREEN_MCS5000 is not set
# CONFIG_TOUCHSCREEN_MMS114 is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_INEXIO is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
# CONFIG_TOUCHSCREEN_PENMOUNT is not set
# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set
# CONFIG_TOUCHSCREEN_TOUCHRIGHT is not set
# CONFIG_TOUCHSCREEN_TOUCHWIN is not set
# CONFIG_TOUCHSCREEN_PIXCIR is not set
# CONFIG_TOUCHSCREEN_WM97XX is not set
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
# CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
# CONFIG_TOUCHSCREEN_TSC_SERIO is not set
# CONFIG_TOUCHSCREEN_TSC2007 is not set
# CONFIG_TOUCHSCREEN_ST1232 is not set
# CONFIG_TOUCHSCREEN_SUR40 is not set
# CONFIG_TOUCHSCREEN_TPS6507X is not set
# CONFIG_TOUCHSCREEN_ZFORCE is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_MMA8450 is not set
# CONFIG_INPUT_MPU3050 is not set
# CONFIG_INPUT_APANEL is not set
# CONFIG_INPUT_GP2A is not set
# CONFIG_INPUT_GPIO_BEEPER is not set
# CONFIG_INPUT_GPIO_TILT_POLLED is not set
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_KXTJ9 is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
# CONFIG_INPUT_CM109 is not set
# CONFIG_INPUT_UINPUT is not set
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_PWM_BEEPER is not set
# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV260X_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_SERIO_PS2MULT is not set
# CONFIG_SERIO_ARC_PS2 is not set
CONFIG_GAMEPORT=m
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
CONFIG_GAMEPORT_EMU10K1=m
# CONFIG_GAMEPORT_FM801 is not set

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=0
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_ROCKETPORT is not set
# CONFIG_CYCLADES is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
# CONFIG_SYNCLINK_GT is not set
# CONFIG_NOZOMI is not set
# CONFIG_ISI is not set
# CONFIG_N_HDLC is not set
# CONFIG_N_GSM is not set
# CONFIG_TRACE_SINK is not set
CONFIG_DEVMEM=y
CONFIG_DEVKMEM=y

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=32
# CONFIG_SERIAL_8250_EXTENDED is not set
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_FINTEK is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_MFD_HSU is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_TTY_PRINTK is not set
# CONFIG_PRINTER is not set
CONFIG_PPDEV=m
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
# CONFIG_HW_RANDOM_INTEL is not set
# CONFIG_HW_RANDOM_AMD is not set
# CONFIG_HW_RANDOM_VIA is not set
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
CONFIG_HPET_MMAP_DEFAULT=y
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_UV_MMTIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_XILLYBUS is not set

#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
# CONFIG_I2C_COMPAT is not set
# CONFIG_I2C_CHARDEV is not set
# CONFIG_I2C_MUX is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_ALGOBIT=m

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
CONFIG_I2C_I801=m
# CONFIG_I2C_ISCH is not set
# CONFIG_I2C_ISMT is not set
# CONFIG_I2C_PIIX4 is not set
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set

#
# ACPI drivers
#
# CONFIG_I2C_SCMI is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_CBUS_GPIO is not set
# CONFIG_I2C_DESIGNWARE_PLATFORM is not set
# CONFIG_I2C_DESIGNWARE_PCI is not set
# CONFIG_I2C_GPIO is not set
# CONFIG_I2C_OCORES is not set
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_PXA_PCI is not set
# CONFIG_I2C_SIMTEC is not set
# CONFIG_I2C_XILINX is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_DIOLAN_U2C is not set
# CONFIG_I2C_PARPORT is not set
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
# CONFIG_I2C_TINY_USB is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_STUB is not set
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set

#
# PPS support
#
CONFIG_PPS=m
# CONFIG_PPS_DEBUG is not set

#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
# CONFIG_PPS_CLIENT_LDISC is not set
# CONFIG_PPS_CLIENT_PARPORT is not set
# CONFIG_PPS_CLIENT_GPIO is not set

#
# PPS generators support
#

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=m

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
CONFIG_PINCTRL=y

#
# Pin controllers
#
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_BAYTRAIL is not set
# CONFIG_PINCTRL_CHERRYVIEW is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_DEVRES=y
CONFIG_GPIO_ACPI=y
# CONFIG_DEBUG_GPIO is not set
CONFIG_GPIO_SYSFS=y
CONFIG_GPIO_GENERIC=m

#
# Memory mapped GPIO drivers:
#
CONFIG_GPIO_GENERIC_PLATFORM=m
# CONFIG_GPIO_IT8761E is not set
# CONFIG_GPIO_F7188X is not set
# CONFIG_GPIO_SCH311X is not set
# CONFIG_GPIO_SCH is not set
# CONFIG_GPIO_ICH is not set
# CONFIG_GPIO_VX855 is not set
# CONFIG_GPIO_LYNXPOINT is not set

#
# I2C GPIO expanders:
#
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
# CONFIG_GPIO_PCA953X is not set
# CONFIG_GPIO_PCF857X is not set
# CONFIG_GPIO_SX150X is not set
# CONFIG_GPIO_ADP5588 is not set

#
# PCI GPIO expanders:
#
# CONFIG_GPIO_BT8XX is not set
# CONFIG_GPIO_AMD8111 is not set
# CONFIG_GPIO_INTEL_MID is not set
# CONFIG_GPIO_ML_IOH is not set
# CONFIG_GPIO_RDC321X is not set

#
# SPI GPIO expanders:
#
# CONFIG_GPIO_MCP23S08 is not set

#
# AC97 GPIO expanders:
#

#
# LPC GPIO expanders:
#

#
# MODULbus GPIO expanders:
#

#
# USB GPIO expanders:
#
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_BATTERY_BQ27x00 is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_BQ24190 is not set
# CONFIG_CHARGER_BQ24735 is not set
# CONFIG_CHARGER_SMB347 is not set
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
CONFIG_POWER_RESET=y
# CONFIG_POWER_RESET_RESTART is not set
CONFIG_POWER_AVS=y
CONFIG_HWMON=y
# CONFIG_HWMON_VID is not set
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Native drivers
#
# CONFIG_SENSORS_ABITUGURU is not set
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7414 is not set
# CONFIG_SENSORS_AD7418 is not set
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1026 is not set
# CONFIG_SENSORS_ADM1029 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ADM9240 is not set
# CONFIG_SENSORS_ADT7410 is not set
# CONFIG_SENSORS_ADT7411 is not set
# CONFIG_SENSORS_ADT7462 is not set
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7475 is not set
# CONFIG_SENSORS_ASC7621 is not set
# CONFIG_SENSORS_K8TEMP is not set
# CONFIG_SENSORS_K10TEMP is not set
# CONFIG_SENSORS_FAM15H_POWER is not set
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_ATXP1 is not set
# CONFIG_SENSORS_DS620 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_I5K_AMB is not set
# CONFIG_SENSORS_F71805F is not set
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
# CONFIG_SENSORS_FSCHMD is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_GL520SM is not set
# CONFIG_SENSORS_G760A is not set
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_GPIO_FAN is not set
# CONFIG_SENSORS_HIH6130 is not set
# CONFIG_SENSORS_I5500 is not set
CONFIG_SENSORS_CORETEMP=m
# CONFIG_SENSORS_IT87 is not set
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
# CONFIG_SENSORS_LINEAGE is not set
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC4151 is not set
# CONFIG_SENSORS_LTC4215 is not set
# CONFIG_SENSORS_LTC4222 is not set
# CONFIG_SENSORS_LTC4245 is not set
# CONFIG_SENSORS_LTC4260 is not set
# CONFIG_SENSORS_LTC4261 is not set
# CONFIG_SENSORS_MAX16065 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_MAX1668 is not set
# CONFIG_SENSORS_MAX197 is not set
# CONFIG_SENSORS_MAX6639 is not set
# CONFIG_SENSORS_MAX6642 is not set
# CONFIG_SENSORS_MAX6650 is not set
# CONFIG_SENSORS_MAX6697 is not set
# CONFIG_SENSORS_HTU21 is not set
# CONFIG_SENSORS_MCP3021 is not set
# CONFIG_SENSORS_LM63 is not set
# CONFIG_SENSORS_LM73 is not set
# CONFIG_SENSORS_LM75 is not set
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_LM92 is not set
# CONFIG_SENSORS_LM93 is not set
# CONFIG_SENSORS_LM95234 is not set
# CONFIG_SENSORS_LM95241 is not set
# CONFIG_SENSORS_LM95245 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_PC87427 is not set
# CONFIG_SENSORS_NTC_THERMISTOR is not set
# CONFIG_SENSORS_NCT6683 is not set
# CONFIG_SENSORS_NCT6775 is not set
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_PMBUS is not set
# CONFIG_SENSORS_SHT15 is not set
# CONFIG_SENSORS_SHT21 is not set
# CONFIG_SENSORS_SHTC1 is not set
# CONFIG_SENSORS_SIS5595 is not set
# CONFIG_SENSORS_DME1737 is not set
# CONFIG_SENSORS_EMC1403 is not set
# CONFIG_SENSORS_EMC2103 is not set
# CONFIG_SENSORS_EMC6W201 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_SMSC47M192 is not set
# CONFIG_SENSORS_SMSC47B397 is not set
# CONFIG_SENSORS_SCH56XX_COMMON is not set
# CONFIG_SENSORS_SCH5627 is not set
# CONFIG_SENSORS_SCH5636 is not set
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
# CONFIG_SENSORS_ADS1015 is not set
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_AMC6821 is not set
# CONFIG_SENSORS_INA209 is not set
# CONFIG_SENSORS_INA2XX is not set
# CONFIG_SENSORS_THMC50 is not set
# CONFIG_SENSORS_TMP102 is not set
# CONFIG_SENSORS_TMP103 is not set
# CONFIG_SENSORS_TMP401 is not set
# CONFIG_SENSORS_TMP421 is not set
# CONFIG_SENSORS_VIA_CPUTEMP is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_VT1211 is not set
# CONFIG_SENSORS_VT8231 is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83791D is not set
# CONFIG_SENSORS_W83792D is not set
# CONFIG_SENSORS_W83793 is not set
# CONFIG_SENSORS_W83795 is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83L786NG is not set
# CONFIG_SENSORS_W83627HF is not set
# CONFIG_SENSORS_W83627EHF is not set

#
# ACPI drivers
#
# CONFIG_SENSORS_ACPI_POWER is not set
# CONFIG_SENSORS_ATK0110 is not set
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
# CONFIG_THERMAL_GOV_BANG_BANG is not set
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_EMULATION is not set
CONFIG_INTEL_POWERCLAMP=m
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set
# CONFIG_INT340X_THERMAL is not set

#
# Texas Instruments thermal drivers
#
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_F71808E_WDT is not set
# CONFIG_SP5100_TCO is not set
# CONFIG_SBC_FITPC2_WATCHDOG is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
# CONFIG_IBMASR is not set
# CONFIG_WAFER_WDT is not set
# CONFIG_I6300ESB_WDT is not set
# CONFIG_IE6XX_WDT is not set
CONFIG_ITCO_WDT=m
CONFIG_ITCO_VENDOR_SUPPORT=y
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
# CONFIG_NV_TCO is not set
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_SMSC_SCH311X_WDT is not set
# CONFIG_SMSC37B787_WDT is not set
# CONFIG_VIA_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_W83977F_WDT is not set
# CONFIG_MACHZ_WDT is not set
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
# CONFIG_MEN_A21_WDT is not set

#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
# CONFIG_WDTPCI is not set

#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y

#
# Broadcom specific AMBA
#
# CONFIG_BCMA is not set

#
# Multifunction device drivers
#
CONFIG_MFD_CORE=m
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_AAT2870_CORE is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_AXP20X is not set
# CONFIG_MFD_CROS_EC is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_HTC_I2CPLD is not set
CONFIG_LPC_ICH=m
# CONFIG_LPC_SCH is not set
# CONFIG_INTEL_SOC_PMIC is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_MFD_VIPERBOARD is not set
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_UCB1400_CORE is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_RTSX_PCI is not set
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RTSX_USB is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_RN5T618 is not set
# CONFIG_MFD_SEC_CORE is not set
# CONFIG_MFD_SI476X_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SMSC is not set
# CONFIG_ABX500_CORE is not set
CONFIG_MFD_SYSCON=y
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS65010 is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TPS65217 is not set
# CONFIG_MFD_TPS65218 is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS65910 is not set
# CONFIG_MFD_TPS65912 is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS80031 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TC3589X is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_MFD_VX855 is not set
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_REGULATOR is not set
# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=y
CONFIG_INTEL_GTT=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
CONFIG_VGA_SWITCHEROO=y

#
# Direct Rendering Manager
#
CONFIG_DRM=m
CONFIG_DRM_MIPI_DSI=y
CONFIG_DRM_KMS_HELPER=m
CONFIG_DRM_KMS_FB_HELPER=y
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_TTM=m

#
# I2C encoder or helper chips
#
# CONFIG_DRM_I2C_ADV7511 is not set
# CONFIG_DRM_I2C_CH7006 is not set
# CONFIG_DRM_I2C_SIL164 is not set
# CONFIG_DRM_I2C_NXP_TDA998X is not set
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
# CONFIG_DRM_RADEON is not set
# CONFIG_DRM_NOUVEAU is not set
CONFIG_DRM_I915=m
CONFIG_DRM_I915_KMS=y
CONFIG_DRM_I915_FBDEV=y
CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT=y
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
CONFIG_DRM_VMWGFX=m
CONFIG_DRM_VMWGFX_FBCON=y
# CONFIG_DRM_GMA500 is not set
# CONFIG_DRM_UDL is not set
# CONFIG_DRM_AST is not set
# CONFIG_DRM_MGAG200 is not set
# CONFIG_DRM_CIRRUS_QEMU is not set
# CONFIG_DRM_QXL is not set
# CONFIG_DRM_BOCHS is not set
CONFIG_DRM_PANEL=y

#
# Display Panels
#

#
# Frame buffer Devices
#
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_CMDLINE=y
# CONFIG_FB_DDC is not set
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_FOREIGN_ENDIAN is not set
# CONFIG_FB_SYS_FOPS is not set
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_FB_AUO_K190X is not set
# CONFIG_FB_SIMPLE is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
# CONFIG_LCD_CLASS_DEVICE is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_GENERIC is not set
# CONFIG_BACKLIGHT_PWM is not set
# CONFIG_BACKLIGHT_APPLE is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3630A is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_LP855X is not set
# CONFIG_BACKLIGHT_GPIO is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_VGASTATE is not set
CONFIG_HDMI=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_LOGO is not set
CONFIG_SOUND=m
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_JACK=y
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_HRTIMER=m
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
CONFIG_SND_SUPPORT_OLD_API=y
CONFIG_SND_VERBOSE_PROCFS=y
CONFIG_SND_VERBOSE_PRINTK=y
CONFIG_SND_DEBUG=y
# CONFIG_SND_DEBUG_VERBOSE is not set
CONFIG_SND_PCM_XRUN_DEBUG=y
CONFIG_SND_VMASTER=y
CONFIG_SND_KCTL_JACK=y
CONFIG_SND_DMA_SGBUF=y
CONFIG_SND_RAWMIDI_SEQ=m
# CONFIG_SND_OPL3_LIB_SEQ is not set
# CONFIG_SND_OPL4_LIB_SEQ is not set
# CONFIG_SND_SBAWE_SEQ is not set
CONFIG_SND_EMU10K1_SEQ=m
CONFIG_SND_VX_LIB=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_DRIVERS=y
# CONFIG_SND_PCSP is not set
CONFIG_SND_DUMMY=m
CONFIG_SND_ALOOP=m
CONFIG_SND_VIRMIDI=m
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_MTS64 is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
# CONFIG_SND_PORTMAN2X4 is not set
CONFIG_SND_AC97_POWER_SAVE=y
CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0
CONFIG_SND_PCI=y
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ASIHPI is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
CONFIG_SND_CTXFI=m
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
CONFIG_SND_EMU10K1=m
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_LOLA is not set
# CONFIG_SND_LX6464ES is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SE6X is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
CONFIG_SND_VX222=m
# CONFIG_SND_YMFPCI is not set

#
# HD-Audio
#
CONFIG_SND_HDA=m
CONFIG_SND_HDA_INTEL=m
CONFIG_SND_HDA_DSP_LOADER=y
CONFIG_SND_HDA_PREALLOC_SIZE=1024
CONFIG_SND_HDA_HWDEP=y
CONFIG_SND_HDA_RECONFIG=y
CONFIG_SND_HDA_INPUT_BEEP=y
CONFIG_SND_HDA_INPUT_BEEP_MODE=1
CONFIG_SND_HDA_INPUT_JACK=y
CONFIG_SND_HDA_PATCH_LOADER=y
CONFIG_SND_HDA_CODEC_REALTEK=m
CONFIG_SND_HDA_CODEC_ANALOG=m
CONFIG_SND_HDA_CODEC_SIGMATEL=m
CONFIG_SND_HDA_CODEC_VIA=m
CONFIG_SND_HDA_CODEC_HDMI=m
CONFIG_SND_HDA_I915=y
CONFIG_SND_HDA_CODEC_CIRRUS=m
CONFIG_SND_HDA_CODEC_CONEXANT=m
CONFIG_SND_HDA_CODEC_CA0110=m
CONFIG_SND_HDA_CODEC_CA0132=m
CONFIG_SND_HDA_CODEC_CA0132_DSP=y
CONFIG_SND_HDA_CODEC_CMEDIA=m
CONFIG_SND_HDA_CODEC_SI3054=m
CONFIG_SND_HDA_GENERIC=m
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_SND_HDA_CORE=m
CONFIG_SND_USB=y
CONFIG_SND_USB_AUDIO=m
# CONFIG_SND_USB_UA101 is not set
# CONFIG_SND_USB_USX2Y is not set
# CONFIG_SND_USB_CAIAQ is not set
# CONFIG_SND_USB_US122L is not set
# CONFIG_SND_USB_6FIRE is not set
# CONFIG_SND_USB_HIFACE is not set
# CONFIG_SND_BCD2000 is not set
CONFIG_SND_USB_LINE6=m
CONFIG_SND_USB_POD=m
CONFIG_SND_USB_PODHD=m
CONFIG_SND_USB_TONEPORT=m
CONFIG_SND_USB_VARIAX=m
CONFIG_SND_FIREWIRE=y
CONFIG_SND_FIREWIRE_LIB=m
# CONFIG_SND_DICE is not set
CONFIG_SND_OXFW=m
# CONFIG_SND_ISIGHT is not set
# CONFIG_SND_SCS1X is not set
# CONFIG_SND_FIREWORKS is not set
# CONFIG_SND_BEBOB is not set
# CONFIG_SND_SOC is not set
# CONFIG_SOUND_PRIME is not set
CONFIG_AC97_BUS=m

#
# HID support
#
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
CONFIG_HIDRAW=y
# CONFIG_UHID is not set
CONFIG_HID_GENERIC=y

#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
# CONFIG_HID_ACRUX is not set
CONFIG_HID_APPLE=y
# CONFIG_HID_APPLEIR is not set
# CONFIG_HID_AUREAL is not set
CONFIG_HID_BELKIN=y
# CONFIG_HID_BETOP_FF is not set
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
# CONFIG_HID_PRODIKEYS is not set
# CONFIG_HID_CP2112 is not set
CONFIG_HID_CYPRESS=y
# CONFIG_HID_DRAGONRISE is not set
# CONFIG_HID_EMS_FF is not set
# CONFIG_HID_ELECOM is not set
# CONFIG_HID_ELO is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_HOLTEK is not set
# CONFIG_HID_GT683R is not set
# CONFIG_HID_HUION is not set
# CONFIG_HID_KEYTOUCH is not set
# CONFIG_HID_KYE is not set
# CONFIG_HID_UCLOGIC is not set
# CONFIG_HID_WALTOP is not set
# CONFIG_HID_GYRATION is not set
# CONFIG_HID_ICADE is not set
# CONFIG_HID_TWINHAN is not set
CONFIG_HID_KENSINGTON=y
# CONFIG_HID_LCPOWER is not set
# CONFIG_HID_LENOVO is not set
CONFIG_HID_LOGITECH=y
# CONFIG_HID_LOGITECH_DJ is not set
# CONFIG_HID_LOGITECH_HIDPP is not set
CONFIG_LOGITECH_FF=y
CONFIG_LOGIRUMBLEPAD2_FF=y
CONFIG_LOGIG940_FF=y
CONFIG_LOGIWHEELS_FF=y
# CONFIG_HID_MAGICMOUSE is not set
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
# CONFIG_HID_MULTITOUCH is not set
# CONFIG_HID_NTRIG is not set
# CONFIG_HID_ORTEK is not set
# CONFIG_HID_PANTHERLORD is not set
# CONFIG_HID_PENMOUNT is not set
# CONFIG_HID_PETALYNX is not set
# CONFIG_HID_PICOLCD is not set
# CONFIG_HID_PLANTRONICS is not set
# CONFIG_HID_PRIMAX is not set
# CONFIG_HID_ROCCAT is not set
# CONFIG_HID_SAITEK is not set
# CONFIG_HID_SAMSUNG is not set
# CONFIG_HID_SONY is not set
# CONFIG_HID_SPEEDLINK is not set
# CONFIG_HID_STEELSERIES is not set
# CONFIG_HID_SUNPLUS is not set
# CONFIG_HID_RMI is not set
# CONFIG_HID_GREENASIA is not set
# CONFIG_HID_SMARTJOYPLUS is not set
# CONFIG_HID_TIVO is not set
# CONFIG_HID_TOPSEED is not set
# CONFIG_HID_THINGM is not set
# CONFIG_HID_THRUSTMASTER is not set
# CONFIG_HID_WACOM is not set
# CONFIG_HID_WIIMOTE is not set
# CONFIG_HID_XINMO is not set
# CONFIG_HID_ZEROPLUS is not set
# CONFIG_HID_ZYDACRON is not set
# CONFIG_HID_SENSOR_HUB is not set

#
# USB HID support
#
CONFIG_USB_HID=y
CONFIG_HID_PID=y
CONFIG_USB_HIDDEV=y

#
# I2C HID support
#
# CONFIG_I2C_HID is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_DYNAMIC_MINORS is not set
CONFIG_USB_OTG=y
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_OTG_BLACKLIST_HUB is not set
# CONFIG_USB_OTG_FSM is not set
# CONFIG_USB_MON is not set
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=m
CONFIG_USB_XHCI_PCI=m
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FUSBH200_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
CONFIG_USB_OHCI_HCD=y
# CONFIG_USB_OHCI_HCD_PCI is not set
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_HCD_TEST_MODE is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_REALTEK is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_USBAT is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_USB_STORAGE_ALAUDA is not set
# CONFIG_USB_STORAGE_ONETOUCH is not set
# CONFIG_USB_STORAGE_KARMA is not set
# CONFIG_USB_STORAGE_CYPRESS_ATACB is not set
# CONFIG_USB_STORAGE_ENE_UB6250 is not set
CONFIG_USB_UAS=m

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
# CONFIG_USB_DWC3 is not set
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_YUREX is not set
# CONFIG_USB_EZUSB_FX2 is not set
# CONFIG_USB_HSIC_USB3503 is not set
# CONFIG_USB_LINK_LAYER_TEST is not set

#
# USB Physical Layer drivers
#
# CONFIG_USB_PHY is not set
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_USB_GPIO_VBUS is not set
# CONFIG_USB_ISP1301 is not set
# CONFIG_USB_GADGET is not set
# CONFIG_USB_LED_TRIG is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set

#
# LED drivers
#
# CONFIG_LEDS_LM3530 is not set
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_GPIO is not set
# CONFIG_LEDS_LP3944 is not set
# CONFIG_LEDS_LP5521 is not set
# CONFIG_LEDS_LP5523 is not set
# CONFIG_LEDS_LP5562 is not set
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_LP8860 is not set
# CONFIG_LEDS_CLEVO_MAIL is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_PWM is not set
# CONFIG_LEDS_BD2802 is not set
# CONFIG_LEDS_INTEL_SS4200 is not set
# CONFIG_LEDS_LT3593 is not set
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_LM355x is not set

#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
# CONFIG_LEDS_BLINKM is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
# CONFIG_LEDS_TRIGGER_TIMER is not set
# CONFIG_LEDS_TRIGGER_ONESHOT is not set
# CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
CONFIG_LEDS_TRIGGER_CPU=y
# CONFIG_LEDS_TRIGGER_GPIO is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_LEDS_TRIGGER_TRANSIENT is not set
# CONFIG_LEDS_TRIGGER_CAMERA is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
# CONFIG_EDAC_DEBUG is not set
# CONFIG_EDAC_DECODE_MCE is not set
# CONFIG_EDAC_MM_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_SYSTOHC=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_DS1307 is not set
# CONFIG_RTC_DRV_DS1374 is not set
# CONFIG_RTC_DRV_DS1672 is not set
# CONFIG_RTC_DRV_DS3232 is not set
# CONFIG_RTC_DRV_MAX6900 is not set
# CONFIG_RTC_DRV_RS5C372 is not set
# CONFIG_RTC_DRV_ISL1208 is not set
# CONFIG_RTC_DRV_ISL12022 is not set
# CONFIG_RTC_DRV_ISL12057 is not set
# CONFIG_RTC_DRV_X1205 is not set
# CONFIG_RTC_DRV_PCF2127 is not set
# CONFIG_RTC_DRV_PCF8523 is not set
# CONFIG_RTC_DRV_PCF8563 is not set
# CONFIG_RTC_DRV_PCF85063 is not set
# CONFIG_RTC_DRV_PCF8583 is not set
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_BQ32K is not set
# CONFIG_RTC_DRV_S35390A is not set
# CONFIG_RTC_DRV_FM3130 is not set
# CONFIG_RTC_DRV_RX8581 is not set
# CONFIG_RTC_DRV_RX8025 is not set
# CONFIG_RTC_DRV_EM3027 is not set
# CONFIG_RTC_DRV_RV3029C2 is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_DS2404 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_MSM6242 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_RP5C01 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
# CONFIG_RTC_DRV_XGENE is not set

#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set

#
# DMA Devices
#
# CONFIG_INTEL_MID_DMAC is not set
# CONFIG_INTEL_IOATDMA is not set
# CONFIG_DW_DMAC_CORE is not set
# CONFIG_DW_DMAC is not set
# CONFIG_DW_DMAC_PCI is not set
CONFIG_DMA_ACPI=y
CONFIG_AUXDISPLAY=y
# CONFIG_KS0108 is not set
# CONFIG_UIO is not set
# CONFIG_VFIO is not set
CONFIG_VIRT_DRIVERS=y

#
# Virtio drivers
#
# CONFIG_VIRTIO_PCI is not set
# CONFIG_VIRTIO_MMIO is not set

#
# Microsoft Hyper-V guest support
#
# CONFIG_HYPERV is not set
CONFIG_STAGING=y
# CONFIG_SLICOSS is not set
# CONFIG_COMEDI is not set
# CONFIG_PANEL is not set
# CONFIG_RTL8192U is not set
# CONFIG_RTLLIB is not set
# CONFIG_R8712U is not set
# CONFIG_R8188EU is not set
# CONFIG_RTS5208 is not set
# CONFIG_FB_SM7XX is not set
# CONFIG_FB_XGI is not set
CONFIG_FT1000=m
# CONFIG_FT1000_USB is not set

#
# Speakup console speech
#
# CONFIG_SPEAKUP is not set
# CONFIG_TOUCHSCREEN_SYNAPTICS_I2C_RMI4 is not set
CONFIG_STAGING_MEDIA=y

#
# Android
#
# CONFIG_USB_WPAN_HCD is not set
# CONFIG_WIMAX_GDM72XX is not set
# CONFIG_LTE_GDM724X is not set
# CONFIG_FIREWIRE_SERIAL is not set
# CONFIG_LUSTRE_FS is not set
# CONFIG_DGNC is not set
# CONFIG_DGAP is not set
# CONFIG_GS_FPGABOOT is not set
# CONFIG_CRYPTO_SKEIN is not set
# CONFIG_UNISYSSPAR is not set
# CONFIG_I2O is not set
CONFIG_X86_PLATFORM_DEVICES=y
# CONFIG_ACERHDF is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_DELL_LAPTOP is not set
# CONFIG_DELL_SMO8800 is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_FUJITSU_TABLET is not set
# CONFIG_HP_ACCEL is not set
# CONFIG_HP_WIRELESS is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_EEEPC_LAPTOP is not set
# CONFIG_ACPI_WMI is not set
# CONFIG_TOPSTAR_LAPTOP is not set
# CONFIG_TOSHIBA_BT_RFKILL is not set
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_ACPI_CMPC is not set
# CONFIG_INTEL_IPS is not set
# CONFIG_IBM_RTL is not set
# CONFIG_SAMSUNG_LAPTOP is not set
# CONFIG_SAMSUNG_Q10 is not set
# CONFIG_APPLE_GMUX is not set
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
# CONFIG_PVPANIC is not set
CONFIG_CHROME_PLATFORMS=y
# CONFIG_CHROMEOS_LAPTOP is not set
# CONFIG_CHROMEOS_PSTORE is not set
CONFIG_CLKDEV_LOOKUP=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y

#
# Common Clock Framework
#
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_PXA is not set
# CONFIG_COMMON_CLK_CDCE706 is not set

#
# Hardware Spinlock drivers
#

#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_ATMEL_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
# CONFIG_MAILBOX is not set
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y

#
# Generic IOMMU Pagetable Support
#
CONFIG_IOMMU_IOVA=y
CONFIG_AMD_IOMMU=y
# CONFIG_AMD_IOMMU_STATS is not set
# CONFIG_AMD_IOMMU_V2 is not set
CONFIG_DMAR_TABLE=y
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y
CONFIG_IRQ_REMAP=y

#
# Remoteproc drivers
#
# CONFIG_STE_MODEM_RPROC is not set

#
# Rpmsg drivers
#

#
# SOC (System On Chip) specific Drivers
#
# CONFIG_SOC_TI is not set
# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
CONFIG_MEMORY=y
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_VME_BUS is not set
CONFIG_PWM=y
CONFIG_PWM_SYSFS=y
# CONFIG_PWM_LPSS is not set
# CONFIG_IPACK_BUS is not set
CONFIG_RESET_CONTROLLER=y
# CONFIG_FMC is not set

#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_BCM_KONA_USB2_PHY is not set
# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set
CONFIG_RAS=y
# CONFIG_THUNDERBOLT is not set

#
# Android
#
# CONFIG_ANDROID is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_DELL_RBU is not set
CONFIG_DCDBAS=m
CONFIG_DMIID=y
# CONFIG_DMI_SYSFS is not set
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
CONFIG_ISCSI_IBFT_FIND=y
CONFIG_ISCSI_IBFT=m
# CONFIG_GOOGLE_FIRMWARE is not set

#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_VARS=y
# CONFIG_EFI_VARS_PSTORE is not set
CONFIG_EFI_RUNTIME_MAP=y
CONFIG_EFI_RUNTIME_WRAPPERS=y
CONFIG_UEFI_CPER=y

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT23=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QUOTA_DEBUG is not set
# CONFIG_QFMT_V1 is not set
# CONFIG_QFMT_V2 is not set
CONFIG_QUOTACTL=y
CONFIG_QUOTACTL_COMPAT=y
CONFIG_AUTOFS4_FS=y
CONFIG_FUSE_FS=m
# CONFIG_CUSE is not set
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
# CONFIG_FSCACHE_HISTOGRAM is not set
# CONFIG_FSCACHE_DEBUG is not set
CONFIG_FSCACHE_OBJECT_LIST=y
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=m
# CONFIG_EFIVAR_FS is not set
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_LOGFS is not set
# CONFIG_CRAMFS is not set
CONFIG_SQUASHFS=m
CONFIG_SQUASHFS_FILE_CACHE=y
# CONFIG_SQUASHFS_FILE_DIRECT is not set
CONFIG_SQUASHFS_DECOMP_SINGLE=y
# CONFIG_SQUASHFS_DECOMP_MULTI is not set
# CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU is not set
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_ZLIB=y
# CONFIG_SQUASHFS_LZ4 is not set
CONFIG_SQUASHFS_LZO=y
CONFIG_SQUASHFS_XZ=y
CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
# CONFIG_SQUASHFS_EMBEDDED is not set
CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_PSTORE=y
# CONFIG_PSTORE_CONSOLE is not set
# CONFIG_PSTORE_PMSG is not set
# CONFIG_PSTORE_FTRACE is not set
# CONFIG_PSTORE_RAM is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_EXOFS_FS=m
# CONFIG_EXOFS_DEBUG is not set
# CONFIG_F2FS_FS is not set
CONFIG_ORE=m
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V2=m
CONFIG_NFS_V3=m
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
CONFIG_NFS_SWAP=y
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_OBJLAYOUT=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_NFS_FSCACHE=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DEBUG=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_PNFS=y
CONFIG_NFSD_V4_SECURITY_LABEL=y
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_GRACE_PERIOD=m
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_SUNRPC_SWAP=y
CONFIG_SUNRPC_DEBUG=y
# CONFIG_CEPH_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
CONFIG_NLS_MAC_CENTEURO=m
CONFIG_NLS_MAC_CROATIAN=m
CONFIG_NLS_MAC_CYRILLIC=m
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
CONFIG_NLS_MAC_ICELAND=m
CONFIG_NLS_MAC_INUIT=m
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=m
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_BOOT_PRINTK_DELAY is not set
CONFIG_DYNAMIC_DEBUG=y

#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_REDUCED is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_GDB_SCRIPTS is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
CONFIG_STRIP_ASM_SYMS=y
# CONFIG_READABLE_ASM is not set
CONFIG_UNUSED_SYMBOLS=y
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_DEBUG_SLAB is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
# CONFIG_DEBUG_SHIRQ is not set

#
# Debug Lockups and Hangs
#
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=480
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=90
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
# CONFIG_SCHED_STACK_END_CHECK is not set
CONFIG_TIMER_STATS=y
# CONFIG_DEBUG_PREEMPT is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
CONFIG_TRACE_IRQFLAGS=y
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SG=y
CONFIG_DEBUG_NOTIFIERS=y
# CONFIG_DEBUG_CREDENTIALS is not set

#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
# CONFIG_PROVE_RCU_REPEATEDLY is not set
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_TORTURE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_CPU_STALL_INFO is not set
# CONFIG_RCU_TRACE is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
# CONFIG_FAULT_INJECTION is not set
CONFIG_LATENCYTOP=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
# CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
CONFIG_SCHED_TRACER=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
CONFIG_STACK_TRACER=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENT=y
CONFIG_UPROBE_EVENT=y
CONFIG_PROBE_EVENTS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
# CONFIG_RING_BUFFER_BENCHMARK is not set
# CONFIG_RING_BUFFER_STARTUP_TEST is not set

#
# Runtime Testing
#
# CONFIG_LKDTM is not set
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_KPROBES_SANITY_TEST is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_RBTREE_TEST is not set
# CONFIG_INTERVAL_TREE_TEST is not set
# CONFIG_PERCPU_TEST is not set
# CONFIG_ATOMIC64_SELFTEST is not set
# CONFIG_ASYNC_RAID6_TEST is not set
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
# CONFIG_TEST_KSTRTOX is not set
# CONFIG_TEST_RHASHTABLE is not set
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
CONFIG_BUILD_DOCSRC=y
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_TEST_LKM is not set
# CONFIG_TEST_USER_COPY is not set
# CONFIG_TEST_BPF is not set
# CONFIG_TEST_FIRMWARE is not set
# CONFIG_TEST_UDELAY is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_KGDB=y
# CONFIG_KGDB_SERIAL_CONSOLE is not set
# CONFIG_KGDB_TESTS is not set
CONFIG_KGDB_LOW_LEVEL_TRAP=y
CONFIG_KGDB_KDB=y
CONFIG_KDB_DEFAULT_ENABLE=0x1
CONFIG_KDB_KEYBOARD=y
CONFIG_KDB_CONTINUE_CATASTROPHIC=0
CONFIG_STRICT_DEVMEM=y
# CONFIG_X86_VERBOSE_BOOTUP is not set
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
CONFIG_EARLY_PRINTK_EFI=y
# CONFIG_X86_PTDUMP is not set
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_DEBUG_SET_MODULE_RONX=y
# CONFIG_DEBUG_NX_TEST is not set
CONFIG_DOUBLEFAULT=y
# CONFIG_DEBUG_TLBFLUSH is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
# CONFIG_X86_DECODER_SELFTEST is not set
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
CONFIG_OPTIMIZE_INLINING=y
# CONFIG_DEBUG_NMI_SELFTEST is not set
# CONFIG_X86_DEBUG_STATIC_CPU_HAS is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_BIG_KEYS=y
# CONFIG_ENCRYPTED_KEYS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_PATH=y
CONFIG_INTEL_TXT=y
CONFIG_LSM_MMAP_MIN_ADDR=0
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=0
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
# CONFIG_SECURITY_SMACK is not set
CONFIG_SECURITY_TOMOYO=y
CONFIG_SECURITY_TOMOYO_MAX_ACCEPT_ENTRY=2048
CONFIG_SECURITY_TOMOYO_MAX_AUDIT_LOG=1024
# CONFIG_SECURITY_TOMOYO_OMIT_USERSPACE_LOADER is not set
CONFIG_SECURITY_TOMOYO_POLICY_LOADER="/sbin/tomoyo-init"
CONFIG_SECURITY_TOMOYO_ACTIVATION_TRIGGER="/sbin/init"
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_YAMA is not set
CONFIG_INTEGRITY=y
# CONFIG_INTEGRITY_SIGNATURE is not set
CONFIG_INTEGRITY_AUDIT=y
# CONFIG_IMA is not set
# CONFIG_EVM is not set
CONFIG_DEFAULT_SECURITY_SELINUX=y
# CONFIG_DEFAULT_SECURITY_TOMOYO is not set
# CONFIG_DEFAULT_SECURITY_DAC is not set
CONFIG_DEFAULT_SECURITY="selinux"
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_ASYNC_PQ=m
CONFIG_ASYNC_RAID6_RECOV=m
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=m
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_USER is not set
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=m
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_PCRYPT is not set
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=m
# CONFIG_CRYPTO_MCRYPTD is not set
# CONFIG_CRYPTO_AUTHENC is not set
# CONFIG_CRYPTO_TEST is not set
CONFIG_CRYPTO_ABLK_HELPER=m
CONFIG_CRYPTO_GLUE_HELPER_X86=m

#
# Authenticated Encryption with Associated Data
#
# CONFIG_CRYPTO_CCM is not set
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_SEQIV is not set

#
# Block modes
#
CONFIG_CRYPTO_CBC=m
# CONFIG_CRYPTO_CTR is not set
# CONFIG_CRYPTO_CTS is not set
# CONFIG_CRYPTO_ECB is not set
CONFIG_CRYPTO_LRW=m
# CONFIG_CRYPTO_PCBC is not set
CONFIG_CRYPTO_XTS=m

#
# Hash modes
#
# CONFIG_CRYPTO_CMAC is not set
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
# CONFIG_CRYPTO_VMAC is not set

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32C_INTEL=m
# CONFIG_CRYPTO_CRC32 is not set
CONFIG_CRYPTO_CRC32_PCLMUL=m
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
# CONFIG_CRYPTO_GHASH is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_RMD128 is not set
# CONFIG_CRYPTO_RMD160 is not set
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
CONFIG_CRYPTO_SHA1=y
# CONFIG_CRYPTO_SHA1_SSSE3 is not set
# CONFIG_CRYPTO_SHA256_SSSE3 is not set
# CONFIG_CRYPTO_SHA512_SSSE3 is not set
# CONFIG_CRYPTO_SHA1_MB is not set
CONFIG_CRYPTO_SHA256=y
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_TGR192 is not set
# CONFIG_CRYPTO_WP512 is not set
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=m

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_X86_64=m
CONFIG_CRYPTO_AES_NI_INTEL=m
# CONFIG_CRYPTO_ANUBIS is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
# CONFIG_CRYPTO_CAMELLIA is not set
CONFIG_CRYPTO_CAMELLIA_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=m
# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
# CONFIG_CRYPTO_FCRYPT is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SALSA20_X86_64 is not set
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set

#
# Compression
#
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_ZLIB is not set
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
# CONFIG_CRYPTO_DRBG_MENU is not set
# CONFIG_CRYPTO_USER_API_HASH is not set
# CONFIG_CRYPTO_USER_API_SKCIPHER is not set
# CONFIG_CRYPTO_USER_API_RNG is not set
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
# CONFIG_CRYPTO_DEV_PADLOCK_AES is not set
# CONFIG_CRYPTO_DEV_PADLOCK_SHA is not set
CONFIG_CRYPTO_DEV_CCP=y
# CONFIG_CRYPTO_DEV_CCP_DD is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_ASYMMETRIC_KEY_TYPE is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
# CONFIG_KVM_AMD is not set
CONFIG_KVM_MMU_AUDIT=y
CONFIG_KVM_DEVICE_ASSIGNMENT=y
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_BITREVERSE=y
# CONFIG_HAVE_ARCH_BITREVERSE is not set
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_PERCPU_RWSEM=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
# CONFIG_CRC_CCITT is not set
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
# CONFIG_CRC8 is not set
# CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_INTERVAL_TREE=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE=y
CONFIG_AVERAGE=y
# CONFIG_CORDIC is not set
CONFIG_DDR=y
CONFIG_OID_REGISTRY=m
CONFIG_UCS2_STRING=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_ARCH_HAS_SG_CHAIN=y

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 10:16                       ` Takashi Iwai
@ 2015-03-19 10:58                         ` Denys Vlasenko
  2015-03-19 11:21                           ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-19 10:58 UTC (permalink / raw)
  To: Takashi Iwai, Andy Lutomirski
  Cc: Jiri Kosina, Linus Torvalds, Stefan Seyfried, X86 ML, LKML, Tejun Heo

On 03/19/2015 11:16 AM, Takashi Iwai wrote:
> The kconfig is attached

You also have PARAVIRT enabled, like Stefan.

Just to obtain an additional data point, can you guys
try reproducing it with PARAVIRT off?

It won't help us that much if it won't trigger with PARAVIRT off
(the bug may just become much harder to trigger), but if it would
still happen, that'd reduce the number of things we can suspect.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 10:58                         ` Denys Vlasenko
@ 2015-03-19 11:21                           ` Takashi Iwai
  2015-03-19 12:48                             ` Denys Vlasenko
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-19 11:21 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Jiri Kosina, Linus Torvalds, Stefan Seyfried,
	X86 ML, LKML, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 579 bytes --]

At Thu, 19 Mar 2015 11:58:19 +0100,
Denys Vlasenko wrote:
> 
> On 03/19/2015 11:16 AM, Takashi Iwai wrote:
> > The kconfig is attached
> 
> You also have PARAVIRT enabled, like Stefan.
> 
> Just to obtain an additional data point, can you guys
> try reproducing it with PARAVIRT off?
> 
> It won't help us that much if it won't trigger with PARAVIRT off
> (the bug may just become much harder to trigger), but if it would
> still happen, that'd reduce the number of things we can suspect.

I tried w/o PARAVIRT and the bug is still seen.  The dmesg is attached
below.


Takashi


[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 67391 bytes --]

[    0.000000] CPU0 microcode updated early to revision 0x1b, date = 2014-05-29
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 4.0.0-rc4-testz+ (tiwai@alsa1) (gcc version 4.8.3 20141208 [gcc-4_8-branch revision 218481] (SUSE Linux) ) #126 SMP PREEMPT Thu Mar 19 12:07:56 CET 2015
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.0.0-rc4-testz+ root=UUID=1190c997-9457-4dde-8a57-0cce0aae93c6 resume=/dev/disk/by-id/ata-INTEL_SSDSA2M080G2GN_CVPO9412011S080BGN-part1 splash=silent quiet showopts crashkernel=512M-:256M
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffffff] usable
[    0.000000] BIOS-e820: [mem 0x0000000020000000-0x00000000201fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000020200000-0x0000000040003fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000040004000-0x0000000040004fff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000040005000-0x00000000d6709fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000d670a000-0x00000000d67fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000d6800000-0x00000000d6f55fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000d6f56000-0x00000000d6ffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000d7000000-0x00000000d77b3fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000d77b4000-0x00000000d77fffff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000d7800000-0x00000000d8f1dfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000d8f1e000-0x00000000d8ffffff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000d9000000-0x00000000da6e2fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000da6e3000-0x00000000da8e1fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000da8e2000-0x00000000da924fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000da925000-0x00000000daffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000db800000-0x00000000df9fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000021e5fffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.7 present.
[    0.000000] DMI: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] AGP: No AGP bridge found
[    0.000000] e820: last_pfn = 0x21e600 max_arch_pfn = 0x400000000
[    0.000000] MTRR default type: uncachable
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-BFFFF uncachable
[    0.000000]   C0000-D3FFF write-protect
[    0.000000]   D4000-E7FFF uncachable
[    0.000000]   E8000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 000000000 mask E00000000 write-back
[    0.000000]   1 base 200000000 mask FE0000000 write-back
[    0.000000]   2 base 0E0000000 mask FE0000000 uncachable
[    0.000000]   3 base 0DC000000 mask FFC000000 uncachable
[    0.000000]   4 base 0DB800000 mask FFF800000 uncachable
[    0.000000]   5 base 21F000000 mask FFF000000 uncachable
[    0.000000]   6 base 21E800000 mask FFF800000 uncachable
[    0.000000]   7 base 21E600000 mask FFFE00000 uncachable
[    0.000000]   8 disabled
[    0.000000]   9 disabled
[    0.000000] PAT configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- UC  
[    0.000000] e820: update [mem 0xdb800000-0xffffffff] usable ==> reserved
[    0.000000] e820: last_pfn = 0xdb000 max_arch_pfn = 0x400000000
[    0.000000] found SMP MP-table at [mem 0x000fda40-0x000fda4f] mapped at [ffff8800000fda40]
[    0.000000] Scanning 1 areas for low memory corruption
[    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
[    0.000000] BRK [0x02cde000, 0x02cdefff] PGTABLE
[    0.000000] BRK [0x02cdf000, 0x02cdffff] PGTABLE
[    0.000000] BRK [0x02ce0000, 0x02ce0fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x21e400000-0x21e5fffff]
[    0.000000]  [mem 0x21e400000-0x21e5fffff] page 2M
[    0.000000] BRK [0x02ce1000, 0x02ce1fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x200000000-0x21e3fffff]
[    0.000000]  [mem 0x200000000-0x21e3fffff] page 2M
[    0.000000] init_memory_mapping: [mem 0x1e0000000-0x1ffffffff]
[    0.000000]  [mem 0x1e0000000-0x1ffffffff] page 2M
[    0.000000] BRK [0x02ce2000, 0x02ce2fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x00100000-0x1fffffff]
[    0.000000]  [mem 0x00100000-0x001fffff] page 4k
[    0.000000]  [mem 0x00200000-0x1fffffff] page 2M
[    0.000000] init_memory_mapping: [mem 0x20200000-0x40003fff]
[    0.000000]  [mem 0x20200000-0x3fffffff] page 2M
[    0.000000]  [mem 0x40000000-0x40003fff] page 4k
[    0.000000] BRK [0x02ce3000, 0x02ce3fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x40005000-0xd6709fff]
[    0.000000]  [mem 0x40005000-0x401fffff] page 4k
[    0.000000]  [mem 0x40200000-0xd65fffff] page 2M
[    0.000000]  [mem 0xd6600000-0xd6709fff] page 4k
[    0.000000] init_memory_mapping: [mem 0xd6800000-0xd6f55fff]
[    0.000000]  [mem 0xd6800000-0xd6dfffff] page 2M
[    0.000000]  [mem 0xd6e00000-0xd6f55fff] page 4k
[    0.000000] init_memory_mapping: [mem 0xd7000000-0xd77b3fff]
[    0.000000]  [mem 0xd7000000-0xd75fffff] page 2M
[    0.000000]  [mem 0xd7600000-0xd77b3fff] page 4k
[    0.000000] init_memory_mapping: [mem 0xd7800000-0xd8f1dfff]
[    0.000000]  [mem 0xd7800000-0xd8dfffff] page 2M
[    0.000000]  [mem 0xd8e00000-0xd8f1dfff] page 4k
[    0.000000] init_memory_mapping: [mem 0xd9000000-0xda6e2fff]
[    0.000000]  [mem 0xd9000000-0xda5fffff] page 2M
[    0.000000]  [mem 0xda600000-0xda6e2fff] page 4k
[    0.000000] init_memory_mapping: [mem 0xda925000-0xdaffffff]
[    0.000000]  [mem 0xda925000-0xda9fffff] page 4k
[    0.000000]  [mem 0xdaa00000-0xdaffffff] page 2M
[    0.000000] init_memory_mapping: [mem 0x100000000-0x1dfffffff]
[    0.000000]  [mem 0x100000000-0x1dfffffff] page 2M
[    0.000000] RAMDISK: [mem 0x37530000-0x37a8ffff]
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000000F0490 000024 (v02 DELL  )
[    0.000000] ACPI: XSDT 0x00000000D77F4080 00007C (v01 DELL   CBX3     01072009 AMI  00010013)
[    0.000000] ACPI: FACP 0x00000000D77FD7B0 00010C (v05 DELL   CBX3     01072009 AMI  00010013)
[    0.000000] ACPI: DSDT 0x00000000D77F4188 009625 (v02 DELL   CBX3     00000022 INTL 20091112)
[    0.000000] ACPI: FACS 0x00000000D8FFE080 000040
[    0.000000] ACPI: APIC 0x00000000D77FD8C0 000092 (v03 DELL   CBX3     01072009 AMI  00010013)
[    0.000000] ACPI: FPDT 0x00000000D77FD958 000044 (v01 DELL   CBX3     01072009 AMI  00010013)
[    0.000000] ACPI: MCFG 0x00000000D77FD9A0 00003C (v01 DELL   CBX3     01072009 MSFT 00000097)
[    0.000000] ACPI: HPET 0x00000000D77FD9E0 000038 (v01 DELL   CBX3     01072009 AMI. 00000005)
[    0.000000] ACPI: SSDT 0x00000000D77FDA18 000415 (v01 SataRe SataTabl 00001000 INTL 20091112)
[    0.000000] ACPI: SSDT 0x00000000D77FDE30 0009B9 (v01 PmRef  Cpu0Ist  00003000 INTL 20051117)
[    0.000000] ACPI: SSDT 0x00000000D77FE7F0 000A92 (v01 PmRef  CpuPm    00003000 INTL 20051117)
[    0.000000] ACPI: DMAR 0x00000000D77FF288 0000B8 (v01 INTEL  SNB      00000001 INTL 00000001)
[    0.000000] ACPI: ASF! 0x00000000D77FF340 0000A5 (v32 INTEL   HCG     00000001 TFSM 000F4240)
[    0.000000] ACPI: SLIC 0x00000000D77FF3E8 000176 (v03 DELL   CBX3     01072009 MSFT 00010013)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at [mem 0x0000000000000000-0x000000021e5fffff]
[    0.000000] NODE_DATA(0) allocated [mem 0x21e5df000-0x21e5f3fff]
[    0.000000] cma: Reserved 16 MiB at 0x000000021d400000
[    0.000000] Reserving 256MB of memory at 624MB for crashkernel (System RAM: 8078MB)
[    0.000000]  [ffffea0000000000-ffffea00087fffff] PMD -> [ffff880214c00000-ffff88021cbfffff] on node 0
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x000000021e5fffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009cfff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000001fffffff]
[    0.000000]   node   0: [mem 0x0000000020200000-0x0000000040003fff]
[    0.000000]   node   0: [mem 0x0000000040005000-0x00000000d6709fff]
[    0.000000]   node   0: [mem 0x00000000d6800000-0x00000000d6f55fff]
[    0.000000]   node   0: [mem 0x00000000d7000000-0x00000000d77b3fff]
[    0.000000]   node   0: [mem 0x00000000d7800000-0x00000000d8f1dfff]
[    0.000000]   node   0: [mem 0x00000000d9000000-0x00000000da6e2fff]
[    0.000000]   node   0: [mem 0x00000000da925000-0x00000000daffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x000000021e5fffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000021e5fffff]
[    0.000000] On node 0 totalpages: 2068107
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 21 pages reserved
[    0.000000]   DMA zone: 3996 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 13924 pages used for memmap
[    0.000000]   DMA32 zone: 891119 pages, LIFO batch:31
[    0.000000]   Normal zone: 18328 pages used for memmap
[    0.000000]   Normal zone: 1172992 pages, LIFO batch:31
[    0.000000] Reserving Intel graphics stolen memory at 0xdba00000-0xdf9fffff
[    0.000000] ACPI: PM-Timer IO Port: 0x408
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x05] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
[    0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a701 base: 0xfed00000
[    0.000000] smpboot: Allowing 8 CPUs, 0 hotplug CPUs
[    0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    0.000000] PM: Registered nosave memory: [mem 0x0009d000-0x0009dfff]
[    0.000000] PM: Registered nosave memory: [mem 0x0009e000-0x0009ffff]
[    0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000dffff]
[    0.000000] PM: Registered nosave memory: [mem 0x000e0000-0x000fffff]
[    0.000000] PM: Registered nosave memory: [mem 0x20000000-0x201fffff]
[    0.000000] PM: Registered nosave memory: [mem 0x40004000-0x40004fff]
[    0.000000] PM: Registered nosave memory: [mem 0xd670a000-0xd67fffff]
[    0.000000] PM: Registered nosave memory: [mem 0xd6f56000-0xd6ffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xd77b4000-0xd77fffff]
[    0.000000] PM: Registered nosave memory: [mem 0xd8f1e000-0xd8ffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xda6e3000-0xda8e1fff]
[    0.000000] PM: Registered nosave memory: [mem 0xda8e2000-0xda924fff]
[    0.000000] PM: Registered nosave memory: [mem 0xdb000000-0xdb7fffff]
[    0.000000] PM: Registered nosave memory: [mem 0xdb800000-0xdf9fffff]
[    0.000000] PM: Registered nosave memory: [mem 0xdfa00000-0xf7ffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xf8000000-0xfbffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xfc000000-0xfebfffff]
[    0.000000] PM: Registered nosave memory: [mem 0xfec00000-0xfec00fff]
[    0.000000] PM: Registered nosave memory: [mem 0xfec01000-0xfecfffff]
[    0.000000] PM: Registered nosave memory: [mem 0xfed00000-0xfed03fff]
[    0.000000] PM: Registered nosave memory: [mem 0xfed04000-0xfed1bfff]
[    0.000000] PM: Registered nosave memory: [mem 0xfed1c000-0xfed1ffff]
[    0.000000] PM: Registered nosave memory: [mem 0xfed20000-0xfedfffff]
[    0.000000] PM: Registered nosave memory: [mem 0xfee00000-0xfee00fff]
[    0.000000] PM: Registered nosave memory: [mem 0xfee01000-0xfeffffff]
[    0.000000] PM: Registered nosave memory: [mem 0xff000000-0xffffffff]
[    0.000000] e820: [mem 0xdfa00000-0xf7ffffff] available for PCI devices
[    0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:8 nr_node_ids:1
[    0.000000] PERCPU: Embedded 32 pages/cpu @ffff88021d200000 s90824 r8192 d32056 u262144
[    0.000000] pcpu-alloc: s90824 r8192 d32056 u262144 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 0 1 2 3 4 5 6 7 
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total pages: 2035770
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.0.0-rc4-testz+ root=UUID=1190c997-9457-4dde-8a57-0cce0aae93c6 resume=/dev/disk/by-id/ata-INTEL_SSDSA2M080G2GN_CVPO9412011S080BGN-part1 splash=silent quiet showopts crashkernel=512M-:256M
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340 using standard form
[    0.000000] AGP: Checking aperture...
[    0.000000] AGP: No AGP bridge found
[    0.000000] Memory: 7759620K/8272428K available (7193K kernel code, 1116K rwdata, 3180K rodata, 1564K init, 14552K bss, 496424K reserved, 16384K cma-reserved)
[    0.000000] Preemptible hierarchical RCU implementation.
[    0.000000] 	RCU dyntick-idle grace-period acceleration is enabled.
[    0.000000] 	RCU lockdep checking is enabled.
[    0.000000] 	RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=8.
[    0.000000] 	RCU kthread priority: 1.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[    0.000000] Running RCU self tests
[    0.000000] NR_IRQS:33024 nr_irqs:488 16
[    0.000000] 	Offload RCU callbacks from all CPUs
[    0.000000] 	Offload RCU callbacks from CPUs: 0-7.
[    0.000000] Console: colour dummy device 80x25
[    0.000000] console [tty0] enabled
[    0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.000000] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.000000] ... MAX_LOCK_DEPTH:          48
[    0.000000] ... MAX_LOCKDEP_KEYS:        8191
[    0.000000] ... CLASSHASH_SIZE:          4096
[    0.000000] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.000000] ... MAX_LOCKDEP_CHAINS:      65536
[    0.000000] ... CHAINHASH_SIZE:          32768
[    0.000000]  memory used by lock dependency info: 8159 kB
[    0.000000]  per task-struct memory footprint: 1920 bytes
[    0.000000] hpet clockevent registered
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3392.281 MHz processor
[    0.000047] Calibrating delay loop (skipped), value calculated using timer frequency.. 6784.56 BogoMIPS (lpj=3392281)
[    0.000049] pid_max: default: 32768 minimum: 301
[    0.000060] ACPI: Core revision 20150204
[    0.012734] ACPI: All ACPI Tables successfully acquired
[    0.012957] Security Framework initialized
[    0.012963] SELinux:  Disabled at boot.
[    0.013847] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[    0.015600] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
[    0.016259] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes)
[    0.016274] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes)
[    0.016926] Initializing cgroup subsys blkio
[    0.016931] Initializing cgroup subsys memory
[    0.016942] Initializing cgroup subsys devices
[    0.016958] Initializing cgroup subsys freezer
[    0.016973] Initializing cgroup subsys net_cls
[    0.016977] Initializing cgroup subsys perf_event
[    0.016980] Initializing cgroup subsys net_prio
[    0.016983] Initializing cgroup subsys hugetlb
[    0.017019] CPU: Physical Processor ID: 0
[    0.017020] CPU: Processor Core ID: 0
[    0.017023] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.017023] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
[    0.017285] mce: CPU supports 9 MCE banks
[    0.017294] CPU0: Thermal monitoring enabled (TM1)
[    0.017305] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8
[    0.017306] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0
[    0.017445] Freeing SMP alternatives memory: 24K (ffffffff81ea0000 - ffffffff81ea6000)
[    0.017449] ftrace: allocating 24339 entries in 96 pages
[    0.026029] dmar: Host address width 36
[    0.026031] dmar: DRHD base: 0x000000fed90000 flags: 0x0
[    0.026059] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020e60262 ecap f0101a
[    0.026060] dmar: DRHD base: 0x000000fed91000 flags: 0x1
[    0.026065] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap c9008020660262 ecap f0105a
[    0.026066] dmar: RMRR base: 0x000000da85a000 end: 0x000000da880fff
[    0.026068] dmar: RMRR base: 0x000000db800000 end: 0x000000df9fffff
[    0.026070] IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.026071] HPET id 0 under DRHD base 0xfed91000
[    0.026253] Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.026265] Enabled IRQ remapping in x2apic mode
[    0.026266] x2apic enabled
[    0.026271] Switched APIC routing to cluster x2apic.
[    0.026857] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.036868] TSC deadline timer enabled
[    0.036872] smpboot: CPU0: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (fam: 06, model: 3a, stepping: 09)
[    0.036896] Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, full-width counters, Intel PMU driver.
[    0.036916] ... version:                3
[    0.036917] ... bit width:              48
[    0.036917] ... generic registers:      4
[    0.036918] ... value mask:             0000ffffffffffff
[    0.036919] ... max period:             0000ffffffffffff
[    0.036919] ... fixed-purpose events:   3
[    0.036920] ... event mask:             000000070000000f
[    0.047355] x86: Booting SMP configuration:
[    0.047357] .... node  #0, CPUs:      #1
[    0.058653] CPU1 microcode updated early to revision 0x1b, date = 2014-05-29
[    0.061206] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[    0.063280]  #2
[    0.074578] CPU2 microcode updated early to revision 0x1b, date = 2014-05-29
[    0.079201]  #3
[    0.090494] CPU3 microcode updated early to revision 0x1b, date = 2014-05-29
[    0.095218]  #4 #5 #6 #7
[    0.155962] x86: Booted up 1 node, 8 CPUs
[    0.155965] smpboot: Total of 8 processors activated (54276.49 BogoMIPS)
[    0.162277] devtmpfs: initialized
[    0.166726] PM: Registering ACPI NVS region [mem 0xd8f1e000-0xd8ffffff] (925696 bytes)
[    0.166769] PM: Registering ACPI NVS region [mem 0xda8e2000-0xda924fff] (274432 bytes)
[    0.167172] pinctrl core: initialized pinctrl subsystem
[    0.167317] RTC time: 11:12:48, date: 03/19/15
[    0.167629] NET: Registered protocol family 16
[    0.170647] cpuidle: using governor ladder
[    0.173647] cpuidle: using governor menu
[    0.173702] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
[    0.173713] ACPI: bus type PCI registered
[    0.173714] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.173838] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
[    0.173840] PCI: MMCONFIG at [mem 0xf8000000-0xfbffffff] reserved in E820
[    0.173894] PCI: Using configuration type 1 for base access
[    0.173900] dmi type 0xB1 record - unknown flag
[    0.180477] ACPI: Added _OSI(Module Device)
[    0.180479] ACPI: Added _OSI(Processor Device)
[    0.180480] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.180481] ACPI: Added _OSI(Processor Aggregator Device)
[    0.188139] ACPI: Executed 1 blocks of module-level executable AML code
[    0.195015] ACPI: Dynamic OEM Table Load:
[    0.195027] ACPI: SSDT 0xFFFF880213B80000 00083B (v01 PmRef  Cpu0Cst  00003001 INTL 20051117)
[    0.196420] ACPI: Dynamic OEM Table Load:
[    0.196431] ACPI: SSDT 0xFFFF880213B84C00 000303 (v01 PmRef  ApIst    00003000 INTL 20051117)
[    0.197691] ACPI: Dynamic OEM Table Load:
[    0.197705] ACPI: SSDT 0xFFFF880213B81C00 000119 (v01 PmRef  ApCst    00003000 INTL 20051117)
[    0.201219] ACPI: Interpreter enabled
[    0.201225] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150204/hwxface-580)
[    0.201229] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150204/hwxface-580)
[    0.201260] ACPI: (supports S0 S3 S4 S5)
[    0.201261] ACPI: Using IOAPIC for interrupt routing
[    0.201307] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.219608] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])
[    0.219614] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
[    0.219952] \_SB_.PCI0:_OSC invalid UUID
[    0.219954] _OSC request data:1 1f 0 
[    0.219957] acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM
[    0.221167] PCI host bridge to bus 0000:00
[    0.221170] pci_bus 0000:00: root bus resource [bus 00-3e]
[    0.221172] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.221174] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.221175] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.221177] pci_bus 0000:00: root bus resource [mem 0x000d4000-0x000d7fff window]
[    0.221178] pci_bus 0000:00: root bus resource [mem 0x000d8000-0x000dbfff window]
[    0.221179] pci_bus 0000:00: root bus resource [mem 0x000dc000-0x000dffff window]
[    0.221181] pci_bus 0000:00: root bus resource [mem 0x000e0000-0x000e3fff window]
[    0.221182] pci_bus 0000:00: root bus resource [mem 0x000e4000-0x000e7fff window]
[    0.221183] pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfeafffff window]
[    0.221200] pci 0000:00:00.0: [8086:0150] type 00 class 0x060000
[    0.221451] pci 0000:00:02.0: [8086:0162] type 00 class 0x030000
[    0.221463] pci 0000:00:02.0: reg 0x10: [mem 0xf4400000-0xf47fffff 64bit]
[    0.221470] pci 0000:00:02.0: reg 0x18: [mem 0xe0000000-0xefffffff 64bit pref]
[    0.221474] pci 0000:00:02.0: reg 0x20: [io  0xf000-0xf03f]
[    0.221759] pci 0000:00:14.0: [8086:1e31] type 00 class 0x0c0330
[    0.221784] pci 0000:00:14.0: reg 0x10: [mem 0xf4920000-0xf492ffff 64bit]
[    0.221874] pci 0000:00:14.0: PME# supported from D3hot D3cold
[    0.222019] pci 0000:00:14.0: System wakeup disabled by ACPI
[    0.222115] pci 0000:00:16.0: [8086:1e3a] type 00 class 0x078000
[    0.222141] pci 0000:00:16.0: reg 0x10: [mem 0xf493c000-0xf493c00f 64bit]
[    0.222231] pci 0000:00:16.0: PME# supported from D0 D3hot D3cold
[    0.222421] pci 0000:00:16.3: [8086:1e3d] type 00 class 0x070002
[    0.222441] pci 0000:00:16.3: reg 0x10: [io  0xf0e0-0xf0e7]
[    0.222451] pci 0000:00:16.3: reg 0x14: [mem 0xf493a000-0xf493afff]
[    0.222711] pci 0000:00:19.0: [8086:1502] type 00 class 0x020000
[    0.222731] pci 0000:00:19.0: reg 0x10: [mem 0xf4900000-0xf491ffff]
[    0.222739] pci 0000:00:19.0: reg 0x14: [mem 0xf4939000-0xf4939fff]
[    0.222753] pci 0000:00:19.0: reg 0x18: [io  0xf080-0xf09f]
[    0.222833] pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
[    0.222943] pci 0000:00:19.0: System wakeup disabled by ACPI
[    0.223039] pci 0000:00:1a.0: [8086:1e2d] type 00 class 0x0c0320
[    0.223062] pci 0000:00:1a.0: reg 0x10: [mem 0xf4938000-0xf49383ff]
[    0.223166] pci 0000:00:1a.0: PME# supported from D0 D3hot D3cold
[    0.223276] pci 0000:00:1a.0: System wakeup disabled by ACPI
[    0.223371] pci 0000:00:1b.0: [8086:1e20] type 00 class 0x040300
[    0.223390] pci 0000:00:1b.0: reg 0x10: [mem 0xf4930000-0xf4933fff 64bit]
[    0.223481] pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
[    0.223590] pci 0000:00:1b.0: System wakeup disabled by ACPI
[    0.223683] pci 0000:00:1c.0: [8086:1e10] type 01 class 0x060400
[    0.223787] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
[    0.223912] pci 0000:00:1c.0: System wakeup disabled by ACPI
[    0.224006] pci 0000:00:1c.2: [8086:1e14] type 01 class 0x060400
[    0.224104] pci 0000:00:1c.2: PME# supported from D0 D3hot D3cold
[    0.224223] pci 0000:00:1c.2: System wakeup disabled by ACPI
[    0.224327] pci 0000:00:1d.0: [8086:1e26] type 00 class 0x0c0320
[    0.224350] pci 0000:00:1d.0: reg 0x10: [mem 0xf4937000-0xf49373ff]
[    0.224455] pci 0000:00:1d.0: PME# supported from D0 D3hot D3cold
[    0.224565] pci 0000:00:1d.0: System wakeup disabled by ACPI
[    0.224654] pci 0000:00:1e.0: [8086:244e] type 01 class 0x060401
[    0.224817] pci 0000:00:1e.0: System wakeup disabled by ACPI
[    0.224915] pci 0000:00:1f.0: [8086:1e47] type 00 class 0x060100
[    0.225207] pci 0000:00:1f.2: [8086:1e02] type 00 class 0x010601
[    0.225228] pci 0000:00:1f.2: reg 0x10: [io  0xf0d0-0xf0d7]
[    0.225236] pci 0000:00:1f.2: reg 0x14: [io  0xf0c0-0xf0c3]
[    0.225245] pci 0000:00:1f.2: reg 0x18: [io  0xf0b0-0xf0b7]
[    0.225253] pci 0000:00:1f.2: reg 0x1c: [io  0xf0a0-0xf0a3]
[    0.225262] pci 0000:00:1f.2: reg 0x20: [io  0xf060-0xf07f]
[    0.225271] pci 0000:00:1f.2: reg 0x24: [mem 0xf4936000-0xf49367ff]
[    0.225327] pci 0000:00:1f.2: PME# supported from D3hot
[    0.225516] pci 0000:00:1f.3: [8086:1e22] type 00 class 0x0c0500
[    0.225532] pci 0000:00:1f.3: reg 0x10: [mem 0xf4935000-0xf49350ff 64bit]
[    0.225555] pci 0000:00:1f.3: reg 0x20: [io  0xf040-0xf05f]
[    0.225864] pci 0000:00:1c.0: PCI bridge to [bus 01]
[    0.226014] pci 0000:02:00.0: [1102:000b] type 00 class 0x040300
[    0.226058] pci 0000:02:00.0: reg 0x10: [mem 0xf4200000-0xf420ffff 64bit]
[    0.226088] pci 0000:02:00.0: reg 0x18: [mem 0xf4000000-0xf41fffff 64bit]
[    0.226119] pci 0000:02:00.0: reg 0x20: [mem 0xf0000000-0xf3ffffff 64bit]
[    0.226320] pci 0000:02:00.0: System wakeup disabled by ACPI
[    0.228827] pci 0000:00:1c.2: PCI bridge to [bus 02]
[    0.228834] pci 0000:00:1c.2:   bridge window [mem 0xf0000000-0xf42fffff]
[    0.228934] pci 0000:03:02.0: [1102:0004] type 00 class 0x040100
[    0.228955] pci 0000:03:02.0: reg 0x10: [io  0xe000-0xe03f]
[    0.229053] pci 0000:03:02.0: supports D1 D2
[    0.229138] pci 0000:03:02.1: [1102:7003] type 00 class 0x098000
[    0.229157] pci 0000:03:02.1: reg 0x10: [io  0xe040-0xe047]
[    0.229255] pci 0000:03:02.1: supports D1 D2
[    0.229337] pci 0000:03:02.2: [1102:4001] type 00 class 0x0c0010
[    0.229358] pci 0000:03:02.2: reg 0x10: [mem 0xf4804000-0xf48047ff]
[    0.229370] pci 0000:03:02.2: reg 0x14: [mem 0xf4800000-0xf4803fff]
[    0.229461] pci 0000:03:02.2: supports D1 D2
[    0.229462] pci 0000:03:02.2: PME# supported from D0 D1 D2 D3hot
[    0.229622] pci 0000:00:1e.0: PCI bridge to [bus 03] (subtractive decode)
[    0.229626] pci 0000:00:1e.0:   bridge window [io  0xe000-0xefff]
[    0.229631] pci 0000:00:1e.0:   bridge window [mem 0xf4800000-0xf48fffff]
[    0.229637] pci 0000:00:1e.0:   bridge window [io  0x0000-0x0cf7 window] (subtractive decode)
[    0.229639] pci 0000:00:1e.0:   bridge window [io  0x0d00-0xffff window] (subtractive decode)
[    0.229640] pci 0000:00:1e.0:   bridge window [mem 0x000a0000-0x000bffff window] (subtractive decode)
[    0.229641] pci 0000:00:1e.0:   bridge window [mem 0x000d4000-0x000d7fff window] (subtractive decode)
[    0.229643] pci 0000:00:1e.0:   bridge window [mem 0x000d8000-0x000dbfff window] (subtractive decode)
[    0.229644] pci 0000:00:1e.0:   bridge window [mem 0x000dc000-0x000dffff window] (subtractive decode)
[    0.229646] pci 0000:00:1e.0:   bridge window [mem 0x000e0000-0x000e3fff window] (subtractive decode)
[    0.229647] pci 0000:00:1e.0:   bridge window [mem 0x000e4000-0x000e7fff window] (subtractive decode)
[    0.229648] pci 0000:00:1e.0:   bridge window [mem 0xdfa00000-0xfeafffff window] (subtractive decode)
[    0.230783] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15)
[    0.230902] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[    0.231013] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 10 *11 12 14 15)
[    0.231123] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 *10 11 12 14 15)
[    0.231233] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 10 11 12 14 15)
[    0.231341] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[    0.231450] ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 5 6 10 11 12 14 15)
[    0.231559] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 *10 11 12 14 15)
[    0.231775] ACPI: Enabled 4 GPEs in block 00 to 3F
[    0.232127] vgaarb: setting as boot device: PCI:0000:00:02.0
[    0.232129] vgaarb: device added: PCI:0000:00:02.0,decodes=io+mem,owns=io+mem,locks=none
[    0.232135] vgaarb: loaded
[    0.232136] vgaarb: bridge control possible 0000:00:02.0
[    0.232282] SCSI subsystem initialized
[    0.232363] libata version 3.00 loaded.
[    0.232376] ACPI: bus type USB registered
[    0.232415] usbcore: registered new interface driver usbfs
[    0.232433] usbcore: registered new interface driver hub
[    0.232483] usbcore: registered new device driver usb
[    0.232553] PCI: Using ACPI for IRQ routing
[    0.234183] PCI: pci_cache_line_size set to 64 bytes
[    0.234248] e820: reserve RAM buffer [mem 0x0009d800-0x0009ffff]
[    0.234254] e820: reserve RAM buffer [mem 0x40004000-0x43ffffff]
[    0.234255] e820: reserve RAM buffer [mem 0xd670a000-0xd7ffffff]
[    0.234257] e820: reserve RAM buffer [mem 0xd6f56000-0xd7ffffff]
[    0.234258] e820: reserve RAM buffer [mem 0xd77b4000-0xd7ffffff]
[    0.234259] e820: reserve RAM buffer [mem 0xd8f1e000-0xdbffffff]
[    0.234261] e820: reserve RAM buffer [mem 0xda6e3000-0xdbffffff]
[    0.234263] e820: reserve RAM buffer [mem 0xdb000000-0xdbffffff]
[    0.234264] e820: reserve RAM buffer [mem 0x21e600000-0x21fffffff]
[    0.234693] NetLabel: Initializing
[    0.234694] NetLabel:  domain hash size = 128
[    0.234695] NetLabel:  protocols = UNLABELED CIPSOv4
[    0.234729] NetLabel:  unlabeled traffic allowed by default
[    0.234759] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
[    0.234763] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[    0.236825] Switched to clocksource hpet
[    0.258865] pnp: PnP ACPI init
[    0.259066] system 00:00: [mem 0xfed40000-0xfed44fff] has been reserved
[    0.259109] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.259247] system 00:01: [io  0x0680-0x069f] has been reserved
[    0.259249] system 00:01: [io  0x1000-0x100f] has been reserved
[    0.259251] system 00:01: [io  0xffff] has been reserved
[    0.259253] system 00:01: [io  0xffff] has been reserved
[    0.259255] system 00:01: [io  0x0400-0x0453] could not be reserved
[    0.259257] system 00:01: [io  0x0458-0x047f] has been reserved
[    0.259260] system 00:01: [io  0x0500-0x057f] has been reserved
[    0.259261] system 00:01: [io  0x164e-0x164f] has been reserved
[    0.259265] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.259331] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.259434] system 00:03: [io  0x0454-0x0457] has been reserved
[    0.259437] system 00:03: Plug and Play ACPI device, IDs INT3f0d PNP0c02 (active)
[    0.259638] system 00:04: [io  0x0a40-0x0a4f] has been reserved
[    0.259640] system 00:04: [io  0x0a00-0x0a3f] has been reserved
[    0.259643] system 00:04: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.259743] system 00:05: [io  0x04d0-0x04d1] has been reserved
[    0.259746] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.259829] pnp 00:06: Plug and Play ACPI device, IDs PNP0303 PNP030b (active)
[    0.261322] pnp 00:07: [dma 0 disabled]
[    0.261378] pnp 00:07: Plug and Play ACPI device, IDs PNP0501 (active)
[    0.261826] system 00:08: [mem 0xfed1c000-0xfed1ffff] has been reserved
[    0.261829] system 00:08: [mem 0xfed10000-0xfed17fff] has been reserved
[    0.261843] system 00:08: [mem 0xfed18000-0xfed18fff] has been reserved
[    0.261845] system 00:08: [mem 0xfed19000-0xfed19fff] has been reserved
[    0.261847] system 00:08: [mem 0xf8000000-0xfbffffff] has been reserved
[    0.261849] system 00:08: [mem 0xfed20000-0xfed3ffff] has been reserved
[    0.261851] system 00:08: [mem 0xfed90000-0xfed93fff] could not be reserved
[    0.261853] system 00:08: [mem 0xfed45000-0xfed8ffff] has been reserved
[    0.261855] system 00:08: [mem 0xff000000-0xffffffff] has been reserved
[    0.261857] system 00:08: [mem 0xfee00000-0xfeefffff] could not be reserved
[    0.261859] system 00:08: [mem 0xdfa00000-0xdfa00fff] has been reserved
[    0.261862] system 00:08: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.262159] system 00:09: [mem 0x20000000-0x201fffff] has been reserved
[    0.262161] system 00:09: [mem 0x40004000-0x40004fff] has been reserved
[    0.262164] system 00:09: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.262189] pnp: PnP ACPI: found 10 devices
[    0.271001] pci 0000:00:1c.0: PCI bridge to [bus 01]
[    0.271016] pci 0000:00:1c.2: PCI bridge to [bus 02]
[    0.271021] pci 0000:00:1c.2:   bridge window [mem 0xf0000000-0xf42fffff]
[    0.271030] pci 0000:00:1e.0: PCI bridge to [bus 03]
[    0.271033] pci 0000:00:1e.0:   bridge window [io  0xe000-0xefff]
[    0.271038] pci 0000:00:1e.0:   bridge window [mem 0xf4800000-0xf48fffff]
[    0.271047] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.271049] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.271050] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.271051] pci_bus 0000:00: resource 7 [mem 0x000d4000-0x000d7fff window]
[    0.271053] pci_bus 0000:00: resource 8 [mem 0x000d8000-0x000dbfff window]
[    0.271054] pci_bus 0000:00: resource 9 [mem 0x000dc000-0x000dffff window]
[    0.271055] pci_bus 0000:00: resource 10 [mem 0x000e0000-0x000e3fff window]
[    0.271056] pci_bus 0000:00: resource 11 [mem 0x000e4000-0x000e7fff window]
[    0.271058] pci_bus 0000:00: resource 12 [mem 0xdfa00000-0xfeafffff window]
[    0.271059] pci_bus 0000:02: resource 1 [mem 0xf0000000-0xf42fffff]
[    0.271061] pci_bus 0000:03: resource 0 [io  0xe000-0xefff]
[    0.271062] pci_bus 0000:03: resource 1 [mem 0xf4800000-0xf48fffff]
[    0.271063] pci_bus 0000:03: resource 4 [io  0x0000-0x0cf7 window]
[    0.271064] pci_bus 0000:03: resource 5 [io  0x0d00-0xffff window]
[    0.271065] pci_bus 0000:03: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.271067] pci_bus 0000:03: resource 7 [mem 0x000d4000-0x000d7fff window]
[    0.271068] pci_bus 0000:03: resource 8 [mem 0x000d8000-0x000dbfff window]
[    0.271069] pci_bus 0000:03: resource 9 [mem 0x000dc000-0x000dffff window]
[    0.271070] pci_bus 0000:03: resource 10 [mem 0x000e0000-0x000e3fff window]
[    0.271071] pci_bus 0000:03: resource 11 [mem 0x000e4000-0x000e7fff window]
[    0.271073] pci_bus 0000:03: resource 12 [mem 0xdfa00000-0xfeafffff window]
[    0.271175] NET: Registered protocol family 2
[    0.271577] TCP established hash table entries: 65536 (order: 7, 524288 bytes)
[    0.272131] TCP bind hash table entries: 65536 (order: 10, 4194304 bytes)
[    0.274636] TCP: Hash tables configured (established 65536 bind 65536)
[    0.274667] TCP: reno registered
[    0.274741] UDP hash table entries: 4096 (order: 7, 655360 bytes)
[    0.275144] UDP-Lite hash table entries: 4096 (order: 7, 655360 bytes)
[    0.275676] NET: Registered protocol family 1
[    0.275694] pci 0000:00:02.0: Video device with shadowed ROM
[    0.306918] PCI: CLS mismatch (64 != 32), using 64 bytes
[    0.307098] Unpacking initramfs...
[    0.734788] Freeing initrd memory: 5504K (ffff880037530000 - ffff880037a90000)
[    0.734840] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    0.734843] software IO TLB [mem 0xd270a000-0xd670a000] (64MB) mapped at [ffff8800d270a000-ffff8800d6709fff]
[    0.735591] RAPL PMU detected, hw unit 2^-16 Joules, API unit is 2^-32 Joules, 3 fixed counters 163840 ms ovfl timer
[    0.735865] microcode: CPU0 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735878] microcode: CPU1 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735889] microcode: CPU2 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735906] microcode: CPU3 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735914] microcode: CPU4 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735925] microcode: CPU5 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735936] microcode: CPU6 sig=0x306a9, pf=0x2, revision=0x1b
[    0.735946] microcode: CPU7 sig=0x306a9, pf=0x2, revision=0x1b
[    0.736068] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    0.736110] Scanning for low memory corruption every 60 seconds
[    0.736807] futex hash table entries: 2048 (order: 6, 262144 bytes)
[    0.736924] audit: initializing netlink subsys (disabled)
[    0.736977] audit: type=2000 audit(1426763568.714:1): initialized
[    0.737785] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    0.737849] zpool: loaded
[    0.737852] zbud: loaded
[    0.738115] VFS: Disk quotas dquot_6.5.2
[    0.738147] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.738797] Key type big_key registered
[    0.739351] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    0.739463] io scheduler noop registered
[    0.739466] io scheduler deadline registered
[    0.739489] io scheduler cfq registered (default)
[    0.739491] start plist test
[    0.740485] end plist test
[    0.741147] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    0.741235] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
[    0.741276] vesafb: mode is 1600x1200x32, linelength=6400, pages=0
[    0.741277] vesafb: scrolling: redraw
[    0.741279] vesafb: Truecolor: size=8:8:8:8, shift=24:16:8:0
[    0.741295] vesafb: framebuffer at 0xe0000000, mapped to 0xffffc90005300000, using 7552k, total 7552k
[    0.841960] Console: switching to colour frame buffer device 200x75
[    0.942257] fb0: VESA VGA frame buffer device
[    0.942286] intel_idle: MWAIT substates: 0x1120
[    0.942287] intel_idle: v0.4 model 0x3A
[    0.942288] intel_idle: lapic_timer_reliable_states 0xffffffff
[    0.943135] GHES: HEST is not enabled!
[    0.943222] Serial: 8250/16550 driver, 32 ports, IRQ sharing disabled
[    0.964225] 00:07: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    0.989672] 0000:00:16.3: ttyS4 at I/O 0xf0e0 (irq = 19, base_baud = 115200) is a 16550A
[    0.990097] Non-volatile memory driver v1.3
[    0.990163] Linux agpgart interface v0.103
[    0.990491] ahci 0000:00:1f.2: version 3.0
[    1.001213] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x7 impl SATA mode
[    1.001216] ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst 
[    1.006897] scsi host0: ahci
[    1.007259] scsi host1: ahci
[    1.007513] scsi host2: ahci
[    1.007720] scsi host3: ahci
[    1.007912] scsi host4: ahci
[    1.008103] scsi host5: ahci
[    1.008226] ata1: SATA max UDMA/133 abar m2048@0xf4936000 port 0xf4936100 irq 26
[    1.008229] ata2: SATA max UDMA/133 abar m2048@0xf4936000 port 0xf4936180 irq 26
[    1.008231] ata3: SATA max UDMA/133 abar m2048@0xf4936000 port 0xf4936200 irq 26
[    1.008232] ata4: DUMMY
[    1.008233] ata5: DUMMY
[    1.008233] ata6: DUMMY
[    1.008290] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.008297] ehci-pci: EHCI PCI platform driver
[    1.008437] ehci-pci 0000:00:1a.0: EHCI Host Controller
[    1.008617] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1
[    1.008704] ehci-pci 0000:00:1a.0: debug port 2
[    1.012684] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
[    1.012731] ehci-pci 0000:00:1a.0: irq 16, io mem 0xf4938000
[    1.018212] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00
[    1.018443] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[    1.018444] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.018446] usb usb1: Product: EHCI Host Controller
[    1.018447] usb usb1: Manufacturer: Linux 4.0.0-rc4-testz+ ehci_hcd
[    1.018448] usb usb1: SerialNumber: 0000:00:1a.0
[    1.018995] hub 1-0:1.0: USB hub found
[    1.019014] hub 1-0:1.0: 3 ports detected
[    1.019783] ehci-pci 0000:00:1d.0: EHCI Host Controller
[    1.019793] ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 2
[    1.019806] ehci-pci 0000:00:1d.0: debug port 2
[    1.023768] ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
[    1.023796] ehci-pci 0000:00:1d.0: irq 23, io mem 0xf4937000
[    1.029209] ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
[    1.029291] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
[    1.029293] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.029294] usb usb2: Product: EHCI Host Controller
[    1.029295] usb usb2: Manufacturer: Linux 4.0.0-rc4-testz+ ehci_hcd
[    1.029296] usb usb2: SerialNumber: 0000:00:1d.0
[    1.029572] hub 2-0:1.0: USB hub found
[    1.029585] hub 2-0:1.0: 3 ports detected
[    1.029982] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    1.029992] uhci_hcd: USB Universal Host Controller Interface driver
[    1.030070] i8042: PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
[    1.030071] i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[    1.030849] serio: i8042 KBD port at 0x60,0x64 irq 1
[    1.031250] mousedev: PS/2 mouse device common for all mice
[    1.031550] rtc_cmos 00:02: RTC can wake from S4
[    1.031799] rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0
[    1.031835] rtc_cmos 00:02: alarms up to one month, y3k, 242 bytes nvram, hpet irqs
[    1.031855] Intel P-state driver initializing.
[    1.032692] ledtrig-cpu: registered to indicate activity on CPUs
[    1.032746] hidraw: raw HID events driver (C) Jiri Kosina
[    1.033117] usbcore: registered new interface driver usbhid
[    1.033119] usbhid: USB HID core driver
[    1.033267] TCP: cubic registered
[    1.033514] NET: Registered protocol family 10
[    1.036977] registered taskstats version 1
[    1.038915]   Magic number: 7:741:226
[    1.039073] rtc_cmos 00:02: setting system clock to 2015-03-19 11:12:49 UTC (1426763569)
[    1.039614] PM: Checking hibernation image partition /dev/disk/by-id/ata-INTEL_SSDSA2M080G2GN_CVPO9412011S080BGN-part1
[    1.086565] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    1.313407] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.313453] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    1.314522] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.314527] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.314531] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.315761] ata3.00: ATA-7: INTEL SSDSA2M080G2GN, 2CV102G9, max UDMA/133
[    1.315768] ata3.00: 156301488 sectors, multi 1: LBA48 NCQ (depth 31/32)
[    1.315848] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    1.315926] ata1.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.315931] ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.315934] ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.316767] ata1.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[    1.316804] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.316811] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.316814] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.316868] ata3.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[    1.317197] ata3.00: configured for UDMA/133
[    1.318504] ata1.00: ATA-8: WDC WD5000AAKX-75U6AA0, 19.01H19, max UDMA/133
[    1.318509] ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    1.321000] ata2.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.321017] ata2.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.321023] ata2.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.321063] ata1.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.321070] ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.321073] ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.321237] ata1.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[    1.321352] ata2.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[    1.321425] usb 1-1: new high-speed USB device number 2 using ehci-pci
[    1.322906] ata1.00: configured for UDMA/133
[    1.324136] scsi 0:0:0:0: Direct-Access     ATA      WDC WD5000AAKX-7 1H19 PQ: 0 ANSI: 5
[    1.324633] ata2.00: ATAPI: HL-DT-ST DVD+/-RW GHA2N, A103, max UDMA/133
[    1.325558] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[    1.325700] sd 0:0:0:0: [sda] Write Protect is off
[    1.325704] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.325746] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.330012] ata2.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.330018] ata2.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.330021] ata2.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.330457] ata2.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[    1.331460] usb 2-1: new high-speed USB device number 2 using ehci-pci
[    1.333760] ata2.00: configured for UDMA/133
[    1.345643]  sda: sda1 sda2 sda3
[    1.347718] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.351496] scsi 1:0:0:0: CD-ROM            HL-DT-ST DVD+-RW GHA2N    A103 PQ: 0 ANSI: 5
[    1.363868] scsi 2:0:0:0: Direct-Access     ATA      INTEL SSDSA2M080 02G9 PQ: 0 ANSI: 5
[    1.364965] sd 2:0:0:0: [sdb] 156301488 512-byte logical blocks: (80.0 GB/74.5 GiB)
[    1.365496] sd 2:0:0:0: [sdb] Write Protect is off
[    1.365502] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.365726] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.367705]  sdb: sdb1 sdb2
[    1.369573] sd 2:0:0:0: [sdb] Attached SCSI disk
[    1.369610] PM: Hibernation image not present or could not be loaded.
[    1.370928] Freeing unused kernel memory: 1564K (ffffffff81d19000 - ffffffff81ea0000)
[    1.370931] Write protecting the kernel read-only data: 12288k
[    1.372021] Freeing unused kernel memory: 988K (ffff880001709000 - ffff880001800000)
[    1.372623] Freeing unused kernel memory: 916K (ffff880001b1b000 - ffff880001c00000)
[    1.379527] random: systemd urandom read with 17 bits of entropy available
[    1.435997] usb 1-1: New USB device found, idVendor=8087, idProduct=0024
[    1.436004] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    1.437017] hub 1-1:1.0: USB hub found
[    1.437119] hub 1-1:1.0: 6 ports detected
[    1.445998] usb 2-1: New USB device found, idVendor=8087, idProduct=0024
[    1.446004] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    1.446823] hub 2-1:1.0: USB hub found
[    1.446996] hub 2-1:1.0: 8 ports detected
[    1.521873] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    1.522055] scsi 1:0:0:0: Attached scsi generic sg1 type 5
[    1.522211] sd 2:0:0:0: Attached scsi generic sg2 type 0
[    1.665202] xhci_hcd 0000:00:14.0: xHCI Host Controller
[    1.665299] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 3
[    1.665965] xhci_hcd 0000:00:14.0: hcc params 0x20007181 hci version 0x100 quirks 0x0000b930
[    1.665974] xhci_hcd 0000:00:14.0: cache line size of 64 is not supported
[    1.666151] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002
[    1.666153] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.666154] usb usb3: Product: xHCI Host Controller
[    1.666155] usb usb3: Manufacturer: Linux 4.0.0-rc4-testz+ xhci-hcd
[    1.666156] usb usb3: SerialNumber: 0000:00:14.0
[    1.668003] hub 3-0:1.0: USB hub found
[    1.668055] hub 3-0:1.0: 4 ports detected
[    1.672194] xhci_hcd 0000:00:14.0: xHCI Host Controller
[    1.672205] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 4
[    1.672413] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003
[    1.672416] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.672418] usb usb4: Product: xHCI Host Controller
[    1.672420] usb usb4: Manufacturer: Linux 4.0.0-rc4-testz+ xhci-hcd
[    1.672423] usb usb4: SerialNumber: 0000:00:14.0
[    1.675954] hub 4-0:1.0: USB hub found
[    1.676000] hub 4-0:1.0: 4 ports detected
[    1.703171] sr 1:0:0:0: [sr0] scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
[    1.703177] cdrom: Uniform CD-ROM driver Revision: 3.20
[    1.707611] sr 1:0:0:0: Attached scsi CD-ROM sr0
[    1.712607] usb 1-1.3: new high-speed USB device number 3 using ehci-pci
[    1.734331] PM: Marking nosave pages: [mem 0x00000000-0x00000fff]
[    1.734334] PM: Marking nosave pages: [mem 0x0009d000-0x000fffff]
[    1.734336] PM: Marking nosave pages: [mem 0x20000000-0x201fffff]
[    1.734341] PM: Marking nosave pages: [mem 0x40004000-0x40004fff]
[    1.734342] PM: Marking nosave pages: [mem 0xd670a000-0xd67fffff]
[    1.734345] PM: Marking nosave pages: [mem 0xd6f56000-0xd6ffffff]
[    1.734347] PM: Marking nosave pages: [mem 0xd77b4000-0xd77fffff]
[    1.734349] PM: Marking nosave pages: [mem 0xd8f1e000-0xd8ffffff]
[    1.734352] PM: Marking nosave pages: [mem 0xda6e3000-0xda924fff]
[    1.734358] PM: Marking nosave pages: [mem 0xdb000000-0xffffffff]
[    1.734729] PM: Basic memory bitmaps created
[    1.735299] PM: Basic memory bitmaps freed
[    1.737601] tsc: Refined TSC clocksource calibration: 3392.295 MHz
[    1.739247] PM: Starting manual resume from disk
[    1.739254] PM: Hibernation image partition 8:17 present
[    1.739255] PM: Looking for hibernation image.
[    1.739541] PM: Image not found (code -22)
[    1.739544] PM: Hibernation image not present or could not be loaded.
[    1.768135] EXT4-fs (sdb2): mounting with "discard" option, but the device does not support discard
[    1.768138] EXT4-fs (sdb2): mounted filesystem with writeback data mode. Opts: (null)
[    1.799516] usb 1-1.3: New USB device found, idVendor=04e8, idProduct=685e
[    1.799519] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    1.799521] usb 1-1.3: Product: BCM21553-Thunderbird
[    1.799522] usb 1-1.3: Manufacturer: Broadcom
[    1.799523] usb 1-1.3: SerialNumber: 0123456789ABCDEF
[    1.874840] usb 1-1.4: new low-speed USB device number 4 using ehci-pci
[    1.966500] usb 1-1.4: New USB device found, idVendor=046d, idProduct=c03f
[    1.966504] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    1.966507] usb 1-1.4: Product: USB-PS/2 Optical Mouse
[    1.966509] usb 1-1.4: Manufacturer: Logitech
[    1.971968] input: Logitech USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.4/1-1.4:1.0/0003:046D:C03F.0001/input/input1
[    1.973387] hid-generic 0003:046D:C03F.0001: input,hidraw0: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on usb-0000:00:1a.0-1.4/input0
[    1.978360] usb-storage 1-1.3:1.0: USB Mass Storage device detected
[    1.978623] scsi host6: usb-storage 1-1.3:1.0
[    1.979384] usbcore: registered new interface driver usb-storage
[    1.979993] usbcore: registered new interface driver uas
[    2.219795] random: nonblocking pool is initialized
[    2.738355] Switched to clocksource tsc
[    2.747945] EXT4-fs (sdb2): re-mounted. Opts: (null)
[    2.901840] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input2
[    2.903857] ACPI: Power Button [PWRB]
[    2.907095] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[    2.907175] ACPI: Power Button [PWRF]
[    2.928789] ACPI: Invalid active0 threshold
[    2.930635] thermal LNXTHERM:00: registered as thermal_zone0
[    2.930637] ACPI: Thermal Zone [TZ00] (28 C)
[    2.932304] thermal LNXTHERM:01: registered as thermal_zone1
[    2.932307] ACPI: Thermal Zone [TZ01] (30 C)
[    2.934619] pps_core: LinuxPPS API ver. 1 registered
[    2.934621] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    2.943412] PTP clock support registered
[    2.978517] ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042f conflicts with OpRegion 0x0000000000000400-0x000000000000047f (\PMIO) (20150204/utaddress-258)
[    2.978525] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    2.978530] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054f conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20150204/utaddress-258)
[    2.978535] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    2.978537] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053f conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20150204/utaddress-258)
[    2.978541] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    2.978543] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052f conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20150204/utaddress-258)
[    2.978547] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    2.978549] lpc_ich: Resource conflict(s) found affecting gpio_ich
[    2.983076] ACPI Warning: SystemIO range 0x000000000000f040-0x000000000000f05f conflicts with OpRegion 0x000000000000f040-0x000000000000f04f (\_SB_.PCI0.SBUS.SMBI) (20150204/utaddress-258)
[    2.983083] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    2.983509] scsi 6:0:0:0: Direct-Access     SAMSUNG  S5830i Card      0000 PQ: 0 ANSI: 2
[    2.985735] sd 6:0:0:0: Attached scsi generic sg3 type 0
[    2.999778] sd 6:0:0:0: [sdc] Attached SCSI removable disk
[    3.008272] [drm] Initialized drm 1.1.0 20060810
[    3.027766] gameport gameport0: EMU10K1 is pci0000:03:02.1/gameport0, io 0xe040, speed 857kHz
[    3.051103] cdc_acm 1-1.3:1.1: ttyACM0: USB ACM device
[    3.051641] usbcore: registered new interface driver cdc_acm
[    3.051643] cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters
[    3.071066] e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
[    3.071070] e1000e: Copyright(c) 1999 - 2014 Intel Corporation.
[    3.071516] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[    3.082377] device-mapper: uevent: version 1.0.3
[    3.087323] device-mapper: ioctl: 4.30.0-ioctl (2014-12-22) initialised: dm-devel@redhat.com
[    3.095239] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
[    3.129424] AVX version of gcm_enc/dec engaged.
[    3.129424] AES CTR mode by8 optimization enabled
[    3.137395] iTCO_vendor_support: vendor-support=0
[    3.140512] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[    3.140567] iTCO_wdt: Found a Panther Point TCO device (Version=2, TCOBASE=0x0460)
[    3.141586] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
[    3.170092] snd_ctxfi 0000:02:00.0: chip 20K2 model SB0880 (1102:0043) is found
[    3.272689] e1000e 0000:00:19.0 eth0: registered PHC clock
[    3.272692] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 90:b1:1c:98:07:df
[    3.272693] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[    3.272723] e1000e 0000:00:19.0 eth0: MAC: 10, PHY: 11, PBA No: 1011FF-0FF
[    3.273869] EXT4-fs (sda1): mounted filesystem with writeback data mode. Opts: data=writeback
[    3.294862] [drm] Memory usable by graphics device = 2048M
[    3.294877] checking generic (e0000000 760000) vs hw (e0000000 10000000)
[    3.294878] fb: switching to inteldrmfb from VESA VGA
[    3.294936] Console: switching to colour dummy device 80x25
[    3.296683] [drm] Replacing VGA console driver
[    3.299485] e1000e 0000:00:19.0 em1: renamed from eth0
[    3.303418] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    3.303420] [drm] Driver supports precise vblank timestamp query.
[    3.303949] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    3.305870] EXT4-fs (sda2): mounted filesystem with writeback data mode. Opts: data=writeback
[    3.311604] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[    3.312780] acpi device:49: registered as cooling_device8
[    3.313285] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4
[    3.313788] [drm] Initialized i915 1.6.0 20150130 for 0000:00:02.0 on minor 0
[    3.315932] sound hdaudioC0D0: ALC269VB: SKU not ready 0x411111f0
[    3.316512] sound hdaudioC0D0: autoconfig for ALC269VB: line_outs=1 (0x1b/0x0/0x0/0x0/0x0) type:line
[    3.316516] sound hdaudioC0D0:    speaker_outs=1 (0x14/0x0/0x0/0x0/0x0)
[    3.316518] sound hdaudioC0D0:    hp_outs=1 (0x21/0x0/0x0/0x0/0x0)
[    3.316520] sound hdaudioC0D0:    mono: mono_out=0x0
[    3.316522] sound hdaudioC0D0:    inputs:
[    3.316526] sound hdaudioC0D0:      Rear Mic=0x19
[    3.316530] sound hdaudioC0D0:      Front Mic=0x18
[    3.323236] snd_emu10k1 0000:03:02.0: Installing spdif_bug patch: SB Audigy 2 Platinum EX [SB0280]
[    3.354293] input: HDA Digital PCBeep as /devices/pci0000:00/0000:00:1b.0/sound/card0/hdaudioC0D0/input5
[    3.356881] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input6
[    3.357361] input: HDA Intel PCH Front Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input7
[    3.357542] input: HDA Intel PCH Line Out as /devices/pci0000:00/0000:00:1b.0/sound/card0/input8
[    3.357694] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1b.0/sound/card0/input9
[    3.357907] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10
[    3.358115] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11
[    3.379135] fbcon: inteldrmfb (fb0) is primary device
[    3.412685] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder A
[    3.412699] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder A FIFO underrun
[    3.455232] EXT4-fs (sda3): mounted filesystem with writeback data mode. Opts: data=writeback
[    3.471546] Console: switching to colour frame buffer device 200x75
[    3.474804] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[    3.474806] i915 0000:00:02.0: registered panic notifier
[    3.490729] snd_ctxfi 0000:02:00.0: Use xfi-native timer
[    3.556894] Adding 2103292k swap on /dev/sdb1.  Priority:-1 extents:1 across:2103292k SSFS
[    4.763821] NET: Registered protocol family 17
[    4.811455] No iBFT detected.
[    5.066539] IPv6: ADDRCONF(NETDEV_UP): em1: link is not ready
[    7.826030] e1000e: em1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[    7.826103] IPv6: ADDRCONF(NETDEV_CHANGE): em1: link becomes ready
[   20.365735] Guest personality initialized and is inactive
[   20.366184] VMCI host device registered (name=vmci, major=10, minor=58)
[   20.366187] Initialized host personality
[   20.417330] NET: Registered protocol family 40
[   20.459652] fuse init (API version 7.23)
[   20.491924] ppdev: user-space parallel port driver
[   24.696812] FS-Cache: Loaded
[   24.744728] RPC: Registered named UNIX socket transport module.
[   24.744732] RPC: Registered udp transport module.
[   24.744733] RPC: Registered tcp transport module.
[   24.744735] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   24.804702] FS-Cache: Netfs 'nfs' registered for caching
[   24.809940] Key type dns_resolver registered
[   24.859297] NFS: Registering the id_resolver key type
[   24.859315] Key type id_resolver registered
[   24.859317] Key type id_legacy registered
[   40.163739] console [netcon0] enabled
[   40.163744] netconsole: network logging started
[   40.172771] netconsole: network logging has already stopped
[   40.173425] netpoll: netconsole: local port 6665
[   40.173429] netpoll: netconsole: local IPv4 address 0.0.0.0
[   40.173431] netpoll: netconsole: interface 'em1'
[   40.173433] netpoll: netconsole: remote port 6666
[   40.173434] netpoll: netconsole: remote IPv4 address 10.160.67.74
[   40.173436] netpoll: netconsole: remote ethernet address 3c:4a:92:03:60:75
[   40.173474] netpoll: netconsole: local IP 10.160.4.42
[   40.173550] netconsole: netconsole: network logging started
[   43.251576] sysrq: SysRq : Emergency Sync
[   43.394002] Emergency Sync complete
[   53.410036] kvm: zapping shadow pages for mmio generation wraparound
[   55.473435] kvm [5155]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
[  141.698527] PANIC: double fault, error_code: 0x0
[  141.698545] CPU: 1 PID: 15341 Comm: cc1 Not tainted 4.0.0-rc4-testz+ #126
[  141.698550] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
[  141.698554] task: ffff8802048ed790 ti: ffff8800d03c8000 task.ti: ffff8800d03c8000
[  141.698559] RIP: 0010:[<ffffffff81703157>]  [<ffffffff81703157>] page_fault+0x7/0x30
[  141.698572] RSP: 0018:00007ffff75eefe8  EFLAGS: 00010016
[  141.698576] RAX: 0000000081702057 RBX: 0000000000000001 RCX: ffffffff81702057
[  141.698579] RDX: 00000000000001b6 RSI: 0000000000000100 RDI: ffffffff81703323
[  141.698583] RBP: 00000000710f5b1b R08: 0000000000000001 R09: 0000000002846760
[  141.698587] R10: 0000000000000198 R11: 0000000000000246 R12: 0000000002742bf0
[  141.698592] R13: 000000000275ee40 R14: 000000000282d3a8 R15: 0000000002807090
[  141.698596] FS:  00007f4a64b7b800(0000) GS:ffff88021d240000(0000) knlGS:0000000000000000
[  141.698599] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  141.698603] CR2: 00007ffff75eefd8 CR3: 00000000d1bee000 CR4: 00000000001427e0
[  141.698605] Stack:
[  141.698626] BUG: unable to handle kernel paging request at 00007ffff75eefe8
[  141.698675] IP: [<ffffffff81005d94>] show_stack_log_lvl+0x124/0x1a0
[  141.698711] PGD ceb06067 PUD ceb07067 PMD ceb64067 PTE 0
[  141.698755] Oops: 0000 [#1] PREEMPT SMP 
[  141.698787] Modules linked in: netconsole configfs nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs lockd grace sunrpc fscache parport_pc ppdev parport fuse vmw_vsock_vmci_transport vsock vmw_vmci iscsi_ibft iscsi_boot_sysfs af_packet x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_hdmi snd_emu10k1 snd_hda_codec_realtek snd_hda_codec_generic snd_util_mem i915 snd_hda_intel snd_ac97_codec snd_hda_controller ac97_bus snd_hda_codec crct10dif_pclmul snd_rawmidi crc32_pclmul snd_hwdep snd_ctxfi crc32c_intel snd_seq_device iTCO_wdt ghash_clmulni_intel snd_pcm iTCO_vendor_support aesni_intel aes_x86_64 glue_helper dcdbas lrw i2c_algo_bit dm_mod gf128mul drm_kms_helper ablk_helper snd_timer e1000e cryptd cdc_acm snd emu10k1_gp drm serio_raw i2c_i801 mei_me lpc_ich
[  141.699391]  gameport mfd_core mei soundcore ptp pps_core thermal battery processor video button uas usb_storage sr_mod cdrom xhci_pci xhci_hcd sg
[  141.699523] CPU: 1 PID: 15341 Comm: cc1 Not tainted 4.0.0-rc4-testz+ #126
[  141.699557] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
[  141.699591] task: ffff8802048ed790 ti: ffff8800d03c8000 task.ti: ffff8800d03c8000
[  141.699627] RIP: 0010:[<ffffffff81005d94>]  [<ffffffff81005d94>] show_stack_log_lvl+0x124/0x1a0
[  141.699676] RSP: 0018:ffff88021d244e48  EFLAGS: 00010046
[  141.699703] RAX: 00007ffff75eeff0 RBX: 00007ffff75eefe8 RCX: 0000000000000000
[  141.699738] RDX: ffff88021d243fc0 RSI: ffff88021d244f58 RDI: 0000000000000000
[  141.699771] RBP: ffff88021d244ea8 R08: ffff88021d23ffc0 R09: 0000000000000000
[  141.699806] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88021d244f58
[  141.699840] R13: 0000000000000000 R14: ffffffff81a22f73 R15: 0000000000000000
[  141.699874] FS:  00007f4a64b7b800(0000) GS:ffff88021d240000(0000) knlGS:0000000000000000
[  141.699911] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  141.699942] CR2: 00007ffff75eefe8 CR3: 00000000d1bee000 CR4: 00000000001427e0
[  141.699975] Stack:
[  141.699987]  ffff88021d244ea8 ffffffff816f2f01 ffff88021d244ea8 ffffffff00000008
[  141.700037]  ffff88021d244eb8 00007ffff75eefe8 ffff8800d03c8000 ffff88021d244f58
[  141.700086]  00007ffff75eefe8 0000000000000040 000000000282d3a8 0000000002807090
[  141.700135] Call Trace:
[  141.700151]  <#DF> 
[  141.700164] 
[  141.700180]  [<ffffffff816f2f01>] ? printk+0x46/0x48
[  141.700201]  [<ffffffff81005e9a>] show_regs+0x8a/0x220
[  141.700229]  [<ffffffff81046e37>] df_debug+0x27/0x40
[  141.700257]  [<ffffffff810041e7>] do_double_fault+0x87/0x100
[  141.700287]  [<ffffffff81702e67>] double_fault+0x27/0x30
[  141.700314]  [<ffffffff81702057>] ? native_iret+0x7/0x7
[  141.700342]  [<ffffffff81703323>] ? error_sti+0x5/0x6
[  141.700369]  [<ffffffff81703157>] ? page_fault+0x7/0x30
[  141.700395]  <<EOE>> 
[  141.700409]  <UNK> 
[  141.700425] Code: 4d b0 4c 89 45 b8 48 89 55 c0 48 8b 5b f8 e8 3f d1 6e 00 48 8b 55 c0 4c 8b 45 b8 8b 4d b0 85 c9 74 05 f6 c1 03 74 4c 48 8d 43 08 <48> 8b 33 48 c7 c7 6b 2f a2 81 89 4d ac 4c 89 45 b0 48 89 45 c0 
[  141.700735] RIP  [<ffffffff81005d94>] show_stack_log_lvl+0x124/0x1a0
[  141.700771]  RSP <ffff88021d244e48>
[  141.700789] CR2: 00007ffff75eefe8

[-- Attachment #3: Type: text/plain, Size: 1 bytes --]



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 11:21                           ` Takashi Iwai
@ 2015-03-19 12:48                             ` Denys Vlasenko
  2015-03-19 13:47                               ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-19 12:48 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Denys Vlasenko, Andy Lutomirski, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 578 bytes --]

Having no more ideas at the moment, here is a tarball of 13 patches
of commits touching entry_64.S up to 4.0.0-rc1.

x0001.patch is the latest, x0015.patch is the oldest.

Patches 0003 and 0008 are not there since 0003 is empty merge patch
and 0008 does some PCI fixup.

If this breakage is recent, it ought to be one of these.
Most of them do some non-trivial surgery.

Even though I did not spot anything suspicious in them,
entry.S is notorious for subtle breakage.

Try reverting them in sequence starting from x0001.patch
and see reverting which one makes crash disappear.

[-- Attachment #2: revert_me_13.tar.gz --]
[-- Type: application/x-gzip, Size: 20340 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 21:55                 ` Andy Lutomirski
                                     ` (2 preceding siblings ...)
  2015-03-18 22:22                   ` Jiri Kosina
@ 2015-03-19 13:21                   ` Denys Vlasenko
  3 siblings, 0 replies; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-19 13:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Stefan Seyfried, Takashi Iwai, X86 ML, LKML, Tejun Heo

On 03/18/2015 10:55 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 2:42 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>>> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
>>> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
>>> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
>>> takes some fault.
>>
>> Yes, so far we happily thought that SYSRET never fails...
>>
>> This merits adding some code which would at least BUG_ON
>> if the faulting address is seen to match SYSRET64.
> 
> sysret64 can only fail with #GP, and we're totally screwed if that
> happens, although I agree about the BUG_ON in principle.  Where would
> we add it that would help in this case, though?  We never even made it
> to C code.

I propose to widen such check to catch any cases where
we enter an exception from CPL0 and find that our RSP
is bad. This will cover the case of faulting SYSRET and possible
future obscure bugs.

What this patch does is it stops CPU dead if we find itself
with userspace RSP (not saved RSP, but _actual_ %RSP register)
in an exception handler prologue:

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index a0a3a6e..53a34ba 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -930,6 +930,12 @@ ENTRY(\sym)
 	INTR_FRAME
 	.endif

+	testq %rsp,%rsp
+	/* If RSP is positive, we are in kernel but have userspace RSP. */
+	/* We corrupted user stack already by storing iret frame there. */
+	/* This is supposed to be impossible. */
+0:	jns 0b
+
 	ASM_CLAC
 	PARAVIRT_ADJUST_EXCEPTION_FRAME


Hopefully then NMI watchdog will kill it, and we'll get better data.

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 12:48                             ` Denys Vlasenko
@ 2015-03-19 13:47                               ` Takashi Iwai
  2015-03-19 14:55                                 ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-19 13:47 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Denys Vlasenko, Andy Lutomirski, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Thu, 19 Mar 2015 13:48:56 +0100,
Denys Vlasenko wrote:
> 
> Having no more ideas at the moment, here is a tarball of 13 patches
> of commits touching entry_64.S up to 4.0.0-rc1.
> 
> x0001.patch is the latest, x0015.patch is the oldest.
> 
> Patches 0003 and 0008 are not there since 0003 is empty merge patch
> and 0008 does some PCI fixup.
> 
> If this breakage is recent, it ought to be one of these.
> Most of them do some non-trivial surgery.
> 
> Even though I did not spot anything suspicious in them,
> entry.S is notorious for subtle breakage.
> 
> Try reverting them in sequence starting from x0001.patch
> and see reverting which one makes crash disappear.

OK, I'm going to check these git series.


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 13:47                               ` Takashi Iwai
@ 2015-03-19 14:55                                 ` Takashi Iwai
  2015-03-19 15:22                                   ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-19 14:55 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Denys Vlasenko, Andy Lutomirski, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Thu, 19 Mar 2015 14:47:12 +0100,
Takashi Iwai wrote:
> 
> At Thu, 19 Mar 2015 13:48:56 +0100,
> Denys Vlasenko wrote:
> > 
> > Having no more ideas at the moment, here is a tarball of 13 patches
> > of commits touching entry_64.S up to 4.0.0-rc1.
> > 
> > x0001.patch is the latest, x0015.patch is the oldest.
> > 
> > Patches 0003 and 0008 are not there since 0003 is empty merge patch
> > and 0008 does some PCI fixup.
> > 
> > If this breakage is recent, it ought to be one of these.
> > Most of them do some non-trivial surgery.
> > 
> > Even though I did not spot anything suspicious in them,
> > entry.S is notorious for subtle breakage.
> > 
> > Try reverting them in sequence starting from x0001.patch
> > and see reverting which one makes crash disappear.
> 
> OK, I'm going to check these git series.

Reverting the commit
96b6352c12711d5c0bb7157f49c92580248e8146
    x86_64, entry: Remove the syscall exit audit and schedule optimizations

seems enough.  After reverting this one, the machine runs stable with
the kvm stress test.

(I'll keep test running for a while; at the previous bisection, I hit
 the bug right after posting the mail ;)

BTW, I also tried to reproduce this on another machine (a Haswell
laptop), but I failed, even with the very same kernel.  So the bug
really seems depending on CPU.


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 14:55                                 ` Takashi Iwai
@ 2015-03-19 15:22                                   ` Takashi Iwai
  2015-03-19 15:41                                     ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-19 15:22 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Denys Vlasenko, Andy Lutomirski, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Thu, 19 Mar 2015 15:55:26 +0100,
Takashi Iwai wrote:
> 
> At Thu, 19 Mar 2015 14:47:12 +0100,
> Takashi Iwai wrote:
> > 
> > At Thu, 19 Mar 2015 13:48:56 +0100,
> > Denys Vlasenko wrote:
> > > 
> > > Having no more ideas at the moment, here is a tarball of 13 patches
> > > of commits touching entry_64.S up to 4.0.0-rc1.
> > > 
> > > x0001.patch is the latest, x0015.patch is the oldest.
> > > 
> > > Patches 0003 and 0008 are not there since 0003 is empty merge patch
> > > and 0008 does some PCI fixup.
> > > 
> > > If this breakage is recent, it ought to be one of these.
> > > Most of them do some non-trivial surgery.
> > > 
> > > Even though I did not spot anything suspicious in them,
> > > entry.S is notorious for subtle breakage.
> > > 
> > > Try reverting them in sequence starting from x0001.patch
> > > and see reverting which one makes crash disappear.
> > 
> > OK, I'm going to check these git series.
> 
> Reverting the commit
> 96b6352c12711d5c0bb7157f49c92580248e8146
>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
> 
> seems enough.  After reverting this one, the machine runs stable with
> the kvm stress test.
> 
> (I'll keep test running for a while; at the previous bisection, I hit
>  the bug right after posting the mail ;)

It survived long enough, so this looks like the spot.

Also, I checked the patch below instead of reverting the commit, and
this seems working, too.


Takashi

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1d74d161687c..5340ac7f88a9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -364,12 +364,12 @@ system_call_fastpath:
  * Has incomplete stack frame and undefined top of stack.
  */
 ret_from_sys_call:
-	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
-	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
-
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
+	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
+	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
+
 	CFI_REMEMBER_STATE
 	/*
 	 * sysretq will re-enable interrupts:

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 15:22                                   ` Takashi Iwai
@ 2015-03-19 15:41                                     ` Andy Lutomirski
  2015-03-19 15:51                                       ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-19 15:41 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Denys Vlasenko, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On Thu, Mar 19, 2015 at 8:22 AM, Takashi Iwai <tiwai@suse.de> wrote:
> At Thu, 19 Mar 2015 15:55:26 +0100,
> Takashi Iwai wrote:
>>
>> At Thu, 19 Mar 2015 14:47:12 +0100,
>> Takashi Iwai wrote:
>> >
>> > At Thu, 19 Mar 2015 13:48:56 +0100,
>> > Denys Vlasenko wrote:
>> > >
>> > > Having no more ideas at the moment, here is a tarball of 13 patches
>> > > of commits touching entry_64.S up to 4.0.0-rc1.
>> > >
>> > > x0001.patch is the latest, x0015.patch is the oldest.
>> > >
>> > > Patches 0003 and 0008 are not there since 0003 is empty merge patch
>> > > and 0008 does some PCI fixup.
>> > >
>> > > If this breakage is recent, it ought to be one of these.
>> > > Most of them do some non-trivial surgery.
>> > >
>> > > Even though I did not spot anything suspicious in them,
>> > > entry.S is notorious for subtle breakage.
>> > >
>> > > Try reverting them in sequence starting from x0001.patch
>> > > and see reverting which one makes crash disappear.
>> >
>> > OK, I'm going to check these git series.
>>
>> Reverting the commit
>> 96b6352c12711d5c0bb7157f49c92580248e8146
>>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
>>
>> seems enough.  After reverting this one, the machine runs stable with
>> the kvm stress test.
>>
>> (I'll keep test running for a while; at the previous bisection, I hit
>>  the bug right after posting the mail ;)
>
> It survived long enough, so this looks like the spot.
>
> Also, I checked the patch below instead of reverting the commit, and
> this seems working, too.
>
>
> Takashi
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 1d74d161687c..5340ac7f88a9 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -364,12 +364,12 @@ system_call_fastpath:
>   * Has incomplete stack frame and undefined top of stack.
>   */
>  ret_from_sys_call:
> -       testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> -       jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> -
>         LOCKDEP_SYS_EXIT
>         DISABLE_INTERRUPTS(CLBR_NONE)
>         TRACE_IRQS_OFF
> +       testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> +       jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> +
>         CFI_REMEMBER_STATE
>         /*
>          * sysretq will re-enable interrupts:

The crash you're seeing could certainly be caused by an IRQ at the
wrong time.  However:

int_ret_from_sys_call_fixup:
        FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
        jmp int_ret_from_sys_call

and

GLOBAL(int_ret_from_sys_call)
        DISABLE_INTERRUPTS(CLBR_NONE)
        TRACE_IRQS_OFF

so with or without your little patch, we're turning off IRQs very
quickly.  retint_swapgs also turnes off interrupts before doing
anything.  So I don't see how your patch would have any effect.

I'm starting to wonder if the problem has something to do with running
fire_user_return_notifiers with IRQs on.  We appear to do that, and it
seems rather questionable to me that it's safe, given the sneaky
things that KVM does in there.

If we end up in user mode with a bad MSR_SYSCALL_MASK, we could see
your crash, although I don't see how that would happen either.

I'll try to write a diagnostic patch later this morning.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 15:41                                     ` Andy Lutomirski
@ 2015-03-19 15:51                                       ` Takashi Iwai
  2015-03-19 16:01                                         ` Andy Lutomirski
  2015-03-20 18:16                                         ` Denys Vlasenko
  0 siblings, 2 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-19 15:51 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Thu, 19 Mar 2015 08:41:57 -0700,
Andy Lutomirski wrote:
> 
> On Thu, Mar 19, 2015 at 8:22 AM, Takashi Iwai <tiwai@suse.de> wrote:
> > At Thu, 19 Mar 2015 15:55:26 +0100,
> > Takashi Iwai wrote:
> >>
> >> At Thu, 19 Mar 2015 14:47:12 +0100,
> >> Takashi Iwai wrote:
> >> >
> >> > At Thu, 19 Mar 2015 13:48:56 +0100,
> >> > Denys Vlasenko wrote:
> >> > >
> >> > > Having no more ideas at the moment, here is a tarball of 13 patches
> >> > > of commits touching entry_64.S up to 4.0.0-rc1.
> >> > >
> >> > > x0001.patch is the latest, x0015.patch is the oldest.
> >> > >
> >> > > Patches 0003 and 0008 are not there since 0003 is empty merge patch
> >> > > and 0008 does some PCI fixup.
> >> > >
> >> > > If this breakage is recent, it ought to be one of these.
> >> > > Most of them do some non-trivial surgery.
> >> > >
> >> > > Even though I did not spot anything suspicious in them,
> >> > > entry.S is notorious for subtle breakage.
> >> > >
> >> > > Try reverting them in sequence starting from x0001.patch
> >> > > and see reverting which one makes crash disappear.
> >> >
> >> > OK, I'm going to check these git series.
> >>
> >> Reverting the commit
> >> 96b6352c12711d5c0bb7157f49c92580248e8146
> >>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
> >>
> >> seems enough.  After reverting this one, the machine runs stable with
> >> the kvm stress test.
> >>
> >> (I'll keep test running for a while; at the previous bisection, I hit
> >>  the bug right after posting the mail ;)
> >
> > It survived long enough, so this looks like the spot.
> >
> > Also, I checked the patch below instead of reverting the commit, and
> > this seems working, too.
> >
> >
> > Takashi
> >
> > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> > index 1d74d161687c..5340ac7f88a9 100644
> > --- a/arch/x86/kernel/entry_64.S
> > +++ b/arch/x86/kernel/entry_64.S
> > @@ -364,12 +364,12 @@ system_call_fastpath:
> >   * Has incomplete stack frame and undefined top of stack.
> >   */
> >  ret_from_sys_call:
> > -       testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> > -       jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> > -
> >         LOCKDEP_SYS_EXIT
> >         DISABLE_INTERRUPTS(CLBR_NONE)
> >         TRACE_IRQS_OFF
> > +       testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> > +       jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> > +
> >         CFI_REMEMBER_STATE
> >         /*
> >          * sysretq will re-enable interrupts:
> 
> The crash you're seeing could certainly be caused by an IRQ at the
> wrong time.  However:
> 
> int_ret_from_sys_call_fixup:
>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>         jmp int_ret_from_sys_call
> 
> and
> 
> GLOBAL(int_ret_from_sys_call)
>         DISABLE_INTERRUPTS(CLBR_NONE)
>         TRACE_IRQS_OFF
> 
> so with or without your little patch, we're turning off IRQs very
> quickly.  retint_swapgs also turnes off interrupts before doing
> anything.  So I don't see how your patch would have any effect.

What about LOCKDEP_SYS_EXIT?


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 15:51                                       ` Takashi Iwai
@ 2015-03-19 16:01                                         ` Andy Lutomirski
  2015-03-20 18:16                                         ` Denys Vlasenko
  1 sibling, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-19 16:01 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Denys Vlasenko, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On Thu, Mar 19, 2015 at 8:51 AM, Takashi Iwai <tiwai@suse.de> wrote:
> At Thu, 19 Mar 2015 08:41:57 -0700,
> Andy Lutomirski wrote:
>>
>> On Thu, Mar 19, 2015 at 8:22 AM, Takashi Iwai <tiwai@suse.de> wrote:
>> > At Thu, 19 Mar 2015 15:55:26 +0100,
>> > Takashi Iwai wrote:
>> >>
>> >> At Thu, 19 Mar 2015 14:47:12 +0100,
>> >> Takashi Iwai wrote:
>> >> >
>> >> > At Thu, 19 Mar 2015 13:48:56 +0100,
>> >> > Denys Vlasenko wrote:
>> >> > >
>> >> > > Having no more ideas at the moment, here is a tarball of 13 patches
>> >> > > of commits touching entry_64.S up to 4.0.0-rc1.
>> >> > >
>> >> > > x0001.patch is the latest, x0015.patch is the oldest.
>> >> > >
>> >> > > Patches 0003 and 0008 are not there since 0003 is empty merge patch
>> >> > > and 0008 does some PCI fixup.
>> >> > >
>> >> > > If this breakage is recent, it ought to be one of these.
>> >> > > Most of them do some non-trivial surgery.
>> >> > >
>> >> > > Even though I did not spot anything suspicious in them,
>> >> > > entry.S is notorious for subtle breakage.
>> >> > >
>> >> > > Try reverting them in sequence starting from x0001.patch
>> >> > > and see reverting which one makes crash disappear.
>> >> >
>> >> > OK, I'm going to check these git series.
>> >>
>> >> Reverting the commit
>> >> 96b6352c12711d5c0bb7157f49c92580248e8146
>> >>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
>> >>
>> >> seems enough.  After reverting this one, the machine runs stable with
>> >> the kvm stress test.
>> >>
>> >> (I'll keep test running for a while; at the previous bisection, I hit
>> >>  the bug right after posting the mail ;)
>> >
>> > It survived long enough, so this looks like the spot.
>> >
>> > Also, I checked the patch below instead of reverting the commit, and
>> > this seems working, too.
>> >
>> >
>> > Takashi
>> >
>> > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> > index 1d74d161687c..5340ac7f88a9 100644
>> > --- a/arch/x86/kernel/entry_64.S
>> > +++ b/arch/x86/kernel/entry_64.S
>> > @@ -364,12 +364,12 @@ system_call_fastpath:
>> >   * Has incomplete stack frame and undefined top of stack.
>> >   */
>> >  ret_from_sys_call:
>> > -       testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> > -       jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> > -
>> >         LOCKDEP_SYS_EXIT
>> >         DISABLE_INTERRUPTS(CLBR_NONE)
>> >         TRACE_IRQS_OFF
>> > +       testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> > +       jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> > +
>> >         CFI_REMEMBER_STATE
>> >         /*
>> >          * sysretq will re-enable interrupts:
>>
>> The crash you're seeing could certainly be caused by an IRQ at the
>> wrong time.  However:
>>
>> int_ret_from_sys_call_fixup:
>>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>>         jmp int_ret_from_sys_call
>>
>> and
>>
>> GLOBAL(int_ret_from_sys_call)
>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>         TRACE_IRQS_OFF
>>
>> so with or without your little patch, we're turning off IRQs very
>> quickly.  retint_swapgs also turnes off interrupts before doing
>> anything.  So I don't see how your patch would have any effect.
>
> What about LOCKDEP_SYS_EXIT?
>

There's a LOCKDEP_SYS_EXIT_IRQ a few lines down in
int_ret_from_sys_call, and the syscall slow path falls through
directly to int_ret_from_sys_call.

I'm going to try to write a diagnostic patch now.  I have four
separate contractors coming starting half an hour ago*, so it might
take a while.

* Yeah, right.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-19 15:51                                       ` Takashi Iwai
  2015-03-19 16:01                                         ` Andy Lutomirski
@ 2015-03-20 18:16                                         ` Denys Vlasenko
  2015-03-20 18:50                                           ` Takashi Iwai
  2015-03-23  9:02                                           ` Takashi Iwai
  1 sibling, 2 replies; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-20 18:16 UTC (permalink / raw)
  To: Takashi Iwai, Andy Lutomirski
  Cc: Denys Vlasenko, Jiri Kosina, Linus Torvalds, Stefan Seyfried,
	X86 ML, LKML, Tejun Heo

Hi,

This particular crash was hard to diagnose because of two reasons:

* CPU would happily use userspace RSP in kernel mode.
  Crash comes only later, when we run off the stack.
  We lose information when it started.

* Kernel's error handling code is ill prepared for RSP pointing
  to user stack. So we take another page fault trying
  to dump stack.

I prepared a patch which helps with both problems.

For testing, I inserted an invalid instruction right before SYSRET
to induce a similar bug, and booted resulting kernel in qemu.

Before my patch, double fault output starts like this:

[    0.715216] PANIC: double fault, error_code: 0x0
[    0.716033] CPU: 0 PID: 1 Comm: init Not tainted 4.0.0-rc2+ #7
[    0.716033] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.716033] task: ffff880007588000 ti: ffff880007590000 task.ti: ffff880007590000
[    0.716033] RIP: 0010:[<ffffffff81017057>]  [<ffffffff81017057>] do_error_trap+0x47/0x120
[    0.716033] RSP: 0018:00007ffd89e7ffb8  EFLAGS: 00010006

The key here is that it doesn't show at which RIP we took the first
"bad" exception. The only useful detail visible here is bad RSP.
"do_error_trap+0x47" is useless.

After the patch, the very moment of "bad" exception is caught:

[    0.666758] Exception on user stack 00007ffc1fd0c388: RSP: 0018:00007ffc1fd0c3b0  EFLAGS: 00010006
[    0.667285] RIP: 0010:[<ffffffff81793688>]  [<ffffffff81793688>] ret_from_sys_call+0x5f/0x67
[    0.667285] PANIC: double fault, error_code: 0xffffffffffffffff
[    0.667285] CPU: 0 PID: 1 Comm: init Not tainted 4.0.0-rc2+ #13
[    0.667285] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.667285] task: ffff880007588000 ti: ffff880007590000 task.ti: ffff880007590000
[    0.667285] RIP: 0010:[<ffffffff81793688>]  [<ffffffff81793688>] ret_from_sys_call+0x5f/0x67
[    0.667285] RSP: 0018:00007ffc1fd0c3b0  EFLAGS: 00010006

The exception happened at "ret_from_sys_call+0x5f".
We also won't take another page fault any more,
output proceeds like this:

...
[    0.667285] RAX: 0000000007a00000 RBX: 00007ffc1fd0c4e0 RCX: 00000000c0000101
[    0.667285] RDX: 00000000ffff8800 RSI: 0000000000005401 RDI: 00007ffc1fd0c388
[    0.667285] RBP: 00007ffc1fd0c570 R08: 0000000000000010 R09: 0000000000000000
[    0.667285] R10: 00007ffc1fd0c650 R11: 0000000000000202 R12: 0000000000000120
[    0.667285] R13: 00000000005f7b78 R14: 0000000000000000 R15: 00000000004c9d44
[    0.667285] FS:  0000000000000000(0000) GS:ffff880007a00000(0000) knlGS:0000000000000000
[    0.667285] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.667285] CR2: 00000000004ad1e4 CR3: 0000000000101000 CR4: 00000000000007f0
[    0.667285] Stack:
[    0.667285]  0000000000000018 00007ffc1fd0c490 00007ffc1fd0c3d0 0000000000000000
[    0.667285]  0000000000000000 0000000000000000 00007ffc1fd0c490 0000000000000000
[    0.667285]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    0.667285] Call Trace:
[    0.667285]  <UNK>
[    0.667285] Code: 8b 44 24 50 48 8b 54 24 60 48 8b 74 24 68 48 8b 7c 24 70 48 8b 8c 24 80 00 00 00 4c 8b 9c 24 90 00 00 00 48 8b a4 24 98 00 00 00 <0f> 0b 0f 01 f8 48 0f 07 48 c7 84 24 a0 00 00 00 2b 00 00 00 48
[    0.667285] Kernel panic - not syncing: Machine halted.
[    0.667285] CPU: 0 PID: 1 Comm: init Not tainted 4.0.0-rc2+ #13
[    0.667285] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.667285]  ffffffffffffffff ffff880007593e28 ffffffff81789625 ffff880007588000
[    0.667285]  ffffffff81a3b181 ffff880007593ea8 ffffffff817840aa ffff880007590000
[    0.667285]  0000000000000008 ffff880007593eb8 ffff880007593e58 0000000000000001
[    0.667285] Call Trace:
[    0.667285]  [<ffffffff81789625>] dump_stack+0x4c/0x65
[    0.667285]  [<ffffffff817840aa>] panic+0xc6/0x1ff
[    0.667285]  [<ffffffff81059ee5>] df_debug+0x35/0x40
[    0.667285]  [<ffffffff81017e37>] do_double_fault+0x87/0x100
[    0.667285]  [<ffffffff81017fb7>] do_userpsace_rsp_in_kernel+0x107/0x140
[    0.667285]  [<ffffffff81793688>] ? ret_from_sys_call+0x5f/0x67
[    0.667285]  [<ffffffff81795b49>] userpsace_rsp_in_kernel+0x39/0x40
[    0.667285]  [<ffffffff81793688>] ? ret_from_sys_call+0x5f/0x67
[    0.667285] Kernel Offset: disabled
[    0.667285] Rebooting in 1 seconds..

Takashi, are you willing to reproduce the panic one more time,
with this patch? I would like to see whether oops messages
are more informative with it.



diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 4e49d7d..92a35e6 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -70,6 +70,7 @@ dotraplinkage void do_segment_not_present(struct pt_regs *, long);
 dotraplinkage void do_stack_segment(struct pt_regs *, long);
 #ifdef CONFIG_X86_64
 dotraplinkage void do_double_fault(struct pt_regs *, long);
+dotraplinkage void do_userpsace_rsp_in_kernel(struct pt_regs *regs);
 asmlinkage struct pt_regs *sync_regs(struct pt_regs *);
 #endif
 dotraplinkage void do_general_protection(struct pt_regs *, long);
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 0c91256..fb85c26 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -958,6 +958,12 @@ ENTRY(\sym)
 	INTR_FRAME
 	.endif

+	testq %rsp,%rsp
+	/* If RSP is positive, we are in kernel but have userspace RSP. */
+	/* This should be impossible... modulo bugs. */
+	/* We corrupted user stack already by storing iret frame there. */
+	jns	userpsace_rsp_in_kernel
+
 	ASM_CLAC
 	PARAVIRT_ADJUST_EXCEPTION_FRAME

@@ -1635,3 +1641,46 @@ ENTRY(ignore_sysret)
 	CFI_ENDPROC
 END(ignore_sysret)

+/*
+ * We reach this place only if we detected a severe bug:
+ * on exception prologue, %rsp is not in kernelspace.
+ * This means that exception was taken while kernel was running with
+ * bogus %rsp, which should never nappen.
+ *
+ * We don't know what's going on (it *is* a bug, after all).
+ * GS is also in an unknown state.
+ *
+ * Why do we catch this? Because otherwise we would continue
+ * writing to user stack, eventually taking a page fault which
+ * gets promoted to double-fault. By this time, we'll lose
+ * useful information, such as the source RIP.
+ */
+ENTRY(userpsace_rsp_in_kernel)
+	CFI_STARTPROC
+	/* Save bogus RSP value */
+	movq	%rsp,%rdi
+	/* Switch to kernel GS if necessary */
+	movl	$MSR_GS_BASE,%ecx
+	rdmsr
+	testl	%edx,%edx
+	js	1f	/* negative -> already in kernel */
+	SWAPGS
+1:	/* hopefully PER_CPU_VAR() now works */
+
+	/* Load %rsp with something valid */
+	movq	PER_CPU_VAR(cpu_tss + TSS_sp0),%rsp
+
+	/* Create a semi-bogus iret frame */
+	push	$__KERNEL_DS	/* pt_regs->ss */
+	push	%rdi		/* pt_regs->sp */
+	push	$0		/* pt_regs->flags */
+	push	$__KERNEL_CS	/* pt_regs->cs */
+	push	$0		/* pt_regs->ip */
+	push	$-1		/* pt_regs->orix_ax */
+	ALLOC_PT_GPREGS_ON_STACK
+	call	error_entry	/* fill pt_regs->gpregs */
+	movq	%rsp,%rdi
+	call	do_userpsace_rsp_in_kernel
+	/* does not return */
+	CFI_ENDPROC
+END(userpsace_rsp_in_kernel)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 081252c..59f7ef0 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -368,6 +368,47 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
 	for (;;)
 		die(str, regs, error_code);
 }
+
+dotraplinkage void do_userpsace_rsp_in_kernel(struct pt_regs *regs)
+{
+	struct {
+		long error_code;
+		long ip;
+		long cs;
+		long flags;
+		long sp;
+		long ss;
+	} iretq_frame;
+	int err;
+	long __user *bogus_sp;
+
+	memset(&iretq_frame, 0xff, sizeof(iretq_frame));
+
+	bogus_sp = (long __user *)regs->sp;
+	/*
+	 * In long mode, CPU aligns iret frame's top to 16-byte boundary.
+	 * This allows us to determine whether exception word was pushed.
+	 */
+	preempt_disable();
+	if (!(regs->sp & 0xf))
+		err = copy_from_user(&iretq_frame, bogus_sp, 6 * sizeof(long));
+	else
+		err = copy_from_user(&iretq_frame.ip, bogus_sp, 5 * sizeof(long));
+
+	/* What this exception pushed onto user stack? */
+	printk(KERN_EMERG "Exception on user stack %016lx:"
+		" RSP: %04lx:%016lx  EFLAGS: %08lx\n",
+			regs->sp,
+			iretq_frame.ss, iretq_frame.sp, iretq_frame.flags);
+	printk(KERN_EMERG "RIP: %04lx:[<%016lx>] ",
+			iretq_frame.cs, iretq_frame.ip);
+	printk_address(iretq_frame.ip);
+
+	/* (Ab)use do_double_fault to print the rest */
+	if (!err)
+		memcpy(&regs->ip, &iretq_frame.ip, 5 * sizeof(long));
+	do_double_fault(regs, iretq_frame.error_code);
+}
 #endif

 dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-20 18:16                                         ` Denys Vlasenko
@ 2015-03-20 18:50                                           ` Takashi Iwai
  2015-03-23  9:02                                           ` Takashi Iwai
  1 sibling, 0 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-20 18:50 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Fri, 20 Mar 2015 19:16:53 +0100,
Denys Vlasenko wrote:
> 
> Takashi, are you willing to reproduce the panic one more time,
> with this patch? I would like to see whether oops messages
> are more informative with it.

Sure, I'll do it, but you'll have to wait until the next Monday as the
bug is triggered only on a machine in my office.  I checked my local
laptop, but it doesn't show the problem.

Maybe someone else can test it beforehand...


thanks,

Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-20 18:16                                         ` Denys Vlasenko
  2015-03-20 18:50                                           ` Takashi Iwai
@ 2015-03-23  9:02                                           ` Takashi Iwai
  2015-03-23  9:35                                             ` Takashi Iwai
  1 sibling, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23  9:02 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Fri, 20 Mar 2015 19:16:53 +0100,
Denys Vlasenko wrote:
> 
> Hi,
> 
> This particular crash was hard to diagnose because of two reasons:
> 
> * CPU would happily use userspace RSP in kernel mode.
>   Crash comes only later, when we run off the stack.
>   We lose information when it started.
> 
> * Kernel's error handling code is ill prepared for RSP pointing
>   to user stack. So we take another page fault trying
>   to dump stack.
> 
> I prepared a patch which helps with both problems.
> 
> For testing, I inserted an invalid instruction right before SYSRET
> to induce a similar bug, and booted resulting kernel in qemu.
> 
> Before my patch, double fault output starts like this:
> 
> [    0.715216] PANIC: double fault, error_code: 0x0
> [    0.716033] CPU: 0 PID: 1 Comm: init Not tainted 4.0.0-rc2+ #7
> [    0.716033] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    0.716033] task: ffff880007588000 ti: ffff880007590000 task.ti: ffff880007590000
> [    0.716033] RIP: 0010:[<ffffffff81017057>]  [<ffffffff81017057>] do_error_trap+0x47/0x120
> [    0.716033] RSP: 0018:00007ffd89e7ffb8  EFLAGS: 00010006
> 
> The key here is that it doesn't show at which RIP we took the first
> "bad" exception. The only useful detail visible here is bad RSP.
> "do_error_trap+0x47" is useless.
> 
> After the patch, the very moment of "bad" exception is caught:
> 
> [    0.666758] Exception on user stack 00007ffc1fd0c388: RSP: 0018:00007ffc1fd0c3b0  EFLAGS: 00010006
> [    0.667285] RIP: 0010:[<ffffffff81793688>]  [<ffffffff81793688>] ret_from_sys_call+0x5f/0x67
> [    0.667285] PANIC: double fault, error_code: 0xffffffffffffffff
> [    0.667285] CPU: 0 PID: 1 Comm: init Not tainted 4.0.0-rc2+ #13
> [    0.667285] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    0.667285] task: ffff880007588000 ti: ffff880007590000 task.ti: ffff880007590000
> [    0.667285] RIP: 0010:[<ffffffff81793688>]  [<ffffffff81793688>] ret_from_sys_call+0x5f/0x67
> [    0.667285] RSP: 0018:00007ffc1fd0c3b0  EFLAGS: 00010006
> 
> The exception happened at "ret_from_sys_call+0x5f".
> We also won't take another page fault any more,
> output proceeds like this:
> 
> ...
> [    0.667285] RAX: 0000000007a00000 RBX: 00007ffc1fd0c4e0 RCX: 00000000c0000101
> [    0.667285] RDX: 00000000ffff8800 RSI: 0000000000005401 RDI: 00007ffc1fd0c388
> [    0.667285] RBP: 00007ffc1fd0c570 R08: 0000000000000010 R09: 0000000000000000
> [    0.667285] R10: 00007ffc1fd0c650 R11: 0000000000000202 R12: 0000000000000120
> [    0.667285] R13: 00000000005f7b78 R14: 0000000000000000 R15: 00000000004c9d44
> [    0.667285] FS:  0000000000000000(0000) GS:ffff880007a00000(0000) knlGS:0000000000000000
> [    0.667285] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    0.667285] CR2: 00000000004ad1e4 CR3: 0000000000101000 CR4: 00000000000007f0
> [    0.667285] Stack:
> [    0.667285]  0000000000000018 00007ffc1fd0c490 00007ffc1fd0c3d0 0000000000000000
> [    0.667285]  0000000000000000 0000000000000000 00007ffc1fd0c490 0000000000000000
> [    0.667285]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [    0.667285] Call Trace:
> [    0.667285]  <UNK>
> [    0.667285] Code: 8b 44 24 50 48 8b 54 24 60 48 8b 74 24 68 48 8b 7c 24 70 48 8b 8c 24 80 00 00 00 4c 8b 9c 24 90 00 00 00 48 8b a4 24 98 00 00 00 <0f> 0b 0f 01 f8 48 0f 07 48 c7 84 24 a0 00 00 00 2b 00 00 00 48
> [    0.667285] Kernel panic - not syncing: Machine halted.
> [    0.667285] CPU: 0 PID: 1 Comm: init Not tainted 4.0.0-rc2+ #13
> [    0.667285] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    0.667285]  ffffffffffffffff ffff880007593e28 ffffffff81789625 ffff880007588000
> [    0.667285]  ffffffff81a3b181 ffff880007593ea8 ffffffff817840aa ffff880007590000
> [    0.667285]  0000000000000008 ffff880007593eb8 ffff880007593e58 0000000000000001
> [    0.667285] Call Trace:
> [    0.667285]  [<ffffffff81789625>] dump_stack+0x4c/0x65
> [    0.667285]  [<ffffffff817840aa>] panic+0xc6/0x1ff
> [    0.667285]  [<ffffffff81059ee5>] df_debug+0x35/0x40
> [    0.667285]  [<ffffffff81017e37>] do_double_fault+0x87/0x100
> [    0.667285]  [<ffffffff81017fb7>] do_userpsace_rsp_in_kernel+0x107/0x140
> [    0.667285]  [<ffffffff81793688>] ? ret_from_sys_call+0x5f/0x67
> [    0.667285]  [<ffffffff81795b49>] userpsace_rsp_in_kernel+0x39/0x40
> [    0.667285]  [<ffffffff81793688>] ? ret_from_sys_call+0x5f/0x67
> [    0.667285] Kernel Offset: disabled
> [    0.667285] Rebooting in 1 seconds..
> 
> Takashi, are you willing to reproduce the panic one more time,
> with this patch? I would like to see whether oops messages
> are more informative with it.

It can't be applied to 4.0-rc5, unfortunately.

arch/x86/kernel/entry_64.S: Assembler messages:
arch/x86/kernel/entry_64.S:1725: Error: no such instruction: `alloc_pt_gpregs_on_stack'
arch/x86/kernel/entry_64.S:1716: Error: invalid operands (*UND* and *UND* sections) for `+'
scripts/Makefile.build:294: recipe for target 'arch/x86/kernel/entry_64.o' failed


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23  9:02                                           ` Takashi Iwai
@ 2015-03-23  9:35                                             ` Takashi Iwai
  2015-03-23 13:22                                               ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23  9:35 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Mon, 23 Mar 2015 10:02:52 +0100,
Takashi Iwai wrote:
> 
> At Fri, 20 Mar 2015 19:16:53 +0100,
> Denys Vlasenko wrote:
> > Takashi, are you willing to reproduce the panic one more time,
> > with this patch? I would like to see whether oops messages
> > are more informative with it.
> 
> It can't be applied to 4.0-rc5, unfortunately.
> 
> arch/x86/kernel/entry_64.S: Assembler messages:
> arch/x86/kernel/entry_64.S:1725: Error: no such instruction: `alloc_pt_gpregs_on_stack'
> arch/x86/kernel/entry_64.S:1716: Error: invalid operands (*UND* and *UND* sections) for `+'
> scripts/Makefile.build:294: recipe for target 'arch/x86/kernel/entry_64.o' failed

I pulled tip tree on top of 4.0-rc5, built with your patch and now
succeeded to get a better message:

 kvm: zapping shadow pages for mmio generation wraparound
 kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
 Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
 RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
 PANIC: double fault, error_code: 0x0
 CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
 Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
 task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
 RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
 RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0
 RBP: 0000000000000ea7 R08: 0000000000001ea7 R09: ffffffffffffffff
 R10: 000000000309dbf8 R11: 0000000000000246 R12: 0000000000000001
 R13: 0000000000000000 R14: 0000000003026e40 R15: 000000000309cd50
 FS:  00007f89c83c2800(0000) GS:ffff88021d240000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000016d CR3: 00000000d90a0000 CR4: 00000000001427e0
 Stack:
  0000000000000ea7 0000000000000000 0000000003099c10 0000000000000ea7
  0000000000000ea7 0000000000000001 0000000003099c10 0000000000000ea7
  0000000000c84696 0000000003099c88 00007f0122c23fb8 000000000302f610
 Call Trace:
  <UNK> 
 Code: 
 10 75 ee f0 ff 42 6c 48 89 d0 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 30 <8b> 87 68 01 00 00 39 87 9c 01 00 00 7c 25 48 8b 87 88 04 00 00 
 Kernel panic - not syncing: Machine halted.
 CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
 Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
  0000000000000000 ffff8800d1b33e28 ffffffff816f80d2 0000000000000000
  ffffffff81a22f81 ffff8800d1b33ea8 ffffffff816f2358 00000000000058d7
  0000000000000008 ffff8800d1b33eb8 ffff8800d1b33e58 ffff8800d1b33ea8
 Call Trace:
  [<ffffffff816f80d2>] dump_stack+0x4c/0x6e
  [<ffffffff816f2358>] panic+0xc0/0x1f3
  [<ffffffff81046e65>] df_debug+0x35/0x40
  [<ffffffff81003fe7>] do_double_fault+0x87/0x100
  [<ffffffff81004167>] do_userpsace_rsp_in_kernel+0x107/0x140
  [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
  [<ffffffff81703ca6>] userpsace_rsp_in_kernel+0x36/0x40
  [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0


So, it seems hitting in netlink_attachskb().
I'd need to check whether this consistently hits there or just at
random.


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23  9:35                                             ` Takashi Iwai
@ 2015-03-23 13:22                                               ` Takashi Iwai
  2015-03-23 16:07                                                 ` Denys Vlasenko
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23 13:22 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Mon, 23 Mar 2015 10:35:41 +0100,
Takashi Iwai wrote:
> 
> At Mon, 23 Mar 2015 10:02:52 +0100,
> Takashi Iwai wrote:
> > 
> > At Fri, 20 Mar 2015 19:16:53 +0100,
> > Denys Vlasenko wrote:
> > > Takashi, are you willing to reproduce the panic one more time,
> > > with this patch? I would like to see whether oops messages
> > > are more informative with it.
> > 
> > It can't be applied to 4.0-rc5, unfortunately.
> > 
> > arch/x86/kernel/entry_64.S: Assembler messages:
> > arch/x86/kernel/entry_64.S:1725: Error: no such instruction: `alloc_pt_gpregs_on_stack'
> > arch/x86/kernel/entry_64.S:1716: Error: invalid operands (*UND* and *UND* sections) for `+'
> > scripts/Makefile.build:294: recipe for target 'arch/x86/kernel/entry_64.o' failed
> 
> I pulled tip tree on top of 4.0-rc5, built with your patch and now
> succeeded to get a better message:
> 
>  kvm: zapping shadow pages for mmio generation wraparound
>  kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
>  Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>  PANIC: double fault, error_code: 0x0
>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>  task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>  RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
>  RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
>  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0
>  RBP: 0000000000000ea7 R08: 0000000000001ea7 R09: ffffffffffffffff
>  R10: 000000000309dbf8 R11: 0000000000000246 R12: 0000000000000001
>  R13: 0000000000000000 R14: 0000000003026e40 R15: 000000000309cd50
>  FS:  00007f89c83c2800(0000) GS:ffff88021d240000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000000000000016d CR3: 00000000d90a0000 CR4: 00000000001427e0
>  Stack:
>   0000000000000ea7 0000000000000000 0000000003099c10 0000000000000ea7
>   0000000000000ea7 0000000000000001 0000000003099c10 0000000000000ea7
>   0000000000c84696 0000000003099c88 00007f0122c23fb8 000000000302f610
>  Call Trace:
>   <UNK> 
>  Code: 
>  10 75 ee f0 ff 42 6c 48 89 d0 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 30 <8b> 87 68 01 00 00 39 87 9c 01 00 00 7c 25 48 8b 87 88 04 00 00 
>  Kernel panic - not syncing: Machine halted.
>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>   0000000000000000 ffff8800d1b33e28 ffffffff816f80d2 0000000000000000
>   ffffffff81a22f81 ffff8800d1b33ea8 ffffffff816f2358 00000000000058d7
>   0000000000000008 ffff8800d1b33eb8 ffff8800d1b33e58 ffff8800d1b33ea8
>  Call Trace:
>   [<ffffffff816f80d2>] dump_stack+0x4c/0x6e
>   [<ffffffff816f2358>] panic+0xc0/0x1f3
>   [<ffffffff81046e65>] df_debug+0x35/0x40
>   [<ffffffff81003fe7>] do_double_fault+0x87/0x100
>   [<ffffffff81004167>] do_userpsace_rsp_in_kernel+0x107/0x140
>   [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
>   [<ffffffff81703ca6>] userpsace_rsp_in_kernel+0x36/0x40
>   [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
> 
> 
> So, it seems hitting in netlink_attachskb().
> I'd need to check whether this consistently hits there or just at
> random.

I managed to reproduce the bug two more times, and all three show the
very same stack trace like the above.  So, it's well reproducible.

I'm really puzzled now.  We have a few pieces of information:

- git bisection pointed the commit 96b6352c1271:
    x86_64, entry: Remove the syscall exit audit and schedule optimizations
  and reverting this "fixes" the problem indeed.  Even just moving two
  lines
    LOCKDEP_SYS_EXIT
    DISABLE_INTERRUPTS(CLBR_NONE) 
  at the beginning of ret_from_sys_call already fixes.  (Of course I
  can't prove the fix but it stabilizes for a day without crash while
  usually I hit the bug in 10 minutes in full test running.)

- Another piece is that the bug happens only when a KVM is running.
  The kernel ran without problem over days with similar tasks
  (compiling kernel, etc) when no KVM was used.

- And now I get the trace as above, pointing netlink_attachskb().

I have a difficulty to imagine how all these pieces fit into a single
picture.  Is something already screwed up before that?


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 13:22                                               ` Takashi Iwai
@ 2015-03-23 16:07                                                 ` Denys Vlasenko
  2015-03-23 17:18                                                   ` Takashi Iwai
  2015-03-23 18:38                                                   ` Andy Lutomirski
  0 siblings, 2 replies; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-23 16:07 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On 03/23/2015 02:22 PM, Takashi Iwai wrote:
> At Mon, 23 Mar 2015 10:35:41 +0100,
> Takashi Iwai wrote:
>>
>> At Mon, 23 Mar 2015 10:02:52 +0100,
>> Takashi Iwai wrote:
>>>
>>> At Fri, 20 Mar 2015 19:16:53 +0100,
>>> Denys Vlasenko wrote:
>>>> Takashi, are you willing to reproduce the panic one more time,
>>>> with this patch? I would like to see whether oops messages
>>>> are more informative with it.
>>>
>>> It can't be applied to 4.0-rc5, unfortunately.
>>>
>>> arch/x86/kernel/entry_64.S: Assembler messages:
>>> arch/x86/kernel/entry_64.S:1725: Error: no such instruction: `alloc_pt_gpregs_on_stack'
>>> arch/x86/kernel/entry_64.S:1716: Error: invalid operands (*UND* and *UND* sections) for `+'
>>> scripts/Makefile.build:294: recipe for target 'arch/x86/kernel/entry_64.o' failed
>>
>> I pulled tip tree on top of 4.0-rc5, built with your patch and now
>> succeeded to get a better message:
>>
>>  kvm: zapping shadow pages for mmio generation wraparound
>>  kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
>>  Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>>  PANIC: double fault, error_code: 0x0
>>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
>>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>>  task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>>  RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
>>  RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
>>  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0
>>  RBP: 0000000000000ea7 R08: 0000000000001ea7 R09: ffffffffffffffff
>>  R10: 000000000309dbf8 R11: 0000000000000246 R12: 0000000000000001
>>  R13: 0000000000000000 R14: 0000000003026e40 R15: 000000000309cd50
>>  FS:  00007f89c83c2800(0000) GS:ffff88021d240000(0000) knlGS:0000000000000000
>>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>  CR2: 000000000000016d CR3: 00000000d90a0000 CR4: 00000000001427e0
>>  Stack:
>>   0000000000000ea7 0000000000000000 0000000003099c10 0000000000000ea7
>>   0000000000000ea7 0000000000000001 0000000003099c10 0000000000000ea7
>>   0000000000c84696 0000000003099c88 00007f0122c23fb8 000000000302f610
>>  Call Trace:
>>   <UNK> 
>>  Code: 
>>  10 75 ee f0 ff 42 6c 48 89 d0 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 30 <8b> 87 68 01 00 00 39 87 9c 01 00 00 7c 25 48 8b 87 88 04 00 00 
>>  Kernel panic - not syncing: Machine halted.
>>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
>>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>>   0000000000000000 ffff8800d1b33e28 ffffffff816f80d2 0000000000000000
>>   ffffffff81a22f81 ffff8800d1b33ea8 ffffffff816f2358 00000000000058d7
>>   0000000000000008 ffff8800d1b33eb8 ffff8800d1b33e58 ffff8800d1b33ea8
>>  Call Trace:
>>   [<ffffffff816f80d2>] dump_stack+0x4c/0x6e
>>   [<ffffffff816f2358>] panic+0xc0/0x1f3
>>   [<ffffffff81046e65>] df_debug+0x35/0x40
>>   [<ffffffff81003fe7>] do_double_fault+0x87/0x100
>>   [<ffffffff81004167>] do_userpsace_rsp_in_kernel+0x107/0x140
>>   [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
>>   [<ffffffff81703ca6>] userpsace_rsp_in_kernel+0x36/0x40
>>   [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
>>
>>
>> So, it seems hitting in netlink_attachskb().
>> I'd need to check whether this consistently hits there or just at
>> random.
> 
> I managed to reproduce the bug two more times, and all three show the
> very same stack trace like the above.  So, it's well reproducible.

FYI: the disassembly of netlink_attachskb (from "Code:" line) is:

   0:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
   5:   55                      push   %rbp
   6:   48 89 e5                mov    %rsp,%rbp
   9:   41 56                   push   %r14
   b:   41 55                   push   %r13
   d:   49 89 d5                mov    %rdx,%r13
  10:   41 54                   push   %r12
  12:   49 89 f4                mov    %rsi,%r12
  15:   53                      push   %rbx
  16:   48 89 fb                mov    %rdi,%rbx
  19:   48 83 ec 30             sub    $0x30,%rsp
  1d:   8b 87 68 01 00 00       mov    0x168(%rdi),%eax
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  23:   39 87 9c 01 00 00       cmp    %eax,0x19c(%rdi)
  29:   7c 25                   jl     50 <_start+0x50>
  2b:   48 8b 87 88 04 00 00    mov    0x488(%rdi),%rax

The ^^^^^ instruction is the one which faults. Since you said it
consistently happens here, this should be a page fault, not an external
hardware interrupt.

The code corresponds to the comparison in if():

int netlink_attachskb(struct sock *sk, struct sk_buff *skb,
                      long *timeo, struct sock *ssk)
{
        struct netlink_sock *nlk;

        nlk = nlk_sk(sk);

        if ((atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||

%rdi (which is 1st param, "struct sock *sk") is 00007ffd22c23ef0
(userspace address), but it's just because my patch clobbers %rdi,   :(
we don't know which value it had at that moment.

> I'm really puzzled now.  We have a few pieces of information:
> 
> - git bisection pointed the commit 96b6352c1271:
>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
>   and reverting this "fixes" the problem indeed.  Even just moving two
>   lines
>     LOCKDEP_SYS_EXIT
>     DISABLE_INTERRUPTS(CLBR_NONE) 
>   at the beginning of ret_from_sys_call already fixes.  (Of course I
>   can't prove the fix but it stabilizes for a day without crash while
>   usually I hit the bug in 10 minutes in full test running.)

The commit 96b6352c1271 moved TIF_ALLWORK_MASK check from
interrupt-disabled region to interrupt-enabled:

        cmpq $__NR_syscall_max,%rax
        ja ret_from_sys_call
        movq %r10,%rcx
        call *sys_call_table(,%rax,8)  # XXX:    rip relative
        movq %rax,RAX-ARGOFFSET(%rsp)
ret_from_sys_call:
	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
...
...
int_ret_from_sys_call_fixup:
        FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
        jmp int_ret_from_sys_call
...
...
GLOBAL(int_ret_from_sys_call)
        DISABLE_INTERRUPTS(CLBR_NONE)
        TRACE_IRQS_OFF

You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).

I also don't see how moving that check (even if it is wrong in a more
benign way) can have such a drastic effect.


Shot-in-the-dark idea. At this code revision we did not yet
store user's %rsp in pt_regs->sp, we used a fixup to populate it:

        .macro FIXUP_TOP_OF_STACK tmp offset=0
        movq PER_CPU_VAR(old_rsp),\tmp
        movq \tmp,RSP+\offset(%rsp)

(There are pending patches to fix this mess).

If an interrupt interrupting *kernel code* would go into a code path
which does FIXUP_TOP_OF_STACK, it'd overwrite the correct saved %rsp
with a user's one. The iret from interrupt would work,
but the resulting CPU state would be inconsistent. But I don't see
such a code path from interrupts to FIXUP_TOP_OF_STACK...


> - Another piece is that the bug happens only when a KVM is running.
>   The kernel ran without problem over days with similar tasks
>   (compiling kernel, etc) when no KVM was used.

Conceivably virtualization support in CPUs can have nasty erratas.
However, you and other reporter have different CPUs - yours
is Ivy Bridge, his CPU is a Penryn.

I don't see the path how KVM helps to trigger this.

> - And now I get the trace as above, pointing netlink_attachskb().
> 
> I have a difficulty to imagine how all these pieces fit into a single
> picture.  Is something already screwed up before that?

Well, a tiny bit more info will be seen if you'd change %rdi
to, say, %r15 in these two lines in my patch:

       /* Save bogus RSP value */
       movq    %rsp,%rdi
...
       push    %rdi            /* pt_regs->sp */

Then original %rdi will be visible in the crash message.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 16:07                                                 ` Denys Vlasenko
@ 2015-03-23 17:18                                                   ` Takashi Iwai
  2015-03-23 17:46                                                     ` Denys Vlasenko
  2015-03-23 18:38                                                   ` Andy Lutomirski
  1 sibling, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23 17:18 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Mon, 23 Mar 2015 17:07:15 +0100,
Denys Vlasenko wrote:
> 
> On 03/23/2015 02:22 PM, Takashi Iwai wrote:
> > At Mon, 23 Mar 2015 10:35:41 +0100,
> > Takashi Iwai wrote:
> >>
> >> At Mon, 23 Mar 2015 10:02:52 +0100,
> >> Takashi Iwai wrote:
> >>>
> >>> At Fri, 20 Mar 2015 19:16:53 +0100,
> >>> Denys Vlasenko wrote:
> >>>> Takashi, are you willing to reproduce the panic one more time,
> >>>> with this patch? I would like to see whether oops messages
> >>>> are more informative with it.
> >>>
> >>> It can't be applied to 4.0-rc5, unfortunately.
> >>>
> >>> arch/x86/kernel/entry_64.S: Assembler messages:
> >>> arch/x86/kernel/entry_64.S:1725: Error: no such instruction: `alloc_pt_gpregs_on_stack'
> >>> arch/x86/kernel/entry_64.S:1716: Error: invalid operands (*UND* and *UND* sections) for `+'
> >>> scripts/Makefile.build:294: recipe for target 'arch/x86/kernel/entry_64.o' failed
> >>
> >> I pulled tip tree on top of 4.0-rc5, built with your patch and now
> >> succeeded to get a better message:
> >>
> >>  kvm: zapping shadow pages for mmio generation wraparound
> >>  kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> >>  Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
> >>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >>  PANIC: double fault, error_code: 0x0
> >>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
> >>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
> >>  task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
> >>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >>  RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
> >>  RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
> >>  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0
> >>  RBP: 0000000000000ea7 R08: 0000000000001ea7 R09: ffffffffffffffff
> >>  R10: 000000000309dbf8 R11: 0000000000000246 R12: 0000000000000001
> >>  R13: 0000000000000000 R14: 0000000003026e40 R15: 000000000309cd50
> >>  FS:  00007f89c83c2800(0000) GS:ffff88021d240000(0000) knlGS:0000000000000000
> >>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>  CR2: 000000000000016d CR3: 00000000d90a0000 CR4: 00000000001427e0
> >>  Stack:
> >>   0000000000000ea7 0000000000000000 0000000003099c10 0000000000000ea7
> >>   0000000000000ea7 0000000000000001 0000000003099c10 0000000000000ea7
> >>   0000000000c84696 0000000003099c88 00007f0122c23fb8 000000000302f610
> >>  Call Trace:
> >>   <UNK> 
> >>  Code: 
> >>  10 75 ee f0 ff 42 6c 48 89 d0 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 30 <8b> 87 68 01 00 00 39 87 9c 01 00 00 7c 25 48 8b 87 88 04 00 00 
> >>  Kernel panic - not syncing: Machine halted.
> >>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
> >>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
> >>   0000000000000000 ffff8800d1b33e28 ffffffff816f80d2 0000000000000000
> >>   ffffffff81a22f81 ffff8800d1b33ea8 ffffffff816f2358 00000000000058d7
> >>   0000000000000008 ffff8800d1b33eb8 ffff8800d1b33e58 ffff8800d1b33ea8
> >>  Call Trace:
> >>   [<ffffffff816f80d2>] dump_stack+0x4c/0x6e
> >>   [<ffffffff816f2358>] panic+0xc0/0x1f3
> >>   [<ffffffff81046e65>] df_debug+0x35/0x40
> >>   [<ffffffff81003fe7>] do_double_fault+0x87/0x100
> >>   [<ffffffff81004167>] do_userpsace_rsp_in_kernel+0x107/0x140
> >>   [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
> >>   [<ffffffff81703ca6>] userpsace_rsp_in_kernel+0x36/0x40
> >>   [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
> >>
> >>
> >> So, it seems hitting in netlink_attachskb().
> >> I'd need to check whether this consistently hits there or just at
> >> random.
> > 
> > I managed to reproduce the bug two more times, and all three show the
> > very same stack trace like the above.  So, it's well reproducible.
> 
> FYI: the disassembly of netlink_attachskb (from "Code:" line) is:
> 
>    0:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>    5:   55                      push   %rbp
>    6:   48 89 e5                mov    %rsp,%rbp
>    9:   41 56                   push   %r14
>    b:   41 55                   push   %r13
>    d:   49 89 d5                mov    %rdx,%r13
>   10:   41 54                   push   %r12
>   12:   49 89 f4                mov    %rsi,%r12
>   15:   53                      push   %rbx
>   16:   48 89 fb                mov    %rdi,%rbx
>   19:   48 83 ec 30             sub    $0x30,%rsp
>   1d:   8b 87 68 01 00 00       mov    0x168(%rdi),%eax
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   23:   39 87 9c 01 00 00       cmp    %eax,0x19c(%rdi)
>   29:   7c 25                   jl     50 <_start+0x50>
>   2b:   48 8b 87 88 04 00 00    mov    0x488(%rdi),%rax
> 
> The ^^^^^ instruction is the one which faults. Since you said it
> consistently happens here, this should be a page fault, not an external
> hardware interrupt.
> 
> The code corresponds to the comparison in if():
> 
> int netlink_attachskb(struct sock *sk, struct sk_buff *skb,
>                       long *timeo, struct sock *ssk)
> {
>         struct netlink_sock *nlk;
> 
>         nlk = nlk_sk(sk);
> 
>         if ((atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||
> 
> %rdi (which is 1st param, "struct sock *sk") is 00007ffd22c23ef0
> (userspace address), but it's just because my patch clobbers %rdi,   :(
> we don't know which value it had at that moment.
> 
> > I'm really puzzled now.  We have a few pieces of information:
> > 
> > - git bisection pointed the commit 96b6352c1271:
> >     x86_64, entry: Remove the syscall exit audit and schedule optimizations
> >   and reverting this "fixes" the problem indeed.  Even just moving two
> >   lines
> >     LOCKDEP_SYS_EXIT
> >     DISABLE_INTERRUPTS(CLBR_NONE) 
> >   at the beginning of ret_from_sys_call already fixes.  (Of course I
> >   can't prove the fix but it stabilizes for a day without crash while
> >   usually I hit the bug in 10 minutes in full test running.)
> 
> The commit 96b6352c1271 moved TIF_ALLWORK_MASK check from
> interrupt-disabled region to interrupt-enabled:
> 
>         cmpq $__NR_syscall_max,%rax
>         ja ret_from_sys_call
>         movq %r10,%rcx
>         call *sys_call_table(,%rax,8)  # XXX:    rip relative
>         movq %rax,RAX-ARGOFFSET(%rsp)
> ret_from_sys_call:
> 	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
>  	LOCKDEP_SYS_EXIT
>  	DISABLE_INTERRUPTS(CLBR_NONE)
>  	TRACE_IRQS_OFF
> ...
> ...
> int_ret_from_sys_call_fixup:
>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>         jmp int_ret_from_sys_call
> ...
> ...
> GLOBAL(int_ret_from_sys_call)
>         DISABLE_INTERRUPTS(CLBR_NONE)
>         TRACE_IRQS_OFF
> 
> You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).

Oh yes.  I forgot to mention that I tested also only moving
DISABLE_INTERRUPTS(CLBR_NONE) at the beginning.  But this didn't
help by some reason.

And, I tested also without all kernel debug options (both
CONFIG_DEBUG_LOCK_ALLOC and CONFIG_TRACE_IRQFLAGS are off), but this
kernel also showed the same crash.

> I also don't see how moving that check (even if it is wrong in a more
> benign way) can have such a drastic effect.
> 
> 
> Shot-in-the-dark idea. At this code revision we did not yet
> store user's %rsp in pt_regs->sp, we used a fixup to populate it:
> 
>         .macro FIXUP_TOP_OF_STACK tmp offset=0
>         movq PER_CPU_VAR(old_rsp),\tmp
>         movq \tmp,RSP+\offset(%rsp)
> 
> (There are pending patches to fix this mess).
> 
> If an interrupt interrupting *kernel code* would go into a code path
> which does FIXUP_TOP_OF_STACK, it'd overwrite the correct saved %rsp
> with a user's one. The iret from interrupt would work,
> but the resulting CPU state would be inconsistent. But I don't see
> such a code path from interrupts to FIXUP_TOP_OF_STACK...
> 
> 
> > - Another piece is that the bug happens only when a KVM is running.
> >   The kernel ran without problem over days with similar tasks
> >   (compiling kernel, etc) when no KVM was used.
> 
> Conceivably virtualization support in CPUs can have nasty erratas.
> However, you and other reporter have different CPUs - yours
> is Ivy Bridge, his CPU is a Penryn.
> 
> I don't see the path how KVM helps to trigger this.
> 
> > - And now I get the trace as above, pointing netlink_attachskb().
> > 
> > I have a difficulty to imagine how all these pieces fit into a single
> > picture.  Is something already screwed up before that?
> 
> Well, a tiny bit more info will be seen if you'd change %rdi
> to, say, %r15 in these two lines in my patch:
> 
>        /* Save bogus RSP value */
>        movq    %rsp,%rdi
> ...
>        push    %rdi            /* pt_regs->sp */
> 
> Then original %rdi will be visible in the crash message.

OK, here we go.

 kvm: zapping shadow pages for mmio generation wraparound
 kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
 Exception on user stack 00007fff1d7e5ec0: RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
 RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
 PANIC: double fault, error_code: 0x0
 CPU: 5 PID: 14285 Comm: fixdep Tainted: G        W       4.0.0-rc5-debug1+ #3
 Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
 task: ffff88020ba1c690 ti: ffff880206ba4000 task.ti: ffff880206ba4000
 RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
 RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000c0000101
 RDX: 0000000000000000 RSI: 0000000000001ebb RDI: 0000000000000000
 RBP: 0000000000000022 R08: 0000000000000004 R09: 0000000000000000
 R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000001ebb
 R13: 00007fb642fcc6e4 R14: 00007fb642fcdc18 R15: 00007fff1d7e5ec0
 FS:  00007fb642fa9700(0000) GS:ffff88021d340000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000168 CR3: 00000000ce1b4000 CR4: 00000000001427e0
 Stack:
  0000000000000005 0000000000401582 00007fb642fcb180 0000000000000053
  0000000000400d8a 0000000000000000 0000000000000000 000000005d152a17
  0000000000400f8c 0000000000000000 0000000100000004 00007fb642fcb000
 Call Trace:
  <UNK> 
 Code: 
 10 75 ee f0 ff 42 6c 48 89 d0 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 30 <8b> 87 68 01 00 00 39 87 9c 01 00 00 7c 25 48 8b 87 88 04 00 00 
 Kernel panic - not syncing: Machine halted.
 CPU: 5 PID: 14285 Comm: fixdep Tainted: G        W       4.0.0-rc5-debug1+ #3
 Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
  0000000000000000 ffff880206ba7e28 ffffffff816f80d2 0000000000000000
  ffffffff81a22f81 ffff880206ba7ea8 ffffffff816f2358 00000000000050da
  0000000000000008 ffff880206ba7eb8 ffff880206ba7e58 ffff880206ba7ea8
 Call Trace:
  [<ffffffff816f80d2>] dump_stack+0x4c/0x6e
  [<ffffffff816f2358>] panic+0xc0/0x1f3
  [<ffffffff81046e65>] df_debug+0x35/0x40
  [<ffffffff81003fe7>] do_double_fault+0x87/0x100
  [<ffffffff81004167>] do_userpsace_rsp_in_kernel+0x107/0x140
  [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0
  [<ffffffff81703ca7>] userpsace_rsp_in_kernel+0x37/0x40
  [<ffffffff8162681d>] ? netlink_attachskb+0x1d/0x1d0


I have to leave my office now.  If you need any further tests, let me
know; I'll do it tomorrow.  In anyway I'll need to double-check
whether I tested properly.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 17:18                                                   ` Takashi Iwai
@ 2015-03-23 17:46                                                     ` Denys Vlasenko
  2015-03-23 18:43                                                       ` Takashi Iwai
  0 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-23 17:46 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On 03/23/2015 06:18 PM, Takashi Iwai wrote:
> At Mon, 23 Mar 2015 17:07:15 +0100, Denys Vlasenko wrote:
>>>> I pulled tip tree on top of 4.0-rc5, built with your patch and now
>>>> succeeded to get a better message:
>>>>
>>>>  kvm: zapping shadow pages for mmio generation wraparound
>>>>  kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
>>>>  Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
>>>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>>>>  PANIC: double fault, error_code: 0x0
>>>>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
>>>>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>>>>  task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
>>>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>>>>  RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
>>>>  RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
>>>>  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0

>> FYI: the disassembly of netlink_attachskb (from "Code:" line) is:
>>
>>    0:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>>    5:   55                      push   %rbp
>>    6:   48 89 e5                mov    %rsp,%rbp
>>    9:   41 56                   push   %r14
>>    b:   41 55                   push   %r13
>>    d:   49 89 d5                mov    %rdx,%r13
>>   10:   41 54                   push   %r12
>>   12:   49 89 f4                mov    %rsi,%r12
>>   15:   53                      push   %rbx
>>   16:   48 89 fb                mov    %rdi,%rbx
>>   19:   48 83 ec 30             sub    $0x30,%rsp
>>   1d:   8b 87 68 01 00 00       mov    0x168(%rdi),%eax
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>   23:   39 87 9c 01 00 00       cmp    %eax,0x19c(%rdi)
>>   29:   7c 25                   jl     50 <_start+0x50>
>>   2b:   48 8b 87 88 04 00 00    mov    0x488(%rdi),%rax
>>
>> The ^^^^^ instruction is the one which faults. Since you said it
>> consistently happens here, this should be a page fault, not an external
>> hardware interrupt.
>>
>> The code corresponds to the comparison in if():
>>
>> int netlink_attachskb(struct sock *sk, struct sk_buff *skb,
>>                       long *timeo, struct sock *ssk)
>> {
>>         struct netlink_sock *nlk;
>>
>>         nlk = nlk_sk(sk);
>>
>>         if ((atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||

>>> - Another piece is that the bug happens only when a KVM is running.
>>>   The kernel ran without problem over days with similar tasks
>>>   (compiling kernel, etc) when no KVM was used.
>>
>> Conceivably virtualization support in CPUs can have nasty erratas.
>> However, you and other reporter have different CPUs - yours
>> is Ivy Bridge, his CPU is a Penryn.
>>
>> I don't see the path how KVM helps to trigger this.
>>
>>> - And now I get the trace as above, pointing netlink_attachskb().
>>>
>>> I have a difficulty to imagine how all these pieces fit into a single
>>> picture.  Is something already screwed up before that?
>>
>> Well, a tiny bit more info will be seen if you'd change %rdi
>> to, say, %r15 in these two lines in my patch:
>>
>>        /* Save bogus RSP value */
>>        movq    %rsp,%rdi
>> ...
>>        push    %rdi            /* pt_regs->sp */
>>
>> Then original %rdi will be visible in the crash message.
> 
> OK, here we go.
> 
>  kvm: zapping shadow pages for mmio generation wraparound
>  kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
>  Exception on user stack 00007fff1d7e5ec0: RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>  PANIC: double fault, error_code: 0x0
>  CPU: 5 PID: 14285 Comm: fixdep Tainted: G        W       4.0.0-rc5-debug1+ #3
>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
>  task: ffff88020ba1c690 ti: ffff880206ba4000 task.ti: ffff880206ba4000
>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
>  RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
>  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000c0000101
>  RDX: 0000000000000000 RSI: 0000000000001ebb RDI: 0000000000000000

Thanks for your testing. So the %rdi was NULL... not very informative.

Notice that your every crash is preceded by

    kvm: zapping shadow pages for mmio generation wraparound
    kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff

This hints that kvm _is_ somehow responsible.
I'm no expert on kvm, I need to take a look around that code...

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 16:07                                                 ` Denys Vlasenko
  2015-03-23 17:18                                                   ` Takashi Iwai
@ 2015-03-23 18:38                                                   ` Andy Lutomirski
  2015-03-23 18:48                                                     ` Andy Lutomirski
                                                                       ` (3 more replies)
  1 sibling, 4 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-23 18:38 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Takashi Iwai, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On Mon, Mar 23, 2015 at 9:07 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/23/2015 02:22 PM, Takashi Iwai wrote:
>> At Mon, 23 Mar 2015 10:35:41 +0100,
>> Takashi Iwai wrote:
>>>
>>> At Mon, 23 Mar 2015 10:02:52 +0100,
>>> Takashi Iwai wrote:
>>>>
>>>> At Fri, 20 Mar 2015 19:16:53 +0100,
>>>> Denys Vlasenko wrote:

>> I'm really puzzled now.  We have a few pieces of information:
>>
>> - git bisection pointed the commit 96b6352c1271:
>>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
>>   and reverting this "fixes" the problem indeed.  Even just moving two
>>   lines
>>     LOCKDEP_SYS_EXIT
>>     DISABLE_INTERRUPTS(CLBR_NONE)
>>   at the beginning of ret_from_sys_call already fixes.  (Of course I
>>   can't prove the fix but it stabilizes for a day without crash while
>>   usually I hit the bug in 10 minutes in full test running.)
>
> The commit 96b6352c1271 moved TIF_ALLWORK_MASK check from
> interrupt-disabled region to interrupt-enabled:
>
>         cmpq $__NR_syscall_max,%rax
>         ja ret_from_sys_call
>         movq %r10,%rcx
>         call *sys_call_table(,%rax,8)  # XXX:    rip relative
>         movq %rax,RAX-ARGOFFSET(%rsp)
> ret_from_sys_call:
>         testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>         LOCKDEP_SYS_EXIT
>         DISABLE_INTERRUPTS(CLBR_NONE)
>         TRACE_IRQS_OFF
> ...
> ...
> int_ret_from_sys_call_fixup:
>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>         jmp int_ret_from_sys_call
> ...
> ...
> GLOBAL(int_ret_from_sys_call)
>         DISABLE_INTERRUPTS(CLBR_NONE)
>         TRACE_IRQS_OFF
>
> You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).
>
> I also don't see how moving that check (even if it is wrong in a more
> benign way) can have such a drastic effect.

I bet I see it.  I have the advantage of having stared at KVM code and
cursed at it more recently than you, I suspect.  KVM does awful, awful
things to CPU state, and, as an optimization, it allows kernel code to
run with CPU state that would be totally invalid in user mode.  This
happens through a bunch of hooks, including this bit in __switch_to:

    /*
     * Now maybe reload the debug registers and handle I/O bitmaps
     */
    if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
             task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
        __switch_to_xtra(prev_p, next_p, tss);

IOW, we *change* tif during context switches.


The race looks like this:

    testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
    jnz int_ret_from_sys_call_fixup    /* Go the the slow path */

--- preempted here, switch to KVM guest ---

KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
happen to be a *32-bit* KVM guest, perhaps?

Now KVM schedules, calling __switch_to.  __switch_to sets
_TIF_USER_RETURN_NOTIFY.  We IRET back to the syscall exit code, turn
off interrupts, and do sysret.  We are now screwed.

I don't know why this manifests in this particular failure, but any
number of terrible things could happen now.

FWIW, this will affect things other than KVM.  For example, SIGKILL
sent while a process is sleeping in that two-instruction window won't
work.

Takashi, can you re-send your patch so we can review it for real in
light of this race?

>
>
> Shot-in-the-dark idea. At this code revision we did not yet
> store user's %rsp in pt_regs->sp, we used a fixup to populate it:
>
>         .macro FIXUP_TOP_OF_STACK tmp offset=0
>         movq PER_CPU_VAR(old_rsp),\tmp
>         movq \tmp,RSP+\offset(%rsp)
>
> (There are pending patches to fix this mess).
>
> If an interrupt interrupting *kernel code* would go into a code path
> which does FIXUP_TOP_OF_STACK, it'd overwrite the correct saved %rsp
> with a user's one. The iret from interrupt would work,
> but the resulting CPU state would be inconsistent. But I don't see
> such a code path from interrupts to FIXUP_TOP_OF_STACK...

I don't buy it.  Anything that does that is so completely broken that
I'd hope we'd have found it long ago.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 17:46                                                     ` Denys Vlasenko
@ 2015-03-23 18:43                                                       ` Takashi Iwai
  0 siblings, 0 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23 18:43 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Mon, 23 Mar 2015 18:46:45 +0100,
Denys Vlasenko wrote:
> 
> On 03/23/2015 06:18 PM, Takashi Iwai wrote:
> > At Mon, 23 Mar 2015 17:07:15 +0100, Denys Vlasenko wrote:
> >>>> I pulled tip tree on top of 4.0-rc5, built with your patch and now
> >>>> succeeded to get a better message:
> >>>>
> >>>>  kvm: zapping shadow pages for mmio generation wraparound
> >>>>  kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> >>>>  Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
> >>>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >>>>  PANIC: double fault, error_code: 0x0
> >>>>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
> >>>>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
> >>>>  task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
> >>>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >>>>  RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
> >>>>  RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
> >>>>  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0
> 
> >> FYI: the disassembly of netlink_attachskb (from "Code:" line) is:
> >>
> >>    0:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> >>    5:   55                      push   %rbp
> >>    6:   48 89 e5                mov    %rsp,%rbp
> >>    9:   41 56                   push   %r14
> >>    b:   41 55                   push   %r13
> >>    d:   49 89 d5                mov    %rdx,%r13
> >>   10:   41 54                   push   %r12
> >>   12:   49 89 f4                mov    %rsi,%r12
> >>   15:   53                      push   %rbx
> >>   16:   48 89 fb                mov    %rdi,%rbx
> >>   19:   48 83 ec 30             sub    $0x30,%rsp
> >>   1d:   8b 87 68 01 00 00       mov    0x168(%rdi),%eax
> >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>   23:   39 87 9c 01 00 00       cmp    %eax,0x19c(%rdi)
> >>   29:   7c 25                   jl     50 <_start+0x50>
> >>   2b:   48 8b 87 88 04 00 00    mov    0x488(%rdi),%rax
> >>
> >> The ^^^^^ instruction is the one which faults. Since you said it
> >> consistently happens here, this should be a page fault, not an external
> >> hardware interrupt.
> >>
> >> The code corresponds to the comparison in if():
> >>
> >> int netlink_attachskb(struct sock *sk, struct sk_buff *skb,
> >>                       long *timeo, struct sock *ssk)
> >> {
> >>         struct netlink_sock *nlk;
> >>
> >>         nlk = nlk_sk(sk);
> >>
> >>         if ((atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||
> 
> >>> - Another piece is that the bug happens only when a KVM is running.
> >>>   The kernel ran without problem over days with similar tasks
> >>>   (compiling kernel, etc) when no KVM was used.
> >>
> >> Conceivably virtualization support in CPUs can have nasty erratas.
> >> However, you and other reporter have different CPUs - yours
> >> is Ivy Bridge, his CPU is a Penryn.
> >>
> >> I don't see the path how KVM helps to trigger this.
> >>
> >>> - And now I get the trace as above, pointing netlink_attachskb().
> >>>
> >>> I have a difficulty to imagine how all these pieces fit into a single
> >>> picture.  Is something already screwed up before that?
> >>
> >> Well, a tiny bit more info will be seen if you'd change %rdi
> >> to, say, %r15 in these two lines in my patch:
> >>
> >>        /* Save bogus RSP value */
> >>        movq    %rsp,%rdi
> >> ...
> >>        push    %rdi            /* pt_regs->sp */
> >>
> >> Then original %rdi will be visible in the crash message.
> > 
> > OK, here we go.
> > 
> >  kvm: zapping shadow pages for mmio generation wraparound
> >  kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> >  Exception on user stack 00007fff1d7e5ec0: RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
> >  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >  PANIC: double fault, error_code: 0x0
> >  CPU: 5 PID: 14285 Comm: fixdep Tainted: G        W       4.0.0-rc5-debug1+ #3
> >  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
> >  task: ffff88020ba1c690 ti: ffff880206ba4000 task.ti: ffff880206ba4000
> >  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >  RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
> >  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000c0000101
> >  RDX: 0000000000000000 RSI: 0000000000001ebb RDI: 0000000000000000
> 
> Thanks for your testing. So the %rdi was NULL... not very informative.
> 
> Notice that your every crash is preceded by
> 
>     kvm: zapping shadow pages for mmio generation wraparound
>     kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> 
> This hints that kvm _is_ somehow responsible.

It's likely irrelevant, as this appears at the time a VM starting, not
at the crash time.  I've got this message all the time.  Sorry for
confusing.


Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 18:38                                                   ` Andy Lutomirski
@ 2015-03-23 18:48                                                     ` Andy Lutomirski
  2015-03-23 18:59                                                       ` Takashi Iwai
  2015-03-23 18:54                                                     ` PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
                                                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-23 18:48 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Takashi Iwai, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On Mon, Mar 23, 2015 at 11:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Mar 23, 2015 at 9:07 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> On 03/23/2015 02:22 PM, Takashi Iwai wrote:
>>> At Mon, 23 Mar 2015 10:35:41 +0100,
>>> Takashi Iwai wrote:
>>>>
>>>> At Mon, 23 Mar 2015 10:02:52 +0100,
>>>> Takashi Iwai wrote:
>>>>>
>>>>> At Fri, 20 Mar 2015 19:16:53 +0100,
>>>>> Denys Vlasenko wrote:
>
>>> I'm really puzzled now.  We have a few pieces of information:
>>>
>>> - git bisection pointed the commit 96b6352c1271:
>>>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
>>>   and reverting this "fixes" the problem indeed.  Even just moving two
>>>   lines
>>>     LOCKDEP_SYS_EXIT
>>>     DISABLE_INTERRUPTS(CLBR_NONE)
>>>   at the beginning of ret_from_sys_call already fixes.  (Of course I
>>>   can't prove the fix but it stabilizes for a day without crash while
>>>   usually I hit the bug in 10 minutes in full test running.)
>>
>> The commit 96b6352c1271 moved TIF_ALLWORK_MASK check from
>> interrupt-disabled region to interrupt-enabled:
>>
>>         cmpq $__NR_syscall_max,%rax
>>         ja ret_from_sys_call
>>         movq %r10,%rcx
>>         call *sys_call_table(,%rax,8)  # XXX:    rip relative
>>         movq %rax,RAX-ARGOFFSET(%rsp)
>> ret_from_sys_call:
>>         testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>         jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>>         LOCKDEP_SYS_EXIT
>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>         TRACE_IRQS_OFF
>> ...
>> ...
>> int_ret_from_sys_call_fixup:
>>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>>         jmp int_ret_from_sys_call
>> ...
>> ...
>> GLOBAL(int_ret_from_sys_call)
>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>         TRACE_IRQS_OFF
>>
>> You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).
>>
>> I also don't see how moving that check (even if it is wrong in a more
>> benign way) can have such a drastic effect.
>
> I bet I see it.  I have the advantage of having stared at KVM code and
> cursed at it more recently than you, I suspect.  KVM does awful, awful
> things to CPU state, and, as an optimization, it allows kernel code to
> run with CPU state that would be totally invalid in user mode.  This
> happens through a bunch of hooks, including this bit in __switch_to:
>
>     /*
>      * Now maybe reload the debug registers and handle I/O bitmaps
>      */
>     if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
>              task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
>         __switch_to_xtra(prev_p, next_p, tss);
>
> IOW, we *change* tif during context switches.
>
>
> The race looks like this:
>
>     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
>     jnz int_ret_from_sys_call_fixup    /* Go the the slow path */
>
> --- preempted here, switch to KVM guest ---
>
> KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
> happen to be a *32-bit* KVM guest, perhaps?
>
> Now KVM schedules, calling __switch_to.  __switch_to sets
> _TIF_USER_RETURN_NOTIFY.  We IRET back to the syscall exit code, turn
> off interrupts, and do sysret.  We are now screwed.
>
> I don't know why this manifests in this particular failure, but any
> number of terrible things could happen now.
>
> FWIW, this will affect things other than KVM.  For example, SIGKILL
> sent while a process is sleeping in that two-instruction window won't
> work.
>
> Takashi, can you re-send your patch so we can review it for real in
> light of this race?

Never mind, I'm testing a slightly fancier patch.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 18:38                                                   ` Andy Lutomirski
  2015-03-23 18:48                                                     ` Andy Lutomirski
@ 2015-03-23 18:54                                                     ` Stefan Seyfried
  2015-03-23 18:56                                                     ` Takashi Iwai
  2015-03-23 19:07                                                     ` Denys Vlasenko
  3 siblings, 0 replies; 77+ messages in thread
From: Stefan Seyfried @ 2015-03-23 18:54 UTC (permalink / raw)
  To: Andy Lutomirski, Denys Vlasenko
  Cc: Takashi Iwai, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	X86 ML, LKML, Tejun Heo

Am 23.03.2015 um 19:38 schrieb Andy Lutomirski:
> I bet I see it.  I have the advantage of having stared at KVM code and
> cursed at it more recently than you, I suspect.  KVM does awful, awful
> things to CPU state, and, as an optimization, it allows kernel code to
> run with CPU state that would be totally invalid in user mode.  This
> happens through a bunch of hooks, including this bit in __switch_to:
> 
>     /*
>      * Now maybe reload the debug registers and handle I/O bitmaps
>      */
>     if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
>              task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
>         __switch_to_xtra(prev_p, next_p, tss);
> 
> IOW, we *change* tif during context switches.
> 
> 
> The race looks like this:
> 
>     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
>     jnz int_ret_from_sys_call_fixup    /* Go the the slow path */
> 
> --- preempted here, switch to KVM guest ---
> 
> KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
> happen to be a *32-bit* KVM guest, perhaps?

not in my case (penryn CPU), there it was 64bit guests.

> Now KVM schedules, calling __switch_to.  __switch_to sets
> _TIF_USER_RETURN_NOTIFY.  We IRET back to the syscall exit code, turn
> off interrupts, and do sysret.  We are now screwed.
> 
> I don't know why this manifests in this particular failure, but any
> number of terrible things could happen now.
> 
> FWIW, this will affect things other than KVM.  For example, SIGKILL
> sent while a process is sleeping in that two-instruction window won't
> work.
> 
> Takashi, can you re-send your patch so we can review it for real in
> light of this race?
-- 
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 18:38                                                   ` Andy Lutomirski
  2015-03-23 18:48                                                     ` Andy Lutomirski
  2015-03-23 18:54                                                     ` PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
@ 2015-03-23 18:56                                                     ` Takashi Iwai
  2015-03-23 19:07                                                     ` Denys Vlasenko
  3 siblings, 0 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23 18:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Mon, 23 Mar 2015 11:38:30 -0700,
Andy Lutomirski wrote:
> 
> On Mon, Mar 23, 2015 at 9:07 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> > On 03/23/2015 02:22 PM, Takashi Iwai wrote:
> >> At Mon, 23 Mar 2015 10:35:41 +0100,
> >> Takashi Iwai wrote:
> >>>
> >>> At Mon, 23 Mar 2015 10:02:52 +0100,
> >>> Takashi Iwai wrote:
> >>>>
> >>>> At Fri, 20 Mar 2015 19:16:53 +0100,
> >>>> Denys Vlasenko wrote:
> 
> >> I'm really puzzled now.  We have a few pieces of information:
> >>
> >> - git bisection pointed the commit 96b6352c1271:
> >>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
> >>   and reverting this "fixes" the problem indeed.  Even just moving two
> >>   lines
> >>     LOCKDEP_SYS_EXIT
> >>     DISABLE_INTERRUPTS(CLBR_NONE)
> >>   at the beginning of ret_from_sys_call already fixes.  (Of course I
> >>   can't prove the fix but it stabilizes for a day without crash while
> >>   usually I hit the bug in 10 minutes in full test running.)
> >
> > The commit 96b6352c1271 moved TIF_ALLWORK_MASK check from
> > interrupt-disabled region to interrupt-enabled:
> >
> >         cmpq $__NR_syscall_max,%rax
> >         ja ret_from_sys_call
> >         movq %r10,%rcx
> >         call *sys_call_table(,%rax,8)  # XXX:    rip relative
> >         movq %rax,RAX-ARGOFFSET(%rsp)
> > ret_from_sys_call:
> >         testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >         jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> >         LOCKDEP_SYS_EXIT
> >         DISABLE_INTERRUPTS(CLBR_NONE)
> >         TRACE_IRQS_OFF
> > ...
> > ...
> > int_ret_from_sys_call_fixup:
> >         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
> >         jmp int_ret_from_sys_call
> > ...
> > ...
> > GLOBAL(int_ret_from_sys_call)
> >         DISABLE_INTERRUPTS(CLBR_NONE)
> >         TRACE_IRQS_OFF
> >
> > You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).
> >
> > I also don't see how moving that check (even if it is wrong in a more
> > benign way) can have such a drastic effect.
> 
> I bet I see it.  I have the advantage of having stared at KVM code and
> cursed at it more recently than you, I suspect.  KVM does awful, awful
> things to CPU state, and, as an optimization, it allows kernel code to
> run with CPU state that would be totally invalid in user mode.  This
> happens through a bunch of hooks, including this bit in __switch_to:
> 
>     /*
>      * Now maybe reload the debug registers and handle I/O bitmaps
>      */
>     if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
>              task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
>         __switch_to_xtra(prev_p, next_p, tss);
> 
> IOW, we *change* tif during context switches.
> 
> 
> The race looks like this:
> 
>     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
>     jnz int_ret_from_sys_call_fixup    /* Go the the slow path */
> 
> --- preempted here, switch to KVM guest ---
> 
> KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
> happen to be a *32-bit* KVM guest, perhaps?
> 
> Now KVM schedules, calling __switch_to.  __switch_to sets
> _TIF_USER_RETURN_NOTIFY.  We IRET back to the syscall exit code, turn
> off interrupts, and do sysret.  We are now screwed.

Thanks for enlightening!  That looks like a feasible scenario.
(I tested only a 64bit KVM guest, BTW.)

> I don't know why this manifests in this particular failure, but any
> number of terrible things could happen now.
> 
> FWIW, this will affect things other than KVM.  For example, SIGKILL
> sent while a process is sleeping in that two-instruction window won't
> work.
> 
> Takashi, can you re-send your patch so we can review it for real in
> light of this race?

The patch below worked.  I'll double-check tomorrow whether this
really cures reliably.


thanks,

Takashi

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1d74d161687c..5340ac7f88a9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -364,12 +364,12 @@ system_call_fastpath:
  * Has incomplete stack frame and undefined top of stack.
  */
 ret_from_sys_call:
-	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
-	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
-
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
+	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
+	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
+
 	CFI_REMEMBER_STATE
 	/*
 	 * sysretq will re-enable interrupts:

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 18:48                                                     ` Andy Lutomirski
@ 2015-03-23 18:59                                                       ` Takashi Iwai
  2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Takashi Iwai @ 2015-03-23 18:59 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Takashi Iwai, Denys Vlasenko, Jiri Kosina,
	Linus Torvalds, Stefan Seyfried, X86 ML, LKML, Tejun Heo

At Mon, 23 Mar 2015 11:48:42 -0700,
Andy Lutomirski wrote:
> 
> On Mon, Mar 23, 2015 at 11:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> > On Mon, Mar 23, 2015 at 9:07 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> >> On 03/23/2015 02:22 PM, Takashi Iwai wrote:
> >>> At Mon, 23 Mar 2015 10:35:41 +0100,
> >>> Takashi Iwai wrote:
> >>>>
> >>>> At Mon, 23 Mar 2015 10:02:52 +0100,
> >>>> Takashi Iwai wrote:
> >>>>>
> >>>>> At Fri, 20 Mar 2015 19:16:53 +0100,
> >>>>> Denys Vlasenko wrote:
> >
> >>> I'm really puzzled now.  We have a few pieces of information:
> >>>
> >>> - git bisection pointed the commit 96b6352c1271:
> >>>     x86_64, entry: Remove the syscall exit audit and schedule optimizations
> >>>   and reverting this "fixes" the problem indeed.  Even just moving two
> >>>   lines
> >>>     LOCKDEP_SYS_EXIT
> >>>     DISABLE_INTERRUPTS(CLBR_NONE)
> >>>   at the beginning of ret_from_sys_call already fixes.  (Of course I
> >>>   can't prove the fix but it stabilizes for a day without crash while
> >>>   usually I hit the bug in 10 minutes in full test running.)
> >>
> >> The commit 96b6352c1271 moved TIF_ALLWORK_MASK check from
> >> interrupt-disabled region to interrupt-enabled:
> >>
> >>         cmpq $__NR_syscall_max,%rax
> >>         ja ret_from_sys_call
> >>         movq %r10,%rcx
> >>         call *sys_call_table(,%rax,8)  # XXX:    rip relative
> >>         movq %rax,RAX-ARGOFFSET(%rsp)
> >> ret_from_sys_call:
> >>         testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>         jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> >>         LOCKDEP_SYS_EXIT
> >>         DISABLE_INTERRUPTS(CLBR_NONE)
> >>         TRACE_IRQS_OFF
> >> ...
> >> ...
> >> int_ret_from_sys_call_fixup:
> >>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
> >>         jmp int_ret_from_sys_call
> >> ...
> >> ...
> >> GLOBAL(int_ret_from_sys_call)
> >>         DISABLE_INTERRUPTS(CLBR_NONE)
> >>         TRACE_IRQS_OFF
> >>
> >> You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).
> >>
> >> I also don't see how moving that check (even if it is wrong in a more
> >> benign way) can have such a drastic effect.
> >
> > I bet I see it.  I have the advantage of having stared at KVM code and
> > cursed at it more recently than you, I suspect.  KVM does awful, awful
> > things to CPU state, and, as an optimization, it allows kernel code to
> > run with CPU state that would be totally invalid in user mode.  This
> > happens through a bunch of hooks, including this bit in __switch_to:
> >
> >     /*
> >      * Now maybe reload the debug registers and handle I/O bitmaps
> >      */
> >     if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
> >              task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
> >         __switch_to_xtra(prev_p, next_p, tss);
> >
> > IOW, we *change* tif during context switches.
> >
> >
> > The race looks like this:
> >
> >     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
> >     jnz int_ret_from_sys_call_fixup    /* Go the the slow path */
> >
> > --- preempted here, switch to KVM guest ---
> >
> > KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
> > happen to be a *32-bit* KVM guest, perhaps?
> >
> > Now KVM schedules, calling __switch_to.  __switch_to sets
> > _TIF_USER_RETURN_NOTIFY.  We IRET back to the syscall exit code, turn
> > off interrupts, and do sysret.  We are now screwed.
> >
> > I don't know why this manifests in this particular failure, but any
> > number of terrible things could happen now.
> >
> > FWIW, this will affect things other than KVM.  For example, SIGKILL
> > sent while a process is sleeping in that two-instruction window won't
> > work.
> >
> > Takashi, can you re-send your patch so we can review it for real in
> > light of this race?
> 
> Never mind, I'm testing a slightly fancier patch.

OK, I'll wait for your test patch.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 18:38                                                   ` Andy Lutomirski
                                                                       ` (2 preceding siblings ...)
  2015-03-23 18:56                                                     ` Takashi Iwai
@ 2015-03-23 19:07                                                     ` Denys Vlasenko
  2015-03-23 19:10                                                       ` Andy Lutomirski
  3 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-23 19:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Takashi Iwai, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On 03/23/2015 07:38 PM, Andy Lutomirski wrote:
>>         cmpq $__NR_syscall_max,%rax
>>         ja ret_from_sys_call
>>         movq %r10,%rcx
>>         call *sys_call_table(,%rax,8)  # XXX:    rip relative
>>         movq %rax,RAX-ARGOFFSET(%rsp)
>> ret_from_sys_call:
>>         testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>         jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>>         LOCKDEP_SYS_EXIT
>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>         TRACE_IRQS_OFF
>> ...
>> ...
>> int_ret_from_sys_call_fixup:
>>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>>         jmp int_ret_from_sys_call
>> ...
>> ...
>> GLOBAL(int_ret_from_sys_call)
>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>         TRACE_IRQS_OFF
>>
>> You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).
>>
>> I also don't see how moving that check (even if it is wrong in a more
>> benign way) can have such a drastic effect.
> 
> I bet I see it.  I have the advantage of having stared at KVM code and
> cursed at it more recently than you, I suspect.  KVM does awful, awful
> things to CPU state, and, as an optimization, it allows kernel code to
> run with CPU state that would be totally invalid in user mode.  This
> happens through a bunch of hooks, including this bit in __switch_to:
> 
>     /*
>      * Now maybe reload the debug registers and handle I/O bitmaps
>      */
>     if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
>              task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
>         __switch_to_xtra(prev_p, next_p, tss);
> 
> IOW, we *change* tif during context switches.
> 
> 
> The race looks like this:
> 
>     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
>     jnz int_ret_from_sys_call_fixup    /* Go the the slow path */
> 
> --- preempted here, switch to KVM guest ---
> 
> KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
> happen to be a *32-bit* KVM guest, perhaps?
> 
> Now KVM schedules, calling __switch_to.  __switch_to sets
> _TIF_USER_RETURN_NOTIFY.

Clear up to now...

> We IRET back to the syscall exit code,

So we end up being just after the "testl", right?
We go into "int_ret_from_sys_call_fixup".
We FIXUP_TOP_OF_STACK - now iret frame contains correct values.
Then we jump to "int_ret_from_sys_call".

> turn off interrupts, and do sysret.  We are now screwed.

I don't understand. Where exactly it would go wrong?

On sysret, rsp would be restored from PER_CPU(old_rsp), right?
We'd end up in *userspace* with userspace rsp.

More to it. Since we FIXUPed the iret frame, it does not even matter
how we'll exit to userspace. Either sysret or iret would work.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-23 19:07                                                     ` Denys Vlasenko
@ 2015-03-23 19:10                                                       ` Andy Lutomirski
  0 siblings, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-23 19:10 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Takashi Iwai, Denys Vlasenko, Jiri Kosina, Linus Torvalds,
	Stefan Seyfried, X86 ML, LKML, Tejun Heo

On Mon, Mar 23, 2015 at 12:07 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/23/2015 07:38 PM, Andy Lutomirski wrote:
>>>         cmpq $__NR_syscall_max,%rax
>>>         ja ret_from_sys_call
>>>         movq %r10,%rcx
>>>         call *sys_call_table(,%rax,8)  # XXX:    rip relative
>>>         movq %rax,RAX-ARGOFFSET(%rsp)
>>> ret_from_sys_call:
>>>         testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>         jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>>>         LOCKDEP_SYS_EXIT
>>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>>         TRACE_IRQS_OFF
>>> ...
>>> ...
>>> int_ret_from_sys_call_fixup:
>>>         FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>>>         jmp int_ret_from_sys_call
>>> ...
>>> ...
>>> GLOBAL(int_ret_from_sys_call)
>>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>>         TRACE_IRQS_OFF
>>>
>>> You reverted that by moving this insn to be after first DISABLE_INTERRUPTS(CLBR_NONE).
>>>
>>> I also don't see how moving that check (even if it is wrong in a more
>>> benign way) can have such a drastic effect.
>>
>> I bet I see it.  I have the advantage of having stared at KVM code and
>> cursed at it more recently than you, I suspect.  KVM does awful, awful
>> things to CPU state, and, as an optimization, it allows kernel code to
>> run with CPU state that would be totally invalid in user mode.  This
>> happens through a bunch of hooks, including this bit in __switch_to:
>>
>>     /*
>>      * Now maybe reload the debug registers and handle I/O bitmaps
>>      */
>>     if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
>>              task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
>>         __switch_to_xtra(prev_p, next_p, tss);
>>
>> IOW, we *change* tif during context switches.
>>
>>
>> The race looks like this:
>>
>>     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP)
>>     jnz int_ret_from_sys_call_fixup    /* Go the the slow path */
>>
>> --- preempted here, switch to KVM guest ---
>>
>> KVM guest enters and screws up, say, MSR_SYSCALL_MASK.  This wouldn't
>> happen to be a *32-bit* KVM guest, perhaps?
>>
>> Now KVM schedules, calling __switch_to.  __switch_to sets
>> _TIF_USER_RETURN_NOTIFY.
>
> Clear up to now...
>
>> We IRET back to the syscall exit code,
>
> So we end up being just after the "testl", right?
> We go into "int_ret_from_sys_call_fixup".

Nope, other way around.  We saw no work bits set in testl, but one or
more of those bits was set when we're preempted and return.  Now we
*don't* go to int_ret_from_sys_call_fixup.  I don't think that the
resulting sysret itself is harmful, but I think we're now running user
code with some MSRs programmed wrong.  The next syscall could do bad
things, such as failing to clear IF.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-23 18:59                                                       ` Takashi Iwai
@ 2015-03-23 19:10                                                         ` Andy Lutomirski
  2015-03-23 19:21                                                           ` Denys Vlasenko
                                                                             ` (3 more replies)
  0 siblings, 4 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-23 19:10 UTC (permalink / raw)
  To: Takashi Iwai, Denys Vlasenko, Stefan Seyfried, X86 ML
  Cc: Jiri Kosina, LKML, Tejun Heo, Andy Lutomirski

We currently have a race: if we're preempted during syscall exit, we
can fail to process syscall return work that is queued up while
we're preempted in ret_from_sys_call after checking ti.flags.

Fix it by disabling interrupts before checking ti.flags.

Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
look to confirm that it's okay to call it more than once?

arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1d74d161687c..2babb393915e 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -364,12 +364,21 @@ system_call_fastpath:
  * Has incomplete stack frame and undefined top of stack.
  */
 ret_from_sys_call:
-	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
-	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
-
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
+
+	/*
+	 * We must check ti flags with interrupts (or at least preemption)
+	 * off because we must *never* return to userspace without
+	 * processing exit work that is enqueued if we're preempted here.
+	 * In particular, returning to userspace with any of the one-shot
+	 * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
+	 * very bad.
+	 */
+	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
+	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
+
 	CFI_REMEMBER_STATE
 	/*
 	 * sysretq will re-enable interrupts:
@@ -386,7 +395,7 @@ ret_from_sys_call:
 
 int_ret_from_sys_call_fixup:
 	FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
-	jmp int_ret_from_sys_call
+	jmp int_ret_from_sys_call_irqs_off
 
 	/* Do syscall tracing */
 tracesys:
@@ -432,6 +441,7 @@ tracesys_phase2:
 GLOBAL(int_ret_from_sys_call)
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
+int_ret_from_sys_call_irqs_off:
 	movl $_TIF_ALLWORK_MASK,%edi
 	/* edi:	mask to check */
 GLOBAL(int_with_check)
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
@ 2015-03-23 19:21                                                           ` Denys Vlasenko
  2015-03-23 19:27                                                             ` Andy Lutomirski
  2015-03-24 11:17                                                           ` Takashi Iwai
                                                                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 77+ messages in thread
From: Denys Vlasenko @ 2015-03-23 19:21 UTC (permalink / raw)
  To: Andy Lutomirski, Takashi Iwai, Stefan Seyfried, X86 ML
  Cc: Jiri Kosina, LKML, Tejun Heo

On 03/23/2015 08:10 PM, Andy Lutomirski wrote:
> We currently have a race: if we're preempted during syscall exit, we
> can fail to process syscall return work that is queued up while
> we're preempted in ret_from_sys_call after checking ti.flags.
> 
> Fix it by disabling interrupts before checking ti.flags.
> 
> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
> Reported-by: Takashi Iwai <tiwai@suse.de>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
> look to confirm that it's okay to call it more than once?
> 
> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 1d74d161687c..2babb393915e 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -364,12 +364,21 @@ system_call_fastpath:
>   * Has incomplete stack frame and undefined top of stack.
>   */
>  ret_from_sys_call:
> -	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> -	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
> -
>  	LOCKDEP_SYS_EXIT
>  	DISABLE_INTERRUPTS(CLBR_NONE)
>  	TRACE_IRQS_OFF
> +
> +	/*
> +	 * We must check ti flags with interrupts (or at least preemption)
> +	 * off because we must *never* return to userspace without
> +	 * processing exit work that is enqueued if we're preempted here.
> +	 * In particular, returning to userspace with any of the one-shot
> +	 * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
> +	 * very bad.
> +	 */
> +	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> +	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
                                           ^^^^^^^^^^^^^^^^^^^^

typo here; s/the the/to the/


> +
>  	CFI_REMEMBER_STATE
>  	/*
>  	 * sysretq will re-enable interrupts:
> @@ -386,7 +395,7 @@ ret_from_sys_call:
>  
>  int_ret_from_sys_call_fixup:
>  	FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
> -	jmp int_ret_from_sys_call
> +	jmp int_ret_from_sys_call_irqs_off
>  
>  	/* Do syscall tracing */
>  tracesys:
> @@ -432,6 +441,7 @@ tracesys_phase2:
>  GLOBAL(int_ret_from_sys_call)
>  	DISABLE_INTERRUPTS(CLBR_NONE)
>  	TRACE_IRQS_OFF
> +int_ret_from_sys_call_irqs_off:
>  	movl $_TIF_ALLWORK_MASK,%edi
>  	/* edi:	mask to check */
>  GLOBAL(int_with_check)


You can avoid having to know LOCKDEP_SYS_EXIT    :)
Just set %edi = $_TIF_ALLWORK_MASK, and jump a bit farther:


        movl $_TIF_ALLWORK_MASK,%edi
	testl %edi,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
	jnz int_ret_from_sys_call_fixup	/* Go to the slow path */
...
...
GLOBAL(int_ret_from_sys_call)
        DISABLE_INTERRUPTS(CLBR_NONE)
        TRACE_IRQS_OFF
        movl $_TIF_ALLWORK_MASK,%edi
        /* edi: mask to check */
GLOBAL(int_with_check)
        LOCKDEP_SYS_EXIT_IRQ
int_ret_from_sys_call_irqs_off:  <========== HERE


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-23 19:21                                                           ` Denys Vlasenko
@ 2015-03-23 19:27                                                             ` Andy Lutomirski
  2015-03-23 19:32                                                               ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-23 19:27 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Takashi Iwai, Stefan Seyfried, X86 ML,
	Jiri Kosina, LKML, Tejun Heo

On Mon, Mar 23, 2015 at 12:21 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> On 03/23/2015 08:10 PM, Andy Lutomirski wrote:
>> We currently have a race: if we're preempted during syscall exit, we
>> can fail to process syscall return work that is queued up while
>> we're preempted in ret_from_sys_call after checking ti.flags.
>>
>> Fix it by disabling interrupts before checking ti.flags.
>>
>> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
>> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
>> Reported-by: Takashi Iwai <tiwai@suse.de>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>
>> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
>> look to confirm that it's okay to call it more than once?
>>
>> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>>  1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> index 1d74d161687c..2babb393915e 100644
>> --- a/arch/x86/kernel/entry_64.S
>> +++ b/arch/x86/kernel/entry_64.S
>> @@ -364,12 +364,21 @@ system_call_fastpath:
>>   * Has incomplete stack frame and undefined top of stack.
>>   */
>>  ret_from_sys_call:
>> -     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> -     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> -
>>       LOCKDEP_SYS_EXIT
>>       DISABLE_INTERRUPTS(CLBR_NONE)
>>       TRACE_IRQS_OFF
>> +
>> +     /*
>> +      * We must check ti flags with interrupts (or at least preemption)
>> +      * off because we must *never* return to userspace without
>> +      * processing exit work that is enqueued if we're preempted here.
>> +      * In particular, returning to userspace with any of the one-shot
>> +      * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
>> +      * very bad.
>> +      */
>> +     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> +     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>                                            ^^^^^^^^^^^^^^^^^^^^
>
> typo here; s/the the/to the/

Whoops.

>
>
>> +
>>       CFI_REMEMBER_STATE
>>       /*
>>        * sysretq will re-enable interrupts:
>> @@ -386,7 +395,7 @@ ret_from_sys_call:
>>
>>  int_ret_from_sys_call_fixup:
>>       FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>> -     jmp int_ret_from_sys_call
>> +     jmp int_ret_from_sys_call_irqs_off
>>
>>       /* Do syscall tracing */
>>  tracesys:
>> @@ -432,6 +441,7 @@ tracesys_phase2:
>>  GLOBAL(int_ret_from_sys_call)
>>       DISABLE_INTERRUPTS(CLBR_NONE)
>>       TRACE_IRQS_OFF
>> +int_ret_from_sys_call_irqs_off:
>>       movl $_TIF_ALLWORK_MASK,%edi
>>       /* edi: mask to check */
>>  GLOBAL(int_with_check)
>
>
> You can avoid having to know LOCKDEP_SYS_EXIT    :)
> Just set %edi = $_TIF_ALLWORK_MASK, and jump a bit farther:
>
>
>         movl $_TIF_ALLWORK_MASK,%edi
>         testl %edi,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>         jnz int_ret_from_sys_call_fixup /* Go to the slow path */
> ...
> ...
> GLOBAL(int_ret_from_sys_call)
>         DISABLE_INTERRUPTS(CLBR_NONE)
>         TRACE_IRQS_OFF
>         movl $_TIF_ALLWORK_MASK,%edi
>         /* edi: mask to check */
> GLOBAL(int_with_check)
>         LOCKDEP_SYS_EXIT_IRQ
> int_ret_from_sys_call_irqs_off:  <========== HERE
>

I didn't want to do that, because I really want to rewrite
int_ret_from_sys_call in C.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-23 19:27                                                             ` Andy Lutomirski
@ 2015-03-23 19:32                                                               ` Andy Lutomirski
  0 siblings, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-23 19:32 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Andy Lutomirski, Takashi Iwai, Stefan Seyfried, X86 ML,
	Jiri Kosina, LKML, Tejun Heo

On Mon, Mar 23, 2015 at 12:27 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Mar 23, 2015 at 12:21 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> On 03/23/2015 08:10 PM, Andy Lutomirski wrote:
>>> We currently have a race: if we're preempted during syscall exit, we
>>> can fail to process syscall return work that is queued up while
>>> we're preempted in ret_from_sys_call after checking ti.flags.
>>>
>>> Fix it by disabling interrupts before checking ti.flags.
>>>
>>> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
>>> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
>>> Reported-by: Takashi Iwai <tiwai@suse.de>
>>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>>> ---
>>>
>>> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
>>> look to confirm that it's okay to call it more than once?
>>>
>>> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>>>  1 file changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>>> index 1d74d161687c..2babb393915e 100644
>>> --- a/arch/x86/kernel/entry_64.S
>>> +++ b/arch/x86/kernel/entry_64.S
>>> @@ -364,12 +364,21 @@ system_call_fastpath:
>>>   * Has incomplete stack frame and undefined top of stack.
>>>   */
>>>  ret_from_sys_call:
>>> -     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>>> -     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>>> -
>>>       LOCKDEP_SYS_EXIT
>>>       DISABLE_INTERRUPTS(CLBR_NONE)
>>>       TRACE_IRQS_OFF
>>> +
>>> +     /*
>>> +      * We must check ti flags with interrupts (or at least preemption)
>>> +      * off because we must *never* return to userspace without
>>> +      * processing exit work that is enqueued if we're preempted here.
>>> +      * In particular, returning to userspace with any of the one-shot
>>> +      * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
>>> +      * very bad.
>>> +      */
>>> +     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>>> +     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>>                                            ^^^^^^^^^^^^^^^^^^^^
>>
>> typo here; s/the the/to the/
>
> Whoops.
>
>>
>>
>>> +
>>>       CFI_REMEMBER_STATE
>>>       /*
>>>        * sysretq will re-enable interrupts:
>>> @@ -386,7 +395,7 @@ ret_from_sys_call:
>>>
>>>  int_ret_from_sys_call_fixup:
>>>       FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>>> -     jmp int_ret_from_sys_call
>>> +     jmp int_ret_from_sys_call_irqs_off
>>>
>>>       /* Do syscall tracing */
>>>  tracesys:
>>> @@ -432,6 +441,7 @@ tracesys_phase2:
>>>  GLOBAL(int_ret_from_sys_call)
>>>       DISABLE_INTERRUPTS(CLBR_NONE)
>>>       TRACE_IRQS_OFF
>>> +int_ret_from_sys_call_irqs_off:
>>>       movl $_TIF_ALLWORK_MASK,%edi
>>>       /* edi: mask to check */
>>>  GLOBAL(int_with_check)
>>
>>
>> You can avoid having to know LOCKDEP_SYS_EXIT    :)
>> Just set %edi = $_TIF_ALLWORK_MASK, and jump a bit farther:
>>
>>
>>         movl $_TIF_ALLWORK_MASK,%edi
>>         testl %edi,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>>         jnz int_ret_from_sys_call_fixup /* Go to the slow path */
>> ...
>> ...
>> GLOBAL(int_ret_from_sys_call)
>>         DISABLE_INTERRUPTS(CLBR_NONE)
>>         TRACE_IRQS_OFF
>>         movl $_TIF_ALLWORK_MASK,%edi
>>         /* edi: mask to check */
>> GLOBAL(int_with_check)
>>         LOCKDEP_SYS_EXIT_IRQ
>> int_ret_from_sys_call_irqs_off:  <========== HERE
>>
>
> I didn't want to do that, because I really want to rewrite
> int_ret_from_sys_call in C.
>

To say that better: I don't want to further spread the %edi garbage
around entry_64.S.  Saving a single load on the slow path isn't worth
any of this complexity, and, if we're going to rewrite it in C anyway,
then maybe we could consider microoptimizations like that later on.

--Andy

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
  2015-03-23 19:21                                                           ` Denys Vlasenko
@ 2015-03-24 11:17                                                           ` Takashi Iwai
  2015-03-24 20:08                                                           ` Ingo Molnar
  2015-03-25  9:13                                                           ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
  3 siblings, 0 replies; 77+ messages in thread
From: Takashi Iwai @ 2015-03-24 11:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, Stefan Seyfried, X86 ML, Jiri Kosina, LKML, Tejun Heo

At Mon, 23 Mar 2015 12:32:54 -0700,
Andy Lutomirski wrote:
> 
> We currently have a race: if we're preempted during syscall exit, we
> can fail to process syscall return work that is queued up while
> we're preempted in ret_from_sys_call after checking ti.flags.
> 
> Fix it by disabling interrupts before checking ti.flags.
> 
> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
> Reported-by: Takashi Iwai <tiwai@suse.de>

I tested the patch again and confirmed that it works with the latest
Linus tree.

Tested-by: Takashi Iwai <tiwai@suse.de>


thanks,

Takashi

> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
> look to confirm that it's okay to call it more than once?
> 
> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 1d74d161687c..2babb393915e 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -364,12 +364,21 @@ system_call_fastpath:
>   * Has incomplete stack frame and undefined top of stack.
>   */
>  ret_from_sys_call:
> -	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> -	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
> -
>  	LOCKDEP_SYS_EXIT
>  	DISABLE_INTERRUPTS(CLBR_NONE)
>  	TRACE_IRQS_OFF
> +
> +	/*
> +	 * We must check ti flags with interrupts (or at least preemption)
> +	 * off because we must *never* return to userspace without
> +	 * processing exit work that is enqueued if we're preempted here.
> +	 * In particular, returning to userspace with any of the one-shot
> +	 * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
> +	 * very bad.
> +	 */
> +	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> +	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
> +
>  	CFI_REMEMBER_STATE
>  	/*
>  	 * sysretq will re-enable interrupts:
> @@ -386,7 +395,7 @@ ret_from_sys_call:
>  
>  int_ret_from_sys_call_fixup:
>  	FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
> -	jmp int_ret_from_sys_call
> +	jmp int_ret_from_sys_call_irqs_off
>  
>  	/* Do syscall tracing */
>  tracesys:
> @@ -432,6 +441,7 @@ tracesys_phase2:
>  GLOBAL(int_ret_from_sys_call)
>  	DISABLE_INTERRUPTS(CLBR_NONE)
>  	TRACE_IRQS_OFF
> +int_ret_from_sys_call_irqs_off:
>  	movl $_TIF_ALLWORK_MASK,%edi
>  	/* edi:	mask to check */
>  GLOBAL(int_with_check)
> -- 
> 2.3.0
> 

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
  2015-03-23 19:21                                                           ` Denys Vlasenko
  2015-03-24 11:17                                                           ` Takashi Iwai
@ 2015-03-24 20:08                                                           ` Ingo Molnar
  2015-03-25  0:35                                                             ` Andy Lutomirski
  2015-03-25  9:13                                                           ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
  3 siblings, 1 reply; 77+ messages in thread
From: Ingo Molnar @ 2015-03-24 20:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Takashi Iwai, Denys Vlasenko, Stefan Seyfried, X86 ML,
	Jiri Kosina, LKML, Tejun Heo


* Andy Lutomirski <luto@kernel.org> wrote:

> We currently have a race: if we're preempted during syscall exit, we
> can fail to process syscall return work that is queued up while
> we're preempted in ret_from_sys_call after checking ti.flags.
> 
> Fix it by disabling interrupts before checking ti.flags.
> 
> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
> Reported-by: Takashi Iwai <tiwai@suse.de>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
> look to confirm that it's okay to call it more than once?

So the essence is that it wants to print this warning if we are 
holding a lock after a syscall:

                printk("[ BUG: lock held when returning to user space! ]\n");

it manipulates no state and is not sensitive to whether it's called 
before or after return-work processing.

> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 1d74d161687c..2babb393915e 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -364,12 +364,21 @@ system_call_fastpath:
>   * Has incomplete stack frame and undefined top of stack.
>   */
>  ret_from_sys_call:
> -	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> -	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
> -
>  	LOCKDEP_SYS_EXIT
>  	DISABLE_INTERRUPTS(CLBR_NONE)
>  	TRACE_IRQS_OFF
> +
> +	/*
> +	 * We must check ti flags with interrupts (or at least preemption)
> +	 * off because we must *never* return to userspace without
> +	 * processing exit work that is enqueued if we're preempted here.
> +	 * In particular, returning to userspace with any of the one-shot
> +	 * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
> +	 * very bad.
> +	 */
> +	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> +	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */

Should be safe to call it once again after user-work processing has 
been finished.

I've picked up your fix for tip:x86/urgent.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-24 20:08                                                           ` Ingo Molnar
@ 2015-03-25  0:35                                                             ` Andy Lutomirski
  2015-03-25 12:21                                                               ` Ingo Molnar
  0 siblings, 1 reply; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-25  0:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, Takashi Iwai, Denys Vlasenko, Stefan Seyfried,
	X86 ML, Jiri Kosina, LKML, Tejun Heo

On Tue, Mar 24, 2015 at 1:08 PM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@kernel.org> wrote:
>
>> We currently have a race: if we're preempted during syscall exit, we
>> can fail to process syscall return work that is queued up while
>> we're preempted in ret_from_sys_call after checking ti.flags.
>>
>> Fix it by disabling interrupts before checking ti.flags.
>>
>> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
>> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
>> Reported-by: Takashi Iwai <tiwai@suse.de>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>
>> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
>> look to confirm that it's okay to call it more than once?
>
> So the essence is that it wants to print this warning if we are
> holding a lock after a syscall:
>
>                 printk("[ BUG: lock held when returning to user space! ]\n");
>
> it manipulates no state and is not sensitive to whether it's called
> before or after return-work processing.
>
>> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>>  1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> index 1d74d161687c..2babb393915e 100644
>> --- a/arch/x86/kernel/entry_64.S
>> +++ b/arch/x86/kernel/entry_64.S
>> @@ -364,12 +364,21 @@ system_call_fastpath:
>>   * Has incomplete stack frame and undefined top of stack.
>>   */
>>  ret_from_sys_call:
>> -     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> -     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> -
>>       LOCKDEP_SYS_EXIT
>>       DISABLE_INTERRUPTS(CLBR_NONE)
>>       TRACE_IRQS_OFF
>> +
>> +     /*
>> +      * We must check ti flags with interrupts (or at least preemption)
>> +      * off because we must *never* return to userspace without
>> +      * processing exit work that is enqueued if we're preempted here.
>> +      * In particular, returning to userspace with any of the one-shot
>> +      * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
>> +      * very bad.
>> +      */
>> +     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> +     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>
> Should be safe to call it once again after user-work processing has
> been finished.
>
> I've picked up your fix for tip:x86/urgent.

FWIW, the tentative merge here:

https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=tmp.tmp&id=a77dd1607ad88a601259a74ba4d646fa68b7cd9a

looks funny.  Why aren't you jumping to int_ret_from_sys_call_irqs_off?

--Andy

>
> Thanks,
>
>         Ingo



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [tip:x86/asm] x86/asm/entry: Check for syscall exit work with IRQs disabled
  2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
                                                                             ` (2 preceding siblings ...)
  2015-03-24 20:08                                                           ` Ingo Molnar
@ 2015-03-25  9:13                                                           ` tip-bot for Andy Lutomirski
  3 siblings, 0 replies; 77+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-25  9:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, linux-kernel, tj, hpa, jkosina, mingo, dvlasenk, tiwai,
	tglx, stefan.seyfried

Commit-ID:  b3494a4ab20f6bdf74cdf2badf7918bb65ee8a00
Gitweb:     http://git.kernel.org/tip/b3494a4ab20f6bdf74cdf2badf7918bb65ee8a00
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Mon, 23 Mar 2015 12:32:54 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Mar 2015 21:08:28 +0100

x86/asm/entry: Check for syscall exit work with IRQs disabled

We currently have a race: if we're preempted during syscall
exit, we can fail to process syscall return work that is queued
up while we're preempted in ret_from_sys_call after checking
ti.flags.

Fix it by disabling interrupts before checking ti.flags.

Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Fixes: 96b6352c1271 ("x86_64, entry: Remove the syscall exit audit")
Link: http://lkml.kernel.org/r/189320d42b4d671df78c10555976bb10af1ffc75.1427137498.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1d74d16..2babb39 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -364,12 +364,21 @@ system_call_fastpath:
  * Has incomplete stack frame and undefined top of stack.
  */
 ret_from_sys_call:
-	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
-	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
-
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
+
+	/*
+	 * We must check ti flags with interrupts (or at least preemption)
+	 * off because we must *never* return to userspace without
+	 * processing exit work that is enqueued if we're preempted here.
+	 * In particular, returning to userspace with any of the one-shot
+	 * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
+	 * very bad.
+	 */
+	testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
+	jnz int_ret_from_sys_call_fixup	/* Go the the slow path */
+
 	CFI_REMEMBER_STATE
 	/*
 	 * sysretq will re-enable interrupts:
@@ -386,7 +395,7 @@ ret_from_sys_call:
 
 int_ret_from_sys_call_fixup:
 	FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
-	jmp int_ret_from_sys_call
+	jmp int_ret_from_sys_call_irqs_off
 
 	/* Do syscall tracing */
 tracesys:
@@ -432,6 +441,7 @@ tracesys_phase2:
 GLOBAL(int_ret_from_sys_call)
 	DISABLE_INTERRUPTS(CLBR_NONE)
 	TRACE_IRQS_OFF
+int_ret_from_sys_call_irqs_off:
 	movl $_TIF_ALLWORK_MASK,%edi
 	/* edi:	mask to check */
 GLOBAL(int_with_check)

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-25  0:35                                                             ` Andy Lutomirski
@ 2015-03-25 12:21                                                               ` Ingo Molnar
  2015-03-25 15:07                                                                 ` Andy Lutomirski
  0 siblings, 1 reply; 77+ messages in thread
From: Ingo Molnar @ 2015-03-25 12:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, Takashi Iwai, Denys Vlasenko, Stefan Seyfried,
	X86 ML, Jiri Kosina, LKML, Tejun Heo


* Andy Lutomirski <luto@amacapital.net> wrote:

> On Tue, Mar 24, 2015 at 1:08 PM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Andy Lutomirski <luto@kernel.org> wrote:
> >
> >> We currently have a race: if we're preempted during syscall exit, we
> >> can fail to process syscall return work that is queued up while
> >> we're preempted in ret_from_sys_call after checking ti.flags.
> >>
> >> Fix it by disabling interrupts before checking ti.flags.
> >>
> >> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
> >> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
> >> Reported-by: Takashi Iwai <tiwai@suse.de>
> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> >> ---
> >>
> >> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
> >> look to confirm that it's okay to call it more than once?
> >
> > So the essence is that it wants to print this warning if we are
> > holding a lock after a syscall:
> >
> >                 printk("[ BUG: lock held when returning to user space! ]\n");
> >
> > it manipulates no state and is not sensitive to whether it's called
> > before or after return-work processing.
> >
> >> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
> >>  1 file changed, 14 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> >> index 1d74d161687c..2babb393915e 100644
> >> --- a/arch/x86/kernel/entry_64.S
> >> +++ b/arch/x86/kernel/entry_64.S
> >> @@ -364,12 +364,21 @@ system_call_fastpath:
> >>   * Has incomplete stack frame and undefined top of stack.
> >>   */
> >>  ret_from_sys_call:
> >> -     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> >> -     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> >> -
> >>       LOCKDEP_SYS_EXIT
> >>       DISABLE_INTERRUPTS(CLBR_NONE)
> >>       TRACE_IRQS_OFF
> >> +
> >> +     /*
> >> +      * We must check ti flags with interrupts (or at least preemption)
> >> +      * off because we must *never* return to userspace without
> >> +      * processing exit work that is enqueued if we're preempted here.
> >> +      * In particular, returning to userspace with any of the one-shot
> >> +      * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
> >> +      * very bad.
> >> +      */
> >> +     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> >> +     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
> >
> > Should be safe to call it once again after user-work processing has
> > been finished.
> >
> > I've picked up your fix for tip:x86/urgent.
> 
> FWIW, the tentative merge here:
> 
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=tmp.tmp&id=a77dd1607ad88a601259a74ba4d646fa68b7cd9a
> 
> looks funny.  Why aren't you jumping to int_ret_from_sys_call_irqs_off?

Indeed - the orphaned label should have told me that. The mismerge is 
functionally harmless (causes extra overhead in the slowpath), that's 
why it passed testing.

Does:

  06ab9c1ba6a1 Merge branch 'x86/urgent' into x86/asm, to resolve conflict

look better to you?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH] x86, entry: Check for syscall exit work with IRQs disabled
  2015-03-25 12:21                                                               ` Ingo Molnar
@ 2015-03-25 15:07                                                                 ` Andy Lutomirski
  0 siblings, 0 replies; 77+ messages in thread
From: Andy Lutomirski @ 2015-03-25 15:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, Takashi Iwai, Denys Vlasenko, Stefan Seyfried,
	X86 ML, Jiri Kosina, LKML, Tejun Heo

On Wed, Mar 25, 2015 at 5:21 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@amacapital.net> wrote:
>
>> On Tue, Mar 24, 2015 at 1:08 PM, Ingo Molnar <mingo@kernel.org> wrote:
>> >
>> > * Andy Lutomirski <luto@kernel.org> wrote:
>> >
>> >> We currently have a race: if we're preempted during syscall exit, we
>> >> can fail to process syscall return work that is queued up while
>> >> we're preempted in ret_from_sys_call after checking ti.flags.
>> >>
>> >> Fix it by disabling interrupts before checking ti.flags.
>> >>
>> >> Fixes: 96b6352c1271 x86_64, entry: Remove the syscall exit audit and schedule optimizations
>> >> Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
>> >> Reported-by: Takashi Iwai <tiwai@suse.de>
>> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> >> ---
>> >>
>> >> Ingo, I don't understand the LOCKDEP_SYS_EXIT stuff.  Can you take a quick
>> >> look to confirm that it's okay to call it more than once?
>> >
>> > So the essence is that it wants to print this warning if we are
>> > holding a lock after a syscall:
>> >
>> >                 printk("[ BUG: lock held when returning to user space! ]\n");
>> >
>> > it manipulates no state and is not sensitive to whether it's called
>> > before or after return-work processing.
>> >
>> >> arch/x86/kernel/entry_64.S | 18 ++++++++++++++----
>> >>  1 file changed, 14 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> >> index 1d74d161687c..2babb393915e 100644
>> >> --- a/arch/x86/kernel/entry_64.S
>> >> +++ b/arch/x86/kernel/entry_64.S
>> >> @@ -364,12 +364,21 @@ system_call_fastpath:
>> >>   * Has incomplete stack frame and undefined top of stack.
>> >>   */
>> >>  ret_from_sys_call:
>> >> -     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> >> -     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> >> -
>> >>       LOCKDEP_SYS_EXIT
>> >>       DISABLE_INTERRUPTS(CLBR_NONE)
>> >>       TRACE_IRQS_OFF
>> >> +
>> >> +     /*
>> >> +      * We must check ti flags with interrupts (or at least preemption)
>> >> +      * off because we must *never* return to userspace without
>> >> +      * processing exit work that is enqueued if we're preempted here.
>> >> +      * In particular, returning to userspace with any of the one-shot
>> >> +      * flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
>> >> +      * very bad.
>> >> +      */
>> >> +     testl $_TIF_ALLWORK_MASK,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
>> >> +     jnz int_ret_from_sys_call_fixup /* Go the the slow path */
>> >
>> > Should be safe to call it once again after user-work processing has
>> > been finished.
>> >
>> > I've picked up your fix for tip:x86/urgent.
>>
>> FWIW, the tentative merge here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=tmp.tmp&id=a77dd1607ad88a601259a74ba4d646fa68b7cd9a
>>
>> looks funny.  Why aren't you jumping to int_ret_from_sys_call_irqs_off?
>
> Indeed - the orphaned label should have told me that. The mismerge is
> functionally harmless (causes extra overhead in the slowpath), that's
> why it passed testing.
>
> Does:
>
>   06ab9c1ba6a1 Merge branch 'x86/urgent' into x86/asm, to resolve conflict
>
> look better to you?

Yes, looks good.  Thanks.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
  2015-03-18 19:26           ` Andy Lutomirski
                               ` (2 preceding siblings ...)
  2015-03-18 21:32             ` Linus Torvalds
@ 2015-03-28 23:57             ` Maciej W. Rozycki
  3 siblings, 0 replies; 77+ messages in thread
From: Maciej W. Rozycki @ 2015-03-28 23:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stefan Seyfried, Linus Torvalds, Takashi Iwai, Denys Vlasenko,
	X86 ML, LKML, Tejun Heo

On Wed, 18 Mar 2015, Andy Lutomirski wrote:

> > I posted the same problem to the opensuse kernel list shortly before turning
> > to LKML. There, Michal Kubecek noted:
> >
> > "I encountered a similar problem recently. The thing is, x86
> > specification says that on a double fault, RIP and RSP registers are
> > undefined, i.e. you not only can't expect them to contain values
> > corresponding to the first or second fault but you can't even expect
> > them to have any usable values at all. Unfortunately the kernel double
> > fault handler doesn't take this into account and does try to display
> > usual crash related information so that it itself does usually crash
> > when trying to show stack content (that's the show_stack_log_lvl()
> > crash).
> 
> I think that's not entirely true.  RIP is reliable for many classes of
> double faults, and we rely on that for espfix64.  The fact that hpa
> was willing to write that code strongly suggests that Intel chips at
> least really do work that way.

 A #DF won't deliberately clobber the instruction or the stack pointer.  
It's only that it may happen at a stage where either or both original 
pointers have been lost and replaced with new values already, possibly 
making them inconsistent with the corresponding segment selectors too (as 
they are not written at the same time).

 This will only happen in certain degenerate corner cases such as e.g. a 
problem with TSS (#TS) in the processing of a task gate used for taking 
the original exception, where a part of the new context has already been 
loaded before #DF resulted.  Another case will be a stack segment limit 
violation (#SS), where stack has been switched in the processing of a trap 
or interrupt gate, preventing return information and error code from being 
pushed for the original exception.  These are not conditions we'd normally 
observe in Linux.

 In other cases both the original instruction and the original stack 
pointer will have been retained.

  Maciej

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2015-03-28 23:57 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-15  8:17 PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
2015-03-18 14:16 ` Takashi Iwai
2015-03-18 15:05   ` Takashi Iwai
2015-03-18 17:43   ` Takashi Iwai
2015-03-18 17:46     ` Takashi Iwai
2015-03-18 18:03       ` Andy Lutomirski
2015-03-18 19:03         ` Stefan Seyfried
2015-03-18 19:26           ` Andy Lutomirski
2015-03-18 20:05             ` Stefan Seyfried
2015-03-18 20:51               ` Andy Lutomirski
2015-03-18 21:12                 ` Stefan Seyfried
2015-03-18 21:21                   ` Andy Lutomirski
2015-03-18 21:41                     ` Stefan Seyfried
2015-03-18 21:49                       ` Denys Vlasenko
2015-03-18 21:53                         ` Stefan Seyfried
2015-03-18 20:06             ` Denys Vlasenko
2015-03-18 20:49               ` Andy Lutomirski
2015-03-18 21:06                 ` Denys Vlasenko
2015-03-18 21:17                   ` Andy Lutomirski
2015-03-18 21:32             ` Linus Torvalds
2015-03-18 21:42               ` Denys Vlasenko
2015-03-18 21:55                 ` Andy Lutomirski
2015-03-18 22:17                   ` Denys Vlasenko
2015-03-18 22:20                     ` Andy Lutomirski
2015-03-18 22:27                       ` Denys Vlasenko
2015-03-18 22:18                   ` Linus Torvalds
2015-03-18 22:24                     ` Andy Lutomirski
2015-03-18 22:22                   ` Jiri Kosina
2015-03-18 22:28                     ` Linus Torvalds
2015-03-18 22:29                       ` Andy Lutomirski
2015-03-18 22:29                     ` Andy Lutomirski
2015-03-18 22:38                       ` Stefan Seyfried
2015-03-18 22:40                         ` Andy Lutomirski
2015-03-18 23:22                           ` Andy Lutomirski
2015-03-19  0:23                             ` Stefan Seyfried
2015-03-19  0:57                               ` Andy Lutomirski
2015-03-19  2:15                                 ` Linus Torvalds
2015-03-19  6:24                                 ` Stefan Seyfried
2015-03-19 10:16                       ` Takashi Iwai
2015-03-19 10:58                         ` Denys Vlasenko
2015-03-19 11:21                           ` Takashi Iwai
2015-03-19 12:48                             ` Denys Vlasenko
2015-03-19 13:47                               ` Takashi Iwai
2015-03-19 14:55                                 ` Takashi Iwai
2015-03-19 15:22                                   ` Takashi Iwai
2015-03-19 15:41                                     ` Andy Lutomirski
2015-03-19 15:51                                       ` Takashi Iwai
2015-03-19 16:01                                         ` Andy Lutomirski
2015-03-20 18:16                                         ` Denys Vlasenko
2015-03-20 18:50                                           ` Takashi Iwai
2015-03-23  9:02                                           ` Takashi Iwai
2015-03-23  9:35                                             ` Takashi Iwai
2015-03-23 13:22                                               ` Takashi Iwai
2015-03-23 16:07                                                 ` Denys Vlasenko
2015-03-23 17:18                                                   ` Takashi Iwai
2015-03-23 17:46                                                     ` Denys Vlasenko
2015-03-23 18:43                                                       ` Takashi Iwai
2015-03-23 18:38                                                   ` Andy Lutomirski
2015-03-23 18:48                                                     ` Andy Lutomirski
2015-03-23 18:59                                                       ` Takashi Iwai
2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
2015-03-23 19:21                                                           ` Denys Vlasenko
2015-03-23 19:27                                                             ` Andy Lutomirski
2015-03-23 19:32                                                               ` Andy Lutomirski
2015-03-24 11:17                                                           ` Takashi Iwai
2015-03-24 20:08                                                           ` Ingo Molnar
2015-03-25  0:35                                                             ` Andy Lutomirski
2015-03-25 12:21                                                               ` Ingo Molnar
2015-03-25 15:07                                                                 ` Andy Lutomirski
2015-03-25  9:13                                                           ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
2015-03-23 18:54                                                     ` PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
2015-03-23 18:56                                                     ` Takashi Iwai
2015-03-23 19:07                                                     ` Denys Vlasenko
2015-03-23 19:10                                                       ` Andy Lutomirski
2015-03-19 13:21                   ` Denys Vlasenko
2015-03-18 21:49               ` Stefan Seyfried
2015-03-28 23:57             ` Maciej W. Rozycki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.