* Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
@ 2015-01-22 5:52 James Dingwall
2015-01-22 8:20 ` Borislav Petkov
0 siblings, 1 reply; 11+ messages in thread
From: James Dingwall @ 2015-01-22 5:52 UTC (permalink / raw)
To: linux-kernel
Hi,
Since 3.18.2 I am getting the oops below during boot whilst running as a dom0 under xen 4.4.1 / 4.5.0. Is this a known issue or worth bisecting to identify the exact commit which causes this?
Thanks,
James
[ 173.735541] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 173.735789] IP: [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
[ 173.735958] PGD 71480067 PUD 71bc6067 PMD 0
[ 173.736077] Oops: 0002 [#1] SMP
[ 173.736152] Modules linked in: it87 hwmon_vid autofs4 nfsd xen_pciback xen_gntalloc bridge stp llc ipv6 rbd ceph libceph openvswitch geneve vxlan ip6_udp_tunnel udp_tunnel tun tmem
xen_acpi_processor xen_gntdev xen_blkback xen_netback i915 fbcon bitblit softcursor font tileblit video drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic coretemp microcode(-) drm lpc_ich
i2c_i801 mfd_core snd_hda_intel firewire_ohci r8169 mii ata_generic evdev e1000e snd_hda_controller rtc_cmos i2c_algo_bit snd_hda_codec i2c_core cfbfillrect cfbimgblt cfbcopyarea backlight fb
snd_pcm processor fbdev snd_hwdep snd_timer button intel_agp intel_gtt snd parport_pc parport thermal_sys dm_zero dm_thin_pool dm_persistent_data dm_bio_prison xts lrw gf128mul glue_helper
ablk_helper cryptd aes_x86_64 iscsi_tcp libiscsi_tcp
[ 173.738197] libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy hwmon e1000 fuse btrfs ext4 jbd2 linear raid0 dm_raid raid1 raid10 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash
dm_log firewire_core hid_sunplus hid_samsung hid_pl hid_petalynx hid_gyration usbhid ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd megaraid_sas 3w_xxxx qla1280 aic7xxx scsi_transport_spi sr_mod
cdrom sg ahci libahci sata_nv sata_sil pata_amd libata
[ 173.738197] CPU: 1 PID: 5381 Comm: rmmod Tainted: G W 3.18.2 #118
[ 173.738197] Hardware name: Gigabyte Technology Co., Ltd. G33M-S2/G33M-S2, BIOS F7K 07/31/2009
[ 173.738197] task: ffff880073dca940 ti: ffff880071434000 task.ti: ffff880071434000
[ 173.738197] RIP: e030:[<ffffffff8134e7c2>] [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
[ 173.738197] RSP: e02b:ffff880071437eb8 EFLAGS: 00010247
[ 173.738197] RAX: 0000000000000000 RBX: ffffffffa0511e20 RCX: 00000028739a828b
[ 173.738197] RDX: 0000000000000000 RSI: ffff880073dc0604 RDI: ffff880078866840
[ 173.738197] RBP: ffff880071437ec8 R08: 00000000bfff1433 R09: ffff88007f68a0b0
[ 173.738197] R10: 0000000000007ff0 R11: 0000000000000000 R12: 00000000ffffff87
[ 173.738197] R13: 0000000000000800 R14: 00000000019c11c0 R15: 00000000019c1010
[ 173.738197] FS: 00007f7ac2926700(0000) GS:ffff88007f680000(0000) knlGS:0000000000000000
[ 173.738197] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 173.738197] CR2: 0000000000000008 CR3: 0000000072886000 CR4: 0000000000002660
[ 173.738197] Stack:
[ 173.738197] ffff88007f6113c0 0000000000000000 ffff880071437ee8 ffffffffa0511253
[ 173.738197] 0000000000000000 ffffffffa0511f30 ffff880071437f78 ffffffff81093fdf
[ 173.738197] 00000000c22d6d00 ffffffffa0511f30 ffff880000000800 ffff880071437efc
[ 173.738197] Call Trace:
[ 173.738197] [<ffffffffa0511253>] microcode_exit+0x20/0xb1 [microcode]
[ 173.738197] [<ffffffff81093fdf>] SyS_delete_module+0x118/0x1a6
[ 173.738197] [<ffffffff8100af73>] ? do_notify_resume+0x6a/0x78
[ 173.738197] [<ffffffff814caae9>] system_call_fastpath+0x12/0x17
[ 173.738197] Code: f1 07 64 81 e8 3a 70 cf ff b8 ea ff ff ff eb 6b 48 c7 c7 20 fa 71 81 e8 1a a4 17 00 48 8b 43 20 48 8b 53 18 48 8b 3d e6 73 57 00 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00
ad de 8b 33 48 89
[ 173.738197] RIP [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
[ 173.738197] RSP <ffff880071437eb8>
[ 173.738197] CR2: 0000000000000008
[ 173.784081] ---[ end trace 0ab648576ba0af94 ]---
Previously in 3.18.1:
[ 176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
[ 176.855844] microcode: CPU0 update to revision 0x60f failed
[ 176.856107] microcode: CPU1 sig=0x10676, pf=0x1, revision=0x60c
[ 176.856115] microcode: CPU1 update to revision 0x60f failed
[ 176.861597] microcode: Microcode Update Driver: v2.00 removed.
Same 3.18.2 kernel on bare metal (different system but identical hardware):
[ 46.002857] microcode: Microcode Update Driver: v2.00 removed.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 5:52 Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops James Dingwall
@ 2015-01-22 8:20 ` Borislav Petkov
2015-01-22 14:48 ` Boris Ostrovsky
0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-22 8:20 UTC (permalink / raw)
To: James Dingwall; +Cc: linux-kernel, Boris Ostrovsky
Hmm,
and I thought we fixed all that fun. It seems not :-\
Boris, this paravirt_enabled() thing doesn't seem to work or why are we
even calling microcode_exit()?
Leaving in the rest.
On Thu, Jan 22, 2015 at 05:52:42AM +0000, James Dingwall wrote:
> Hi,
>
> Since 3.18.2 I am getting the oops below during boot whilst running as a dom0 under xen 4.4.1 / 4.5.0. Is this a known issue or worth bisecting to identify the exact commit which causes this?
>
> Thanks,
> James
>
> [ 173.735541] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [ 173.735789] IP: [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
> [ 173.735958] PGD 71480067 PUD 71bc6067 PMD 0
> [ 173.736077] Oops: 0002 [#1] SMP
> [ 173.736152] Modules linked in: it87 hwmon_vid autofs4 nfsd xen_pciback xen_gntalloc bridge stp llc ipv6 rbd ceph libceph openvswitch geneve vxlan ip6_udp_tunnel udp_tunnel tun tmem
> xen_acpi_processor xen_gntdev xen_blkback xen_netback i915 fbcon bitblit softcursor font tileblit video drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic coretemp microcode(-) drm lpc_ich
> i2c_i801 mfd_core snd_hda_intel firewire_ohci r8169 mii ata_generic evdev e1000e snd_hda_controller rtc_cmos i2c_algo_bit snd_hda_codec i2c_core cfbfillrect cfbimgblt cfbcopyarea backlight fb
> snd_pcm processor fbdev snd_hwdep snd_timer button intel_agp intel_gtt snd parport_pc parport thermal_sys dm_zero dm_thin_pool dm_persistent_data dm_bio_prison xts lrw gf128mul glue_helper
> ablk_helper cryptd aes_x86_64 iscsi_tcp libiscsi_tcp
> [ 173.738197] libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy hwmon e1000 fuse btrfs ext4 jbd2 linear raid0 dm_raid raid1 raid10 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash
> dm_log firewire_core hid_sunplus hid_samsung hid_pl hid_petalynx hid_gyration usbhid ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd megaraid_sas 3w_xxxx qla1280 aic7xxx scsi_transport_spi sr_mod
> cdrom sg ahci libahci sata_nv sata_sil pata_amd libata
> [ 173.738197] CPU: 1 PID: 5381 Comm: rmmod Tainted: G W 3.18.2 #118
> [ 173.738197] Hardware name: Gigabyte Technology Co., Ltd. G33M-S2/G33M-S2, BIOS F7K 07/31/2009
> [ 173.738197] task: ffff880073dca940 ti: ffff880071434000 task.ti: ffff880071434000
> [ 173.738197] RIP: e030:[<ffffffff8134e7c2>] [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
> [ 173.738197] RSP: e02b:ffff880071437eb8 EFLAGS: 00010247
> [ 173.738197] RAX: 0000000000000000 RBX: ffffffffa0511e20 RCX: 00000028739a828b
> [ 173.738197] RDX: 0000000000000000 RSI: ffff880073dc0604 RDI: ffff880078866840
> [ 173.738197] RBP: ffff880071437ec8 R08: 00000000bfff1433 R09: ffff88007f68a0b0
> [ 173.738197] R10: 0000000000007ff0 R11: 0000000000000000 R12: 00000000ffffff87
> [ 173.738197] R13: 0000000000000800 R14: 00000000019c11c0 R15: 00000000019c1010
> [ 173.738197] FS: 00007f7ac2926700(0000) GS:ffff88007f680000(0000) knlGS:0000000000000000
> [ 173.738197] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 173.738197] CR2: 0000000000000008 CR3: 0000000072886000 CR4: 0000000000002660
> [ 173.738197] Stack:
> [ 173.738197] ffff88007f6113c0 0000000000000000 ffff880071437ee8 ffffffffa0511253
> [ 173.738197] 0000000000000000 ffffffffa0511f30 ffff880071437f78 ffffffff81093fdf
> [ 173.738197] 00000000c22d6d00 ffffffffa0511f30 ffff880000000800 ffff880071437efc
> [ 173.738197] Call Trace:
> [ 173.738197] [<ffffffffa0511253>] microcode_exit+0x20/0xb1 [microcode]
> [ 173.738197] [<ffffffff81093fdf>] SyS_delete_module+0x118/0x1a6
> [ 173.738197] [<ffffffff8100af73>] ? do_notify_resume+0x6a/0x78
> [ 173.738197] [<ffffffff814caae9>] system_call_fastpath+0x12/0x17
> [ 173.738197] Code: f1 07 64 81 e8 3a 70 cf ff b8 ea ff ff ff eb 6b 48 c7 c7 20 fa 71 81 e8 1a a4 17 00 48 8b 43 20 48 8b 53 18 48 8b 3d e6 73 57 00 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00
> ad de 8b 33 48 89
> [ 173.738197] RIP [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
> [ 173.738197] RSP <ffff880071437eb8>
> [ 173.738197] CR2: 0000000000000008
> [ 173.784081] ---[ end trace 0ab648576ba0af94 ]---
>
>
> Previously in 3.18.1:
> [ 176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
> [ 176.855844] microcode: CPU0 update to revision 0x60f failed
> [ 176.856107] microcode: CPU1 sig=0x10676, pf=0x1, revision=0x60c
> [ 176.856115] microcode: CPU1 update to revision 0x60f failed
> [ 176.861597] microcode: Microcode Update Driver: v2.00 removed.
>
>
> Same 3.18.2 kernel on bare metal (different system but identical hardware):
> [ 46.002857] microcode: Microcode Update Driver: v2.00 removed.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 8:20 ` Borislav Petkov
@ 2015-01-22 14:48 ` Boris Ostrovsky
2015-01-22 14:53 ` Boris Ostrovsky
0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-22 14:48 UTC (permalink / raw)
To: Borislav Petkov, James Dingwall; +Cc: linux-kernel
On 01/22/2015 03:20 AM, Borislav Petkov wrote:
> Hmm,
>
> and I thought we fixed all that fun. It seems not :-\
>
> Boris, this paravirt_enabled() thing doesn't seem to work or why are we
> even calling microcode_exit()?
Looks like something is unloading microcode driver (init scripts
perhaps) and so we are trying to unregister device that we never
registered (because we had early return from microcode_init() when we
loaded it).
I actually suspect the same bug would be triggered if dis_ucode_ldr is
true on baremetal.
So we need something like:
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -625,6 +625,9 @@ static void __exit microcode_exit(void)
{
struct cpuinfo_x86 *c = &cpu_data(0);
+ if (paravirt_enabled() || dis_ucode_ldr)
+ return 0;
+
microcode_dev_exit();
unregister_hotcpu_notifier(&mc_cpu_notifier);
-boris
>
> Leaving in the rest.
>
> On Thu, Jan 22, 2015 at 05:52:42AM +0000, James Dingwall wrote:
>> Hi,
>>
>> Since 3.18.2 I am getting the oops below during boot whilst running as a dom0 under xen 4.4.1 / 4.5.0. Is this a known issue or worth bisecting to identify the exact commit which causes this?
>>
>> Thanks,
>> James
>>
>> [ 173.735541] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>> [ 173.735789] IP: [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
>> [ 173.735958] PGD 71480067 PUD 71bc6067 PMD 0
>> [ 173.736077] Oops: 0002 [#1] SMP
>> [ 173.736152] Modules linked in: it87 hwmon_vid autofs4 nfsd xen_pciback xen_gntalloc bridge stp llc ipv6 rbd ceph libceph openvswitch geneve vxlan ip6_udp_tunnel udp_tunnel tun tmem
>> xen_acpi_processor xen_gntdev xen_blkback xen_netback i915 fbcon bitblit softcursor font tileblit video drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic coretemp microcode(-) drm lpc_ich
>> i2c_i801 mfd_core snd_hda_intel firewire_ohci r8169 mii ata_generic evdev e1000e snd_hda_controller rtc_cmos i2c_algo_bit snd_hda_codec i2c_core cfbfillrect cfbimgblt cfbcopyarea backlight fb
>> snd_pcm processor fbdev snd_hwdep snd_timer button intel_agp intel_gtt snd parport_pc parport thermal_sys dm_zero dm_thin_pool dm_persistent_data dm_bio_prison xts lrw gf128mul glue_helper
>> ablk_helper cryptd aes_x86_64 iscsi_tcp libiscsi_tcp
>> [ 173.738197] libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy hwmon e1000 fuse btrfs ext4 jbd2 linear raid0 dm_raid raid1 raid10 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash
>> dm_log firewire_core hid_sunplus hid_samsung hid_pl hid_petalynx hid_gyration usbhid ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd megaraid_sas 3w_xxxx qla1280 aic7xxx scsi_transport_spi sr_mod
>> cdrom sg ahci libahci sata_nv sata_sil pata_amd libata
>> [ 173.738197] CPU: 1 PID: 5381 Comm: rmmod Tainted: G W 3.18.2 #118
>> [ 173.738197] Hardware name: Gigabyte Technology Co., Ltd. G33M-S2/G33M-S2, BIOS F7K 07/31/2009
>> [ 173.738197] task: ffff880073dca940 ti: ffff880071434000 task.ti: ffff880071434000
>> [ 173.738197] RIP: e030:[<ffffffff8134e7c2>] [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
>> [ 173.738197] RSP: e02b:ffff880071437eb8 EFLAGS: 00010247
>> [ 173.738197] RAX: 0000000000000000 RBX: ffffffffa0511e20 RCX: 00000028739a828b
>> [ 173.738197] RDX: 0000000000000000 RSI: ffff880073dc0604 RDI: ffff880078866840
>> [ 173.738197] RBP: ffff880071437ec8 R08: 00000000bfff1433 R09: ffff88007f68a0b0
>> [ 173.738197] R10: 0000000000007ff0 R11: 0000000000000000 R12: 00000000ffffff87
>> [ 173.738197] R13: 0000000000000800 R14: 00000000019c11c0 R15: 00000000019c1010
>> [ 173.738197] FS: 00007f7ac2926700(0000) GS:ffff88007f680000(0000) knlGS:0000000000000000
>> [ 173.738197] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 173.738197] CR2: 0000000000000008 CR3: 0000000072886000 CR4: 0000000000002660
>> [ 173.738197] Stack:
>> [ 173.738197] ffff88007f6113c0 0000000000000000 ffff880071437ee8 ffffffffa0511253
>> [ 173.738197] 0000000000000000 ffffffffa0511f30 ffff880071437f78 ffffffff81093fdf
>> [ 173.738197] 00000000c22d6d00 ffffffffa0511f30 ffff880000000800 ffff880071437efc
>> [ 173.738197] Call Trace:
>> [ 173.738197] [<ffffffffa0511253>] microcode_exit+0x20/0xb1 [microcode]
>> [ 173.738197] [<ffffffff81093fdf>] SyS_delete_module+0x118/0x1a6
>> [ 173.738197] [<ffffffff8100af73>] ? do_notify_resume+0x6a/0x78
>> [ 173.738197] [<ffffffff814caae9>] system_call_fastpath+0x12/0x17
>> [ 173.738197] Code: f1 07 64 81 e8 3a 70 cf ff b8 ea ff ff ff eb 6b 48 c7 c7 20 fa 71 81 e8 1a a4 17 00 48 8b 43 20 48 8b 53 18 48 8b 3d e6 73 57 00 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00
>> ad de 8b 33 48 89
>> [ 173.738197] RIP [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
>> [ 173.738197] RSP <ffff880071437eb8>
>> [ 173.738197] CR2: 0000000000000008
>> [ 173.784081] ---[ end trace 0ab648576ba0af94 ]---
>>
>>
>> Previously in 3.18.1:
>> [ 176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
>> [ 176.855844] microcode: CPU0 update to revision 0x60f failed
>> [ 176.856107] microcode: CPU1 sig=0x10676, pf=0x1, revision=0x60c
>> [ 176.856115] microcode: CPU1 update to revision 0x60f failed
>> [ 176.861597] microcode: Microcode Update Driver: v2.00 removed.
>>
>>
>> Same 3.18.2 kernel on bare metal (different system but identical hardware):
>> [ 46.002857] microcode: Microcode Update Driver: v2.00 removed.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 14:48 ` Boris Ostrovsky
@ 2015-01-22 14:53 ` Boris Ostrovsky
2015-01-22 15:30 ` Borislav Petkov
0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-22 14:53 UTC (permalink / raw)
To: Borislav Petkov, James Dingwall; +Cc: linux-kernel
On 01/22/2015 09:48 AM, Boris Ostrovsky wrote:
> On 01/22/2015 03:20 AM, Borislav Petkov wrote:
>> Hmm,
>>
>> and I thought we fixed all that fun. It seems not :-\
>>
>> Boris, this paravirt_enabled() thing doesn't seem to work or why are we
>> even calling microcode_exit()?
>
> Looks like something is unloading microcode driver (init scripts
> perhaps) and so we are trying to unregister device that we never
> registered (because we had early return from microcode_init() when we
> loaded it).
>
> I actually suspect the same bug would be triggered if dis_ucode_ldr is
> true on baremetal.
>
> So we need something like:
>
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -625,6 +625,9 @@ static void __exit microcode_exit(void)
> {
> struct cpuinfo_x86 *c = &cpu_data(0);
>
> + if (paravirt_enabled() || dis_ucode_ldr)
> + return 0;
Plain 'return', of course.
Alternatively, we could return an error (-EINVAL?) from microcode_init()
when either of these two conditions is true.
-boris
> +
> microcode_dev_exit();
>
> unregister_hotcpu_notifier(&mc_cpu_notifier);
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 14:53 ` Boris Ostrovsky
@ 2015-01-22 15:30 ` Borislav Petkov
2015-01-22 17:43 ` James Dingwall
2015-01-27 22:12 ` Borislav Petkov
0 siblings, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2015-01-22 15:30 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: James Dingwall, linux-kernel
On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
> Alternatively, we could return an error (-EINVAL?) from
> microcode_init() when either of these two conditions is true.
Yeah, this should be the right fix.
James, does that fix your issue? (It should.)
---
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 15c29096136b..36a83617eb21 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -552,7 +552,7 @@ static int __init microcode_init(void)
int error;
if (paravirt_enabled() || dis_ucode_ldr)
- return 0;
+ return -EINVAL;
if (c->x86_vendor == X86_VENDOR_INTEL)
microcode_ops = init_intel_microcode();
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 15:30 ` Borislav Petkov
@ 2015-01-22 17:43 ` James Dingwall
2015-01-22 17:58 ` Borislav Petkov
2015-01-27 22:12 ` Borislav Petkov
1 sibling, 1 reply; 11+ messages in thread
From: James Dingwall @ 2015-01-22 17:43 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Boris Ostrovsky, linux-kernel
On Thu, Jan 22, 2015 at 04:30:02PM +0100, Borislav Petkov wrote:
> On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
> > Alternatively, we could return an error (-EINVAL?) from
> > microcode_init() when either of these two conditions is true.
>
> Yeah, this should be the right fix.
>
> James, does that fix your issue? (It should.)
>
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 15c29096136b..36a83617eb21 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -552,7 +552,7 @@ static int __init microcode_init(void)
> int error;
>
> if (paravirt_enabled() || dis_ucode_ldr)
> - return 0;
> + return -EINVAL;
>
> if (c->x86_vendor == X86_VENDOR_INTEL)
> microcode_ops = init_intel_microcode();
>
> --
This patch solves it for me on my dom0 with 3.18.3, now there is nothing printed at all from the microcode
driver which doesn't seem surprising given where the return is. I'll check it on bare metal at the next
opportunity but from my understanding of what is happening there I don't see that it should have any impact at
all.
Thanks,
James
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 17:43 ` James Dingwall
@ 2015-01-22 17:58 ` Borislav Petkov
2015-01-22 18:09 ` Boris Ostrovsky
0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-22 17:58 UTC (permalink / raw)
To: James Dingwall; +Cc: Boris Ostrovsky, linux-kernel
On Thu, Jan 22, 2015 at 05:43:15PM +0000, James Dingwall wrote:
> This patch solves it for me on my dom0 with 3.18.3, now there is
> nothing printed at all from the microcode driver which doesn't seem
> surprising given where the return is.
Yap, xen does/will update microcode differently...
> I'll check it on bare metal at the next opportunity but from my
> understanding of what is happening there I don't see that it should
> have any impact at all.
Yeah, it shouldn't have any effect on baremetal in the sense that it
should load properly there. And it is a fix for baremetal too, as Boris
pointed out.
I'm still curious as to why does it say this on your machine:
[ 176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
[ 176.855844] microcode: CPU0 update to revision 0x60f failed
?
This basically says that we do try to update with the patch but the
hardware doesn't accept it.
Is this new? Did it ever update microcode properly?
Where do you get the microcode for that machine?
Can you run the scriptlet below as root and send me the results? You'd
need the msr-tools package and the msr.ko kernel module loaded. Ask if
you need help.
Also, please send me a full dmesg with "ignore_loglevel log_buf_len=16M
debug" on the kernel command line, privately is fine too.
Thanks.
--
#!/bin/bash
echo "/proc/cpuinfo: "
cat /proc/cpuinfo
echo; echo "dmesg: "
dmesg | grep -i microcode
modprobe msr 2>/dev/null
echo ; echo "MSRs: (0x8b) "
rdmsr --all 0x8b
--
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 17:58 ` Borislav Petkov
@ 2015-01-22 18:09 ` Boris Ostrovsky
0 siblings, 0 replies; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-22 18:09 UTC (permalink / raw)
To: Borislav Petkov, James Dingwall; +Cc: linux-kernel
On 01/22/2015 12:58 PM, Borislav Petkov wrote:
> On Thu, Jan 22, 2015 at 05:43:15PM +0000, James Dingwall wrote:
>> This patch solves it for me on my dom0 with 3.18.3, now there is
>> nothing printed at all from the microcode driver which doesn't seem
>> surprising given where the return is.
> Yap, xen does/will update microcode differently...
>
>> I'll check it on bare metal at the next opportunity but from my
>> understanding of what is happening there I don't see that it should
>> have any impact at all.
> Yeah, it shouldn't have any effect on baremetal in the sense that it
> should load properly there. And it is a fix for baremetal too, as Boris
> pointed out.
>
> I'm still curious as to why does it say this on your machine:
>
> [ 176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
> [ 176.855844] microcode: CPU0 update to revision 0x60f failed
If this was on dom0 (i.e a Xen PV guest) then it's understandable since
in that case MSR writes would have been trapped by the hypervisor and
not processed any further.
James can probably see in the hypervisor log ('xl dmesg') something like
(XEN) traps.c:2579:d0v0 Domain attempted WRMSR 0000000000000079 from
0x0000000000000000 to 0x0000000000000001.
(MSR reads should proceed fine though).
-boris
>
> ?
>
> This basically says that we do try to update with the patch but the
> hardware doesn't accept it.
>
> Is this new? Did it ever update microcode properly?
>
> Where do you get the microcode for that machine?
>
> Can you run the scriptlet below as root and send me the results? You'd
> need the msr-tools package and the msr.ko kernel module loaded. Ask if
> you need help.
>
> Also, please send me a full dmesg with "ignore_loglevel log_buf_len=16M
> debug" on the kernel command line, privately is fine too.
>
> Thanks.
>
> --
> #!/bin/bash
>
> echo "/proc/cpuinfo: "
> cat /proc/cpuinfo
>
> echo; echo "dmesg: "
> dmesg | grep -i microcode
> modprobe msr 2>/dev/null
>
> echo ; echo "MSRs: (0x8b) "
> rdmsr --all 0x8b
> --
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-22 15:30 ` Borislav Petkov
2015-01-22 17:43 ` James Dingwall
@ 2015-01-27 22:12 ` Borislav Petkov
2015-01-27 22:55 ` Boris Ostrovsky
1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-27 22:12 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: James Dingwall, linux-kernel
Hey Boris,
On Thu, Jan 22, 2015 at 04:30:02PM +0100, Borislav Petkov wrote:
> On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
> > Alternatively, we could return an error (-EINVAL?) from
> > microcode_init() when either of these two conditions is true.
>
> Yeah, this should be the right fix.
>
> James, does that fix your issue? (It should.)
>
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 15c29096136b..36a83617eb21 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -552,7 +552,7 @@ static int __init microcode_init(void)
> int error;
>
> if (paravirt_enabled() || dis_ucode_ldr)
> - return 0;
> + return -EINVAL;
>
> if (c->x86_vendor == X86_VENDOR_INTEL)
> microcode_ops = init_intel_microcode();
would you do the honor and write a proper patch? You found the bug so...
:-D
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-27 22:12 ` Borislav Petkov
@ 2015-01-27 22:55 ` Boris Ostrovsky
2015-01-27 23:12 ` Borislav Petkov
0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-27 22:55 UTC (permalink / raw)
To: Borislav Petkov; +Cc: James Dingwall, linux-kernel
On 01/27/2015 05:12 PM, Borislav Petkov wrote:
> Hey Boris,
>
> On Thu, Jan 22, 2015 at 04:30:02PM +0100, Borislav Petkov wrote:
>> On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
>>> Alternatively, we could return an error (-EINVAL?) from
>>> microcode_init() when either of these two conditions is true.
>> Yeah, this should be the right fix.
>>
>> James, does that fix your issue? (It should.)
>>
>> ---
>> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
>> index 15c29096136b..36a83617eb21 100644
>> --- a/arch/x86/kernel/cpu/microcode/core.c
>> +++ b/arch/x86/kernel/cpu/microcode/core.c
>> @@ -552,7 +552,7 @@ static int __init microcode_init(void)
>> int error;
>>
>> if (paravirt_enabled() || dis_ucode_ldr)
>> - return 0;
>> + return -EINVAL;
>>
>> if (c->x86_vendor == X86_VENDOR_INTEL)
>> microcode_ops = init_intel_microcode();
> would you do the honor and write a proper patch? You found the bug so...
> :-D
Will do. This needs to go to stable as well, right?
-boris
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
2015-01-27 22:55 ` Boris Ostrovsky
@ 2015-01-27 23:12 ` Borislav Petkov
0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2015-01-27 23:12 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: James Dingwall, linux-kernel
On Tue, Jan 27, 2015 at 05:55:17PM -0500, Boris Ostrovsky wrote:
> Will do. This needs to go to stable as well, right?
Right, add "# 3.18" as a comment after the CC:stable line as I haven't
done the backports for the older stable kernels yet. I'll submit it for
the older ones myself, along with the rest of the commits.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-01-27 23:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-22 5:52 Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops James Dingwall
2015-01-22 8:20 ` Borislav Petkov
2015-01-22 14:48 ` Boris Ostrovsky
2015-01-22 14:53 ` Boris Ostrovsky
2015-01-22 15:30 ` Borislav Petkov
2015-01-22 17:43 ` James Dingwall
2015-01-22 17:58 ` Borislav Petkov
2015-01-22 18:09 ` Boris Ostrovsky
2015-01-27 22:12 ` Borislav Petkov
2015-01-27 22:55 ` Boris Ostrovsky
2015-01-27 23:12 ` Borislav Petkov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.