All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
@ 2015-01-22  5:52 James Dingwall
  2015-01-22  8:20 ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: James Dingwall @ 2015-01-22  5:52 UTC (permalink / raw)
  To: linux-kernel

Hi,

Since 3.18.2 I am getting the oops below during boot whilst running as a dom0 under xen 4.4.1 / 4.5.0.  Is this a known issue or worth bisecting to identify the exact commit which causes this?

Thanks,
James

[  173.735541] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  173.735789] IP: [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
[  173.735958] PGD 71480067 PUD 71bc6067 PMD 0 
[  173.736077] Oops: 0002 [#1] SMP 
[  173.736152] Modules linked in: it87 hwmon_vid autofs4 nfsd xen_pciback xen_gntalloc bridge stp llc ipv6 rbd ceph libceph openvswitch geneve vxlan ip6_udp_tunnel udp_tunnel tun tmem 
xen_acpi_processor xen_gntdev xen_blkback xen_netback i915 fbcon bitblit softcursor font tileblit video drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic coretemp microcode(-) drm lpc_ich 
i2c_i801 mfd_core snd_hda_intel firewire_ohci r8169 mii ata_generic evdev e1000e snd_hda_controller rtc_cmos i2c_algo_bit snd_hda_codec i2c_core cfbfillrect cfbimgblt cfbcopyarea backlight fb 
snd_pcm processor fbdev snd_hwdep snd_timer button intel_agp intel_gtt snd parport_pc parport thermal_sys dm_zero dm_thin_pool dm_persistent_data dm_bio_prison xts lrw gf128mul glue_helper 
ablk_helper cryptd aes_x86_64 iscsi_tcp libiscsi_tcp
[  173.738197]  libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy hwmon e1000 fuse btrfs ext4 jbd2 linear raid0 dm_raid raid1 raid10 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash 
dm_log firewire_core hid_sunplus hid_samsung hid_pl hid_petalynx hid_gyration usbhid ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd megaraid_sas 3w_xxxx qla1280 aic7xxx scsi_transport_spi sr_mod 
cdrom sg ahci libahci sata_nv sata_sil pata_amd libata
[  173.738197] CPU: 1 PID: 5381 Comm: rmmod Tainted: G        W      3.18.2 #118
[  173.738197] Hardware name: Gigabyte Technology Co., Ltd. G33M-S2/G33M-S2, BIOS F7K 07/31/2009
[  173.738197] task: ffff880073dca940 ti: ffff880071434000 task.ti: ffff880071434000
[  173.738197] RIP: e030:[<ffffffff8134e7c2>]  [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
[  173.738197] RSP: e02b:ffff880071437eb8  EFLAGS: 00010247
[  173.738197] RAX: 0000000000000000 RBX: ffffffffa0511e20 RCX: 00000028739a828b
[  173.738197] RDX: 0000000000000000 RSI: ffff880073dc0604 RDI: ffff880078866840
[  173.738197] RBP: ffff880071437ec8 R08: 00000000bfff1433 R09: ffff88007f68a0b0
[  173.738197] R10: 0000000000007ff0 R11: 0000000000000000 R12: 00000000ffffff87
[  173.738197] R13: 0000000000000800 R14: 00000000019c11c0 R15: 00000000019c1010
[  173.738197] FS:  00007f7ac2926700(0000) GS:ffff88007f680000(0000) knlGS:0000000000000000
[  173.738197] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  173.738197] CR2: 0000000000000008 CR3: 0000000072886000 CR4: 0000000000002660
[  173.738197] Stack:
[  173.738197]  ffff88007f6113c0 0000000000000000 ffff880071437ee8 ffffffffa0511253
[  173.738197]  0000000000000000 ffffffffa0511f30 ffff880071437f78 ffffffff81093fdf
[  173.738197]  00000000c22d6d00 ffffffffa0511f30 ffff880000000800 ffff880071437efc
[  173.738197] Call Trace:
[  173.738197]  [<ffffffffa0511253>] microcode_exit+0x20/0xb1 [microcode]
[  173.738197]  [<ffffffff81093fdf>] SyS_delete_module+0x118/0x1a6
[  173.738197]  [<ffffffff8100af73>] ? do_notify_resume+0x6a/0x78
[  173.738197]  [<ffffffff814caae9>] system_call_fastpath+0x12/0x17
[  173.738197] Code: f1 07 64 81 e8 3a 70 cf ff b8 ea ff ff ff eb 6b 48 c7 c7 20 fa 71 81 e8 1a a4 17 00 48 8b 43 20 48 8b 53 18 48 8b 3d e6 73 57 00 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00 
ad de 8b 33 48 89 
[  173.738197] RIP  [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
[  173.738197]  RSP <ffff880071437eb8>
[  173.738197] CR2: 0000000000000008
[  173.784081] ---[ end trace 0ab648576ba0af94 ]---


Previously in 3.18.1:
[  176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
[  176.855844] microcode: CPU0 update to revision 0x60f failed
[  176.856107] microcode: CPU1 sig=0x10676, pf=0x1, revision=0x60c
[  176.856115] microcode: CPU1 update to revision 0x60f failed
[  176.861597] microcode: Microcode Update Driver: v2.00 removed.


Same 3.18.2 kernel on bare metal (different system but identical hardware):
[   46.002857] microcode: Microcode Update Driver: v2.00 removed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22  5:52 Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops James Dingwall
@ 2015-01-22  8:20 ` Borislav Petkov
  2015-01-22 14:48   ` Boris Ostrovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-22  8:20 UTC (permalink / raw)
  To: James Dingwall; +Cc: linux-kernel, Boris Ostrovsky

Hmm,

and I thought we fixed all that fun. It seems not :-\

Boris, this paravirt_enabled() thing doesn't seem to work or why are we
even calling microcode_exit()?

Leaving in the rest.

On Thu, Jan 22, 2015 at 05:52:42AM +0000, James Dingwall wrote:
> Hi,
> 
> Since 3.18.2 I am getting the oops below during boot whilst running as a dom0 under xen 4.4.1 / 4.5.0.  Is this a known issue or worth bisecting to identify the exact commit which causes this?
> 
> Thanks,
> James
> 
> [  173.735541] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [  173.735789] IP: [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
> [  173.735958] PGD 71480067 PUD 71bc6067 PMD 0 
> [  173.736077] Oops: 0002 [#1] SMP 
> [  173.736152] Modules linked in: it87 hwmon_vid autofs4 nfsd xen_pciback xen_gntalloc bridge stp llc ipv6 rbd ceph libceph openvswitch geneve vxlan ip6_udp_tunnel udp_tunnel tun tmem 
> xen_acpi_processor xen_gntdev xen_blkback xen_netback i915 fbcon bitblit softcursor font tileblit video drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic coretemp microcode(-) drm lpc_ich 
> i2c_i801 mfd_core snd_hda_intel firewire_ohci r8169 mii ata_generic evdev e1000e snd_hda_controller rtc_cmos i2c_algo_bit snd_hda_codec i2c_core cfbfillrect cfbimgblt cfbcopyarea backlight fb 
> snd_pcm processor fbdev snd_hwdep snd_timer button intel_agp intel_gtt snd parport_pc parport thermal_sys dm_zero dm_thin_pool dm_persistent_data dm_bio_prison xts lrw gf128mul glue_helper 
> ablk_helper cryptd aes_x86_64 iscsi_tcp libiscsi_tcp
> [  173.738197]  libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy hwmon e1000 fuse btrfs ext4 jbd2 linear raid0 dm_raid raid1 raid10 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash 
> dm_log firewire_core hid_sunplus hid_samsung hid_pl hid_petalynx hid_gyration usbhid ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd megaraid_sas 3w_xxxx qla1280 aic7xxx scsi_transport_spi sr_mod 
> cdrom sg ahci libahci sata_nv sata_sil pata_amd libata
> [  173.738197] CPU: 1 PID: 5381 Comm: rmmod Tainted: G        W      3.18.2 #118
> [  173.738197] Hardware name: Gigabyte Technology Co., Ltd. G33M-S2/G33M-S2, BIOS F7K 07/31/2009
> [  173.738197] task: ffff880073dca940 ti: ffff880071434000 task.ti: ffff880071434000
> [  173.738197] RIP: e030:[<ffffffff8134e7c2>]  [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
> [  173.738197] RSP: e02b:ffff880071437eb8  EFLAGS: 00010247
> [  173.738197] RAX: 0000000000000000 RBX: ffffffffa0511e20 RCX: 00000028739a828b
> [  173.738197] RDX: 0000000000000000 RSI: ffff880073dc0604 RDI: ffff880078866840
> [  173.738197] RBP: ffff880071437ec8 R08: 00000000bfff1433 R09: ffff88007f68a0b0
> [  173.738197] R10: 0000000000007ff0 R11: 0000000000000000 R12: 00000000ffffff87
> [  173.738197] R13: 0000000000000800 R14: 00000000019c11c0 R15: 00000000019c1010
> [  173.738197] FS:  00007f7ac2926700(0000) GS:ffff88007f680000(0000) knlGS:0000000000000000
> [  173.738197] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  173.738197] CR2: 0000000000000008 CR3: 0000000072886000 CR4: 0000000000002660
> [  173.738197] Stack:
> [  173.738197]  ffff88007f6113c0 0000000000000000 ffff880071437ee8 ffffffffa0511253
> [  173.738197]  0000000000000000 ffffffffa0511f30 ffff880071437f78 ffffffff81093fdf
> [  173.738197]  00000000c22d6d00 ffffffffa0511f30 ffff880000000800 ffff880071437efc
> [  173.738197] Call Trace:
> [  173.738197]  [<ffffffffa0511253>] microcode_exit+0x20/0xb1 [microcode]
> [  173.738197]  [<ffffffff81093fdf>] SyS_delete_module+0x118/0x1a6
> [  173.738197]  [<ffffffff8100af73>] ? do_notify_resume+0x6a/0x78
> [  173.738197]  [<ffffffff814caae9>] system_call_fastpath+0x12/0x17
> [  173.738197] Code: f1 07 64 81 e8 3a 70 cf ff b8 ea ff ff ff eb 6b 48 c7 c7 20 fa 71 81 e8 1a a4 17 00 48 8b 43 20 48 8b 53 18 48 8b 3d e6 73 57 00 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00 
> ad de 8b 33 48 89 
> [  173.738197] RIP  [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
> [  173.738197]  RSP <ffff880071437eb8>
> [  173.738197] CR2: 0000000000000008
> [  173.784081] ---[ end trace 0ab648576ba0af94 ]---
> 
> 
> Previously in 3.18.1:
> [  176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
> [  176.855844] microcode: CPU0 update to revision 0x60f failed
> [  176.856107] microcode: CPU1 sig=0x10676, pf=0x1, revision=0x60c
> [  176.856115] microcode: CPU1 update to revision 0x60f failed
> [  176.861597] microcode: Microcode Update Driver: v2.00 removed.
> 
> 
> Same 3.18.2 kernel on bare metal (different system but identical hardware):
> [   46.002857] microcode: Microcode Update Driver: v2.00 removed.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22  8:20 ` Borislav Petkov
@ 2015-01-22 14:48   ` Boris Ostrovsky
  2015-01-22 14:53     ` Boris Ostrovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-22 14:48 UTC (permalink / raw)
  To: Borislav Petkov, James Dingwall; +Cc: linux-kernel

On 01/22/2015 03:20 AM, Borislav Petkov wrote:
> Hmm,
>
> and I thought we fixed all that fun. It seems not :-\
>
> Boris, this paravirt_enabled() thing doesn't seem to work or why are we
> even calling microcode_exit()?

Looks like something is unloading microcode driver (init scripts 
perhaps) and so we are trying to unregister device that we never 
registered (because we had early return from microcode_init() when we 
loaded it).

I actually suspect the same bug would be triggered if dis_ucode_ldr is 
true on baremetal.

So we need something like:

--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -625,6 +625,9 @@ static void __exit microcode_exit(void)
  {
         struct cpuinfo_x86 *c = &cpu_data(0);

+       if (paravirt_enabled() || dis_ucode_ldr)
+               return 0;
+
         microcode_dev_exit();

         unregister_hotcpu_notifier(&mc_cpu_notifier);


-boris

>
> Leaving in the rest.
>
> On Thu, Jan 22, 2015 at 05:52:42AM +0000, James Dingwall wrote:
>> Hi,
>>
>> Since 3.18.2 I am getting the oops below during boot whilst running as a dom0 under xen 4.4.1 / 4.5.0.  Is this a known issue or worth bisecting to identify the exact commit which causes this?
>>
>> Thanks,
>> James
>>
>> [  173.735541] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>> [  173.735789] IP: [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
>> [  173.735958] PGD 71480067 PUD 71bc6067 PMD 0
>> [  173.736077] Oops: 0002 [#1] SMP
>> [  173.736152] Modules linked in: it87 hwmon_vid autofs4 nfsd xen_pciback xen_gntalloc bridge stp llc ipv6 rbd ceph libceph openvswitch geneve vxlan ip6_udp_tunnel udp_tunnel tun tmem
>> xen_acpi_processor xen_gntdev xen_blkback xen_netback i915 fbcon bitblit softcursor font tileblit video drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic coretemp microcode(-) drm lpc_ich
>> i2c_i801 mfd_core snd_hda_intel firewire_ohci r8169 mii ata_generic evdev e1000e snd_hda_controller rtc_cmos i2c_algo_bit snd_hda_codec i2c_core cfbfillrect cfbimgblt cfbcopyarea backlight fb
>> snd_pcm processor fbdev snd_hwdep snd_timer button intel_agp intel_gtt snd parport_pc parport thermal_sys dm_zero dm_thin_pool dm_persistent_data dm_bio_prison xts lrw gf128mul glue_helper
>> ablk_helper cryptd aes_x86_64 iscsi_tcp libiscsi_tcp
>> [  173.738197]  libiscsi scsi_transport_iscsi tg3 ptp pps_core libphy hwmon e1000 fuse btrfs ext4 jbd2 linear raid0 dm_raid raid1 raid10 dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash
>> dm_log firewire_core hid_sunplus hid_samsung hid_pl hid_petalynx hid_gyration usbhid ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd megaraid_sas 3w_xxxx qla1280 aic7xxx scsi_transport_spi sr_mod
>> cdrom sg ahci libahci sata_nv sata_sil pata_amd libata
>> [  173.738197] CPU: 1 PID: 5381 Comm: rmmod Tainted: G        W      3.18.2 #118
>> [  173.738197] Hardware name: Gigabyte Technology Co., Ltd. G33M-S2/G33M-S2, BIOS F7K 07/31/2009
>> [  173.738197] task: ffff880073dca940 ti: ffff880071434000 task.ti: ffff880071434000
>> [  173.738197] RIP: e030:[<ffffffff8134e7c2>]  [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
>> [  173.738197] RSP: e02b:ffff880071437eb8  EFLAGS: 00010247
>> [  173.738197] RAX: 0000000000000000 RBX: ffffffffa0511e20 RCX: 00000028739a828b
>> [  173.738197] RDX: 0000000000000000 RSI: ffff880073dc0604 RDI: ffff880078866840
>> [  173.738197] RBP: ffff880071437ec8 R08: 00000000bfff1433 R09: ffff88007f68a0b0
>> [  173.738197] R10: 0000000000007ff0 R11: 0000000000000000 R12: 00000000ffffff87
>> [  173.738197] R13: 0000000000000800 R14: 00000000019c11c0 R15: 00000000019c1010
>> [  173.738197] FS:  00007f7ac2926700(0000) GS:ffff88007f680000(0000) knlGS:0000000000000000
>> [  173.738197] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [  173.738197] CR2: 0000000000000008 CR3: 0000000072886000 CR4: 0000000000002660
>> [  173.738197] Stack:
>> [  173.738197]  ffff88007f6113c0 0000000000000000 ffff880071437ee8 ffffffffa0511253
>> [  173.738197]  0000000000000000 ffffffffa0511f30 ffff880071437f78 ffffffff81093fdf
>> [  173.738197]  00000000c22d6d00 ffffffffa0511f30 ffff880000000800 ffff880071437efc
>> [  173.738197] Call Trace:
>> [  173.738197]  [<ffffffffa0511253>] microcode_exit+0x20/0xb1 [microcode]
>> [  173.738197]  [<ffffffff81093fdf>] SyS_delete_module+0x118/0x1a6
>> [  173.738197]  [<ffffffff8100af73>] ? do_notify_resume+0x6a/0x78
>> [  173.738197]  [<ffffffff814caae9>] system_call_fastpath+0x12/0x17
>> [  173.738197] Code: f1 07 64 81 e8 3a 70 cf ff b8 ea ff ff ff eb 6b 48 c7 c7 20 fa 71 81 e8 1a a4 17 00 48 8b 43 20 48 8b 53 18 48 8b 3d e6 73 57 00 <48> 89 42 08 48 89 10 48 b8 00 01 10 00 00 00
>> ad de 8b 33 48 89
>> [  173.738197] RIP  [<ffffffff8134e7c2>] misc_deregister+0x50/0xa5
>> [  173.738197]  RSP <ffff880071437eb8>
>> [  173.738197] CR2: 0000000000000008
>> [  173.784081] ---[ end trace 0ab648576ba0af94 ]---
>>
>>
>> Previously in 3.18.1:
>> [  176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
>> [  176.855844] microcode: CPU0 update to revision 0x60f failed
>> [  176.856107] microcode: CPU1 sig=0x10676, pf=0x1, revision=0x60c
>> [  176.856115] microcode: CPU1 update to revision 0x60f failed
>> [  176.861597] microcode: Microcode Update Driver: v2.00 removed.
>>
>>
>> Same 3.18.2 kernel on bare metal (different system but identical hardware):
>> [   46.002857] microcode: Microcode Update Driver: v2.00 removed.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22 14:48   ` Boris Ostrovsky
@ 2015-01-22 14:53     ` Boris Ostrovsky
  2015-01-22 15:30       ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-22 14:53 UTC (permalink / raw)
  To: Borislav Petkov, James Dingwall; +Cc: linux-kernel

On 01/22/2015 09:48 AM, Boris Ostrovsky wrote:
> On 01/22/2015 03:20 AM, Borislav Petkov wrote:
>> Hmm,
>>
>> and I thought we fixed all that fun. It seems not :-\
>>
>> Boris, this paravirt_enabled() thing doesn't seem to work or why are we
>> even calling microcode_exit()?
>
> Looks like something is unloading microcode driver (init scripts 
> perhaps) and so we are trying to unregister device that we never 
> registered (because we had early return from microcode_init() when we 
> loaded it).
>
> I actually suspect the same bug would be triggered if dis_ucode_ldr is 
> true on baremetal.
>
> So we need something like:
>
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -625,6 +625,9 @@ static void __exit microcode_exit(void)
>  {
>         struct cpuinfo_x86 *c = &cpu_data(0);
>
> +       if (paravirt_enabled() || dis_ucode_ldr)
> +               return 0;

Plain 'return', of course.

Alternatively, we could return an error (-EINVAL?) from microcode_init() 
when either of these two conditions is true.

-boris

> +
>         microcode_dev_exit();
>
>         unregister_hotcpu_notifier(&mc_cpu_notifier);
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22 14:53     ` Boris Ostrovsky
@ 2015-01-22 15:30       ` Borislav Petkov
  2015-01-22 17:43         ` James Dingwall
  2015-01-27 22:12         ` Borislav Petkov
  0 siblings, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2015-01-22 15:30 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: James Dingwall, linux-kernel

On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
> Alternatively, we could return an error (-EINVAL?) from
> microcode_init() when either of these two conditions is true.

Yeah, this should be the right fix.

James, does that fix your issue? (It should.)

---
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 15c29096136b..36a83617eb21 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -552,7 +552,7 @@ static int __init microcode_init(void)
 	int error;
 
 	if (paravirt_enabled() || dis_ucode_ldr)
-		return 0;
+		return -EINVAL;
 
 	if (c->x86_vendor == X86_VENDOR_INTEL)
 		microcode_ops = init_intel_microcode();

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22 15:30       ` Borislav Petkov
@ 2015-01-22 17:43         ` James Dingwall
  2015-01-22 17:58           ` Borislav Petkov
  2015-01-27 22:12         ` Borislav Petkov
  1 sibling, 1 reply; 11+ messages in thread
From: James Dingwall @ 2015-01-22 17:43 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Boris Ostrovsky, linux-kernel

On Thu, Jan 22, 2015 at 04:30:02PM +0100, Borislav Petkov wrote:
> On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
> > Alternatively, we could return an error (-EINVAL?) from
> > microcode_init() when either of these two conditions is true.
> 
> Yeah, this should be the right fix.
> 
> James, does that fix your issue? (It should.)
> 
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 15c29096136b..36a83617eb21 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -552,7 +552,7 @@ static int __init microcode_init(void)
>  	int error;
>  
>  	if (paravirt_enabled() || dis_ucode_ldr)
> -		return 0;
> +		return -EINVAL;
>  
>  	if (c->x86_vendor == X86_VENDOR_INTEL)
>  		microcode_ops = init_intel_microcode();
> 
> -- 

This patch solves it for me on my dom0 with 3.18.3, now there is nothing printed at all from the microcode 
driver which doesn't seem surprising given where the return is.  I'll check it on bare metal at the next 
opportunity but from my understanding of what is happening there I don't see that it should have any impact at 
all.

Thanks,
James

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22 17:43         ` James Dingwall
@ 2015-01-22 17:58           ` Borislav Petkov
  2015-01-22 18:09             ` Boris Ostrovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-22 17:58 UTC (permalink / raw)
  To: James Dingwall; +Cc: Boris Ostrovsky, linux-kernel

On Thu, Jan 22, 2015 at 05:43:15PM +0000, James Dingwall wrote:
> This patch solves it for me on my dom0 with 3.18.3, now there is
> nothing printed at all from the microcode driver which doesn't seem
> surprising given where the return is.

Yap, xen does/will update microcode differently...

> I'll check it on bare metal at the next opportunity but from my
> understanding of what is happening there I don't see that it should
> have any impact at all.

Yeah, it shouldn't have any effect on baremetal in the sense that it
should load properly there. And it is a fix for baremetal too, as Boris
pointed out.

I'm still curious as to why does it say this on your machine:

[  176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
[  176.855844] microcode: CPU0 update to revision 0x60f failed

?

This basically says that we do try to update with the patch but the
hardware doesn't accept it.

Is this new? Did it ever update microcode properly?

Where do you get the microcode for that machine?

Can you run the scriptlet below as root and send me the results? You'd
need the msr-tools package and the msr.ko kernel module loaded. Ask if
you need help.

Also, please send me a full dmesg with "ignore_loglevel log_buf_len=16M
debug" on the kernel command line, privately is fine too.

Thanks.

--
#!/bin/bash

echo "/proc/cpuinfo: "
cat /proc/cpuinfo

echo; echo "dmesg: "
dmesg | grep -i microcode
modprobe msr 2>/dev/null

echo ; echo "MSRs: (0x8b) "
rdmsr --all 0x8b
--

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22 17:58           ` Borislav Petkov
@ 2015-01-22 18:09             ` Boris Ostrovsky
  0 siblings, 0 replies; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-22 18:09 UTC (permalink / raw)
  To: Borislav Petkov, James Dingwall; +Cc: linux-kernel

On 01/22/2015 12:58 PM, Borislav Petkov wrote:
> On Thu, Jan 22, 2015 at 05:43:15PM +0000, James Dingwall wrote:
>> This patch solves it for me on my dom0 with 3.18.3, now there is
>> nothing printed at all from the microcode driver which doesn't seem
>> surprising given where the return is.
> Yap, xen does/will update microcode differently...
>
>> I'll check it on bare metal at the next opportunity but from my
>> understanding of what is happening there I don't see that it should
>> have any impact at all.
> Yeah, it shouldn't have any effect on baremetal in the sense that it
> should load properly there. And it is a fix for baremetal too, as Boris
> pointed out.
>
> I'm still curious as to why does it say this on your machine:
>
> [  176.855832] microcode: CPU0 sig=0x10676, pf=0x1, revision=0x60c
> [  176.855844] microcode: CPU0 update to revision 0x60f failed


If this was on dom0 (i.e  a Xen PV guest) then it's understandable since 
in that case MSR writes would have been  trapped by the hypervisor and 
not processed any further.

James can probably see in the hypervisor log ('xl dmesg') something like

(XEN) traps.c:2579:d0v0 Domain attempted WRMSR 0000000000000079 from 
0x0000000000000000 to 0x0000000000000001.

(MSR reads should proceed fine though).

-boris

>
> ?
>
> This basically says that we do try to update with the patch but the
> hardware doesn't accept it.
>
> Is this new? Did it ever update microcode properly?
>
> Where do you get the microcode for that machine?
>
> Can you run the scriptlet below as root and send me the results? You'd
> need the msr-tools package and the msr.ko kernel module loaded. Ask if
> you need help.
>
> Also, please send me a full dmesg with "ignore_loglevel log_buf_len=16M
> debug" on the kernel command line, privately is fine too.
>
> Thanks.
>
> --
> #!/bin/bash
>
> echo "/proc/cpuinfo: "
> cat /proc/cpuinfo
>
> echo; echo "dmesg: "
> dmesg | grep -i microcode
> modprobe msr 2>/dev/null
>
> echo ; echo "MSRs: (0x8b) "
> rdmsr --all 0x8b
> --
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-22 15:30       ` Borislav Petkov
  2015-01-22 17:43         ` James Dingwall
@ 2015-01-27 22:12         ` Borislav Petkov
  2015-01-27 22:55           ` Boris Ostrovsky
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-27 22:12 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: James Dingwall, linux-kernel

Hey Boris,

On Thu, Jan 22, 2015 at 04:30:02PM +0100, Borislav Petkov wrote:
> On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
> > Alternatively, we could return an error (-EINVAL?) from
> > microcode_init() when either of these two conditions is true.
> 
> Yeah, this should be the right fix.
> 
> James, does that fix your issue? (It should.)
> 
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 15c29096136b..36a83617eb21 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -552,7 +552,7 @@ static int __init microcode_init(void)
>  	int error;
>  
>  	if (paravirt_enabled() || dis_ucode_ldr)
> -		return 0;
> +		return -EINVAL;
>  
>  	if (c->x86_vendor == X86_VENDOR_INTEL)
>  		microcode_ops = init_intel_microcode();

would you do the honor and write a proper patch? You found the bug so...
:-D

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-27 22:12         ` Borislav Petkov
@ 2015-01-27 22:55           ` Boris Ostrovsky
  2015-01-27 23:12             ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-01-27 22:55 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: James Dingwall, linux-kernel

On 01/27/2015 05:12 PM, Borislav Petkov wrote:
> Hey Boris,
>
> On Thu, Jan 22, 2015 at 04:30:02PM +0100, Borislav Petkov wrote:
>> On Thu, Jan 22, 2015 at 09:53:04AM -0500, Boris Ostrovsky wrote:
>>> Alternatively, we could return an error (-EINVAL?) from
>>> microcode_init() when either of these two conditions is true.
>> Yeah, this should be the right fix.
>>
>> James, does that fix your issue? (It should.)
>>
>> ---
>> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
>> index 15c29096136b..36a83617eb21 100644
>> --- a/arch/x86/kernel/cpu/microcode/core.c
>> +++ b/arch/x86/kernel/cpu/microcode/core.c
>> @@ -552,7 +552,7 @@ static int __init microcode_init(void)
>>   	int error;
>>   
>>   	if (paravirt_enabled() || dis_ucode_ldr)
>> -		return 0;
>> +		return -EINVAL;
>>   
>>   	if (c->x86_vendor == X86_VENDOR_INTEL)
>>   		microcode_ops = init_intel_microcode();
> would you do the honor and write a proper patch? You found the bug so...
> :-D


Will do. This needs to go to stable as well, right?

-boris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops
  2015-01-27 22:55           ` Boris Ostrovsky
@ 2015-01-27 23:12             ` Borislav Petkov
  0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2015-01-27 23:12 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: James Dingwall, linux-kernel

On Tue, Jan 27, 2015 at 05:55:17PM -0500, Boris Ostrovsky wrote:
> Will do. This needs to go to stable as well, right?

Right, add "# 3.18" as a comment after the CC:stable line as I haven't
done the backports for the older stable kernels yet. I'll submit it for
the older ones myself, along with the rest of the commits.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-01-27 23:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-22  5:52 Linux 3.18.2 / xen 4.4.1 dom0 - microcode oops James Dingwall
2015-01-22  8:20 ` Borislav Petkov
2015-01-22 14:48   ` Boris Ostrovsky
2015-01-22 14:53     ` Boris Ostrovsky
2015-01-22 15:30       ` Borislav Petkov
2015-01-22 17:43         ` James Dingwall
2015-01-22 17:58           ` Borislav Petkov
2015-01-22 18:09             ` Boris Ostrovsky
2015-01-27 22:12         ` Borislav Petkov
2015-01-27 22:55           ` Boris Ostrovsky
2015-01-27 23:12             ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.