All of lore.kernel.org
 help / color / mirror / Atom feed
* unable to boot windows with 256 cpus
@ 2020-07-15 17:45 Igor Mammedov
  2020-07-15 18:28 ` Peter Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Igor Mammedov @ 2020-07-15 17:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, alex.williamson, peterx

While testing ACPI cpu hotplug changes I stumbled on BSOD in case
QEMU is configured with 256 CPUs, Windows Server 2012R2x64 fails to boot
with bugcheck 5C


qemu-system-x86_64 -m 4G -smp 2,sockets=64,cores=4,maxcpus=256 -M q35,kernel-irqchip=split  -enable-kvm -device intel-iommu,intremap=on,eim=on ws2012r2x64DCchk.qcow2

Looking at stacktrace, it seems that is fails when trying to initialize iommu

hal_interrupt_remapping_setup_failure_nt!initbootprocessor

Any idea what to try to figure out what QEMU is missing wrt intremap?

PS:
WS2016 boots eventually, but CPU hotplug doesn't work, symptoms (unconfirmed yet) like SCI isn't being delivered.
With RHEL76 the same config works fine.


---
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************
 
HAL_INITIALIZATION_FAILED (5c)
Arguments:
Arg1: 0000000000007000
Arg2: 0000000000000618
Arg3: ffffffffc00000bb
Arg4: 0000000000000000
 
Debugging Details:
------------------
 
 
DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT
 
BUGCHECK_STR:  0x5C
 
CURRENT_IRQL:  0
 
ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre
 
DPC_STACK_BASE:  FFFFF80225BE3FB0
 
LAST_CONTROL_TRANSFER:  from fffff80221aa9714 to fffff80221c95801
 
STACK_TEXT:  
fffff802`25bdbcd8 fffff802`21aa9714 : fffff6fc`01114af8 fffff6fb`7e0088a0 fffff6fb`7dbf0040 fffff6fb`7dbedf80 : nt!DbgBreakPointWithStatus+0x1
fffff802`25bdbce0 fffff802`21aaabff : 00000000`00000004 00000000`00000000 00000000`00000000 ffffffff`c00000bb : nt!KiBugCheckDebugBreak+0x14
fffff802`25bdbd50 fffff802`21c8fb94 : 00000000`0000005c 00000000`00007000 00000000`00000618 ffffffff`c00000bb : nt!KeBugCheck2+0xdd7
fffff802`25bdc460 fffff802`2297c94a : 00000000`0000005c 00000000`00007000 00000000`00000618 ffffffff`c00000bb : nt!KeBugCheckEx+0x104
fffff802`25bdc4a0 fffff802`2297c9ac : fffff802`20c735d0 00000000`00000000 00000000`00000009 fffff802`20c735f0 : hal!HalpInitializeInterrupts+0x406
fffff802`25bdc4f0 fffff802`229728aa : fffff802`20c735d0 fffff802`229742ce fffff802`20c735d0 00000000`00000007 : hal!HalpInterruptInitDiscard+0x3c
fffff802`25bdc520 fffff802`2297221c : fffff802`00000008 00000000`00000000 fffff802`20c735d0 00000000`00000008 : hal!HalpInterruptInitSystem+0xf6
fffff802`25bdc560 fffff802`2297c537 : fffff802`00000006 fffff802`0000000c fffff802`20c735d0 fffff802`20c735d0 : hal!HalpInitSystemHelper+0x44
fffff802`25bdc5c0 fffff802`2285467d : fffff802`20c735d0 fffff802`20c735d0 fffff802`25bdc6f0 00000000`00010228 : hal!HalpInitSystemPhase0+0x1b
fffff802`25bdc5f0 fffff802`222d065f : fffff802`20c735d0 00000000`00000000 00000000`00000000 fffff802`21cf6180 : nt!InitBootProcessor+0x265
fffff802`25bdc870 fffff802`222e4073 : fffff802`222ae400 fffff802`222aeac0 fffff802`25bdd000 fffff802`21cf6180 : nt!KiInitializeKernel+0xd83
fffff802`25bdcc10 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemStartup+0x193
 
 
STACK_COMMAND:  kb
 
FOLLOWUP_IP:
nt!InitBootProcessor+265
fffff802`2285467d 84c0            test    al,al
 
SYMBOL_STACK_INDEX:  9
 
SYMBOL_NAME:  nt!InitBootProcessor+265
 
FOLLOWUP_NAME:  MachineOwner
 
MODULE_NAME: nt
 
IMAGE_NAME:  ntkrnlmp.exe
 
DEBUG_FLR_IMAGE_TIMESTAMP:  5215d150
 
IMAGE_VERSION:  6.3.9600.16384
 
BUCKET_ID_FUNC_OFFSET:  265
 
FAILURE_BUCKET_ID:  0x5C_HAL_INTERRUPT_REMAPPING_SETUP_FAILURE_nt!InitBootProcessor
 
BUCKET_ID:  0x5C_HAL_INTERRUPT_REMAPPING_SETUP_FAILURE_nt!InitBootProcessor
 
ANALYSIS_SOURCE:  KM
 
FAILURE_ID_HASH_STRING:  km:0x5c_hal_interrupt_remapping_setup_failure_nt!initbootprocessor
 
FAILURE_ID_HASH:  {85a33624-5d0e-3044-bbaa-0bdba50d221b}



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: unable to boot windows with 256 cpus
  2020-07-15 17:45 unable to boot windows with 256 cpus Igor Mammedov
@ 2020-07-15 18:28 ` Peter Xu
  2020-07-16 10:20   ` Igor Mammedov
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Xu @ 2020-07-15 18:28 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Paolo Bonzini, alex.williamson, qemu-devel

On Wed, Jul 15, 2020 at 07:45:13PM +0200, Igor Mammedov wrote:
> While testing ACPI cpu hotplug changes I stumbled on BSOD in case
> QEMU is configured with 256 CPUs, Windows Server 2012R2x64 fails to boot
> with bugcheck 5C
> 
> 
> qemu-system-x86_64 -m 4G -smp 2,sockets=64,cores=4,maxcpus=256 -M q35,kernel-irqchip=split  -enable-kvm -device intel-iommu,intremap=on,eim=on ws2012r2x64DCchk.qcow2
> 
> Looking at stacktrace, it seems that is fails when trying to initialize iommu
> 
> hal_interrupt_remapping_setup_failure_nt!initbootprocessor
> 
> Any idea what to try to figure out what QEMU is missing wrt intremap?
> 
> PS:
> WS2016 boots eventually, but CPU hotplug doesn't work, symptoms (unconfirmed yet) like SCI isn't being delivered.
> With RHEL76 the same config works fine.

Igor,

Could you try this again but with vtd tracepoints enabled?

  -trace enable="vtd_*"

I think we don't need to capture all the trace outputs, but only until when the
HAL error message triggered should work.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: unable to boot windows with 256 cpus
  2020-07-15 18:28 ` Peter Xu
@ 2020-07-16 10:20   ` Igor Mammedov
  0 siblings, 0 replies; 3+ messages in thread
From: Igor Mammedov @ 2020-07-16 10:20 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, alex.williamson, qemu-devel

On Wed, 15 Jul 2020 14:28:19 -0400
Peter Xu <peterx@redhat.com> wrote:

> On Wed, Jul 15, 2020 at 07:45:13PM +0200, Igor Mammedov wrote:
> > While testing ACPI cpu hotplug changes I stumbled on BSOD in case
> > QEMU is configured with 256 CPUs, Windows Server 2012R2x64 fails to boot
> > with bugcheck 5C
> > 
> > 
> > qemu-system-x86_64 -m 4G -smp 2,sockets=64,cores=4,maxcpus=256 -M q35,kernel-irqchip=split  -enable-kvm -device intel-iommu,intremap=on,eim=on ws2012r2x64DCchk.qcow2
> > 
> > Looking at stacktrace, it seems that is fails when trying to initialize iommu
> > 
> > hal_interrupt_remapping_setup_failure_nt!initbootprocessor
> > 
> > Any idea what to try to figure out what QEMU is missing wrt intremap?
> > 
> > PS:
> > WS2016 boots eventually, but CPU hotplug doesn't work, symptoms (unconfirmed yet) like SCI isn't being delivered.
> > With RHEL76 the same config works fine.  
> 
> Igor,
> 
> Could you try this again but with vtd tracepoints enabled?
> 
>   -trace enable="vtd_*"
> 
> I think we don't need to capture all the trace outputs, but only until when the
> HAL error message triggered should work.

here is all it outputs.

480969@1594894035.927040:vtd_context_cache_reset 
480969@1594894035.928033:vtd_switch_address_space Device 00:00.0 switching address space (iommu enabled=0)
480969@1594894035.934110:vtd_switch_address_space Device 00:01.0 switching address space (iommu enabled=0)
480969@1594894035.935382:vtd_switch_address_space Device 00:02.0 switching address space (iommu enabled=0)
480969@1594894035.936661:vtd_switch_address_space Device 00:1f.0 switching address space (iommu enabled=0)
480969@1594894035.937957:vtd_switch_address_space Device 00:1f.2 switching address space (iommu enabled=0)
480969@1594894035.939258:vtd_switch_address_space Device 00:1f.3 switching address space (iommu enabled=0)
480969@1594894035.950198:vtd_context_cache_reset 
480969@1594894035.950213:vtd_switch_address_space Device 00:00.0 switching address space (iommu enabled=0)
480969@1594894035.950219:vtd_switch_address_space Device 00:01.0 switching address space (iommu enabled=0)
480969@1594894035.950224:vtd_switch_address_space Device 00:02.0 switching address space (iommu enabled=0)
480969@1594894035.950229:vtd_switch_address_space Device 00:1f.0 switching address space (iommu enabled=0)
480969@1594894035.950234:vtd_switch_address_space Device 00:1f.2 switching address space (iommu enabled=0)
480969@1594894035.950238:vtd_switch_address_space Device 00:1f.3 switching address space (iommu enabled=0)

> 
> Thanks,
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-07-16 10:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 17:45 unable to boot windows with 256 cpus Igor Mammedov
2020-07-15 18:28 ` Peter Xu
2020-07-16 10:20   ` Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.