All of lore.kernel.org
 help / color / mirror / Atom feed
* HVM guest only bring up a single vCPU
@ 2021-08-26 21:00 Julien Grall
  2021-08-26 22:51 ` Marek Marczykowski-Górecki
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Julien Grall @ 2021-08-26 21:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Jan Beulich

Hi Andrew,

While doing more testing today, I noticed that only one vCPU would be 
brought up with HVM guest with Xen 4.16 on my setup (QEMU):

[    1.122180] 
================================================================================
[    1.122180] UBSAN: shift-out-of-bounds in 
oss/linux/arch/x86/kernel/apic/apic.c:2362:13
[    1.122180] shift exponent -1 is negative
[    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
[    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
[    1.122180] Call Trace:
[    1.122180]  dump_stack_lvl+0x56/0x6c
[    1.122180]  ubsan_epilogue+0x5/0x50
[    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
[    1.122180]  ? cgroup_kill_write+0x4d/0x150
[    1.122180]  ? cpu_up+0x6e/0x100
[    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
[    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
[    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
[    1.122180]  ? lock_release+0xc7/0x2a0
[    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
[    1.122180]  apic_id_is_primary_thread+0x56/0x60
[    1.122180]  cpu_up+0xbd/0x100
[    1.122180]  bringup_nonboot_cpus+0x4f/0x60
[    1.122180]  smp_init+0x26/0x74
[    1.122180]  kernel_init_freeable+0x183/0x32d
[    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
[    1.122180]  ? rest_init+0x330/0x330
[    1.122180]  kernel_init+0x17/0x140
[    1.122180]  ? rest_init+0x330/0x330
[    1.122180]  ret_from_fork+0x22/0x30
[    1.122244] 
================================================================================
[    1.123176] installing Xen timer for CPU 1
[    1.123369] x86: Booting SMP configuration:
[    1.123409] .... node  #0, CPUs:      #1
[    1.154400] Callback from call_rcu_tasks_trace() invoked.
[    1.154491] smp: Brought up 1 node, 1 CPU
[    1.154526] smpboot: Max logical packages: 2
[    1.154570] smpboot: Total of 1 processors activated (5999.99 BogoMIPS)

I have tried a PV guest (same setup) and the kernel could bring up all 
the vCPUs.

Digging down, Linux will set smp_num_siblings to 0 (via 
detect_ht_early()) and as a result will skip all the CPUs. The value is 
retrieve from a CPUID leaf. So it sounds like we don't set the leaft 
correctly.

FWIW, I have also tried on Xen 4.11 and could spot the same issue. Does 
this ring any bell to you?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-26 21:00 HVM guest only bring up a single vCPU Julien Grall
@ 2021-08-26 22:51 ` Marek Marczykowski-Górecki
  2021-08-27 11:01   ` Julien Grall
  2021-08-26 23:42 ` Andrew Cooper
  2021-08-27  9:26 ` Jan Beulich
  2 siblings, 1 reply; 11+ messages in thread
From: Marek Marczykowski-Górecki @ 2021-08-26 22:51 UTC (permalink / raw)
  To: Julien Grall; +Cc: Andrew Cooper, xen-devel, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 2709 bytes --]

On Thu, Aug 26, 2021 at 10:00:58PM +0100, Julien Grall wrote:
> Hi Andrew,
> 
> While doing more testing today, I noticed that only one vCPU would be
> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
> 
> [    1.122180] ================================================================================
> [    1.122180] UBSAN: shift-out-of-bounds in
> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
> [    1.122180] shift exponent -1 is negative
> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
> [    1.122180] Call Trace:
> [    1.122180]  dump_stack_lvl+0x56/0x6c
> [    1.122180]  ubsan_epilogue+0x5/0x50
> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
> [    1.122180]  ? cpu_up+0x6e/0x100
> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
> [    1.122180]  ? lock_release+0xc7/0x2a0
> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
> [    1.122180]  cpu_up+0xbd/0x100
> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
> [    1.122180]  smp_init+0x26/0x74
> [    1.122180]  kernel_init_freeable+0x183/0x32d
> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
> [    1.122180]  ? rest_init+0x330/0x330
> [    1.122180]  kernel_init+0x17/0x140
> [    1.122180]  ? rest_init+0x330/0x330
> [    1.122180]  ret_from_fork+0x22/0x30
> [    1.122244] ================================================================================
> [    1.123176] installing Xen timer for CPU 1
> [    1.123369] x86: Booting SMP configuration:
> [    1.123409] .... node  #0, CPUs:      #1
> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
> [    1.154491] smp: Brought up 1 node, 1 CPU
> [    1.154526] smpboot: Max logical packages: 2
> [    1.154570] smpboot: Total of 1 processors activated (5999.99 BogoMIPS)
> 
> I have tried a PV guest (same setup) and the kernel could bring up all the
> vCPUs.
> 
> Digging down, Linux will set smp_num_siblings to 0 (via detect_ht_early())
> and as a result will skip all the CPUs. The value is retrieve from a CPUID
> leaf. So it sounds like we don't set the leaft correctly.
> 
> FWIW, I have also tried on Xen 4.11 and could spot the same issue. Does this
> ring any bell to you?

Is it maybe this:
https://lore.kernel.org/xen-devel/20201106003529.391649-1-bmasney@redhat.com/T/#u
?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-26 21:00 HVM guest only bring up a single vCPU Julien Grall
  2021-08-26 22:51 ` Marek Marczykowski-Górecki
@ 2021-08-26 23:42 ` Andrew Cooper
  2021-08-27  6:28   ` Jan Beulich
  2021-08-27  9:26 ` Jan Beulich
  2 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2021-08-26 23:42 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Jan Beulich

On 26/08/2021 22:00, Julien Grall wrote:
> Hi Andrew,
>
> While doing more testing today, I noticed that only one vCPU would be
> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>
> [    1.122180]
> ================================================================================
> [    1.122180] UBSAN: shift-out-of-bounds in
> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
> [    1.122180] shift exponent -1 is negative
> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
> [    1.122180] Call Trace:
> [    1.122180]  dump_stack_lvl+0x56/0x6c
> [    1.122180]  ubsan_epilogue+0x5/0x50
> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
> [    1.122180]  ? cpu_up+0x6e/0x100
> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
> [    1.122180]  ? lock_release+0xc7/0x2a0
> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
> [    1.122180]  cpu_up+0xbd/0x100
> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
> [    1.122180]  smp_init+0x26/0x74
> [    1.122180]  kernel_init_freeable+0x183/0x32d
> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
> [    1.122180]  ? rest_init+0x330/0x330
> [    1.122180]  kernel_init+0x17/0x140
> [    1.122180]  ? rest_init+0x330/0x330
> [    1.122180]  ret_from_fork+0x22/0x30
> [    1.122244]
> ================================================================================
> [    1.123176] installing Xen timer for CPU 1
> [    1.123369] x86: Booting SMP configuration:
> [    1.123409] .... node  #0, CPUs:      #1
> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
> [    1.154491] smp: Brought up 1 node, 1 CPU
> [    1.154526] smpboot: Max logical packages: 2
> [    1.154570] smpboot: Total of 1 processors activated (5999.99
> BogoMIPS)
>
> I have tried a PV guest (same setup) and the kernel could bring up all
> the vCPUs.
>
> Digging down, Linux will set smp_num_siblings to 0 (via
> detect_ht_early()) and as a result will skip all the CPUs. The value
> is retrieve from a CPUID leaf. So it sounds like we don't set the
> leaft correctly.
>
> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
> Does this ring any bell to you?

The CPUID data we give to guests is generally nonsense when it comes to
topology.  By any chance does the hardware you're booting this on not
have hyperthreading enabled/active to begin with?

Fixing this is on the todo list, but it needs libxl to start using
policy objects (series for the next phase of this still pending on
xen-devel).  Exactly how you represent the topology to the guest
correctly depends on the vendor and rough generation - I believe there
are 5 different algorithms to use, and for AMD in particular, it even
depends on how many IO-APICs are visible in the guest.

~Andrew



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-26 23:42 ` Andrew Cooper
@ 2021-08-27  6:28   ` Jan Beulich
  2021-08-27 10:35     ` Julien Grall
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2021-08-27  6:28 UTC (permalink / raw)
  To: Andrew Cooper, Julien Grall; +Cc: xen-devel

On 27.08.2021 01:42, Andrew Cooper wrote:
> On 26/08/2021 22:00, Julien Grall wrote:
>> Hi Andrew,
>>
>> While doing more testing today, I noticed that only one vCPU would be
>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>
>> [    1.122180]
>> ================================================================================
>> [    1.122180] UBSAN: shift-out-of-bounds in
>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>> [    1.122180] shift exponent -1 is negative
>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>> [    1.122180] Call Trace:
>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>> [    1.122180]  ubsan_epilogue+0x5/0x50
>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>> [    1.122180]  ? cpu_up+0x6e/0x100
>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>> [    1.122180]  ? lock_release+0xc7/0x2a0
>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>> [    1.122180]  cpu_up+0xbd/0x100
>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>> [    1.122180]  smp_init+0x26/0x74
>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>> [    1.122180]  ? rest_init+0x330/0x330
>> [    1.122180]  kernel_init+0x17/0x140
>> [    1.122180]  ? rest_init+0x330/0x330
>> [    1.122180]  ret_from_fork+0x22/0x30
>> [    1.122244]
>> ================================================================================
>> [    1.123176] installing Xen timer for CPU 1
>> [    1.123369] x86: Booting SMP configuration:
>> [    1.123409] .... node  #0, CPUs:      #1
>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>> [    1.154491] smp: Brought up 1 node, 1 CPU
>> [    1.154526] smpboot: Max logical packages: 2
>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>> BogoMIPS)
>>
>> I have tried a PV guest (same setup) and the kernel could bring up all
>> the vCPUs.
>>
>> Digging down, Linux will set smp_num_siblings to 0 (via
>> detect_ht_early()) and as a result will skip all the CPUs. The value
>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>> leaft correctly.
>>
>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>> Does this ring any bell to you?
> 
> The CPUID data we give to guests is generally nonsense when it comes to
> topology.  By any chance does the hardware you're booting this on not
> have hyperthreading enabled/active to begin with?

Well, I'd put the question slightly differently: What CPUID data does
qemu supply to Xen here? I could easily see us making an assumption
somewhere that is met by all hardware but is theoretically wrong to
make and not met by qemu, which then leads to further issues with what
we expose to our guest.

Jan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-26 21:00 HVM guest only bring up a single vCPU Julien Grall
  2021-08-26 22:51 ` Marek Marczykowski-Górecki
  2021-08-26 23:42 ` Andrew Cooper
@ 2021-08-27  9:26 ` Jan Beulich
  2021-08-27 10:59   ` Julien Grall
  2021-08-27 12:52   ` Andrew Cooper
  2 siblings, 2 replies; 11+ messages in thread
From: Jan Beulich @ 2021-08-27  9:26 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Andrew Cooper

On 26.08.2021 23:00, Julien Grall wrote:
> Digging down, Linux will set smp_num_siblings to 0 (via 
> detect_ht_early()) and as a result will skip all the CPUs. The value is 
> retrieve from a CPUID leaf. So it sounds like we don't set the leaft 
> correctly.

Xen leaves leaf 1 EBX[23:16] untouched from what the tool stack
passes. The tool stack doubles the value coming from hardware
(or qemu in your case), unless the result would overflow. Hence
it would look to me as if the value coming from qemu has got to
be zero. Which is perfectly fine if HTT is off, just that
libxenguest isn't prepared for this. Could you see whether the
patch below helps (making our hack even hackier)?

Jan

libxenguest/x86: ensure CPUID[1].EBX[32:16] is non-zero for HVM

We unconditionally set HTT, so merely doubling the value read from
hardware isn't going to be correct if that value is zero.

Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I question the doubling in the first place, as that leads to absurd
values when the underlying hardware has a value larger than 1 here. I'd
be inclined to suggest to double the value only if the incoming value
has bit 0 set. And then we'd want to also deal with the case of both
bit 0 and bit 7 being set (perhaps by clearing bit 0 in this case).

--- a/tools/libs/guest/xg_cpuid_x86.c
+++ b/tools/libs/guest/xg_cpuid_x86.c
@@ -594,7 +594,9 @@ int xc_cpuid_apply_policy(xc_interface *
          * Update to reflect vLAPIC_ID = vCPU_ID * 2, but make sure to avoid
          * overflow.
          */
-        if ( !(p->basic.lppp & 0x80) )
+        if ( !p->basic.lppp )
+            p->basic.lppp = 2;
+        else if ( !(p->basic.lppp & 0x80) )
             p->basic.lppp *= 2;
 
         switch ( p->x86_vendor )



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-27  6:28   ` Jan Beulich
@ 2021-08-27 10:35     ` Julien Grall
  2021-08-27 10:52       ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Julien Grall @ 2021-08-27 10:35 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: xen-devel

Hi Jan,

On 27/08/2021 07:28, Jan Beulich wrote:
> On 27.08.2021 01:42, Andrew Cooper wrote:
>> On 26/08/2021 22:00, Julien Grall wrote:
>>> Hi Andrew,
>>>
>>> While doing more testing today, I noticed that only one vCPU would be
>>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>>
>>> [    1.122180]
>>> ================================================================================
>>> [    1.122180] UBSAN: shift-out-of-bounds in
>>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>>> [    1.122180] shift exponent -1 is negative
>>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>>> [    1.122180] Call Trace:
>>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>>> [    1.122180]  ubsan_epilogue+0x5/0x50
>>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>>> [    1.122180]  ? cpu_up+0x6e/0x100
>>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>>> [    1.122180]  ? lock_release+0xc7/0x2a0
>>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>>> [    1.122180]  cpu_up+0xbd/0x100
>>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>>> [    1.122180]  smp_init+0x26/0x74
>>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>>> [    1.122180]  ? rest_init+0x330/0x330
>>> [    1.122180]  kernel_init+0x17/0x140
>>> [    1.122180]  ? rest_init+0x330/0x330
>>> [    1.122180]  ret_from_fork+0x22/0x30
>>> [    1.122244]
>>> ================================================================================
>>> [    1.123176] installing Xen timer for CPU 1
>>> [    1.123369] x86: Booting SMP configuration:
>>> [    1.123409] .... node  #0, CPUs:      #1
>>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>>> [    1.154491] smp: Brought up 1 node, 1 CPU
>>> [    1.154526] smpboot: Max logical packages: 2
>>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>>> BogoMIPS)
>>>
>>> I have tried a PV guest (same setup) and the kernel could bring up all
>>> the vCPUs.
>>>
>>> Digging down, Linux will set smp_num_siblings to 0 (via
>>> detect_ht_early()) and as a result will skip all the CPUs. The value
>>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>>> leaft correctly.
>>>
>>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>>> Does this ring any bell to you?
>>
>> The CPUID data we give to guests is generally nonsense when it comes to
>> topology.  By any chance does the hardware you're booting this on not
>> have hyperthreading enabled/active to begin with?
> 
> Well, I'd put the question slightly differently: What CPUID data does
> qemu supply to Xen here? I could easily see us making an assumption
> somewhere that is met by all hardware but is theoretically wrong to
> make and not met by qemu, which then leads to further issues with what
> we expose to our guest.
I have pasted the output from cpuid on a baremetal Linux here:

https://pastebin.com/WvaXiXuL

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-27 10:35     ` Julien Grall
@ 2021-08-27 10:52       ` Jan Beulich
  2021-08-27 10:55         ` Julien Grall
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2021-08-27 10:52 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, Andrew Cooper

On 27.08.2021 12:35, Julien Grall wrote:
> Hi Jan,
> 
> On 27/08/2021 07:28, Jan Beulich wrote:
>> On 27.08.2021 01:42, Andrew Cooper wrote:
>>> On 26/08/2021 22:00, Julien Grall wrote:
>>>> Hi Andrew,
>>>>
>>>> While doing more testing today, I noticed that only one vCPU would be
>>>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>>>
>>>> [    1.122180]
>>>> ================================================================================
>>>> [    1.122180] UBSAN: shift-out-of-bounds in
>>>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>>>> [    1.122180] shift exponent -1 is negative
>>>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>>>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>>>> [    1.122180] Call Trace:
>>>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>>>> [    1.122180]  ubsan_epilogue+0x5/0x50
>>>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>>>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>>>> [    1.122180]  ? cpu_up+0x6e/0x100
>>>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>>>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>>>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>>>> [    1.122180]  ? lock_release+0xc7/0x2a0
>>>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>>>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>>>> [    1.122180]  cpu_up+0xbd/0x100
>>>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>>>> [    1.122180]  smp_init+0x26/0x74
>>>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>>>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>> [    1.122180]  kernel_init+0x17/0x140
>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>> [    1.122180]  ret_from_fork+0x22/0x30
>>>> [    1.122244]
>>>> ================================================================================
>>>> [    1.123176] installing Xen timer for CPU 1
>>>> [    1.123369] x86: Booting SMP configuration:
>>>> [    1.123409] .... node  #0, CPUs:      #1
>>>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>>>> [    1.154491] smp: Brought up 1 node, 1 CPU
>>>> [    1.154526] smpboot: Max logical packages: 2
>>>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>>>> BogoMIPS)
>>>>
>>>> I have tried a PV guest (same setup) and the kernel could bring up all
>>>> the vCPUs.
>>>>
>>>> Digging down, Linux will set smp_num_siblings to 0 (via
>>>> detect_ht_early()) and as a result will skip all the CPUs. The value
>>>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>>>> leaft correctly.
>>>>
>>>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>>>> Does this ring any bell to you?
>>>
>>> The CPUID data we give to guests is generally nonsense when it comes to
>>> topology.  By any chance does the hardware you're booting this on not
>>> have hyperthreading enabled/active to begin with?
>>
>> Well, I'd put the question slightly differently: What CPUID data does
>> qemu supply to Xen here? I could easily see us making an assumption
>> somewhere that is met by all hardware but is theoretically wrong to
>> make and not met by qemu, which then leads to further issues with what
>> we expose to our guest.
> I have pasted the output from cpuid on a baremetal Linux here:

"baremetal" still meaning it was running on qemu, not itself baremetal?

> https://pastebin.com/WvaXiXuL

   miscellaneous (1/ebx):
      process local APIC physical ID = 0x0 (0)
      maximum IDs for CPUs in pkg    = 0x0 (0)
      CLFLUSH line size              = 0x8 (8)
      brand index                    = 0x0 (0)

As suspected the field is zero, and hence will remain zero after
multiplying by 2. I suppose the patch sent earlier should then get you
further.

Jan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-27 10:52       ` Jan Beulich
@ 2021-08-27 10:55         ` Julien Grall
  0 siblings, 0 replies; 11+ messages in thread
From: Julien Grall @ 2021-08-27 10:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper

Hi Jan,

On 27/08/2021 11:52, Jan Beulich wrote:
> On 27.08.2021 12:35, Julien Grall wrote:
>> Hi Jan,
>>
>> On 27/08/2021 07:28, Jan Beulich wrote:
>>> On 27.08.2021 01:42, Andrew Cooper wrote:
>>>> On 26/08/2021 22:00, Julien Grall wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> While doing more testing today, I noticed that only one vCPU would be
>>>>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>>>>
>>>>> [    1.122180]
>>>>> ================================================================================
>>>>> [    1.122180] UBSAN: shift-out-of-bounds in
>>>>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>>>>> [    1.122180] shift exponent -1 is negative
>>>>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>>>>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>>>>> [    1.122180] Call Trace:
>>>>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>>>>> [    1.122180]  ubsan_epilogue+0x5/0x50
>>>>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>>>>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>>>>> [    1.122180]  ? cpu_up+0x6e/0x100
>>>>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>>>>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>>>>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>>>>> [    1.122180]  ? lock_release+0xc7/0x2a0
>>>>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>>>>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>>>>> [    1.122180]  cpu_up+0xbd/0x100
>>>>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>>>>> [    1.122180]  smp_init+0x26/0x74
>>>>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>>>>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>>> [    1.122180]  kernel_init+0x17/0x140
>>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>>> [    1.122180]  ret_from_fork+0x22/0x30
>>>>> [    1.122244]
>>>>> ================================================================================
>>>>> [    1.123176] installing Xen timer for CPU 1
>>>>> [    1.123369] x86: Booting SMP configuration:
>>>>> [    1.123409] .... node  #0, CPUs:      #1
>>>>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>>>>> [    1.154491] smp: Brought up 1 node, 1 CPU
>>>>> [    1.154526] smpboot: Max logical packages: 2
>>>>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>>>>> BogoMIPS)
>>>>>
>>>>> I have tried a PV guest (same setup) and the kernel could bring up all
>>>>> the vCPUs.
>>>>>
>>>>> Digging down, Linux will set smp_num_siblings to 0 (via
>>>>> detect_ht_early()) and as a result will skip all the CPUs. The value
>>>>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>>>>> leaft correctly.
>>>>>
>>>>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>>>>> Does this ring any bell to you?
>>>>
>>>> The CPUID data we give to guests is generally nonsense when it comes to
>>>> topology.  By any chance does the hardware you're booting this on not
>>>> have hyperthreading enabled/active to begin with?
>>>
>>> Well, I'd put the question slightly differently: What CPUID data does
>>> qemu supply to Xen here? I could easily see us making an assumption
>>> somewhere that is met by all hardware but is theoretically wrong to
>>> make and not met by qemu, which then leads to further issues with what
>>> we expose to our guest.
>> I have pasted the output from cpuid on a baremetal Linux here:
> 
> "baremetal" still meaning it was running on qemu, not itself baremetal?

Correct.

> 
>> https://pastebin.com/WvaXiXuL
> 
>     miscellaneous (1/ebx):
>        process local APIC physical ID = 0x0 (0)
>        maximum IDs for CPUs in pkg    = 0x0 (0)
>        CLFLUSH line size              = 0x8 (8)
>        brand index                    = 0x0 (0)
> 
> As suspected the field is zero, and hence will remain zero after
> multiplying by 2. I suppose the patch sent earlier should then get you
> further.

I am about to try your patch. I will let you know the results.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-27  9:26 ` Jan Beulich
@ 2021-08-27 10:59   ` Julien Grall
  2021-08-27 12:52   ` Andrew Cooper
  1 sibling, 0 replies; 11+ messages in thread
From: Julien Grall @ 2021-08-27 10:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper

Hi Jan,

On 27/08/2021 10:26, Jan Beulich wrote:
> On 26.08.2021 23:00, Julien Grall wrote:
>> Digging down, Linux will set smp_num_siblings to 0 (via
>> detect_ht_early()) and as a result will skip all the CPUs. The value is
>> retrieve from a CPUID leaf. So it sounds like we don't set the leaft
>> correctly.
> 
> Xen leaves leaf 1 EBX[23:16] untouched from what the tool stack
> passes. The tool stack doubles the value coming from hardware
> (or qemu in your case), unless the result would overflow. Hence
> it would look to me as if the value coming from qemu has got to
> be zero. Which is perfectly fine if HTT is off, just that
> libxenguest isn't prepared for this. 
> Could you see whether the
> patch below helps (making our hack even hackier)?

It helps. The Linux HVM domain is now able to bring up all the CPUs.

> 
> Jan
> 
> libxenguest/x86: ensure CPUID[1].EBX[32:16] is non-zero for HVM
> 
> We unconditionally set HTT, so merely doubling the value read from
> hardware isn't going to be correct if that value is zero.
> 
> Reported-by: Julien Grall <julien@xen.org>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Feel free to add my tested-by for the patch.

Cheers,

[1] https://pastebin.com/WvaXiXuL

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-26 22:51 ` Marek Marczykowski-Górecki
@ 2021-08-27 11:01   ` Julien Grall
  0 siblings, 0 replies; 11+ messages in thread
From: Julien Grall @ 2021-08-27 11:01 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: Andrew Cooper, xen-devel, Jan Beulich

Hi Marek,

On 26/08/2021 23:51, Marek Marczykowski-Górecki wrote:
> On Thu, Aug 26, 2021 at 10:00:58PM +0100, Julien Grall wrote:
>> While doing more testing today, I noticed that only one vCPU would be
>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>
>> [    1.122180] ================================================================================
>> [    1.122180] UBSAN: shift-out-of-bounds in
>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>> [    1.122180] shift exponent -1 is negative
>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>> [    1.122180] Call Trace:
>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>> [    1.122180]  ubsan_epilogue+0x5/0x50
>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>> [    1.122180]  ? cpu_up+0x6e/0x100
>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>> [    1.122180]  ? lock_release+0xc7/0x2a0
>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>> [    1.122180]  cpu_up+0xbd/0x100
>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>> [    1.122180]  smp_init+0x26/0x74
>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>> [    1.122180]  ? rest_init+0x330/0x330
>> [    1.122180]  kernel_init+0x17/0x140
>> [    1.122180]  ? rest_init+0x330/0x330
>> [    1.122180]  ret_from_fork+0x22/0x30
>> [    1.122244] ================================================================================
>> [    1.123176] installing Xen timer for CPU 1
>> [    1.123369] x86: Booting SMP configuration:
>> [    1.123409] .... node  #0, CPUs:      #1
>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>> [    1.154491] smp: Brought up 1 node, 1 CPU
>> [    1.154526] smpboot: Max logical packages: 2
>> [    1.154570] smpboot: Total of 1 processors activated (5999.99 BogoMIPS)
>>
>> I have tried a PV guest (same setup) and the kernel could bring up all the
>> vCPUs.
>>
>> Digging down, Linux will set smp_num_siblings to 0 (via detect_ht_early())
>> and as a result will skip all the CPUs. The value is retrieve from a CPUID
>> leaf. So it sounds like we don't set the leaft correctly.
>>
>> FWIW, I have also tried on Xen 4.11 and could spot the same issue. Does this
>> ring any bell to you?
> 
> Is it maybe this:
> https://lore.kernel.org/xen-devel/20201106003529.391649-1-bmasney@redhat.com/T/#u
> ?

It looks to be different as I don't see the splat.

Anyway, Jan just posted a patch that allows a Linux HVM domain to brings 
up all the vCPUs.

Cheers,
-- 
Julien Grall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HVM guest only bring up a single vCPU
  2021-08-27  9:26 ` Jan Beulich
  2021-08-27 10:59   ` Julien Grall
@ 2021-08-27 12:52   ` Andrew Cooper
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2021-08-27 12:52 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall; +Cc: xen-devel

On 27/08/2021 10:26, Jan Beulich wrote:
> On 26.08.2021 23:00, Julien Grall wrote:
>> Digging down, Linux will set smp_num_siblings to 0 (via 
>> detect_ht_early()) and as a result will skip all the CPUs. The value is 
>> retrieve from a CPUID leaf. So it sounds like we don't set the leaft 
>> correctly.
> Xen leaves leaf 1 EBX[23:16] untouched from what the tool stack
> passes. The tool stack doubles the value coming from hardware
> (or qemu in your case), unless the result would overflow. Hence
> it would look to me as if the value coming from qemu has got to
> be zero. Which is perfectly fine if HTT is off, just that
> libxenguest isn't prepared for this. Could you see whether the
> patch below helps (making our hack even hackier)?
>
> Jan
>
> libxenguest/x86: ensure CPUID[1].EBX[32:16] is non-zero for HVM
>
> We unconditionally set HTT, so merely doubling the value read from
> hardware isn't going to be correct if that value is zero.
>
> Reported-by: Julien Grall <julien@xen.org>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I don't particularly like this, but I don't like any of the junk we
currently have here.

This codepath ought to be limited to virtual environments which have
given us garbage in p->basic.lppp in the first place.

Therefore, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
> I question the doubling in the first place, as that leads to absurd
> values when the underlying hardware has a value larger than 1 here.

It's broken for several reasons, perhaps most obviously because it is a
gross assumption that all systems look like a Intel Core/Xeon with
Hyperthreading.

The right way to fix this is to pack the APIC IDs tightly (which
actually lets us break the 128 vcpu barrier without vIOMMU), and adjust
the SMT mask in leaf 0xd to compensate.

We need a slide of 8 on the APIC IDs to do AMD legacy topology by the
book, but as we've not had that before, I'm quite tempted to leave
implementing that algorithm to whomever first actually needs it.

> I'd be inclined to suggest to double the value only if the incoming value
> has bit 0 set. And then we'd want to also deal with the case of both
> bit 0 and bit 7 being set (perhaps by clearing bit 0 in this case).

Honestly, until someone starts the "lets do topology correctly,
following Intel and AMD's topology algorithms", I recommend not
tinkering.  It is incredibly fragile logic.

~Andrew



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-08-27 12:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-26 21:00 HVM guest only bring up a single vCPU Julien Grall
2021-08-26 22:51 ` Marek Marczykowski-Górecki
2021-08-27 11:01   ` Julien Grall
2021-08-26 23:42 ` Andrew Cooper
2021-08-27  6:28   ` Jan Beulich
2021-08-27 10:35     ` Julien Grall
2021-08-27 10:52       ` Jan Beulich
2021-08-27 10:55         ` Julien Grall
2021-08-27  9:26 ` Jan Beulich
2021-08-27 10:59   ` Julien Grall
2021-08-27 12:52   ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.