* Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
@ 2015-02-23 9:27 Sander Eikelenboom
2015-02-23 10:06 ` Jan Beulich
0 siblings, 1 reply; 6+ messages in thread
From: Sander Eikelenboom @ 2015-02-23 9:27 UTC (permalink / raw)
To: xen-devel
Hi,
While shutting down all guests to go for a host reboot i encountered the splat below.
This was running on Xen with:
xen_changeset: Fri Feb 20 16:21:10 2015 +0100 git:24b2b8d-dirty
--
Sander
(XEN) [2015-02-23 09:16:26.292] Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
(XEN) [2015-02-23 09:16:26.292] ----[ Xen-4.6-unstable x86_64 debug=y Not tainted ]----
(XEN) [2015-02-23 09:16:26.292] CPU: 1
(XEN) [2015-02-23 09:16:26.292] RIP: e008:[<ffff82d08012c018>] cpu_raise_softirq+0xd7/0xeb
(XEN) [2015-02-23 09:16:26.292] RFLAGS: 0000000000010202 CONTEXT: hypervisor
(XEN) [2015-02-23 09:16:26.292] rax: ffff82d080328e60 rbx: 0000000000000005 rcx: ffff82d0802fff80
(XEN) [2015-02-23 09:16:26.292] rdx: ffff83054eb680e0 rsi: 0000000000000007 rdi: 0000000000000000
(XEN) [2015-02-23 09:16:26.292] rbp: ffff83054eb67328 rsp: ffff83054eb67308 r8: 0000000000000001
(XEN) [2015-02-23 09:16:26.292] r9: ffff83054eb1a240 r10: 0000000000000000 r11: 0000000000000000
(XEN) [2015-02-23 09:16:26.292] r12: 0000000000000007 r13: 0000000000000001 r14: 000000000000000e
(XEN) [2015-02-23 09:16:26.292] r15: 00000000000008f8 cr0: 000000008005003b cr4: 00000000000006f0
(XEN) [2015-02-23 09:16:26.292] cr3: 0000000476850000 cr2: 00007ffd07e91f20
(XEN) [2015-02-23 09:16:26.292] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) [2015-02-23 09:16:26.292] Xen stack trace from rsp=ffff83054eb67308:
(XEN) [2015-02-23 09:16:26.292] 000000000000000e ffff83009fd4b000 0000000000000001 ffff83009fd4b000
(XEN) [2015-02-23 09:16:26.292] ffff83054eb67348 ffff82d080167c90 0000000000000000 ffff83009fd4b418
(XEN) [2015-02-23 09:16:26.292] ffff83054eb67378 ffff82d0801cf411 ffff83054eb673a8 ffff83009fd4b000
(XEN) [2015-02-23 09:16:26.292] ffff83009fd41418 0000000000000000 ffff83054eb67398 ffff82d0801cf51f
(XEN) [2015-02-23 09:16:26.292] ffff83009fd4b000 ffff83009fd41418 ffff83054eb673e8 ffff82d0801cfd50
(XEN) [2015-02-23 09:16:26.292] ffff83054eb67418 00000001011ef7e2 ffff830300000000 ffff83009fd41000
(XEN) [2015-02-23 09:16:26.292] 0000000000000300 00000000000008f8 00000000000008f8 ffff83009fd41000
(XEN) [2015-02-23 09:16:26.292] ffff83054eb67448 ffff82d0801d0657 ffff83054eb67458 ffff82d0801f5d11
(XEN) [2015-02-23 09:16:26.292] 000000004eb67594 0000000000000000 0000000000000000 0000000000000000
(XEN) [2015-02-23 09:16:26.292] ffff8303d7bd6000 0000000000000300 0000000000000004 ffff83009fd41000
(XEN) [2015-02-23 09:16:26.292] ffff83054eb674a8 ffff82d0801d0b58 ffff83054eb674a8 ffff82d080178a6c
(XEN) [2015-02-23 09:16:26.292] 0000000002211067 ffff83054eb674e4 80000000fee0017b ffff82d080289a68
(XEN) [2015-02-23 09:16:26.292] ffff82d0801cf175 ffff82d0801ceb87 ffff83054eb67568 ffff83009fd41000
(XEN) [2015-02-23 09:16:26.292] ffff83054eb67518 ffff82d0801c6747 ffff830500000000 ffff82d080289fa0
(XEN) [2015-02-23 09:16:26.292] ffff82d0801cf175 ffff82d0801d09d7 ffff83054eb674e8 0000000080227575
(XEN) [2015-02-23 09:16:26.292] ffff83054eb67548 ffff83009fd41000 ffff8305356de000 0000000000000004
(XEN) [2015-02-23 09:16:26.292] ffff82e008f4e3c0 0000000000000000 ffff83054eb675b8 ffff82d0801b7023
(XEN) [2015-02-23 09:16:26.292] 0000000000000004 0000000000000300 ffff83054eb67618 000000014eb67618
(XEN) [2015-02-23 09:16:26.292] 00000000fee00300 010082d000000000 ffff83054eb67578 0000000000000004
(XEN) [2015-02-23 09:16:26.292] 00000000fee00300 00000000000008f8 0000000400000001 0100000000000000
(XEN) [2015-02-23 09:16:26.292] Xen call trace:
(XEN) [2015-02-23 09:16:26.292] [<ffff82d08012c018>] cpu_raise_softirq+0xd7/0xeb
(XEN) [2015-02-23 09:16:26.292] [<ffff82d080167c90>] vcpu_kick+0x65/0x6f
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801cf411>] vlapic_set_irq+0xb6/0xc4
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801cf51f>] vlapic_accept_irq+0x91/0x1ca
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801cfd50>] vlapic_ipi+0x28b/0x2ae
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801d0657>] vlapic_reg_write+0x215/0x595
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801d0b58>] vlapic_write+0x181/0x1f7
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801c6747>] hvm_mmio_intercept+0x14d/0x36a
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801b7023>] hvmemul_do_io+0x440/0x66b
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801b727b>] hvmemul_do_mmio+0x2d/0x2f
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801b88de>] hvmemul_write+0x1d8/0x24c
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801a1c9f>] x86_emulate+0xc97b/0x1010c
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801b76da>] _hvm_emulate_one+0x197/0x2bb
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801b78bf>] hvm_emulate_one+0x10/0x12
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801c6eb4>] handle_mmio+0x54/0xd4
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801c6f78>] handle_mmio_with_translation+0x44/0x46
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801c5264>] hvm_hap_nested_page_fault+0x163/0x541
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801db3f7>] svm_vmexit_handler+0x16bf/0x19bc
(XEN) [2015-02-23 09:16:26.292] [<ffff82d0801dd352>] svm_stgi_label+0x8/0x46
(XEN) [2015-02-23 09:16:26.292]
(XEN) [2015-02-23 09:16:27.613]
(XEN) [2015-02-23 09:16:27.622] ****************************************
(XEN) [2015-02-23 09:16:27.642] Panic on CPU 1:
(XEN) [2015-02-23 09:16:27.654] Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
(XEN) [2015-02-23 09:16:27.687] ****************************************
(XEN) [2015-02-23 09:16:27.706]
(XEN) [2015-02-23 09:16:27.715] Manual reset required ('noreboot' specified)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
2015-02-23 9:27 Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97 Sander Eikelenboom
@ 2015-02-23 10:06 ` Jan Beulich
2015-02-23 10:38 ` Andrew Cooper
2015-02-23 10:45 ` Sander Eikelenboom
0 siblings, 2 replies; 6+ messages in thread
From: Jan Beulich @ 2015-02-23 10:06 UTC (permalink / raw)
To: Sander Eikelenboom; +Cc: xen-devel
>>> On 23.02.15 at 10:27, <linux@eikelenboom.it> wrote:
> While shutting down all guests to go for a host reboot i encountered the
> splat below.
> This was running on Xen with:
> xen_changeset: Fri Feb 20 16:21:10 2015 +0100 git:24b2b8d-dirty
"-dirty" meaning what?
> (XEN) [2015-02-23 09:16:26.292] Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
Since with debug=y the callstack entries should be reliable, I can't
see how this matches up with ...
> (XEN) [2015-02-23 09:16:26.292] Xen call trace:
> (XEN) [2015-02-23 09:16:26.292] [<ffff82d08012c018>] cpu_raise_softirq+0xd7/0xeb
... this, since
void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
{
unsigned int this_cpu = smp_processor_id();
if ( test_and_set_bit(nr, &softirq_pending(cpu))
|| (cpu == this_cpu)
|| arch_skip_send_event_check(cpu) )
return;
if ( !per_cpu(batching, this_cpu) || in_irq() )
smp_send_event_check_cpu(cpu);
else
set_bit(nr, &per_cpu(batch_mask, this_cpu));
}
doesn't indicate any use of cpumask functions. If, however,
arch_skip_send_event_check()'s call to cpumask_test_cpu()
didn't get inlined, that might be the cause. Albeit that would mean
smp_processor_id() returned an out-of-range value... In any
event we'll need to know what exactly above code location refers
to inside the entire function.
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
2015-02-23 10:06 ` Jan Beulich
@ 2015-02-23 10:38 ` Andrew Cooper
2015-02-23 10:56 ` Jan Beulich
2015-02-23 10:45 ` Sander Eikelenboom
1 sibling, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2015-02-23 10:38 UTC (permalink / raw)
To: Jan Beulich, Sander Eikelenboom; +Cc: xen-devel
On 23/02/15 10:06, Jan Beulich wrote:
>>>> On 23.02.15 at 10:27, <linux@eikelenboom.it> wrote:
>> While shutting down all guests to go for a host reboot i encountered the
>> splat below.
>> This was running on Xen with:
>> xen_changeset: Fri Feb 20 16:21:10 2015 +0100 git:24b2b8d-dirty
> "-dirty" meaning what?
>
>> (XEN) [2015-02-23 09:16:26.292] Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
> Since with debug=y the callstack entries should be reliable, I can't
> see how this matches up with ...
>
>> (XEN) [2015-02-23 09:16:26.292] Xen call trace:
>> (XEN) [2015-02-23 09:16:26.292] [<ffff82d08012c018>] cpu_raise_softirq+0xd7/0xeb
> ... this, since
>
> void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
> {
> unsigned int this_cpu = smp_processor_id();
>
> if ( test_and_set_bit(nr, &softirq_pending(cpu))
> || (cpu == this_cpu)
> || arch_skip_send_event_check(cpu) )
> return;
>
> if ( !per_cpu(batching, this_cpu) || in_irq() )
> smp_send_event_check_cpu(cpu);
> else
> set_bit(nr, &per_cpu(batch_mask, this_cpu));
> }
>
> doesn't indicate any use of cpumask functions. If, however,
> arch_skip_send_event_check()'s call to cpumask_test_cpu()
> didn't get inlined, that might be the cause. Albeit that would mean
> smp_processor_id() returned an out-of-range value... In any
> event we'll need to know what exactly above code location refers
> to inside the entire function.
Are you sure your code is up to date?
Current staging has
void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
{
unsigned int this_cpu = smp_processor_id();
if ( test_and_set_bit(nr, &softirq_pending(cpu))
|| (cpu == this_cpu)
|| arch_skip_send_event_check(cpu) )
return;
if ( !per_cpu(batching, this_cpu) || in_irq() )
smp_send_event_check_cpu(cpu);
else
__cpumask_set_cpu(nr, &per_cpu(batch_mask, this_cpu));
}
And furthermore, I think the final __cpumask_set_cpu(...) appears
wrong. The first parameter should be 'cpu' rather than 'nr'. I am not
surprised that the ASSERT() is firing.
~Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
2015-02-23 10:06 ` Jan Beulich
2015-02-23 10:38 ` Andrew Cooper
@ 2015-02-23 10:45 ` Sander Eikelenboom
2015-02-23 10:57 ` Jan Beulich
1 sibling, 1 reply; 6+ messages in thread
From: Sander Eikelenboom @ 2015-02-23 10:45 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
Monday, February 23, 2015, 11:06:25 AM, you wrote:
>>>> On 23.02.15 at 10:27, <linux@eikelenboom.it> wrote:
>> While shutting down all guests to go for a host reboot i encountered the
>> splat below.
>> This was running on Xen with:
>> xen_changeset: Fri Feb 20 16:21:10 2015 +0100 git:24b2b8d-dirty
> "-dirty" meaning what?
Patch for re-enabeling HPET, which doesn't get enabled due to a bios glitch, but
actually just works fine (for over a year now or so).
(and if it's not enabled, cpuidle breaks bad)
diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c
index c1b76fb..43435bc 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -608,7 +608,7 @@ int __init amd_setup_hpet_msi(struct msi_desc *msi_desc)
{
AMD_IOMMU_DEBUG("Failed to setup HPET MSI remapping."
" Wrong HPET.\n");
- return -ENODEV;
+ /* return -ENODEV; */
}
lock = get_intremap_lock(hpet_sbdf.seg, hpet_sbdf.bdf);
And the other one is Konrad's temp fix for the dpci softirq problem:
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index ae050df..ed3cfa1 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -804,7 +804,19 @@ static void dpci_softirq(void)
d = pirq_dpci->dom;
smp_mb(); /* 'd' MUST be saved before we set/clear the bits. */
if ( test_and_set_bit(STATE_RUN, &pirq_dpci->state) )
- BUG();
+ {
+ unsigned long flags;
+
+ /* Put back on the list and retry. */
+ local_irq_save(flags);
+ list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
+ local_irq_restore(flags);
+
+ raise_softirq(HVM_DPCI_SOFTIRQ);
+ continue;
+ }
+
+
/*
* The one who clears STATE_SCHED MUST refcount the domain.
*/
>> (XEN) [2015-02-23 09:16:26.292] Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
> Since with debug=y the callstack entries should be reliable, I can't
> see how this matches up with ...
>> (XEN) [2015-02-23 09:16:26.292] Xen call trace:
>> (XEN) [2015-02-23 09:16:26.292] [<ffff82d08012c018>] cpu_raise_softirq+0xd7/0xeb
> ... this, since
> void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
> {
> unsigned int this_cpu = smp_processor_id();
> if ( test_and_set_bit(nr, &softirq_pending(cpu))
> || (cpu == this_cpu)
> || arch_skip_send_event_check(cpu) )
> return;
> if ( !per_cpu(batching, this_cpu) || in_irq() )
> smp_send_event_check_cpu(cpu);
> else
> set_bit(nr, &per_cpu(batch_mask, this_cpu));
> }
> doesn't indicate any use of cpumask functions. If, however,
> arch_skip_send_event_check()'s call to cpumask_test_cpu()
> didn't get inlined, that might be the cause. Albeit that would mean
> smp_processor_id() returned an out-of-range value... In any
> event we'll need to know what exactly above code location refers
> to inside the entire function.
Any instructions on how to figure that out ?
--
Sander
> Jan
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
2015-02-23 10:38 ` Andrew Cooper
@ 2015-02-23 10:56 ` Jan Beulich
0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2015-02-23 10:56 UTC (permalink / raw)
To: Andrew Cooper, Sander Eikelenboom; +Cc: xen-devel
>>> On 23.02.15 at 11:38, <andrew.cooper3@citrix.com> wrote:
> On 23/02/15 10:06, Jan Beulich wrote:
>>>>> On 23.02.15 at 10:27, <linux@eikelenboom.it> wrote:
>>> While shutting down all guests to go for a host reboot i encountered the
>>> splat below.
>>> This was running on Xen with:
>>> xen_changeset: Fri Feb 20 16:21:10 2015 +0100 git:24b2b8d-dirty
>> "-dirty" meaning what?
>>
>>> (XEN) [2015-02-23 09:16:26.292] Assertion 'cpu < nr_cpu_ids' failed at
> .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
>> Since with debug=y the callstack entries should be reliable, I can't
>> see how this matches up with ...
>>
>>> (XEN) [2015-02-23 09:16:26.292] Xen call trace:
>>> (XEN) [2015-02-23 09:16:26.292] [<ffff82d08012c018>]
> cpu_raise_softirq+0xd7/0xeb
>> ... this, since
>>
>> void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
>> {
>> unsigned int this_cpu = smp_processor_id();
>>
>> if ( test_and_set_bit(nr, &softirq_pending(cpu))
>> || (cpu == this_cpu)
>> || arch_skip_send_event_check(cpu) )
>> return;
>>
>> if ( !per_cpu(batching, this_cpu) || in_irq() )
>> smp_send_event_check_cpu(cpu);
>> else
>> set_bit(nr, &per_cpu(batch_mask, this_cpu));
>> }
>>
>> doesn't indicate any use of cpumask functions. If, however,
>> arch_skip_send_event_check()'s call to cpumask_test_cpu()
>> didn't get inlined, that might be the cause. Albeit that would mean
>> smp_processor_id() returned an out-of-range value... In any
>> event we'll need to know what exactly above code location refers
>> to inside the entire function.
>
> Are you sure your code is up to date?
>
> Current staging has
Ah, I looked at master, not staging.
> void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
> {
> unsigned int this_cpu = smp_processor_id();
>
> if ( test_and_set_bit(nr, &softirq_pending(cpu))
> || (cpu == this_cpu)
> || arch_skip_send_event_check(cpu) )
> return;
>
> if ( !per_cpu(batching, this_cpu) || in_irq() )
> smp_send_event_check_cpu(cpu);
> else
> __cpumask_set_cpu(nr, &per_cpu(batch_mask, this_cpu));
> }
>
>
> And furthermore, I think the final __cpumask_set_cpu(...) appears
> wrong. The first parameter should be 'cpu' rather than 'nr'. I am not
> surprised that the ASSERT() is firing.
No, the conversion to __cpumask_set_cpu() was wrong here in
the first place - this ought to be __set_bit(). Will submit a fix in a
minute.
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97
2015-02-23 10:45 ` Sander Eikelenboom
@ 2015-02-23 10:57 ` Jan Beulich
0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2015-02-23 10:57 UTC (permalink / raw)
To: Sander Eikelenboom; +Cc: xen-devel
>>> On 23.02.15 at 11:45, <linux@eikelenboom.it> wrote:
> Any instructions on how to figure that out ?
No need anymore - with Andrew's help it's now already clear what's
wrong.
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-02-23 10:57 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-23 9:27 Assertion 'cpu < nr_cpu_ids' failed at .../src/new/xen-unstable/xen/include/xen/cpumask.h:97 Sander Eikelenboom
2015-02-23 10:06 ` Jan Beulich
2015-02-23 10:38 ` Andrew Cooper
2015-02-23 10:56 ` Jan Beulich
2015-02-23 10:45 ` Sander Eikelenboom
2015-02-23 10:57 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.