All of lore.kernel.org
 help / color / mirror / Atom feed
* dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
@ 2020-11-12 15:57 Manuel Bouyer
  2020-11-12 16:32 ` Roger Pau Monné
  0 siblings, 1 reply; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-12 15:57 UTC (permalink / raw)
  To: xen-devel

Hello,
I'm trying to add dom0 PVH support to NetBSD. I'm tesing with Xen 4.13
on a brand new Intel x86 server (Dell R440).
While the dom0 kernel configures hardware, Xen panics with:
(XEN) Xen call trace:
(XEN)    [<ffff82d08031cc28>] R vpci_msix_arch_mask_entry+0x18/0x20
(XEN)    [<ffff82d08025a38a>] S drivers/vpci/msix.c#msix_write+0x18a/0x2b0
(XEN)    [<ffff82d08030d943>] S arch/x86/hvm/intercept.c#hvm_mmio_write+0x23/0x3
0
(XEN)    [<ffff82d08030dd19>] S hvm_process_io_intercept+0x1e9/0x260
(XEN)    [<ffff82d08030ddad>] S hvm_io_intercept+0x1d/0x40
(XEN)    [<ffff82d0802fe7ba>] S arch/x86/hvm/emulate.c#hvmemul_do_io+0x26a/0x4d0
(XEN)    [<ffff82d080259ef9>] S drivers/vpci/msix.c#msix_accept+0x9/0x20
(XEN)    [<ffff82d0802fea56>] S arch/x86/hvm/emulate.c#hvmemul_do_io_buffer+0x36
/0x70
(XEN)    [<ffff82d0802ff005>] S arch/x86/hvm/emulate.c#hvmemul_linear_mmio_access+0x1e5/0x300
(XEN)    [<ffff82d0802fff44>] S arch/x86/hvm/emulate.c#linear_write+0x84/0x160
(XEN)    [<ffff82d080301ca8>] S arch/x86/hvm/emulate.c#hvmemul_write+0xe8/0x100
(XEN)    [<ffff82d0802de6cc>] S x86_emulate+0x289dc/0x2cfb0
(XEN)    [<ffff82d08027c7ab>] S map_domain_page+0x4b/0x600
(XEN)    [<ffff82d080340eaa>] S __get_gfn_type_access+0x6a/0x100
(XEN)    [<ffff82d08034a367>] S arch/x86/mm/p2m-ept.c#ept_next_level+0x107/0x150
(XEN)    [<ffff82d0802e4961>] S x86_emulate_wrapper+0x21/0x60
(XEN)    [<ffff82d08030024f>] S arch/x86/hvm/emulate.c#_hvm_emulate_one+0x4f/0x220
(XEN)    [<ffff82d0803004ed>] S hvmemul_get_seg_reg+0x4d/0x50
(XEN)    [<ffff82d08030042e>] S hvm_emulate_one+0xe/0x10
(XEN)    [<ffff82d08030e4ca>] S hvm_emulate_one_insn+0x3a/0xf0
(XEN)    [<ffff82d0802e4af0>] S x86_insn_is_mem_access+0/0x260
(XEN)    [<ffff82d08030e5c9>] S handle_mmio_with_translation+0x49/0x60
(XEN)    [<ffff82d080305d78>] S hvm_hap_nested_page_fault+0x2c8/0x720
(XEN)    [<ffff82d0802fea56>] S arch/x86/hvm/emulate.c#hv(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 13:
(XEN) Assertion 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
(XEN) ****************************************

This is when it configures the broadcom network interface, which interrupts
at "msix3 vec 0". It is the first MSI-X device configured; the previous
ones are MSI only.

Is it a bug on the Xen side, or something missing on the NetBSD side ?
If the later, where can I find informations about it ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 15:57 dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843 Manuel Bouyer
@ 2020-11-12 16:32 ` Roger Pau Monné
  2020-11-12 17:02   ` Jan Beulich
  2020-11-12 17:27   ` Manuel Bouyer
  0 siblings, 2 replies; 14+ messages in thread
From: Roger Pau Monné @ 2020-11-12 16:32 UTC (permalink / raw)
  To: Manuel Bouyer; +Cc: xen-devel

On Thu, Nov 12, 2020 at 04:57:15PM +0100, Manuel Bouyer wrote:
> Hello,
> I'm trying to add dom0 PVH support to NetBSD. I'm tesing with Xen 4.13
> on a brand new Intel x86 server (Dell R440).

I would recommend using 4.14, PVH dom0 is still very much in
progress, and while I don't recall any specific fix going in 4.14 that
could be related to this there have been changes.

> While the dom0 kernel configures hardware, Xen panics with:
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08031cc28>] R vpci_msix_arch_mask_entry+0x18/0x20
> (XEN)    [<ffff82d08025a38a>] S drivers/vpci/msix.c#msix_write+0x18a/0x2b0
> (XEN)    [<ffff82d08030d943>] S arch/x86/hvm/intercept.c#hvm_mmio_write+0x23/0x3
> 0
> (XEN)    [<ffff82d08030dd19>] S hvm_process_io_intercept+0x1e9/0x260
> (XEN)    [<ffff82d08030ddad>] S hvm_io_intercept+0x1d/0x40
> (XEN)    [<ffff82d0802fe7ba>] S arch/x86/hvm/emulate.c#hvmemul_do_io+0x26a/0x4d0
> (XEN)    [<ffff82d080259ef9>] S drivers/vpci/msix.c#msix_accept+0x9/0x20
> (XEN)    [<ffff82d0802fea56>] S arch/x86/hvm/emulate.c#hvmemul_do_io_buffer+0x36
> /0x70
> (XEN)    [<ffff82d0802ff005>] S arch/x86/hvm/emulate.c#hvmemul_linear_mmio_access+0x1e5/0x300
> (XEN)    [<ffff82d0802fff44>] S arch/x86/hvm/emulate.c#linear_write+0x84/0x160
> (XEN)    [<ffff82d080301ca8>] S arch/x86/hvm/emulate.c#hvmemul_write+0xe8/0x100
> (XEN)    [<ffff82d0802de6cc>] S x86_emulate+0x289dc/0x2cfb0
> (XEN)    [<ffff82d08027c7ab>] S map_domain_page+0x4b/0x600
> (XEN)    [<ffff82d080340eaa>] S __get_gfn_type_access+0x6a/0x100
> (XEN)    [<ffff82d08034a367>] S arch/x86/mm/p2m-ept.c#ept_next_level+0x107/0x150
> (XEN)    [<ffff82d0802e4961>] S x86_emulate_wrapper+0x21/0x60
> (XEN)    [<ffff82d08030024f>] S arch/x86/hvm/emulate.c#_hvm_emulate_one+0x4f/0x220
> (XEN)    [<ffff82d0803004ed>] S hvmemul_get_seg_reg+0x4d/0x50
> (XEN)    [<ffff82d08030042e>] S hvm_emulate_one+0xe/0x10
> (XEN)    [<ffff82d08030e4ca>] S hvm_emulate_one_insn+0x3a/0xf0
> (XEN)    [<ffff82d0802e4af0>] S x86_insn_is_mem_access+0/0x260
> (XEN)    [<ffff82d08030e5c9>] S handle_mmio_with_translation+0x49/0x60
> (XEN)    [<ffff82d080305d78>] S hvm_hap_nested_page_fault+0x2c8/0x720
> (XEN)    [<ffff82d0802fea56>] S arch/x86/hvm/emulate.c#hv(XEN) 
> (XEN) ****************************************
> (XEN) Panic on CPU 13:
> (XEN) Assertion 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
> (XEN) ****************************************
> 
> This is when it configures the broadcom network interface, which interrupts
> at "msix3 vec 0". It is the first MSI-X device configured; the previous
> ones are MSI only.
> 
> Is it a bug on the Xen side, or something missing on the NetBSD side ?

Looks like a bug on the Xen side, do you see any relevant messages
before hitting the assert?

Can you give a try to the following debug patch and paste what you
get?

Thanks, Roger.
---8<---
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 64dd0a929c..7ff76b7f59 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -371,7 +371,12 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
             entry->updated = false;
         }
         else
+        {
+            printk("%pp offset %u len %u new_masked %d enabled %d masked %d updated %d\n",
+                   &pdev->sbdf, offset, len, new_masked, msix->enabled, msix->masked,
+                   entry->updated);
             vpci_msix_arch_mask_entry(entry, pdev, entry->masked);
+        }
 
         break;
     }


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 16:32 ` Roger Pau Monné
@ 2020-11-12 17:02   ` Jan Beulich
  2020-11-12 17:27   ` Manuel Bouyer
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2020-11-12 17:02 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Manuel Bouyer

On 12.11.2020 17:32, Roger Pau Monné wrote:
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -371,7 +371,12 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
>              entry->updated = false;
>          }
>          else
> +        {
> +            printk("%pp offset %u len %u new_masked %d enabled %d masked %d updated %d\n",
> +                   &pdev->sbdf, offset, len, new_masked, msix->enabled, msix->masked,
> +                   entry->updated);
>              vpci_msix_arch_mask_entry(entry, pdev, entry->masked);
> +        }

What about a write of all zero as the very first write we
get to see, while msix->masked is true? I'm getting the
impression we would never have come through update_entry()
in this case, and hence vpci_msix_arch_enable_entry() - the
only function setting entry->arch.pirq to a valid value
afaics - would never have been called.

Jan


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 16:32 ` Roger Pau Monné
  2020-11-12 17:02   ` Jan Beulich
@ 2020-11-12 17:27   ` Manuel Bouyer
  2020-11-12 17:57     ` Andrew Cooper
  2020-11-12 20:19     ` Roger Pau Monné
  1 sibling, 2 replies; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-12 17:27 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Thu, Nov 12, 2020 at 05:32:40PM +0100, Roger Pau Monné wrote:
> On Thu, Nov 12, 2020 at 04:57:15PM +0100, Manuel Bouyer wrote:
> > Hello,
> > I'm trying to add dom0 PVH support to NetBSD. I'm tesing with Xen 4.13
> > on a brand new Intel x86 server (Dell R440).
> 
> I would recommend using 4.14, PVH dom0 is still very much in
> progress, and while I don't recall any specific fix going in 4.14 that
> could be related to this there have been changes.

packaging Xen on NetBSD requires quite a bit of work; so I don't package
every releases. I still need to get NetBSD patches in shape to contribute
back ...


> [...]
> > This is when it configures the broadcom network interface, which interrupts
> > at "msix3 vec 0". It is the first MSI-X device configured; the previous
> > ones are MSI only.
> > 
> > Is it a bug on the Xen side, or something missing on the NetBSD side ?
> 
> Looks like a bug on the Xen side, do you see any relevant messages
> before hitting the assert?

nothing from Xen

> 
> Can you give a try to the following debug patch and paste what you
> get?
> 
> Thanks, Roger.
> ---8<---
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 64dd0a929c..7ff76b7f59 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -371,7 +371,12 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
>              entry->updated = false;
>          }
>          else
> +        {
> +            printk("%pp offset %u len %u new_masked %d enabled %d masked %d updated %d\n",
> +                   &pdev->sbdf, offset, len, new_masked, msix->enabled, msix->masked,
> +                   entry->updated);
>              vpci_msix_arch_mask_entry(entry, pdev, entry->masked);
> +        }
>  
>          break;
>      }

I get
(XEN) ffff83083feaf500p offset 12 len 4 new_masked 0 enabled 0 masked 0 updated 1
(XEN) Assertion 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843

You can find the full serial console log at
http://www-soc.lip6.fr/~bouyer/xen-log.txt

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 17:27   ` Manuel Bouyer
@ 2020-11-12 17:57     ` Andrew Cooper
  2020-11-12 20:19     ` Roger Pau Monné
  1 sibling, 0 replies; 14+ messages in thread
From: Andrew Cooper @ 2020-11-12 17:57 UTC (permalink / raw)
  To: Manuel Bouyer, Roger Pau Monné; +Cc: xen-devel

On 12/11/2020 17:27, Manuel Bouyer wrote:
> On Thu, Nov 12, 2020 at 05:32:40PM +0100, Roger Pau Monné wrote:
>> On Thu, Nov 12, 2020 at 04:57:15PM +0100, Manuel Bouyer wrote:
>>> Hello,
>>> I'm trying to add dom0 PVH support to NetBSD. I'm tesing with Xen 4.13
>>> on a brand new Intel x86 server (Dell R440).
>> I would recommend using 4.14, PVH dom0 is still very much in
>> progress, and while I don't recall any specific fix going in 4.14 that
>> could be related to this there have been changes.
> packaging Xen on NetBSD requires quite a bit of work; so I don't package
> every releases. I still need to get NetBSD patches in shape to contribute
> back ...

For issues like this, you don't need to package all of Xen.  You can
just build the hypervisor locally and boot that.  (It doesn't matter if
the dom0 userspace doesn't come up cleanly if you're debugging a general
issue between the kernel and Xen.)

~Andrew


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 17:27   ` Manuel Bouyer
  2020-11-12 17:57     ` Andrew Cooper
@ 2020-11-12 20:19     ` Roger Pau Monné
  2020-11-13  9:44       ` Manuel Bouyer
  2020-11-13 11:54       ` Manuel Bouyer
  1 sibling, 2 replies; 14+ messages in thread
From: Roger Pau Monné @ 2020-11-12 20:19 UTC (permalink / raw)
  To: Manuel Bouyer; +Cc: xen-devel

On Thu, Nov 12, 2020 at 06:27:04PM +0100, Manuel Bouyer wrote:
> On Thu, Nov 12, 2020 at 05:32:40PM +0100, Roger Pau Monné wrote:
> > On Thu, Nov 12, 2020 at 04:57:15PM +0100, Manuel Bouyer wrote:
> > Can you give a try to the following debug patch and paste what you
> > get?
> > 
> > Thanks, Roger.
> > ---8<---
> > diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> > index 64dd0a929c..7ff76b7f59 100644
> > --- a/xen/drivers/vpci/msix.c
> > +++ b/xen/drivers/vpci/msix.c
> > @@ -371,7 +371,12 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
> >              entry->updated = false;
> >          }
> >          else
> > +        {
> > +            printk("%pp offset %u len %u new_masked %d enabled %d masked %d updated %d\n",
> > +                   &pdev->sbdf, offset, len, new_masked, msix->enabled, msix->masked,
> > +                   entry->updated);
> >              vpci_msix_arch_mask_entry(entry, pdev, entry->masked);
> > +        }
> >  
> >          break;
> >      }
> 
> I get
> (XEN) ffff83083feaf500p offset 12 len 4 new_masked 0 enabled 0 masked 0 updated 1
> (XEN) Assertion 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
> 
> You can find the full serial console log at
> http://www-soc.lip6.fr/~bouyer/xen-log.txt

The following might be able to get you going, but I think I need to
refine the logic a bit there, will have to give it some thought.

Thanks, Roger.
---8<---
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 64dd0a929c..3eb6102a61 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -370,7 +370,7 @@ static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len,
 
             entry->updated = false;
         }
-        else
+        else if ( msix->enabled )
             vpci_msix_arch_mask_entry(entry, pdev, entry->masked);
 
         break;


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 20:19     ` Roger Pau Monné
@ 2020-11-13  9:44       ` Manuel Bouyer
  2020-11-13 11:54       ` Manuel Bouyer
  1 sibling, 0 replies; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-13  9:44 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> 
> The following might be able to get you going, but I think I need to
> refine the logic a bit there, will have to give it some thought.

thanks. It avoids the panic, but it seems that msi and msi-x interrupts are
not delivered to the dom0 kernel. The conters stay at 0.
I get some ioapic interrupts though, as well as some Xen events.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-12 20:19     ` Roger Pau Monné
  2020-11-13  9:44       ` Manuel Bouyer
@ 2020-11-13 11:54       ` Manuel Bouyer
  2020-11-13 14:33         ` Roger Pau Monné
  2020-11-13 14:35         ` Roger Pau Monné
  1 sibling, 2 replies; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-13 11:54 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> The following might be able to get you going, but I think I need to
> refine the logic a bit there, will have to give it some thought.

I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
Your patch is needed there too to avoid the panic.

As with 4.13, I have problems with interrupts not being properly
delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
or 2 (with 4.15) which would mean that interrupts stop being delivered
at some point in the setup process. Maybe something to do with mask/unmask ?

The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
so it's not MSI/MSI-X (not sure it matters though).
Maybe something related to mask/unmask ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-13 11:54       ` Manuel Bouyer
@ 2020-11-13 14:33         ` Roger Pau Monné
  2020-11-13 15:14           ` Manuel Bouyer
  2020-11-13 16:46           ` Manuel Bouyer
  2020-11-13 14:35         ` Roger Pau Monné
  1 sibling, 2 replies; 14+ messages in thread
From: Roger Pau Monné @ 2020-11-13 14:33 UTC (permalink / raw)
  To: Manuel Bouyer; +Cc: xen-devel

On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote:
> On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> > The following might be able to get you going, but I think I need to
> > refine the logic a bit there, will have to give it some thought.
> 
> I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
> Your patch is needed there too to avoid the panic.
> 
> As with 4.13, I have problems with interrupts not being properly
> delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
> or 2 (with 4.15) which would mean that interrupts stop being delivered
> at some point in the setup process. Maybe something to do with mask/unmask ?
> 
> The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
> so it's not MSI/MSI-X (not sure it matters though).
> Maybe something related to mask/unmask ?

What device do you have on that pin? Is it the only device not working
properly? I get from that that MSI/MSI-X is now working fine.

You can get some interrupt info from the 'i' and the 'z' debug keys,
albeit that won't reflect the state of the emulated IO-APIC used by
dom0, which is likely what we care about. There's also the 'M' debug
key, but that's only useful for MSI/MSI-X.

I can try to prepare a patch to dump some info from the emulated
IO-APIC, but I'm afraid I won't get to it until Monday.

Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-13 11:54       ` Manuel Bouyer
  2020-11-13 14:33         ` Roger Pau Monné
@ 2020-11-13 14:35         ` Roger Pau Monné
  2020-11-13 15:26           ` Manuel Bouyer
  1 sibling, 1 reply; 14+ messages in thread
From: Roger Pau Monné @ 2020-11-13 14:35 UTC (permalink / raw)
  To: Manuel Bouyer; +Cc: xen-devel

On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote:
> On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> > The following might be able to get you going, but I think I need to
> > refine the logic a bit there, will have to give it some thought.
> 
> I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
> Your patch is needed there too to avoid the panic.
> 
> As with 4.13, I have problems with interrupts not being properly
> delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
> or 2 (with 4.15) which would mean that interrupts stop being delivered
> at some point in the setup process. Maybe something to do with mask/unmask ?
> 
> The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
> so it's not MSI/MSI-X (not sure it matters though).
> Maybe something related to mask/unmask ?

Forgot to mention, it might also be helpful to boot Xen with
iommu=debug, just in case.

Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-13 14:33         ` Roger Pau Monné
@ 2020-11-13 15:14           ` Manuel Bouyer
  2020-11-13 16:00             ` Manuel Bouyer
  2020-11-13 16:46           ` Manuel Bouyer
  1 sibling, 1 reply; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-13 15:14 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Fri, Nov 13, 2020 at 03:33:49PM +0100, Roger Pau Monné wrote:
> On Fri, Nov 13, 2020 at 12:54:57PM +0100, Manuel Bouyer wrote:
> > On Thu, Nov 12, 2020 at 09:19:39PM +0100, Roger Pau Monné wrote:
> > > The following might be able to get you going, but I think I need to
> > > refine the logic a bit there, will have to give it some thought.
> > 
> > I also tested with xen devel (Xen version 4.15-unstable, Latest ChangeSet: Wed Nov 4 09:27:22 2020 +0100 git:9ff9705647-dirty).
> > Your patch is needed there too to avoid the panic.
> > 
> > As with 4.13, I have problems with interrupts not being properly
> > delivered. The strange thing is that the counter is not 0, but 3 (wuth 4.13)
> > or 2 (with 4.15) which would mean that interrupts stop being delivered
> > at some point in the setup process. Maybe something to do with mask/unmask ?
> > 
> > The problematc interrupt in identifed as "ioapic2 pin 2" by the NetBSD kernel,
> > so it's not MSI/MSI-X (not sure it matters though).
> > Maybe something related to mask/unmask ?
> 
> What device do you have on that pin?

The PERC H740P raid controller.

> Is it the only device not working
> properly?

I'm not sure, as the kernel is stalling because of this.
The other device counter interrupts are 0.
I can try a kernel without this driver, to see if other devices reports
interrupt.

> I get from that that MSI/MSI-X is now working fine.

See above.

> 
> You can get some interrupt info from the 'i' and the 'z' debug keys,
> albeit that won't reflect the state of the emulated IO-APIC used by
> dom0, which is likely what we care about. There's also the 'M' debug
> key, but that's only useful for MSI/MSI-X.
> 
> I can try to prepare a patch to dump some info from the emulated
> IO-APIC, but I'm afraid I won't get to it until Monday.

No problem.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-13 14:35         ` Roger Pau Monné
@ 2020-11-13 15:26           ` Manuel Bouyer
  0 siblings, 0 replies; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-13 15:26 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Fri, Nov 13, 2020 at 03:35:13PM +0100, Roger Pau Monné wrote:
> Forgot to mention, it might also be helpful to boot Xen with
> iommu=debug, just in case.

I put the serial console log at
http://www-soc.lip6.fr/~bouyer/xen-log2.txt
in case it helps

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-13 15:14           ` Manuel Bouyer
@ 2020-11-13 16:00             ` Manuel Bouyer
  0 siblings, 0 replies; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-13 16:00 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Fri, Nov 13, 2020 at 04:14:28PM +0100, Manuel Bouyer wrote:
> I can try a kernel without this driver, to see if other devices reports
> interrupt.

This didn't change anything, I don't get more interrupts with this driver
commented out.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843
  2020-11-13 14:33         ` Roger Pau Monné
  2020-11-13 15:14           ` Manuel Bouyer
@ 2020-11-13 16:46           ` Manuel Bouyer
  1 sibling, 0 replies; 14+ messages in thread
From: Manuel Bouyer @ 2020-11-13 16:46 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

I just noticed that CPU0 also stops receiving clock interrupts - which may
explain why the kernel wedges. I can still enter NetBSD's debugger,
which means that console interrupts are still working (the console's event
channel is also handled by CPU 0).
A 'q' in Xen doens't show any pending or masked events, for any CPU.

The NetBSD interrupt counters show event channel 2's counter (the CPU0 clock)
stuck at 13, while others are happily increasing.

Any idea what to look at from here ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-11-13 16:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-12 15:57 dom0 PVH: 'entry->arch.pirq != INVALID_PIRQ' failed at vmsi.c:843 Manuel Bouyer
2020-11-12 16:32 ` Roger Pau Monné
2020-11-12 17:02   ` Jan Beulich
2020-11-12 17:27   ` Manuel Bouyer
2020-11-12 17:57     ` Andrew Cooper
2020-11-12 20:19     ` Roger Pau Monné
2020-11-13  9:44       ` Manuel Bouyer
2020-11-13 11:54       ` Manuel Bouyer
2020-11-13 14:33         ` Roger Pau Monné
2020-11-13 15:14           ` Manuel Bouyer
2020-11-13 16:00             ` Manuel Bouyer
2020-11-13 16:46           ` Manuel Bouyer
2020-11-13 14:35         ` Roger Pau Monné
2020-11-13 15:26           ` Manuel Bouyer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.