linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-5.1-rc3: nvme hv_pci: request for interrupt failed
@ 2019-04-04  0:38 Solio Sarabia
  2019-04-04  2:42 ` Dexuan Cui
  0 siblings, 1 reply; 3+ messages in thread
From: Solio Sarabia @ 2019-04-04  0:38 UTC (permalink / raw)
  To: linux-hyperv, linux-nvme; +Cc: haiyangz, kys, decui, mikelley, shiny.sebastian

When two nvme devices are discrete-assigned [1] to a linuxvm on
hyper-v rs5 host, it fails to initialize both.  It worked a couple of
times and after some reboots it failed. `dmesg` shows:

[   13.941971] nvme nvme0: pci function 82c6:00:00.0
[   13.942802] nvme 82c6:00:00.0: can't derive routing for PCI INT A
[   13.942803] nvme 82c6:00:00.0: PCI INT A: no GSI
[   13.942844] nvme nvme1: pci function 8f8d:00:00.0
[   13.943397] nvme 8f8d:00:00.0: can't derive routing for PCI INT A
[   13.943399] nvme 8f8d:00:00.0: PCI INT A: no GSI
[   14.099310] hv_pci 96a07283-8dac-417a-82c6-111eb8b9a4c0: Request for interrupt failed: 0xc000009a
[   14.099353] hv_pci 092472da-23bf-434f-8f8d-cc7546cf6cc1: Request for interrupt failed: 0xc000009a
[   14.119391] hv_pci 96a07283-8dac-417a-82c6-111eb8b9a4c0: hv_irq_unmask() failed: 0x5
[   14.124416] hv_pci 092472da-23bf-434f-8f8d-cc7546cf6cc1: hv_irq_unmask() failed: 0x5
[   74.932888] nvme nvme1: I/O 7 QID 0 timeout, completion polled
[   74.932893] nvme nvme0: I/O 3 QID 0 timeout, completion polled
[  136.372890] nvme nvme1: I/O 4 QID 0 timeout, completion polled
[  136.372892] nvme nvme0: I/O 20 QID 0 timeout, completion polled
[  136.373280] hv_pci 092472da-23bf-434f-8f8d-cc7546cf6cc1: Request for interrupt failed: 0xc000009a
[  136.373432] hv_pci 96a07283-8dac-417a-82c6-111eb8b9a4c0: Request for interrupt failed: 0xc000009a
[  136.376262] hv_pci 092472da-23bf-434f-8f8d-cc7546cf6cc1: hv_irq_unmask() failed: 0x5
[  136.376906] hv_pci 96a07283-8dac-417a-82c6-111eb8b9a4c0: hv_irq_unmask() failed: 0x5
loop of 'interrupt failed' and 'hv_irq_unmask' calls
...

Device is intel ssd p4608 pci nvme, that consists of two nvme devices
as seen by linux (5.0.1-rc3).  Some info from `lspci -v`:

82c6:00:00.0 Non-Volatile memory controller: Intel Corporation Express Flash NVMe P4500/P4600 (prog-if 02 [NVM Express])
8f8d:00:00.0 Non-Volatile memory controller: Intel Corporation Express Flash NVMe P4500/P4600 (prog-if 02 [NVM Express])

Let me know if other info/logs are needed.

[1] https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/deploy/deploying-storage-devices-using-dda

Thanks,
-Solio

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: linux-5.1-rc3: nvme hv_pci: request for interrupt failed
  2019-04-04  0:38 linux-5.1-rc3: nvme hv_pci: request for interrupt failed Solio Sarabia
@ 2019-04-04  2:42 ` Dexuan Cui
  2019-04-04  4:37   ` Solio Sarabia
  0 siblings, 1 reply; 3+ messages in thread
From: Dexuan Cui @ 2019-04-04  2:42 UTC (permalink / raw)
  To: Solio Sarabia, linux-hyperv, linux-nvme
  Cc: Haiyang Zhang, KY Srinivasan, Michael Kelley, Shiny Sebastian

> From: Solio Sarabia <solio.sarabia@intel.com>
> Sent: Wednesday, April 3, 2019 5:38 PM
> To: linux-hyperv@vger.kernel.org; linux-nvme@lists.infradead.org
> 
> When two nvme devices are discrete-assigned [1] to a linuxvm on
> hyper-v rs5 host, it fails to initialize both.  It worked a couple of
> times and after some reboots it failed. `dmesg` shows:
> 
> [   14.099310] hv_pci 96a07283-8dac-417a-82c6-111eb8b9a4c0: Request for
> interrupt failed: 0xc000009a
> 
> Thanks,
> -Solio

0xc000009a is STATUS_INSUFFICIENT_RESOURCES.

This is a known host resource leakage bug of the RS5 host. After the issue
happens, rebooting the VM can not help, and rebooting the host may hang
and we may have to power cycle the host by force.

The bug has been fixed in 19H1, which is in the Insider Preview phase, though:
https://docs.microsoft.com/en-us/windows-insider/at-home/whats-new-wip-at-home-19h1
https://www.microsoft.com/en-us/software-download/windowsinsiderpreviewadvanced

The fix is being backported to RS5, but I don't have an ETA yet. 
I'll try to get more info today and keep you updated.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: linux-5.1-rc3: nvme hv_pci: request for interrupt failed
  2019-04-04  2:42 ` Dexuan Cui
@ 2019-04-04  4:37   ` Solio Sarabia
  0 siblings, 0 replies; 3+ messages in thread
From: Solio Sarabia @ 2019-04-04  4:37 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: linux-hyperv, linux-nvme, Haiyang Zhang, KY Srinivasan,
	Michael Kelley, Shiny Sebastian

On Thu, Apr 04, 2019 at 02:42:56AM +0000, Dexuan Cui wrote:
> > From: Solio Sarabia <solio.sarabia@intel.com>
> > Sent: Wednesday, April 3, 2019 5:38 PM
> > To: linux-hyperv@vger.kernel.org; linux-nvme@lists.infradead.org
> > 
> > When two nvme devices are discrete-assigned [1] to a linuxvm on
> > hyper-v rs5 host, it fails to initialize both.  It worked a couple of
> > times and after some reboots it failed. `dmesg` shows:
> > 
> > [   14.099310] hv_pci 96a07283-8dac-417a-82c6-111eb8b9a4c0: Request for
> > interrupt failed: 0xc000009a
> > 
> > Thanks,
> > -Solio
> 
> 0xc000009a is STATUS_INSUFFICIENT_RESOURCES.
> 
> This is a known host resource leakage bug of the RS5 host. After the issue
> happens, rebooting the VM can not help, and rebooting the host may hang
> and we may have to power cycle the host by force.
> 
> The bug has been fixed in 19H1, which is in the Insider Preview phase, though:
> https://docs.microsoft.com/en-us/windows-insider/at-home/whats-new-wip-at-home-19h1
> https://www.microsoft.com/en-us/software-download/windowsinsiderpreviewadvanced
> 
> The fix is being backported to RS5, but I don't have an ETA yet. 
> I'll try to get more info today and keep you updated.
> 
> Thanks,
> -- Dexuan

Great to know you're on top of things.  Guest hanged and had to reboot host;
in all cases guest was in consistent state upon reboot.

It's not a blocking issue at the moment, I can work on one device at a time.

Thanks,
-Solio

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-04-04  4:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-04  0:38 linux-5.1-rc3: nvme hv_pci: request for interrupt failed Solio Sarabia
2019-04-04  2:42 ` Dexuan Cui
2019-04-04  4:37   ` Solio Sarabia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).