linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Dan Streetman <dan.streetman@canonical.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	xen-devel@lists.xenproject.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, boris.ostrovsky@oracle.com
Subject: Re: [PATCH] xen: do not re-use pirq number cached in pci device msi msg data
Date: Fri, 6 Jan 2017 20:06:51 -0500	[thread overview]
Message-ID: <20170107010651.GV16608@char.us.oracle.com> (raw)
In-Reply-To: <20170105192856.25559-1-dan.streetman@canonical.com>

On Thu, Jan 05, 2017 at 02:28:56PM -0500, Dan Streetman wrote:
> Do not read a pci device's msi message data to see if a pirq was
> previously configured for the device's msi/msix, as the old pirq was
> unmapped and may now be in use by another pci device.  The previous
> pirq should never be re-used; instead a new pirq should always be
> allocated from the hypervisor.

Won't this cause a starvation problem? That is we will run out of PIRQs
as we are not reusing them?
> 
> The xen_hvm_setup_msi_irqs() function currently checks the pci device's
> msi descriptor message data for each msi/msix vector it sets up, and if
> it finds the vector was previously configured with a pirq, and that pirq
> is mapped to an irq, it re-uses the pirq instead of requesting a new pirq
> from the hypervisor.  However, that pirq was unmapped when the pci device
> disabled its msi/msix, and it cannot be re-used; it may have been given
> to a different pci device.

Hm, but this implies that we do keep track of it.


while (true)
do
 rmmod nvme
 modprobe nvme
done

Things go boom without this patch. But with this patch does this
still work? As in we don't run out of PIRQs?

Thanks.
> 
> This exact situation is happening in a Xen guest where multiple NVMe
> controllers (pci devices) are present.  The NVMe driver configures each
> pci device's msi/msix twice; first to configure a single vector (to
> talk to the controller for its configuration info), and then it disables
> that msi/msix and re-configures with all the msi/msix it needs.  When
> multiple NVMe controllers are present, this happens concurrently on all
> of them, and in the time between controller A calling pci_disable_msix()
> and then calling pci_enable_msix_range(), controller B enables its msix
> and gets controller A's pirq allocated from the hypervisor.  Then when
> controller A re-configures its msix, its first vector tries to re-use
> the same pirq that it had before; but that pirq was allocated to
> controller B, and thus the Xen event channel for controller A's re-used
> pirq fails to map its irq to that pirq; the hypervisor already has the
> pirq mapped elsewhere.
> 
> Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
> ---
>  arch/x86/pci/xen.c | 23 +++++++----------------
>  1 file changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
> index bedfab9..a00a6c0 100644
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -234,23 +234,14 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>  		return 1;
>  
>  	for_each_pci_msi_entry(msidesc, dev) {
> -		__pci_read_msi_msg(msidesc, &msg);
> -		pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
> -			((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
> -		if (msg.data != XEN_PIRQ_MSI_DATA ||
> -		    xen_irq_from_pirq(pirq) < 0) {
> -			pirq = xen_allocate_pirq_msi(dev, msidesc);
> -			if (pirq < 0) {
> -				irq = -ENODEV;
> -				goto error;
> -			}
> -			xen_msi_compose_msg(dev, pirq, &msg);
> -			__pci_write_msi_msg(msidesc, &msg);
> -			dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
> -		} else {
> -			dev_dbg(&dev->dev,
> -				"xen: msi already bound to pirq=%d\n", pirq);
> +		pirq = xen_allocate_pirq_msi(dev, msidesc);
> +		if (pirq < 0) {
> +			irq = -ENODEV;
> +			goto error;
>  		}
> +		xen_msi_compose_msg(dev, pirq, &msg);
> +		__pci_write_msi_msg(msidesc, &msg);
> +		dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
>  		irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq,
>  					       (type == PCI_CAP_ID_MSI) ? nvec : 1,
>  					       (type == PCI_CAP_ID_MSIX) ?
> -- 
> 2.9.3
> 

  reply	other threads:[~2017-01-07  1:07 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-05 19:28 [PATCH] xen: do not re-use pirq number cached in pci device msi msg data Dan Streetman
2017-01-07  1:06 ` Konrad Rzeszutek Wilk [this message]
2017-01-09 14:59   ` Boris Ostrovsky
2017-01-09 15:42     ` [Xen-devel] " Dan Streetman
2017-01-09 15:59       ` Konrad Rzeszutek Wilk
2017-01-09 19:30         ` Stefano Stabellini
2017-01-10 15:57           ` Dan Streetman
2017-01-10 18:41             ` Dan Streetman
2017-01-10 19:03               ` Stefano Stabellini
2017-01-10 21:32                 ` Dan Streetman
2017-01-10 23:28                   ` Boris Ostrovsky
2017-01-11  1:25                   ` Stefano Stabellini
2017-01-11 15:26                     ` Dan Streetman
2017-01-11 18:46                       ` Stefano Stabellini
2017-01-11 23:25                         ` Dan Streetman
2017-01-13 18:31                           ` Dan Streetman
2017-01-13 18:44                             ` Stefano Stabellini
2017-01-13 20:00                               ` Boris Ostrovsky
2017-01-13 20:07                                 ` Dan Streetman
2017-01-13 20:54                                   ` Stefano Stabellini
2017-01-13 21:49                                   ` Boris Ostrovsky
2017-01-13 22:30                                   ` Konrad Rzeszutek Wilk
2017-02-21 15:31                                     ` Dan Streetman
2017-02-21 15:45                                       ` Juergen Gross
2017-02-21 15:58                                         ` Boris Ostrovsky
2017-02-22 14:28                                           ` Bjorn Helgaas
2017-02-22 15:14                                             ` Boris Ostrovsky
2017-05-03 18:19                                               ` [Xen-devel] " David Woodhouse
2017-05-03 18:43                                                 ` Boris Ostrovsky
2017-05-03 22:59                                                   ` Stefano Stabellini
2017-05-03 23:06                                                     ` Greg KH
2017-05-03 23:12                                                       ` Stefano Stabellini
2017-05-03 23:19                                                         ` Greg KH
2017-01-13 20:13                                 ` Dan Streetman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170107010651.GV16608@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=bhelgaas@google.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dan.streetman@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).