netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Hutchings <bhutchings@solarflare.com>
To: Alexander Gordeev <agordeev@redhat.com>
Cc: <linux-kernel@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Ralf Baechle" <ralf@linux-mips.org>,
	Michael Ellerman <michael@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Ingo Molnar <mingo@redhat.com>, Tejun Heo <tj@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Andy King <acking@vmware.com>, Jon Mason <jon.mason@intel.com>,
	Matt Porter <mporter@kernel.crashing.org>,
	<linux-pci@vger.kernel.org>, <linux-mips@linux-mips.org>,
	<linuxppc-dev@lists.ozlabs.org>, <linux390@de.ibm.com>,
	<linux-s390@vger.kernel.org>, <x86@kernel.org>,
	<linux-ide@vger.kernel.org>, <iss_storagedev@hp.com>,
	<linux-nvme@lists.infradead.org>, <linux-rdma@vger.kernel.org>,
	<netdev@vger.kernel.org>, <e1000-devel@lists.sourceforge.net>,
	<linux-driver@qlogic.com>,
	Solarflare linux maintainers <linux-net-drivers@solarfla
Subject: Re: [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern
Date: Fri, 4 Oct 2013 22:29:16 +0100	[thread overview]
Message-ID: <1380922156.3214.49.camel@bwh-desktop.uk.level5networks.com> (raw)
In-Reply-To: <20131004082920.GA4536@dhcp-26-207.brq.redhat.com>

On Fri, 2013-10-04 at 10:29 +0200, Alexander Gordeev wrote:
> On Thu, Oct 03, 2013 at 11:49:45PM +0100, Ben Hutchings wrote:
> > On Wed, 2013-10-02 at 12:48 +0200, Alexander Gordeev wrote:
> > > This update converts pci_enable_msix() and pci_enable_msi_block()
> > > interfaces to canonical kernel functions and makes them return a
> > > error code in case of failure or 0 in case of success.
> > [...]
> > 
> > I think this is fundamentally flawed: pci_msix_table_size() and
> > pci_get_msi_cap() can only report the limits of the *device* (which the
> > driver usually already knows), whereas MSI allocation can also be
> > constrained due to *global* limits on the number of distinct IRQs.
> 
> Even the current implementation by no means addresses it. Although it
> might seem a case for architectures to report the number of IRQs available
> for a driver to retry, in fact they all just fail. The same applies to
> *any* other type of resource involved: irq_desc's, CPU interrupt vector
> space, msi_desc's etc. No platform cares about it and just bails out once
> a constrain met (please correct me if I am wrong here). Given that Linux
> has been doing well even on embedded I think we should not change it.
>
> The only exception to the above is pSeries platform which takes advantage
> of the current design (to implement MSI quota). There are indications we
> can satisfy pSeries requirements, but the design proposed in this RFC
> is not going to change drastically anyway. The start of the discusstion
> is here: https://lkml.org/lkml/2013/9/5/293

All I can see there is that Tejun didn't think that the global limits
and positive return values were implemented by any architecture.  But
you have a counter-example, so I'm not sure what your point is.

It has been quite a while since I saw this happen on x86.  But I just
checked on a test system running RHEL 5 i386 (Linux 2.6.18).  If I ask
for 16 MSI-X vectors on a device that supports 1024, the return value is
8, and indeed I can then successfully allocate 8.

Now that's going quite a way back, and it may be that global limits
aren't a significant problem any more.  With the x86_64 build of RHEL 5
on an identical system, I can allocate 16 or even 32, so this is
apparently not a hardware limit in this case.

> > Currently pci_enable_msix() will report a positive value if it fails due
> > to the global limit.  Your patch 7 removes that.  pci_enable_msi_block()
> > unfortunately doesn't appear to do this.
> 
> pci_enable_msi_block() can do more than one MSI only on x86 (with IOMMU),
> but it does not bother to return positive numbers, indeed.
> 
> > It seems to me that a more useful interface would take a minimum and
> > maximum number of vectors from the driver.  This wouldn't allow the
> > driver to specify that it could only accept, say, any even number within
> > a certain range, but you could still leave the current functions
> > available for any driver that needs that.
> 
> Mmmm.. I am not sure I am getting it. Could you please rephrase?

Most drivers seem to either:
(a) require exactly a certain number of MSI vectors, or
(b) require a minimum number of MSI vectors, usually want to allocate
more, and work with any number in between

We can support drivers in both classes by adding new allocation
functions that allow specifying a minimum (required) and maximum
(wanted) number of MSI vectors.  Those in class (a) would just specify
the same value for both.  These new functions can take account of any
global limit or allocation policy without any further changes to the
drivers that use them.

The few drivers with more specific requirements would still need to
implement the currently recommended loop, using the old allocation
functions.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


  parent reply	other threads:[~2013-10-04 21:29 UTC|newest]

Thread overview: 146+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-02 10:48 [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 01/77] PCI/MSI: Fix return value when populate_msi_sysfs() failed Alexander Gordeev
     [not found]   ` <3ff5236944aae69f2cd934b5b6da7c1c269df7c1.1380703262.git.agordeev-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-03  0:39     ` Jon Mason
2013-10-03 21:46       ` Ben Hutchings
2013-10-04  0:59         ` Jon Mason
2013-10-02 10:48 ` [PATCH RFC 02/77] PCI/MSI/PPC: Fix wrong RTAS error code reporting Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 03/77] PCI/MSI/s390: Fix single MSI only check Alexander Gordeev
2013-10-04  7:39   ` Martin Schwidefsky
2013-10-02 10:48 ` [PATCH RFC 04/77] PCI/MSI/s390: Remove superfluous check of MSI type Alexander Gordeev
     [not found]   ` <bae65aa3e30dfd23bd5ed47add7310cfbb96243a.1380703262.git.agordeev-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-02 18:17     ` Greg KH
2013-10-04  7:40   ` Martin Schwidefsky
2013-10-02 10:48 ` [PATCH RFC 05/77] PCI/MSI: Convert pci_msix_table_size() to a public interface Alexander Gordeev
     [not found]   ` <e8b51bd48c24d0fc4ee8adea5c138c9bf84191e9.1380703262.git.agordeev-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-07 18:10     ` Tejun Heo
2013-10-08  7:56       ` Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 06/77] PCI/MSI: Factor out pci_get_msi_cap() interface Alexander Gordeev
     [not found]   ` <9c282c4ab92731c719d161d2db6fc54ce33891d9.1380703262.git.agordeev-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-03 21:52     ` Ben Hutchings
2013-10-04  5:13       ` Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 08/77] PCI/MSI: Get rid of pci_enable_msi_block_auto() interface Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 09/77] ahci: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 10/77] ahci: Check MRSM bit when multiple MSIs enabled Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 11/77] benet: Return -ENOSPC when not enough MSI-Xs available Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 12/77] benet: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 13/77] bna: " Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 14/77] bnx2x: " Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 15/77] bnx2: " Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 16/77] cciss: " Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 17/77] cciss: Update a misleading comment on interrupt usage Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 18/77] cciss: Fallback to single MSI mode in case MSI-X failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 19/77] csiostor: Do not call pci_disable_msix() if pci_enable_msix() failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 20/77] csiostor: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 21/77] csiostor: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 22/77] cxgb3: Do not call pci_disable_msix() if pci_enable_msix() failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 23/77] cxgb3: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 24/77] cxgb3: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 25/77] cxgb4: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 26/77] cxgb4: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 27/77] cxgb4vf: Do not call pci_disable_msix() if pci_enable_msix() failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 28/77] cxgb4vf: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 29/77] cxgb4vf: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 30/77] hpsa: Update a misleading comment on interrupt usage Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 31/77] hpsa: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 32/77] hpsa: Fallback to single MSI mode in case MSI-X failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 33/77] ioat: Disable MSI-X in case request of IRQ failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 34/77] ioat: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 35/77] ipr: Do not call pci_disable_msi/msix() if pci_enable_msi/msix() failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 36/77] ipr: Enable MSI-X when IPR_USE_MSIX type is set, not IPR_USE_MSI Alexander Gordeev
2013-10-02 19:31   ` Brian King
2013-10-02 10:48 ` [PATCH RFC 37/77] ipr: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 38/77] ixgbe: " Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 39/77] ixgbevf: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 40/77] ixgbevf: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 20:50   ` Keller, Jacob E
2013-10-02 10:48 ` [PATCH RFC 41/77] lpfc: Do not call pci_disable_msix() if pci_enable_msix() failed Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 42/77] lpfc: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:48 ` [PATCH RFC 43/77] lpfc: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 44/77] lpfc: Make MSI-X initialization routine more readable Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 45/77] megaraid: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 46/77] mlx4: " Alexander Gordeev
2013-10-03  8:02   ` Jack Morgenstein
2013-10-03  8:27   ` Jack Morgenstein
2013-10-03  8:39   ` Jack Morgenstein
2013-10-02 10:49 ` [PATCH RFC 47/77] mlx5: Fix memory leak in case not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 48/77] mlx5: Return -ENOSPC when " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 49/77] mlx5: Fix minimum number of MSI-Xs Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 50/77] mlx5: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-03  7:14   ` Eli Cohen
2013-10-03 19:48     ` Alexander Gordeev
     [not found]       ` <20131003194837.GA27636-hdGaXg0bp3uRXgp2RCiI5R/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>
2013-10-10 15:29         ` Eli Cohen
2013-10-02 10:49 ` [PATCH RFC 51/77] mthca: " Alexander Gordeev
2013-10-03 16:11   ` Jack Morgenstein
2013-10-02 10:49 ` [PATCH RFC 52/77] niu: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 53/77] ntb: Fix missed call to pci_enable_msix() Alexander Gordeev
2013-10-03  0:49   ` Jon Mason
2013-10-02 10:49 ` [PATCH RFC 54/77] ntb: Ensure number of MSIs on SNB is enough for the link interrupt Alexander Gordeev
2013-10-03  0:48   ` Jon Mason
2013-10-05 21:43     ` Alexander Gordeev
2013-10-07 16:50       ` Jon Mason
2013-10-07 18:38         ` Alexander Gordeev
2013-10-07 20:31           ` Jon Mason
2013-10-02 10:49 ` [PATCH RFC 55/77] ntb: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-03  1:02   ` Jon Mason
2013-10-02 10:49 ` [PATCH RFC 56/77] nvme: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 57/77] pmcraid: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 58/77] qib: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 59/77] qla2xxx: " Alexander Gordeev
2013-10-03 17:42   ` Saurav Kashyap
2013-10-02 10:49 ` [PATCH RFC 60/77] qlcnic: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 61/77] qlogic: Return -EINVAL in case MSI-X is not supported Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 62/77] qlcnic: Remove redundant return operator Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 63/77] qlcnic: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-08 22:46   ` Himanshu Madhani
2013-10-02 10:49 ` [PATCH RFC 64/77] qlcnic: Make MSI-X initialization routine bit more readable Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 65/77] qlge: Remove a redundant assignment Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 66/77] qlge: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 67/77] rapidio: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 68/77] sfc: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 69/77] tg3: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 70/77] vmci: " Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 71/77] vmxnet3: Return -EINVAL if number of requested MSI-Xs is not enough Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 72/77] vmxnet3: Fixup a weird loop exit Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 73/77] vmxnet3: Return -ENOSPC when not enough MSI-X vectors available Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 74/77] vmxnet3: Limit number of rx queues to 1 if per-queue MSI-Xs failed Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 75/77] vmxnet3: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 76/77] vxge: Sanitize MSI-X allocation routine error codes Alexander Gordeev
2013-10-02 10:49 ` [PATCH RFC 77/77] vxge: Update MSI/MSI-X interrupts enablement code Alexander Gordeev
2013-10-03  0:29   ` Jon Mason
     [not found] ` <cover.1380703262.git.agordeev-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-10-02 10:48   ` [PATCH RFC 07/77] PCI/MSI: Re-design MSI/MSI-X interrupts enablement pattern Alexander Gordeev
2013-10-07 18:17     ` Tejun Heo
2013-10-08  7:48       ` Alexander Gordeev
2013-10-09 15:54         ` Tejun Heo
2013-10-03 22:49   ` [PATCH RFC 00/77] " Ben Hutchings
2013-10-04  8:29     ` Alexander Gordeev
2013-10-04  8:31       ` David Laight
2013-10-04  9:21         ` Alexander Gordeev
2013-10-04 21:29       ` Ben Hutchings [this message]
2013-10-05 14:20         ` Alexander Gordeev
2013-10-05 21:46           ` Benjamin Herrenschmidt
2013-10-06  6:02             ` Alexander Gordeev
2013-10-06  6:19               ` Benjamin Herrenschmidt
2013-10-06  7:10                 ` Alexander Gordeev
     [not found]                   ` <20131006071027.GA29143-hdGaXg0bp3uRXgp2RCiI5R/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>
2013-10-07 18:01                     ` Tejun Heo
2013-10-07 20:10                       ` Benjamin Herrenschmidt
2013-10-07 20:46                         ` Ben Hutchings
2013-10-07 22:21                         ` [E1000-devel] " Waskiewicz Jr, Peter P
2013-10-08 12:22                       ` Alexander Gordeev
2013-10-09 15:41                         ` Tejun Heo
2013-10-09 12:57                       ` Alexander Gordeev
2013-10-09 15:43                         ` Tejun Heo
2013-10-07 20:48                     ` Ben Hutchings
2013-10-09 15:46                       ` Tejun Heo
2013-10-07 18:21 ` Tejun Heo
2013-10-08  9:07   ` Alexander Gordeev
2013-10-09 15:57     ` Tejun Heo
2013-10-08  4:33 ` Michael Ellerman
2013-10-08  7:33   ` Alexander Gordeev
2013-10-09  1:34     ` Michael Ellerman
2013-10-09  1:55 ` Mark Lord
2013-10-09  3:55 ` H. Peter Anvin
2013-10-09  4:24   ` Benjamin Herrenschmidt
2013-10-10 10:17     ` Alexander Gordeev
2013-10-10 16:28       ` H. Peter Anvin
2013-10-10 18:07         ` Alexander Gordeev
2013-10-10 23:17           ` Mark Lord
2013-10-11  8:41             ` Alexander Gordeev
2013-10-11 20:29               ` Mark Lord
2013-10-15 15:30                 ` Alexander Gordeev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1380922156.3214.49.camel@bwh-desktop.uk.level5networks.com \
    --to=bhutchings@solarflare.com \
    --cc=acking@vmware.com \
    --cc=agordeev@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=iss_storagedev@hp.com \
    --cc=jon.mason@intel.com \
    --cc=linux-driver@qlogic.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-net-drivers@solarfla \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux390@de.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=mingo@redhat.com \
    --cc=mporter@kernel.crashing.org \
    --cc=netdev@vger.kernel.org \
    --cc=ralf@linux-mips.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).