linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Alexander Gordeev <agordeev@redhat.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Yinghai Lu <yinghai@kernel.org>, Joerg Roedel <joro@8bytes.org>,
	Jan Beulich <JBeulich@suse.com>, Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
Date: Mon, 3 Jun 2013 14:46:59 -0600	[thread overview]
Message-ID: <CAErSpo6bjBEK=Ta=pVzwCBMfrqfe3jB-sEfv1Kz9mt9iakdeaA@mail.gmail.com> (raw)
In-Reply-To: <CAErSpo4EkXX-UoFLyni+ZpXaTmCRq5sJCqZ4FCt7fywpDyVWqw@mail.gmail.com>

On Wed, May 29, 2013 at 2:58 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> [-cc Suresh]
>
> On Wed, May 29, 2013 at 2:36 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
>> On Tue, May 28, 2013 at 03:51:52PM -0600, Bjorn Helgaas wrote:
>>> On Mon, May 13, 2013 at 3:05 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
>>>
>>> The subject would make more sense as "Allocate *only* as many MSIs as
>>> requested."
>>
>> 1.
>>
>>> > When multiple MSIs are enabled with pci_enable_msi_block(), the
>>> > requested number of interrupts 'nvec' is rounded up to the nearest
>>> > power-of-two value.
>>>
>>> This rounding is just a consequence of the encodings of the Multiple
>>> Message Enable field in the Message Control register (PCI spec r3.0,
>>> sec 6.8.1.3), isn't it?
>>
>> Yes, it is.
>>
>>> > The result is then used for setting up the
>>> > number of MSI messages in the PCI device and allocation of
>>> > interrupt resources in the operating system (i.e. vector numbers).
>>> > Thus, in cases when a device driver requests some number of MSIs
>>> > and this number is not a power-of-two value, the extra operating
>>> > system resources (allocated as the result of rounding) are wasted.
>>> >
>>> > This fix introduces 'msi_desc::nvec' field to address the above
>>> > issue. When non-zero, it will report the actual number of MSIs the
>>> > device will send, as requested by the device driver. This value
>>> > should be used by architectures to properly set up and tear down
>>> > associated interrupt resources.
>>>
>>> This name needs a little more context, like "nvec_used" or something.
>>
>> I chose "nvec" to indicate it is what was passed to pci_enable_msi_block().
>> I can resend with "nvec_used", along with subject change [1], if you want.
>>
>>> I think the idea is that the Message Control register can only tell
>>> the OS that the device requires 1, 2, 4, 8, 16, or 32 vectors, and
>>> similarly the OS can only tell the device that 1, 2, 4, 8, 16, or 32
>>> vectors are assigned.  If a device can only make use of 18 vectors, it
>>> must advertise the next larger value (32 vectors).  As far as I can
>>> tell, a device *could* advertise 32 vectors in Multiple Message
>>> Capable even if it can only use 1 vector.
>>
>> Yes, that is what we have with i.e. ICH AHCI device - it advertises
>> 16 vectors while makes use of 6 only. I tried to explain this in my
>> changelog's last paragraph (below).
>>
>>> These patches are to avoid allocating resources for the unused
>>> vectors, i.e., the ones between the last one the driver requested and
>>> the last one advertised in Multiple Message Capable.
>>
>> Almost :) Rather ...between the last one the driver requested and
>> the last one *written* in Multiple Message *Enable*, not Capable.
>> IOW, between the last one the driver requested and the closest power
>> of two - which will be written to the device.
>
> Ah, right.
>
>> As of now, neither pci_enable_msi_block(), nor pci_enable_msi_block_auto()
>> are able to address the case you described, but if we decide to change
>> that then 'msi_desc::nvec' is what would be used. Again, the last paragraph
>> (may be too subtly) implies that.
>>
>>> The driver might
>>> request fewer than the maximum either because it knows the device
>>> isn't capable of using them all, or because the driver author decided
>>> not to use them all.
>>
>> Exactly. (I assume here "or the driver author decided not to use them all"
>> means the author can tell the device how many interrupts to use by means
>> other than Multiple Message Enable - otherwise it would be a bug).
>
> Yep, makes sense.  Thanks for the clarifications.
>
>>> (Sorry, just thinking out loud above, let me know if I'm not
>>> understanding this correctly.)
>>>
>>> > Note, although the existing 'msi_desc::multiple' field might seem
>>> > redundant, in fact in does not. In general case the number of MSIs a
>>> > PCI device is initialized with is not necessarily the closest power-
>>> > of-two value of the number of MSIs the device will send. Thus, in
>>> > theory it would not be always possible to derive the former from the
>>> > latter and we need to keep them both, to stress this corner case.
>>> > Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
>>> > would not save us any space.
>>
>> --
>> Regards,
>> Alexander Gordeev
>> agordeev@redhat.com
>
> No need to resend as far as I'm concerned; I can tweak those bits
> locally.  I can put these in my tree
> if Joerg or Konrad ack the iommu/irq_remapping.c bit.

I pushed these with updates to
http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi

Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
merge that branch into -next for v3.11.

Bjorn

  reply	other threads:[~2013-06-03 20:47 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-28 21:51 [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested Bjorn Helgaas
2013-05-29  8:36 ` Alexander Gordeev
2013-05-29 20:58   ` Bjorn Helgaas
2013-06-03 20:46     ` Bjorn Helgaas [this message]
2013-06-04 13:14       ` Alexander Gordeev
2013-06-05 17:18         ` Bjorn Helgaas
2013-06-05 18:33           ` Konrad Rzeszutek Wilk
2013-06-05 18:35             ` Bjorn Helgaas
2013-06-20 12:51       ` Joerg Roedel
2013-06-25 17:34         ` Bjorn Helgaas
  -- strict thread matches above, loose matches on Subject: below --
2013-05-13  9:05 [PATCH v3 -tip x86/apic 0/2] " Alexander Gordeev
2013-05-13  9:05 ` [PATCH v3 -tip x86/apic 1/2] " Alexander Gordeev
2013-05-28  9:50   ` Ingo Molnar
2013-06-05 20:56   ` Sebastian Andrzej Siewior
2013-06-05 21:09     ` Bjorn Helgaas
2013-06-05 21:28       ` Sebastian Andrzej Siewior
2013-06-06  8:30     ` Alexander Gordeev
2013-06-06 19:51       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAErSpo6bjBEK=Ta=pVzwCBMfrqfe3jB-sEfv1Kz9mt9iakdeaA@mail.gmail.com' \
    --to=bhelgaas@google.com \
    --cc=JBeulich@suse.com \
    --cc=agordeev@redhat.com \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).