All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
@ 2013-05-28 21:51 Bjorn Helgaas
  2013-05-29  8:36 ` Alexander Gordeev
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2013-05-28 21:51 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, linux-pci, Suresh Siddha, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar

On Mon, May 13, 2013 at 3:05 AM, Alexander Gordeev <agordeev@redhat.com> wrote:

The subject would make more sense as "Allocate *only* as many MSIs as
requested."

> When multiple MSIs are enabled with pci_enable_msi_block(), the
> requested number of interrupts 'nvec' is rounded up to the nearest
> power-of-two value.

This rounding is just a consequence of the encodings of the Multiple
Message Enable field in the Message Control register (PCI spec r3.0,
sec 6.8.1.3), isn't it?

> The result is then used for setting up the
> number of MSI messages in the PCI device and allocation of
> interrupt resources in the operating system (i.e. vector numbers).
> Thus, in cases when a device driver requests some number of MSIs
> and this number is not a power-of-two value, the extra operating
> system resources (allocated as the result of rounding) are wasted.
>
> This fix introduces 'msi_desc::nvec' field to address the above
> issue. When non-zero, it will report the actual number of MSIs the
> device will send, as requested by the device driver. This value
> should be used by architectures to properly set up and tear down
> associated interrupt resources.

This name needs a little more context, like "nvec_used" or something.

I think the idea is that the Message Control register can only tell
the OS that the device requires 1, 2, 4, 8, 16, or 32 vectors, and
similarly the OS can only tell the device that 1, 2, 4, 8, 16, or 32
vectors are assigned.  If a device can only make use of 18 vectors, it
must advertise the next larger value (32 vectors).  As far as I can
tell, a device *could* advertise 32 vectors in Multiple Message
Capable even if it can only use 1 vector.

These patches are to avoid allocating resources for the unused
vectors, i.e., the ones between the last one the driver requested and
the last one advertised in Multiple Message Capable.  The driver might
request fewer than the maximum either because it knows the device
isn't capable of using them all, or because the driver author decided
not to use them all.

(Sorry, just thinking out loud above, let me know if I'm not
understanding this correctly.)

> Note, although the existing 'msi_desc::multiple' field might seem
> redundant, in fact in does not. In general case the number of MSIs a
> PCI device is initialized with is not necessarily the closest power-
> of-two value of the number of MSIs the device will send. Thus, in
> theory it would not be always possible to derive the former from the
> latter and we need to keep them both, to stress this corner case.
> Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
> would not save us any space.
>
> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
> ---
>  drivers/pci/msi.c   |   10 ++++++++--
>  include/linux/msi.h |    1 +
>  2 files changed, 9 insertions(+), 2 deletions(-)

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

I'd be happy to push these though my tree (given an Ack from Joerg),
or they can go another way.  Let me know if you want me to take them.

> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 00cc78c7..014b9d5 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -79,7 +79,10 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
>                 int i, nvec;
>                 if (entry->irq == 0)
>                         continue;
> -               nvec = 1 << entry->msi_attrib.multiple;
> +               if (entry->nvec)
> +                       nvec = entry->nvec;
> +               else
> +                       nvec = 1 << entry->msi_attrib.multiple;
>                 for (i = 0; i < nvec; i++)
>                         arch_teardown_msi_irq(entry->irq + i);
>         }
> @@ -340,7 +343,10 @@ static void free_msi_irqs(struct pci_dev *dev)
>                 int i, nvec;
>                 if (!entry->irq)
>                         continue;
> -               nvec = 1 << entry->msi_attrib.multiple;
> +               if (entry->nvec)
> +                       nvec = entry->nvec;
> +               else
> +                       nvec = 1 << entry->msi_attrib.multiple;
>  #ifdef CONFIG_GENERIC_HARDIRQS
>                 for (i = 0; i < nvec; i++)
>                         BUG_ON(irq_has_action(entry->irq + i));
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index ce93a34..0e20dfc 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -35,6 +35,7 @@ struct msi_desc {
>
>         u32 masked;                     /* mask bits */
>         unsigned int irq;
> +       unsigned int nvec;              /* number of messages */
>         struct list_head list;
>
>         union {
> --
> 1.7.7.6
>
>
> --
> Regards,
> Alexander Gordeev
> agordeev@redhat.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-05-28 21:51 [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested Bjorn Helgaas
@ 2013-05-29  8:36 ` Alexander Gordeev
  2013-05-29 20:58   ` Bjorn Helgaas
  0 siblings, 1 reply; 17+ messages in thread
From: Alexander Gordeev @ 2013-05-29  8:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, x86, linux-pci, Suresh Siddha, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar

On Tue, May 28, 2013 at 03:51:52PM -0600, Bjorn Helgaas wrote:
> On Mon, May 13, 2013 at 3:05 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
> 
> The subject would make more sense as "Allocate *only* as many MSIs as
> requested."

1.

> > When multiple MSIs are enabled with pci_enable_msi_block(), the
> > requested number of interrupts 'nvec' is rounded up to the nearest
> > power-of-two value.
> 
> This rounding is just a consequence of the encodings of the Multiple
> Message Enable field in the Message Control register (PCI spec r3.0,
> sec 6.8.1.3), isn't it?

Yes, it is.

> > The result is then used for setting up the
> > number of MSI messages in the PCI device and allocation of
> > interrupt resources in the operating system (i.e. vector numbers).
> > Thus, in cases when a device driver requests some number of MSIs
> > and this number is not a power-of-two value, the extra operating
> > system resources (allocated as the result of rounding) are wasted.
> >
> > This fix introduces 'msi_desc::nvec' field to address the above
> > issue. When non-zero, it will report the actual number of MSIs the
> > device will send, as requested by the device driver. This value
> > should be used by architectures to properly set up and tear down
> > associated interrupt resources.
> 
> This name needs a little more context, like "nvec_used" or something.

I chose "nvec" to indicate it is what was passed to pci_enable_msi_block().
I can resend with "nvec_used", along with subject change [1], if you want.

> I think the idea is that the Message Control register can only tell
> the OS that the device requires 1, 2, 4, 8, 16, or 32 vectors, and
> similarly the OS can only tell the device that 1, 2, 4, 8, 16, or 32
> vectors are assigned.  If a device can only make use of 18 vectors, it
> must advertise the next larger value (32 vectors).  As far as I can
> tell, a device *could* advertise 32 vectors in Multiple Message
> Capable even if it can only use 1 vector.

Yes, that is what we have with i.e. ICH AHCI device - it advertises
16 vectors while makes use of 6 only. I tried to explain this in my
changelog's last paragraph (below).

> These patches are to avoid allocating resources for the unused
> vectors, i.e., the ones between the last one the driver requested and
> the last one advertised in Multiple Message Capable.

Almost :) Rather ...between the last one the driver requested and
the last one *written* in Multiple Message *Enable*, not Capable.
IOW, between the last one the driver requested and the closest power
of two - which will be written to the device.

As of now, neither pci_enable_msi_block(), nor pci_enable_msi_block_auto()
are able to address the case you described, but if we decide to change
that then 'msi_desc::nvec' is what would be used. Again, the last paragraph
(may be too subtly) implies that.

> The driver might
> request fewer than the maximum either because it knows the device
> isn't capable of using them all, or because the driver author decided
> not to use them all.

Exactly. (I assume here "or the driver author decided not to use them all"
means the author can tell the device how many interrupts to use by means
other than Multiple Message Enable - otherwise it would be a bug).

> (Sorry, just thinking out loud above, let me know if I'm not
> understanding this correctly.)
> 
> > Note, although the existing 'msi_desc::multiple' field might seem
> > redundant, in fact in does not. In general case the number of MSIs a
> > PCI device is initialized with is not necessarily the closest power-
> > of-two value of the number of MSIs the device will send. Thus, in
> > theory it would not be always possible to derive the former from the
> > latter and we need to keep them both, to stress this corner case.
> > Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
> > would not save us any space.

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-05-29  8:36 ` Alexander Gordeev
@ 2013-05-29 20:58   ` Bjorn Helgaas
  2013-06-03 20:46     ` Bjorn Helgaas
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2013-05-29 20:58 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, linux-pci, Yinghai Lu, Joerg Roedel,
	Jan Beulich, Ingo Molnar

[-cc Suresh]

On Wed, May 29, 2013 at 2:36 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
> On Tue, May 28, 2013 at 03:51:52PM -0600, Bjorn Helgaas wrote:
>> On Mon, May 13, 2013 at 3:05 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
>>
>> The subject would make more sense as "Allocate *only* as many MSIs as
>> requested."
>
> 1.
>
>> > When multiple MSIs are enabled with pci_enable_msi_block(), the
>> > requested number of interrupts 'nvec' is rounded up to the nearest
>> > power-of-two value.
>>
>> This rounding is just a consequence of the encodings of the Multiple
>> Message Enable field in the Message Control register (PCI spec r3.0,
>> sec 6.8.1.3), isn't it?
>
> Yes, it is.
>
>> > The result is then used for setting up the
>> > number of MSI messages in the PCI device and allocation of
>> > interrupt resources in the operating system (i.e. vector numbers).
>> > Thus, in cases when a device driver requests some number of MSIs
>> > and this number is not a power-of-two value, the extra operating
>> > system resources (allocated as the result of rounding) are wasted.
>> >
>> > This fix introduces 'msi_desc::nvec' field to address the above
>> > issue. When non-zero, it will report the actual number of MSIs the
>> > device will send, as requested by the device driver. This value
>> > should be used by architectures to properly set up and tear down
>> > associated interrupt resources.
>>
>> This name needs a little more context, like "nvec_used" or something.
>
> I chose "nvec" to indicate it is what was passed to pci_enable_msi_block().
> I can resend with "nvec_used", along with subject change [1], if you want.
>
>> I think the idea is that the Message Control register can only tell
>> the OS that the device requires 1, 2, 4, 8, 16, or 32 vectors, and
>> similarly the OS can only tell the device that 1, 2, 4, 8, 16, or 32
>> vectors are assigned.  If a device can only make use of 18 vectors, it
>> must advertise the next larger value (32 vectors).  As far as I can
>> tell, a device *could* advertise 32 vectors in Multiple Message
>> Capable even if it can only use 1 vector.
>
> Yes, that is what we have with i.e. ICH AHCI device - it advertises
> 16 vectors while makes use of 6 only. I tried to explain this in my
> changelog's last paragraph (below).
>
>> These patches are to avoid allocating resources for the unused
>> vectors, i.e., the ones between the last one the driver requested and
>> the last one advertised in Multiple Message Capable.
>
> Almost :) Rather ...between the last one the driver requested and
> the last one *written* in Multiple Message *Enable*, not Capable.
> IOW, between the last one the driver requested and the closest power
> of two - which will be written to the device.

Ah, right.

> As of now, neither pci_enable_msi_block(), nor pci_enable_msi_block_auto()
> are able to address the case you described, but if we decide to change
> that then 'msi_desc::nvec' is what would be used. Again, the last paragraph
> (may be too subtly) implies that.
>
>> The driver might
>> request fewer than the maximum either because it knows the device
>> isn't capable of using them all, or because the driver author decided
>> not to use them all.
>
> Exactly. (I assume here "or the driver author decided not to use them all"
> means the author can tell the device how many interrupts to use by means
> other than Multiple Message Enable - otherwise it would be a bug).

Yep, makes sense.  Thanks for the clarifications.

>> (Sorry, just thinking out loud above, let me know if I'm not
>> understanding this correctly.)
>>
>> > Note, although the existing 'msi_desc::multiple' field might seem
>> > redundant, in fact in does not. In general case the number of MSIs a
>> > PCI device is initialized with is not necessarily the closest power-
>> > of-two value of the number of MSIs the device will send. Thus, in
>> > theory it would not be always possible to derive the former from the
>> > latter and we need to keep them both, to stress this corner case.
>> > Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
>> > would not save us any space.
>
> --
> Regards,
> Alexander Gordeev
> agordeev@redhat.com

No need to resend as far as I'm concerned; I can tweak those bits
locally.  I can put these in my tree
if Joerg or Konrad ack the iommu/irq_remapping.c bit.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-05-29 20:58   ` Bjorn Helgaas
@ 2013-06-03 20:46     ` Bjorn Helgaas
  2013-06-04 13:14       ` Alexander Gordeev
  2013-06-20 12:51       ` Joerg Roedel
  0 siblings, 2 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2013-06-03 20:46 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, linux-pci, Yinghai Lu, Joerg Roedel,
	Jan Beulich, Ingo Molnar

On Wed, May 29, 2013 at 2:58 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> [-cc Suresh]
>
> On Wed, May 29, 2013 at 2:36 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
>> On Tue, May 28, 2013 at 03:51:52PM -0600, Bjorn Helgaas wrote:
>>> On Mon, May 13, 2013 at 3:05 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
>>>
>>> The subject would make more sense as "Allocate *only* as many MSIs as
>>> requested."
>>
>> 1.
>>
>>> > When multiple MSIs are enabled with pci_enable_msi_block(), the
>>> > requested number of interrupts 'nvec' is rounded up to the nearest
>>> > power-of-two value.
>>>
>>> This rounding is just a consequence of the encodings of the Multiple
>>> Message Enable field in the Message Control register (PCI spec r3.0,
>>> sec 6.8.1.3), isn't it?
>>
>> Yes, it is.
>>
>>> > The result is then used for setting up the
>>> > number of MSI messages in the PCI device and allocation of
>>> > interrupt resources in the operating system (i.e. vector numbers).
>>> > Thus, in cases when a device driver requests some number of MSIs
>>> > and this number is not a power-of-two value, the extra operating
>>> > system resources (allocated as the result of rounding) are wasted.
>>> >
>>> > This fix introduces 'msi_desc::nvec' field to address the above
>>> > issue. When non-zero, it will report the actual number of MSIs the
>>> > device will send, as requested by the device driver. This value
>>> > should be used by architectures to properly set up and tear down
>>> > associated interrupt resources.
>>>
>>> This name needs a little more context, like "nvec_used" or something.
>>
>> I chose "nvec" to indicate it is what was passed to pci_enable_msi_block().
>> I can resend with "nvec_used", along with subject change [1], if you want.
>>
>>> I think the idea is that the Message Control register can only tell
>>> the OS that the device requires 1, 2, 4, 8, 16, or 32 vectors, and
>>> similarly the OS can only tell the device that 1, 2, 4, 8, 16, or 32
>>> vectors are assigned.  If a device can only make use of 18 vectors, it
>>> must advertise the next larger value (32 vectors).  As far as I can
>>> tell, a device *could* advertise 32 vectors in Multiple Message
>>> Capable even if it can only use 1 vector.
>>
>> Yes, that is what we have with i.e. ICH AHCI device - it advertises
>> 16 vectors while makes use of 6 only. I tried to explain this in my
>> changelog's last paragraph (below).
>>
>>> These patches are to avoid allocating resources for the unused
>>> vectors, i.e., the ones between the last one the driver requested and
>>> the last one advertised in Multiple Message Capable.
>>
>> Almost :) Rather ...between the last one the driver requested and
>> the last one *written* in Multiple Message *Enable*, not Capable.
>> IOW, between the last one the driver requested and the closest power
>> of two - which will be written to the device.
>
> Ah, right.
>
>> As of now, neither pci_enable_msi_block(), nor pci_enable_msi_block_auto()
>> are able to address the case you described, but if we decide to change
>> that then 'msi_desc::nvec' is what would be used. Again, the last paragraph
>> (may be too subtly) implies that.
>>
>>> The driver might
>>> request fewer than the maximum either because it knows the device
>>> isn't capable of using them all, or because the driver author decided
>>> not to use them all.
>>
>> Exactly. (I assume here "or the driver author decided not to use them all"
>> means the author can tell the device how many interrupts to use by means
>> other than Multiple Message Enable - otherwise it would be a bug).
>
> Yep, makes sense.  Thanks for the clarifications.
>
>>> (Sorry, just thinking out loud above, let me know if I'm not
>>> understanding this correctly.)
>>>
>>> > Note, although the existing 'msi_desc::multiple' field might seem
>>> > redundant, in fact in does not. In general case the number of MSIs a
>>> > PCI device is initialized with is not necessarily the closest power-
>>> > of-two value of the number of MSIs the device will send. Thus, in
>>> > theory it would not be always possible to derive the former from the
>>> > latter and we need to keep them both, to stress this corner case.
>>> > Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
>>> > would not save us any space.
>>
>> --
>> Regards,
>> Alexander Gordeev
>> agordeev@redhat.com
>
> No need to resend as far as I'm concerned; I can tweak those bits
> locally.  I can put these in my tree
> if Joerg or Konrad ack the iommu/irq_remapping.c bit.

I pushed these with updates to
http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi

Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
merge that branch into -next for v3.11.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-03 20:46     ` Bjorn Helgaas
@ 2013-06-04 13:14       ` Alexander Gordeev
  2013-06-05 17:18         ` Bjorn Helgaas
  2013-06-20 12:51       ` Joerg Roedel
  1 sibling, 1 reply; 17+ messages in thread
From: Alexander Gordeev @ 2013-06-04 13:14 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, x86, linux-pci, Yinghai Lu, Joerg Roedel,
	Jan Beulich, Ingo Molnar, Sebastian Andrzej Siewior,
	Konrad Rzeszutek Wilk

On Mon, Jun 03, 2013 at 02:46:59PM -0600, Bjorn Helgaas wrote:
> On Wed, May 29, 2013 at 2:58 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> I pushed these with updates to
> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
> 
> Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
> merge that branch into -next for v3.11.

Konrad, Sebastian,

Any chance to take a look and patch 2/2?

Thanks!

> Bjorn

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-04 13:14       ` Alexander Gordeev
@ 2013-06-05 17:18         ` Bjorn Helgaas
  2013-06-05 18:33           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2013-06-05 17:18 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, linux-pci, Yinghai Lu, Joerg Roedel,
	Jan Beulich, Ingo Molnar, Sebastian Andrzej Siewior,
	Konrad Rzeszutek Wilk

On Tue, Jun 4, 2013 at 7:14 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
> On Mon, Jun 03, 2013 at 02:46:59PM -0600, Bjorn Helgaas wrote:
>> On Wed, May 29, 2013 at 2:58 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> I pushed these with updates to
>> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
>>
>> Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
>> merge that branch into -next for v3.11.
>
> Konrad, Sebastian,
>
> Any chance to take a look and patch 2/2?

I went out on a limb and merged this into my -next branch for v3.11.
If anybody objects, let me know and I'll drop or rework it.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-05 17:18         ` Bjorn Helgaas
@ 2013-06-05 18:33           ` Konrad Rzeszutek Wilk
  2013-06-05 18:35             ` Bjorn Helgaas
  0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-06-05 18:33 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Gordeev, linux-kernel, x86, linux-pci, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar,
	Sebastian Andrzej Siewior

On Wed, Jun 05, 2013 at 11:18:54AM -0600, Bjorn Helgaas wrote:
> On Tue, Jun 4, 2013 at 7:14 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
> > On Mon, Jun 03, 2013 at 02:46:59PM -0600, Bjorn Helgaas wrote:
> >> On Wed, May 29, 2013 at 2:58 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >> I pushed these with updates to
> >> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
> >>
> >> Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
> >> merge that branch into -next for v3.11.
> >
> > Konrad, Sebastian,
> >
> > Any chance to take a look and patch 2/2?

I presume you mean:
x86/MSI: Conserve interrupt resources when using multiple-MSI

? Looks good
This one:
PCI: Allocate only as many MSI vectors as requested by driver


looks OK as well.


> 
> I went out on a limb and merged this into my -next branch for v3.11.
> If anybody objects, let me know and I'll drop or rework it.
> 
> Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-05 18:33           ` Konrad Rzeszutek Wilk
@ 2013-06-05 18:35             ` Bjorn Helgaas
  0 siblings, 0 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2013-06-05 18:35 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Alexander Gordeev, linux-kernel, x86, linux-pci, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar,
	Sebastian Andrzej Siewior

On Wed, Jun 5, 2013 at 12:33 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Wed, Jun 05, 2013 at 11:18:54AM -0600, Bjorn Helgaas wrote:
>> On Tue, Jun 4, 2013 at 7:14 AM, Alexander Gordeev <agordeev@redhat.com> wrote:
>> > On Mon, Jun 03, 2013 at 02:46:59PM -0600, Bjorn Helgaas wrote:
>> >> On Wed, May 29, 2013 at 2:58 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >> I pushed these with updates to
>> >> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
>> >>
>> >> Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
>> >> merge that branch into -next for v3.11.
>> >
>> > Konrad, Sebastian,
>> >
>> > Any chance to take a look and patch 2/2?
>
> I presume you mean:
> x86/MSI: Conserve interrupt resources when using multiple-MSI
>
> ? Looks good
> This one:
> PCI: Allocate only as many MSI vectors as requested by driver
>
>
> looks OK as well.

Great, thanks, Konrad!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-03 20:46     ` Bjorn Helgaas
  2013-06-04 13:14       ` Alexander Gordeev
@ 2013-06-20 12:51       ` Joerg Roedel
  2013-06-25 17:34         ` Bjorn Helgaas
  1 sibling, 1 reply; 17+ messages in thread
From: Joerg Roedel @ 2013-06-20 12:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Gordeev, linux-kernel, x86, linux-pci, Yinghai Lu,
	Jan Beulich, Ingo Molnar

On Mon, Jun 03, 2013 at 02:46:59PM -0600, Bjorn Helgaas wrote:
> I pushed these with updates to
> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
> 
> Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
> merge that branch into -next for v3.11.

Sorry for being unresponsive, things habe been pretty busy here :-(
The changes look good, you have my

Acked-by: Joerg Roedel <joro@8bytes.org>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-20 12:51       ` Joerg Roedel
@ 2013-06-25 17:34         ` Bjorn Helgaas
  0 siblings, 0 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2013-06-25 17:34 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Alexander Gordeev, linux-kernel, x86, linux-pci, Yinghai Lu,
	Jan Beulich, Ingo Molnar

On Thu, Jun 20, 2013 at 6:51 AM, Joerg Roedel <joro@8bytes.org> wrote:
> On Mon, Jun 03, 2013 at 02:46:59PM -0600, Bjorn Helgaas wrote:
>> I pushed these with updates to
>> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
>>
>> Anybody want to ack the iommu/irq_remapping.c patch?  If so, I can
>> merge that branch into -next for v3.11.
>
> Sorry for being unresponsive, things habe been pretty busy here :-(
> The changes look good, you have my
>
> Acked-by: Joerg Roedel <joro@8bytes.org>

No problem, thanks!  I added your ack and merged to my -next branch for v3.11.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-06  8:30     ` Alexander Gordeev
@ 2013-06-06 19:51       ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-06 19:51 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, linux-pci, Suresh Siddha, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar, Bjorn Helgaas

On Thu, Jun 06, 2013 at 10:30:20AM +0200, Alexander Gordeev wrote:
> Sebastian,
Hi Alexander,

> I re-read my comment few times and I admit it might be confusing. You are
> right - 'multiple' is set by rounding up only. The part '...not necessarily
> the closest power-of-two value...' implied an abstract PCI device rather than
> the described code, but the wording is less than perfect, indeed. 

Good, so it is not just me :)

> In fact, at the moment of writing I kept in mind a follow-up patch that could
> help with aforementioned devices. That would be a new interface:
> 
> 	int pci_enable_msi_block_partial(struct pci_dev *dev,
> 					 unsigned int nvec_use,
> 					 unsigned int nvec_init);
> 
> In this case 'nvec_use' would go to 'msi_desc::nvec_used' and 'nvec_init'
> would translate to 'msi_desc::multiple' in case 'nvec_init' is not zero.
> In case 'nvec_init' is zero, 'msi_desc::multiple' would be initialized
> with the maximum possible value for the device (the way it is done now for
> pci_enable_msi_block_auto() interface). So, for the AHCI device (Bjorn
> mentioned) such a call would conserve on 10 of 16 vectors:
> 
> 	pci_enable_msi_block_partial(pdev, 6, 0);

Ah okay. that makes sense.

> 
> What I am not sure is whether we need to read out the maximum possible
> number of vectors like pci_enable_msi_block_auto() does:
> 
> 	int pci_enable_msi_block_partial(struct pci_dev *dev,
> 					 unsigned int nvec_use,
> 					 unsigned int nvec_init,
> 					 unsigned int *maxvec);
> 
> I can not think of any use of 'maxvec' with this interface, but the second
> variant completes the whole picture about a device...
The user of pci_enable_msi_block_auto() does not know how many it will get
so argument seems essential. Your new function on the other hand says exactly
how many it requires. Anything less should be an error.

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-05 20:56   ` Sebastian Andrzej Siewior
  2013-06-05 21:09     ` Bjorn Helgaas
@ 2013-06-06  8:30     ` Alexander Gordeev
  2013-06-06 19:51       ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 17+ messages in thread
From: Alexander Gordeev @ 2013-06-06  8:30 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, linux-pci, Suresh Siddha, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar, Bjorn Helgaas

On Wed, Jun 05, 2013 at 10:56:38PM +0200, Sebastian Andrzej Siewior wrote:
> On Mon, May 13, 2013 at 11:05:48AM +0200, Alexander Gordeev wrote:
> > Note, although the existing 'msi_desc::multiple' field might seem
> > redundant, in fact in does not. In general case the number of MSIs a
> > PCI device is initialized with is not necessarily the closest power-
> > of-two value of the number of MSIs the device will send. Thus, in
> > theory it would not be always possible to derive the former from the
> > latter and we need to keep them both, to stress this corner case.
> > Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
> > would not save us any space.
> 
> The last paragraph makes me curious. The only place where 'multiple' is set is
> in do_setup_msi_irqs() and this uses the next power of two for it. And since a
> device is not enabled twice, it is not overridden.
> So it should be possible to compute 'multiple' out of 'nvec' but it saves
> cycles not do to so. I agree to keep 'multiple' but your argument does not
> seem to make sense.
> While nitpicking, 'nvec' might deserve a better comment than 'number of
> messages' since it holds the number of allocated interrupts. :)

Sebastian,

I re-read my comment few times and I admit it might be confusing. You are
right - 'multiple' is set by rounding up only. The part '...not necessarily
the closest power-of-two value...' implied an abstract PCI device rather than
the described code, but the wording is less than perfect, indeed. 

In fact, at the moment of writing I kept in mind a follow-up patch that could
help with aforementioned devices. That would be a new interface:

	int pci_enable_msi_block_partial(struct pci_dev *dev,
					 unsigned int nvec_use,
					 unsigned int nvec_init);

In this case 'nvec_use' would go to 'msi_desc::nvec_used' and 'nvec_init'
would translate to 'msi_desc::multiple' in case 'nvec_init' is not zero.
In case 'nvec_init' is zero, 'msi_desc::multiple' would be initialized
with the maximum possible value for the device (the way it is done now for
pci_enable_msi_block_auto() interface). So, for the AHCI device (Bjorn
mentioned) such a call would conserve on 10 of 16 vectors:

	pci_enable_msi_block_partial(pdev, 6, 0);

What I am not sure is whether we need to read out the maximum possible
number of vectors like pci_enable_msi_block_auto() does:

	int pci_enable_msi_block_partial(struct pci_dev *dev,
					 unsigned int nvec_use,
					 unsigned int nvec_init,
					 unsigned int *maxvec);

I can not think of any use of 'maxvec' with this interface, but the second
variant completes the whole picture about a device...

> Sebastian

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-05 21:09     ` Bjorn Helgaas
@ 2013-06-05 21:28       ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-05 21:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Gordeev, linux-kernel, x86, linux-pci, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar

-Suresh

On Wed, Jun 05, 2013 at 03:09:34PM -0600, Bjorn Helgaas wrote:
> 
> Alexander had an example device that advertised 16 vectors, but the
> driver knew that it could only generate 6.  That's a case where we
> can't compute 'multiple' from 'nvec' (assuming the driver supplies
> 'nvec == 6').  If we just rounded up to compute 'multiple', I think
> we'd compute 8 instead of 16.

Sure, but as I said: the only place where 'multiple' is computed / written
it is doing the round-up thingy.

> > While nitpicking, 'nvec' might deserve a better comment than 'number of
> > messages' since it holds the number of allocated interrupts. :)
> 
> I did change the name 'nvec' to 'nvec_used', which should help a bit.
> But I agree that it's still somewhat confusing.
> 
> BTW, the patches actually in my tree are at
> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
> (I tweaked this name and some comments slightly).

'nvec_used' is better the comment next to it is still wrong I think.

> Bjorn

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-06-05 20:56   ` Sebastian Andrzej Siewior
@ 2013-06-05 21:09     ` Bjorn Helgaas
  2013-06-05 21:28       ` Sebastian Andrzej Siewior
  2013-06-06  8:30     ` Alexander Gordeev
  1 sibling, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2013-06-05 21:09 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Alexander Gordeev, linux-kernel, x86, linux-pci, Suresh Siddha,
	Yinghai Lu, Joerg Roedel, Jan Beulich, Ingo Molnar

On Wed, Jun 5, 2013 at 2:56 PM, Sebastian Andrzej Siewior
<sebastian@breakpoint.cc> wrote:
> On Mon, May 13, 2013 at 11:05:48AM +0200, Alexander Gordeev wrote:
>> Note, although the existing 'msi_desc::multiple' field might seem
>> redundant, in fact in does not. In general case the number of MSIs a
>> PCI device is initialized with is not necessarily the closest power-
>> of-two value of the number of MSIs the device will send. Thus, in
>> theory it would not be always possible to derive the former from the
>> latter and we need to keep them both, to stress this corner case.
>> Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
>> would not save us any space.
>
> The last paragraph makes me curious. The only place where 'multiple' is set is
> in do_setup_msi_irqs() and this uses the next power of two for it. And since a
> device is not enabled twice, it is not overridden.
> So it should be possible to compute 'multiple' out of 'nvec' but it saves
> cycles not do to so. I agree to keep 'multiple' but your argument does not
> seem to make sense.

Alexander had an example device that advertised 16 vectors, but the
driver knew that it could only generate 6.  That's a case where we
can't compute 'multiple' from 'nvec' (assuming the driver supplies
'nvec == 6').  If we just rounded up to compute 'multiple', I think
we'd compute 8 instead of 16.

> While nitpicking, 'nvec' might deserve a better comment than 'number of
> messages' since it holds the number of allocated interrupts. :)

I did change the name 'nvec' to 'nvec_used', which should help a bit.
But I agree that it's still somewhat confusing.

BTW, the patches actually in my tree are at
http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/alexander-msi
(I tweaked this name and some comments slightly).

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-05-13  9:05 ` [PATCH v3 -tip x86/apic 1/2] " Alexander Gordeev
  2013-05-28  9:50   ` Ingo Molnar
@ 2013-06-05 20:56   ` Sebastian Andrzej Siewior
  2013-06-05 21:09     ` Bjorn Helgaas
  2013-06-06  8:30     ` Alexander Gordeev
  1 sibling, 2 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-06-05 20:56 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, linux-pci, Suresh Siddha, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar, Bjorn Helgaas

On Mon, May 13, 2013 at 11:05:48AM +0200, Alexander Gordeev wrote:
> Note, although the existing 'msi_desc::multiple' field might seem
> redundant, in fact in does not. In general case the number of MSIs a
> PCI device is initialized with is not necessarily the closest power-
> of-two value of the number of MSIs the device will send. Thus, in
> theory it would not be always possible to derive the former from the
> latter and we need to keep them both, to stress this corner case.
> Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
> would not save us any space.

The last paragraph makes me curious. The only place where 'multiple' is set is
in do_setup_msi_irqs() and this uses the next power of two for it. And since a
device is not enabled twice, it is not overridden.
So it should be possible to compute 'multiple' out of 'nvec' but it saves
cycles not do to so. I agree to keep 'multiple' but your argument does not
seem to make sense.
While nitpicking, 'nvec' might deserve a better comment than 'number of
messages' since it holds the number of allocated interrupts. :)

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-05-13  9:05 ` [PATCH v3 -tip x86/apic 1/2] " Alexander Gordeev
@ 2013-05-28  9:50   ` Ingo Molnar
  2013-06-05 20:56   ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2013-05-28  9:50 UTC (permalink / raw)
  To: Alexander Gordeev, Bjorn Helgaas
  Cc: linux-kernel, x86, linux-pci, Suresh Siddha, Yinghai Lu,
	Joerg Roedel, Jan Beulich, Ingo Molnar, Bjorn Helgaas


* Alexander Gordeev <agordeev@redhat.com> wrote:

> When multiple MSIs are enabled with pci_enable_msi_block(), the
> requested number of interrupts 'nvec' is rounded up to the nearest
> power-of-two value. The result is then used for setting up the
> number of MSI messages in the PCI device and allocation of
> interrupt resources in the operating system (i.e. vector numbers).
> Thus, in cases when a device driver requests some number of MSIs
> and this number is not a power-of-two value, the extra operating
> system resources (allocated as the result of rounding) are wasted.
> 
> This fix introduces 'msi_desc::nvec' field to address the above
> issue. When non-zero, it will report the actual number of MSIs the
> device will send, as requested by the device driver. This value
> should be used by architectures to properly set up and tear down
> associated interrupt resources.
> 
> Note, although the existing 'msi_desc::multiple' field might seem
> redundant, in fact in does not. In general case the number of MSIs a
> PCI device is initialized with is not necessarily the closest power-
> of-two value of the number of MSIs the device will send. Thus, in
> theory it would not be always possible to derive the former from the
> latter and we need to keep them both, to stress this corner case.
> Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
> would not save us any space.
> 
> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>

Would be nice to have an Acked-by from Bjorn for this patch.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested
  2013-05-13  9:05 [PATCH v3 -tip x86/apic 0/2] " Alexander Gordeev
@ 2013-05-13  9:05 ` Alexander Gordeev
  2013-05-28  9:50   ` Ingo Molnar
  2013-06-05 20:56   ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 17+ messages in thread
From: Alexander Gordeev @ 2013-05-13  9:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, linux-pci, Suresh Siddha, Yinghai Lu, Joerg Roedel,
	Jan Beulich, Ingo Molnar, Bjorn Helgaas

When multiple MSIs are enabled with pci_enable_msi_block(), the
requested number of interrupts 'nvec' is rounded up to the nearest
power-of-two value. The result is then used for setting up the
number of MSI messages in the PCI device and allocation of
interrupt resources in the operating system (i.e. vector numbers).
Thus, in cases when a device driver requests some number of MSIs
and this number is not a power-of-two value, the extra operating
system resources (allocated as the result of rounding) are wasted.

This fix introduces 'msi_desc::nvec' field to address the above
issue. When non-zero, it will report the actual number of MSIs the
device will send, as requested by the device driver. This value
should be used by architectures to properly set up and tear down
associated interrupt resources.

Note, although the existing 'msi_desc::multiple' field might seem
redundant, in fact in does not. In general case the number of MSIs a
PCI device is initialized with is not necessarily the closest power-
of-two value of the number of MSIs the device will send. Thus, in
theory it would not be always possible to derive the former from the
latter and we need to keep them both, to stress this corner case.
Besides, since 'msi_desc::multiple' is a bitfield, throwing it out
would not save us any space.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 drivers/pci/msi.c   |   10 ++++++++--
 include/linux/msi.h |    1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 00cc78c7..014b9d5 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -79,7 +79,10 @@ void default_teardown_msi_irqs(struct pci_dev *dev)
 		int i, nvec;
 		if (entry->irq == 0)
 			continue;
-		nvec = 1 << entry->msi_attrib.multiple;
+		if (entry->nvec)
+			nvec = entry->nvec;
+		else
+			nvec = 1 << entry->msi_attrib.multiple;
 		for (i = 0; i < nvec; i++)
 			arch_teardown_msi_irq(entry->irq + i);
 	}
@@ -340,7 +343,10 @@ static void free_msi_irqs(struct pci_dev *dev)
 		int i, nvec;
 		if (!entry->irq)
 			continue;
-		nvec = 1 << entry->msi_attrib.multiple;
+		if (entry->nvec)
+			nvec = entry->nvec;
+		else
+			nvec = 1 << entry->msi_attrib.multiple;
 #ifdef CONFIG_GENERIC_HARDIRQS
 		for (i = 0; i < nvec; i++)
 			BUG_ON(irq_has_action(entry->irq + i));
diff --git a/include/linux/msi.h b/include/linux/msi.h
index ce93a34..0e20dfc 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -35,6 +35,7 @@ struct msi_desc {
 
 	u32 masked;			/* mask bits */
 	unsigned int irq;
+	unsigned int nvec;		/* number of messages */
 	struct list_head list;
 
 	union {
-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-06-25 17:35 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-28 21:51 [PATCH v3 -tip x86/apic 1/2] PCI/MSI: Allocate as many multiple-MSIs as requested Bjorn Helgaas
2013-05-29  8:36 ` Alexander Gordeev
2013-05-29 20:58   ` Bjorn Helgaas
2013-06-03 20:46     ` Bjorn Helgaas
2013-06-04 13:14       ` Alexander Gordeev
2013-06-05 17:18         ` Bjorn Helgaas
2013-06-05 18:33           ` Konrad Rzeszutek Wilk
2013-06-05 18:35             ` Bjorn Helgaas
2013-06-20 12:51       ` Joerg Roedel
2013-06-25 17:34         ` Bjorn Helgaas
  -- strict thread matches above, loose matches on Subject: below --
2013-05-13  9:05 [PATCH v3 -tip x86/apic 0/2] " Alexander Gordeev
2013-05-13  9:05 ` [PATCH v3 -tip x86/apic 1/2] " Alexander Gordeev
2013-05-28  9:50   ` Ingo Molnar
2013-06-05 20:56   ` Sebastian Andrzej Siewior
2013-06-05 21:09     ` Bjorn Helgaas
2013-06-05 21:28       ` Sebastian Andrzej Siewior
2013-06-06  8:30     ` Alexander Gordeev
2013-06-06 19:51       ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.