linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	<linux-rdma@vger.kernel.org>, Netdev <netdev@vger.kernel.org>,
	Don Dutile <ddutile@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH mlx5-next v1 2/5] PCI: Add SR-IOV sysfs entry to read number of MSI-X vectors
Date: Thu, 14 Jan 2021 14:29:45 -0400	[thread overview]
Message-ID: <20210114182945.GO4147@nvidia.com> (raw)
In-Reply-To: <CAKgT0UcKqt=EgE+eitB8-u8LvxqHBDfF+u2ZSi5urP_Aj0Btvg@mail.gmail.com>

On Thu, Jan 14, 2021 at 09:55:24AM -0800, Alexander Duyck wrote:
> On Thu, Jan 14, 2021 at 8:49 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Thu, Jan 14, 2021 at 08:40:14AM -0800, Alexander Duyck wrote:
> >
> > > Where I think you and I disagree is that I really think the MSI-X
> > > table size should be fixed at one value for the life of the VF.
> > > Instead of changing the table size it should be the number of vectors
> > > that are functional that should be the limit. Basically there should
> > > be only some that are functional and some that are essentially just
> > > dummy vectors that you can write values into that will never be used.
> >
> > Ignoring the PCI config space to learn the # of available MSI-X
> > vectors is big break on the how the device's programming ABI works.
> >
> > Or stated another way, that isn't compatible with any existing drivers
> > so it is basically not a useful approach as it can't be deployed.
> >
> > I don't know why you think that is better.
> >
> > Jason
> 
> First off, this is technically violating the PCIe spec section 7.7.2.2
> because this is the device driver modifying the Message Control
> register for a device, even if it is the PF firmware modifying the VF.
> The table size is something that should be set and fixed at device
> creation and not changed.

The word "violating" is rather an over-reaction, at worst this is an
extension.

> The MSI-X table is essentially just an MMIO resource, and I believe it
> should not be resized, just as you wouldn't expect any MMIO BAR to be
> dynamically resized. 

Resizing the BAR is already defined see commit 276b738deb5b ("PCI:
Add resizable BAR infrastructure")

As you say BAR and MSI vector table are not so different, it seems
perfectly in line with current PCI sig thinking to allow resizing the
MSI as well

> Many drivers don't make use of the full MSI-X table nor do they
> bother reading the size. We just populate a subset of the table
> based on the number of interrupt causes we will need to associate to
> interrupt handlers. 

This isn't about "many drivers" this is about what mlx5 does in all
the various OS drivers it has, and mlx5 has a sophisticated use of
MSI-X.

> What I see this patch doing is trying to push driver PF policy onto
> the VF PCIe device configuration space dynamically.

Huh? This is using the PF to dynamically reconfigure a child VF beyond
what the PCI spec defined. This is done safely under Linux because no
driver is bound when it is reconfigured, and any stale config data is
flushed out of any OS caches.

This is also why there is not a strong desire to standardize an ECN at
PCI-sig, the rules for how resizing can work are complicated and OS
specific.

> Having some limited number of interrupt causes should really be what
> is limiting things here. 

MSI inherently requires dedicated on-die resources to implement, so
every device has a maximum # of MSI vectors it can currently
expose. This is some consequence of various PCI rules and applies to
all devices.

To make effective use of this limited pool requires a hard restriction
enforced by the secure domain (hypervisor and FW) onto every
user. Every driver attached to the function needs to be aware of the
hard enforced limit by the secure domain to operate properly. It has
nothing to do with "limited number of interrupt causes".

The standards based way to communicate that limit is the MSI table cap
size.

To complain that changing the MSI table cap size dynamically is
non-standard then offer up a completely non-standard way to operate
MSI instead seems to miss the entire point.

The important standard is to keep the PCI config space acting per-spec
so all the various consumers can work as-is. The extension is to only
modify the rare hypervisor to support a dynamic MSI resizing extension
for VFs.

As far as applicability, any device working at high scale with MSI and
VMs is going to need this. Dynamically assigning the limited MSI HW is
really required to support the universe of VM configurations people
want. eg generally I would expect a VM to receive the number of MSI
vectors equal to the number of CPUs the VM gets.

> I see that being mostly a thing between the firmware and the VF in
> terms of configuration and not something that necessarily has to be
> pushed down onto the PCIe configuration space itself.

If mlx5 drivers had been designed long ago to never use standard based
MSI and instead did something internal with FW you might have a point,
but they weren't. All the mlx5 drivers use standards based MSI and
expect the config space to be correct.

Jason

  reply	other threads:[~2021-01-14 18:30 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-10 15:07 [PATCH mlx5-next v1 0/5] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 1/5] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs Leon Romanovsky
2021-01-11 19:30   ` Alexander Duyck
2021-01-12  3:25     ` Don Dutile
2021-01-12  6:15       ` Leon Romanovsky
2021-01-12  6:39     ` Leon Romanovsky
2021-01-12 21:59       ` Alexander Duyck
2021-01-13  6:09         ` Leon Romanovsky
2021-01-13 20:00           ` Alexander Duyck
2021-01-14  7:16             ` Leon Romanovsky
2021-01-14 16:51               ` Alexander Duyck
2021-01-14 17:12                 ` Leon Romanovsky
2021-01-13 17:50   ` Alex Williamson
2021-01-13 19:49     ` Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 2/5] PCI: Add SR-IOV sysfs entry to read number of MSI-X vectors Leon Romanovsky
2021-01-11 19:30   ` Alexander Duyck
2021-01-12  6:56     ` Leon Romanovsky
2021-01-12 21:34       ` Alexander Duyck
2021-01-13  6:19         ` Leon Romanovsky
2021-01-13 22:44           ` Alexander Duyck
2021-01-14  6:50             ` Leon Romanovsky
2021-01-14 16:40               ` Alexander Duyck
2021-01-14 16:48                 ` Jason Gunthorpe
2021-01-14 17:55                   ` Alexander Duyck
2021-01-14 18:29                     ` Jason Gunthorpe [this message]
2021-01-14 19:24                       ` Alexander Duyck
2021-01-14 19:56                         ` Leon Romanovsky
2021-01-14 20:08                         ` Jason Gunthorpe
2021-01-14 21:43                           ` Alexander Duyck
2021-01-14 23:28                             ` Alex Williamson
2021-01-15  1:56                               ` Alexander Duyck
2021-01-15 14:06                                 ` Jason Gunthorpe
2021-01-15 15:53                                   ` Leon Romanovsky
2021-01-16  1:48                                     ` Alexander Duyck
2021-01-16  8:20                                       ` Leon Romanovsky
2021-01-18  3:16                                         ` Alexander Duyck
2021-01-18  7:20                                           ` Leon Romanovsky
2021-01-18 13:28                                             ` Leon Romanovsky
2021-01-18 18:21                                               ` Alexander Duyck
2021-01-19  5:43                                                 ` Leon Romanovsky
2021-01-16  4:32                                   ` Alexander Duyck
2021-01-18 15:47                                     ` Jason Gunthorpe
2021-01-15 13:51                             ` Jason Gunthorpe
2021-01-14 17:26                 ` Leon Romanovsky
2021-01-18 17:03   ` Greg KH
2021-01-19  5:38     ` Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 3/5] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 4/5] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 5/5] net/mlx5: Allow to the users to configure number of MSI-X vectors Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210114182945.GO4147@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).