All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	linux-rdma@vger.kernel.org, Netdev <netdev@vger.kernel.org>,
	Don Dutile <ddutile@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"David S . Miller" <davem@davemloft.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count
Date: Thu, 11 Mar 2021 21:17:40 +0200	[thread overview]
Message-ID: <YEps1B+DEztFfy8R@unreal> (raw)
In-Reply-To: <20210311181729.GA2148230@bjorn-Precision-5520>

On Thu, Mar 11, 2021 at 12:17:29PM -0600, Bjorn Helgaas wrote:
> On Wed, Mar 10, 2021 at 03:34:01PM -0800, Alexander Duyck wrote:
> > On Wed, Mar 10, 2021 at 11:09 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Sun, Mar 07, 2021 at 10:55:24AM -0800, Alexander Duyck wrote:
> > > > On Sun, Feb 28, 2021 at 11:55 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > >
> > > > > @Alexander Duyck, please update me if I can add your ROB tag again
> > > > > to the series, because you liked v6 more.
> > > > >
> > > > > Thanks
> > > > >
> > > > > ---------------------------------------------------------------------------------
> > > > > Changelog
> > > > > v7:
> > > > >  * Rebase on top v5.12-rc1
> > > > >  * More english fixes
> > > > >  * Returned to static sysfs creation model as was implemented in v0/v1.
> > > >
> > > > Yeah, so I am not a fan of the series. The problem is there is only
> > > > one driver that supports this, all VFs are going to expose this sysfs,
> > > > and I don't know how likely it is that any others are going to
> > > > implement this functionality. I feel like you threw out all the
> > > > progress from v2-v6.
> > >
> > > pci_enable_vfs_overlay() turned up in v4, so I think v0-v3 had static
> > > sysfs files regardless of whether the PF driver was bound.
> > >
> > > > I really feel like the big issue is that this model is broken as you
> > > > have the VFs exposing sysfs interfaces that make use of the PFs to
> > > > actually implement. Greg's complaint was the PF pushing sysfs onto the
> > > > VFs. My complaint is VFs sysfs files operating on the PF. The trick is
> > > > to find a way to address both issues.
> > > >
> > > > Maybe the compromise is to reach down into the IOV code and have it
> > > > register the sysfs interface at device creation time in something like
> > > > pci_iov_sysfs_link if the PF has the functionality present to support
> > > > it.
> > >
> > > IIUC there are two questions on the table:
> > >
> > >   1) Should the sysfs files be visible only when a PF driver that
> > >      supports MSI-X vector assignment is bound?
> > >
> > >      I think this is a cosmetic issue.  The presence of the file is
> > >      not a reliable signal to management software; it must always
> > >      tolerate files that don't exist (e.g., on old kernels) or files
> > >      that are visible but don't work (e.g., vectors may be exhausted).
> > >
> > >      If we start with the files always being visible, we should be
> > >      able to add smarts later to expose them only when the PF driver
> > >      is bound.
> > >
> > >      My concerns with pci_enable_vf_overlay() are that it uses a
> > >      little more sysfs internals than I'd like (although there are
> > >      many callers of sysfs_create_files()) and it uses
> > >      pci_get_domain_bus_and_slot(), which is generally a hack and
> > >      creates refcounting hassles.  Speaking of which, isn't v6 missing
> > >      a pci_dev_put() to match the pci_get_domain_bus_and_slot()?
> >
> > I'm not so much worried about management software as the fact that
> > this is a vendor specific implementation detail that is shaping how
> > the kernel interfaces are meant to work. Other than the mlx5 I don't
> > know if there are any other vendors really onboard with this sort of
> > solution.
>
> I know this is currently vendor-specific, but I thought the value
> proposition of dynamic configuration of VFs for different clients
> sounded compelling enough that other vendors would do something
> similar.  But I'm not an SR-IOV guy and have no vendor insight, so
> maybe that's not the case?

It is not the case, any vendor who wants to compete with Mellanox
devices in large scale clouds, will be asked to implement this feature.

You can't provide reliable VM service without it.

<...>

> > I checked it again, it will make the *_msix_count files stop working.
> > In order to guarantee it cannot happen in the middle of things though
> > we are sitting on the device locks for both the PF and the VF. I'm not
> > a fan of having to hold 2 locks while we perform a firmware operation
> > for one device, but I couldn't find anything where we would deadlock
> > so it should be fine.
>
> I agree again, it's not ideal to hold two locks.  Is it possible we
> could get by without the VF lock?  If we simply check whether a VF
> driver is bound (without a lock), a VF driver bind can race with the
> PF sriov_set_msix_vec_count().

We are holding huge amount of locks simultaneously. I don't understand
why two locks can be so big deal now.

Alex said that my locking scheme is correct, I think you also said that.
Why do we need to add races and make correct solution to be half baked?

>
> If the VF driver bind wins, it reads the old Table Size.  If it reads
> a too-small size, it won't use all the vectors.  If it reads a
> too-large size, it will try to use too many vectors and some won't
> work.  But the race would be caused by a bug in the management
> software, and the consequence doesn't seem *terrible*.

It will present to the user/administrator incorrect picture where lspci
output won't be aligned with the real situation.

Thanks

>
> Bjorn

  parent reply	other threads:[~2021-03-11 19:18 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01  7:55 [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
2021-03-01  8:14   ` Greg Kroah-Hartman
2021-03-01  8:32     ` Leon Romanovsky
2021-03-01  8:37       ` Greg Kroah-Hartman
2021-03-01  8:53         ` Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 4/4] net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks Leon Romanovsky
2021-03-07  8:11 ` [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-07 18:55 ` Alexander Duyck
2021-03-07 19:19   ` Leon Romanovsky
2021-03-08 16:33     ` Alexander Duyck
2021-03-08 19:20       ` Leon Romanovsky
2021-03-10 19:09   ` Bjorn Helgaas
2021-03-10 20:10     ` Leon Romanovsky
2021-03-10 20:21       ` Greg Kroah-Hartman
2021-03-11  8:37         ` Leon Romanovsky
2021-03-10 23:34     ` Alexander Duyck
2021-03-11 18:17       ` Bjorn Helgaas
2021-03-11 19:16         ` Keith Busch
2021-03-11 19:21           ` Leon Romanovsky
2021-03-11 20:22           ` Jason Gunthorpe
2021-03-11 20:50             ` Keith Busch
2021-03-11 21:44               ` Jason Gunthorpe
2021-03-25 17:21                 ` Bjorn Helgaas
2021-03-25 17:36                   ` Jason Gunthorpe
2021-03-25 18:20                     ` Bjorn Helgaas
2021-03-25 18:28                       ` Jason Gunthorpe
2021-03-26  6:44                         ` Leon Romanovsky
2021-03-26 16:00                           ` Alexander Duyck
2021-03-26 16:56                             ` Jason Gunthorpe
2021-03-26 17:08                             ` Bjorn Helgaas
2021-03-26 17:12                               ` Jason Gunthorpe
2021-03-27  6:00                                 ` Leon Romanovsky
2021-03-26 17:29                               ` Keith Busch
2021-03-26 17:31                                 ` Jason Gunthorpe
2021-03-26 18:50                               ` Alexander Duyck
2021-03-26 19:01                                 ` Jason Gunthorpe
2021-03-30  1:29                                   ` Bjorn Helgaas
2021-03-30 13:57                                     ` Jason Gunthorpe
2021-03-30 15:00                                       ` Bjorn Helgaas
2021-03-30 19:47                                         ` Jason Gunthorpe
2021-03-30 20:41                                           ` Bjorn Helgaas
2021-03-30 22:43                                             ` Jason Gunthorpe
2021-03-31  6:38                                               ` Greg Kroah-Hartman
2021-03-31 12:19                                                 ` Jason Gunthorpe
2021-03-31 15:03                                                   ` Greg Kroah-Hartman
2021-03-31 17:07                                                     ` Jason Gunthorpe
2021-03-31  4:08                                             ` Leon Romanovsky
2021-04-01  1:23                                               ` Bjorn Helgaas
2021-04-01 11:49                                                 ` Leon Romanovsky
2021-03-30 18:10                                     ` Keith Busch
2021-03-26 19:36                                 ` Bjorn Helgaas
2021-03-27 12:38                                   ` Greg Kroah-Hartman
2021-03-25 18:31                     ` Keith Busch
2021-03-25 18:36                       ` Jason Gunthorpe
2021-03-11 19:17         ` Leon Romanovsky [this message]
2021-03-11 19:37         ` Alexander Duyck
2021-03-11 19:51           ` Leon Romanovsky
2021-03-11 20:11             ` Alexander Duyck
2021-03-11 20:19           ` Jason Gunthorpe
2021-03-11 21:49             ` Alexander Duyck
2021-03-11 23:20               ` Jason Gunthorpe
2021-03-12  2:53                 ` Alexander Duyck
2021-03-12  6:32                   ` Leon Romanovsky
2021-03-12 16:59                     ` Alexander Duyck
2021-03-12 17:03                       ` Jason Gunthorpe
2021-03-12 18:34                         ` Leon Romanovsky
2021-03-12 18:41                       ` Leon Romanovsky
2021-03-12 13:00                   ` Jason Gunthorpe
2021-03-12 13:36                     ` Keith Busch
2021-03-11 20:31         ` Jason Gunthorpe
2021-03-10  5:58 ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YEps1B+DEztFfy8R@unreal \
    --to=leon@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=ddutile@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.