linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Leon Romanovsky <leon@kernel.org>
Cc: Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org,
	netdev@vger.kernel.org, Saeed Mahameed <saeedm@nvidia.com>,
	Yishai Hadas <yishaih@nvidia.com>
Subject: Re: [PATCH mlx5-next 1/7] PCI/IOV: Provide internal VF index
Date: Sun, 26 Sep 2021 15:23:41 -0500	[thread overview]
Message-ID: <20210926202341.GA588922@bhelgaas> (raw)
In-Reply-To: <YVAVAfU3aL6JJg3i@unreal>

On Sun, Sep 26, 2021 at 09:36:49AM +0300, Leon Romanovsky wrote:
> On Sat, Sep 25, 2021 at 12:41:15PM -0500, Bjorn Helgaas wrote:
> > On Sat, Sep 25, 2021 at 01:10:39PM +0300, Leon Romanovsky wrote:
> > > On Fri, Sep 24, 2021 at 08:08:45AM -0500, Bjorn Helgaas wrote:
> > > > On Thu, Sep 23, 2021 at 09:35:32AM +0300, Leon Romanovsky wrote:
> > > > > On Wed, Sep 22, 2021 at 04:59:30PM -0500, Bjorn Helgaas wrote:
> > > > > > On Wed, Sep 22, 2021 at 01:38:50PM +0300, Leon Romanovsky wrote:
> > > > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > > > 
> > > > > > > The PCI core uses the VF index internally, often called the vf_id,
> > > > > > > during the setup of the VF, eg pci_iov_add_virtfn().
> > > > > > > 
> > > > > > > This index is needed for device drivers that implement live migration
> > > > > > > for their internal operations that configure/control their VFs.
> > > > > > >
> > > > > > > Specifically, mlx5_vfio_pci driver that is introduced in coming patches
> > > > > > > from this series needs it and not the bus/device/function which is
> > > > > > > exposed today.
> > > > > > > 
> > > > > > > Add pci_iov_vf_id() which computes the vf_id by reversing the math that
> > > > > > > was used to create the bus/device/function.
> > > > > > > 
> > > > > > > Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > > > > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > > > 
> > > > > > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > > > > > 
> > > > > > mlx5_core_sriov_set_msix_vec_count() looks like it does basically the
> > > > > > same thing as pci_iov_vf_id() by iterating through VFs until it finds
> > > > > > one with a matching devfn (although it *doesn't* check for a matching
> > > > > > bus number, which seems like a bug).
> > > ...
> > 
> > > > And it still looks like the existing code is buggy.  This is called
> > > > via sysfs, so if the PF is on bus X and the user writes to
> > > > sriov_vf_msix_count for a VF on bus X+1, it looks like
> > > > mlx5_core_sriov_set_msix_vec_count() will set the count for the wrong
> > > > VF.
> > > 
> > > In mlx5_core_sriov_set_msix_vec_count(), we receive VF that is connected
> > > to PF which has "struct mlx5_core_dev". My expectation is that they share
> > > same bus as that PF was the one who created VFs. The mlx5 devices supports
> > > upto 256 VFs and it is far below the bus split mentioned in PCI spec.
> > > 
> > > How can VF and their respective PF have different bus numbers?
> > 
> > See PCIe r5.0, sec 9.2.1.2.  For example,
> > 
> >   PF 0 on bus 20
> >     First VF Offset   1
> >     VF Stride         1
> >     NumVFs          511
> >   VF 0,1   through VF 0,255 on bus 20
> >   VF 0,256 through VF 0,511 on bus 21
> > 
> > This is implemented in pci_iov_add_virtfn(), which computes the bus
> > number and devfn from the VF ID.
> > 
> > pci_iov_virtfn_devfn(VF 0,1) == pci_iov_virtfn_devfn(VF 0,256), so if
> > the user writes to sriov_vf_msix_count for VF 0,256, it looks like
> > we'll call mlx5_set_msix_vec_count() for VF 0,1 instead of VF 0,256.
> 
> This is PCI spec split that I mentioned.
> 
> > 
> > The spec encourages devices that require no more than 256 devices to
> > locate them all on the same bus number (PCIe r5.0, sec 9.1), so if you
> > only have 255 VFs, you may avoid the problem.
> > 
> > But in mlx5_core_sriov_set_msix_vec_count(), it's not obvious that it
> > is safe to assume the bus number is the same.
> 
> No problem, we will make it more clear.

IMHO you should resolve it by using the new interface.  Better
performing, unambiguous regardless of how many VFs the device
supports.  What's the down side?

  reply	other threads:[~2021-09-26 20:23 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1632305919.git.leonro@nvidia.com>
2021-09-22 10:38 ` [PATCH mlx5-next 1/7] PCI/IOV: Provide internal VF index Leon Romanovsky
2021-09-22 21:59   ` Bjorn Helgaas
2021-09-23  6:35     ` Leon Romanovsky
2021-09-24 13:08       ` Bjorn Helgaas
2021-09-25 10:10         ` Leon Romanovsky
2021-09-25 17:41           ` Bjorn Helgaas
2021-09-26  6:36             ` Leon Romanovsky
2021-09-26 20:23               ` Bjorn Helgaas [this message]
2021-09-27 11:55                 ` Leon Romanovsky
2021-09-27 14:47                   ` Bjorn Helgaas
2021-09-22 10:38 ` [PATCH mlx5-next 2/7] vfio: Add an API to check migration state transition validity Leon Romanovsky
2021-09-23 10:33   ` Shameerali Kolothum Thodi
2021-09-23 11:17     ` Leon Romanovsky
2021-09-23 13:55       ` Max Gurtovoy
2021-09-24  7:44         ` Shameerali Kolothum Thodi
2021-09-24  9:37           ` Kirti Wankhede
2021-09-26  9:09           ` Max Gurtovoy
2021-09-26 16:17             ` Shameerali Kolothum Thodi
2021-09-27 18:24               ` Max Gurtovoy
2021-09-27 18:29                 ` Shameerali Kolothum Thodi
2021-09-27 22:46   ` Alex Williamson
2021-09-27 23:12     ` Jason Gunthorpe
2021-09-28 19:19       ` Alex Williamson
2021-09-28 19:35         ` Jason Gunthorpe
2021-09-28 20:18           ` Alex Williamson
2021-09-29 16:16             ` Jason Gunthorpe
2021-09-29 18:06               ` Alex Williamson
2021-09-29 18:26                 ` Jason Gunthorpe
2021-09-29 10:57         ` Max Gurtovoy
2021-09-29 10:44       ` Max Gurtovoy
2021-09-29 12:35         ` Alex Williamson
2021-09-29 13:26           ` Max Gurtovoy
2021-09-29 13:50             ` Alex Williamson
2021-09-29 14:36               ` Max Gurtovoy
2021-09-29 15:17                 ` Alex Williamson
2021-09-29 15:28                   ` Max Gurtovoy
2021-09-29 16:14                     ` Jason Gunthorpe
2021-09-29 21:48                       ` Max Gurtovoy
2021-09-29 22:44                         ` Alex Williamson
2021-09-30  9:25                           ` Max Gurtovoy
2021-09-30 12:41                             ` Alex Williamson
2021-09-29 23:21                         ` Jason Gunthorpe
2021-09-30  9:34                           ` Max Gurtovoy
2021-09-30 14:47                             ` Jason Gunthorpe
2021-09-30 15:32                               ` Max Gurtovoy
2021-09-30 16:24                                 ` Jason Gunthorpe
2021-09-30 16:51                                   ` Max Gurtovoy
2021-09-30 17:01                                     ` Jason Gunthorpe
2021-09-22 10:38 ` [PATCH mlx5-next 3/7] vfio/pci_core: Make the region->release() function optional Leon Romanovsky
2021-09-23 13:57   ` Max Gurtovoy
2021-09-22 10:38 ` [PATCH mlx5-next 4/7] net/mlx5: Introduce migration bits and structures Leon Romanovsky
2021-09-24  5:48   ` Mark Zhang
2021-09-22 10:38 ` [PATCH mlx5-next 5/7] net/mlx5: Expose APIs to get/put the mlx5 core device Leon Romanovsky
2021-09-22 10:38 ` [PATCH mlx5-next 6/7] mlx5_vfio_pci: Expose migration commands over mlx5 device Leon Romanovsky
2021-09-28 20:22   ` Alex Williamson
2021-09-29  5:36     ` Leon Romanovsky
2021-09-22 10:38 ` [PATCH mlx5-next 7/7] mlx5_vfio_pci: Implement vfio_pci driver for mlx5 devices Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210926202341.GA588922@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).