netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	linux-rdma@vger.kernel.org, Netdev <netdev@vger.kernel.org>,
	Don Dutile <ddutile@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH mlx5-next v1 1/5] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs
Date: Thu, 14 Jan 2021 08:51:22 -0800	[thread overview]
Message-ID: <CAKgT0UdPZvSf0qSsU1NGcVcK_j6rPZQ8YT_UcygU+2FEq_dGpQ@mail.gmail.com> (raw)
In-Reply-To: <20210114071649.GL4678@unreal>

On Wed, Jan 13, 2021 at 11:16 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Jan 13, 2021 at 12:00:00PM -0800, Alexander Duyck wrote:
> > On Tue, Jan 12, 2021 at 10:09 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Tue, Jan 12, 2021 at 01:59:51PM -0800, Alexander Duyck wrote:
> > > > On Mon, Jan 11, 2021 at 10:39 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Mon, Jan 11, 2021 at 11:30:33AM -0800, Alexander Duyck wrote:
> > > > > > On Sun, Jan 10, 2021 at 7:12 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > >
> > > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > > >

<snip>

> >
> > > >
> > > > Also I am not a big fan of the VF groping around looking for a PF
> > > > interface as it means the interface will likely be exposed in the
> > > > guest as well, but it just won't work.
> > >
> > > If you are referring to VF exposed to the VM, so in this case VF must be
> > > bound too vfio driver, or any other driver, and won't allow MSI-X change.
> > > If you are referring to PF exposed to the VM, it is very unlikely scenario
> > > in real world and reserved for braves among us. Even in this case, the
> > > set MSI-X won't work, because PF will be connected to the hypervisor driver
> > > that doesn't support set_msix.
> > >
> > > So both cases are handled.
> >
> > I get that they are handled. However I am not a huge fan of the sysfs
> > attributes for one device being dependent on another device. When you
> > have to start searching for another device it just makes things messy.
>
> This is pretty common way, nothing new here.

This is how writable fields within the device are handled. I am pretty
sure this is the first sysfs entry that is providing a workaround via
a device firmware to make the field editable that wasn't intended to
be.

So if in the future I define a device that has an MMIO register that
allows me to edit configuration space should I just tie it into the
same framework? That is kind of where I am going with my objection to
this. It just seems like you are adding a backdoor to allow editing
read-only configuration options.

> >
> > > >
> > > > > >
> > > > > > If you are calling this on the VFs then it doesn't really make any
> > > > > > sense anyway since the VF is not a "VF PCI dev representor" and
> > > > > > shouldn't be treated as such. In my opinion if we are going to be
> > > > > > doing per-port resource limiting that is something that might make
> > > > > > more sense as a part of the devlink configuration for the VF since the
> > > > > > actual change won't be visible to an assigned device.
> > > > >
> > > > > https://lore.kernel.org/linux-pci/20210112061535.GB4678@unreal/
> > > >
> > > > So the question I would have is if we are spawning the VFs and
> > > > expecting them to have different configs or the same configuration?
> > >
> > > By default, they have same configuration.
> > >
> > > > I'm assuming in your case you are looking for a different
> > > > configuration per port. Do I have that correct?
> > >
> > > No, per-VF as represents one device in the PCI world. For example, mlx5
> > > can have more than one physical port.
> >
> > Sorry, I meant per virtual function, not per port.
>
> Yes, PCI spec is clear, MSI-X vector count is per-device and in our case
> it means per-VF.

I think you overlooked the part about it being "read-only". It isn't
really meant to be changed and that is what this patch set is
providing.

> >
> > > >
> > > > Where this gets ugly is that SR-IOV assumes a certain uniformity per
> > > > VF so doing a per-VF custom limitation gets ugly pretty quick.
> > >
> > > I don't find any support for this "uniformity" claim in the PCI spec.
> >
> > I am referring to the PCI configuration space. Each VF ends up with
> > some fixed amount of MMIO resources per function. So typically when
> > you spawn VFs we had things setup so that all you do is say how many
> > you want.
> >
> > > > I wonder if it would make more sense if we are going this route to just
> > > > define a device-tree like schema that could be fed in to enable VFs
> > > > instead of just using echo X > sriov_numvfs and then trying to fix
> > > > things afterwards. Then you could define this and other features that
> > > > I am sure you would need in the future via json-schema like is done in
> > > > device-tree and write it once enabling the set of VFs that you need.
> > >
> > > Sorry, but this is overkill, it won't give us much and it doesn't fit
> > > the VF usage model at all.
> > >
> > > Right now, all heavy users of SR-IOV are creating many VFs up to the maximum.
> > > They do it with autoprobe disabled, because it is too time consuming to wait
> > > till all VFs probe themselves and unbind them later.
> > >
> > > After that, they wait for incoming request to provision VM on VF, they set MAC
> > > address, change MSI-X according to VM properties and bind that VF to new VM.
> > >
> > > So MSI-X change is done after VFs were created.
> >
> > So if I understand correctly based on your comments below you are
> > dynamically changing the VF's MSI-X configuration space then?
>
> I'm changing "Table Size" from "7.7.2.2 Message Control Register for
> MSI-X (Offset 02h)" and nothing more.
>
> If you do raw PCI read before and after, only this field will be changed.

I would hope there is much more going on. Otherwise the VF hardware
will be exploitable by a malicious driver in the guest since you could
read/write to registers beyond the table and see some result. I am
assuming the firmware doesn't allow triggering of any interrupts
beyond the ones defined as being in the table.

  reply	other threads:[~2021-01-14 16:52 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-10 15:07 [PATCH mlx5-next v1 0/5] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 1/5] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs Leon Romanovsky
2021-01-11 19:30   ` Alexander Duyck
2021-01-12  3:25     ` Don Dutile
2021-01-12  6:15       ` Leon Romanovsky
2021-01-12  6:39     ` Leon Romanovsky
2021-01-12 21:59       ` Alexander Duyck
2021-01-13  6:09         ` Leon Romanovsky
2021-01-13 20:00           ` Alexander Duyck
2021-01-14  7:16             ` Leon Romanovsky
2021-01-14 16:51               ` Alexander Duyck [this message]
2021-01-14 17:12                 ` Leon Romanovsky
2021-01-13 17:50   ` Alex Williamson
2021-01-13 19:49     ` Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 2/5] PCI: Add SR-IOV sysfs entry to read number of MSI-X vectors Leon Romanovsky
2021-01-11 19:30   ` Alexander Duyck
2021-01-12  6:56     ` Leon Romanovsky
2021-01-12 21:34       ` Alexander Duyck
2021-01-13  6:19         ` Leon Romanovsky
2021-01-13 22:44           ` Alexander Duyck
2021-01-14  6:50             ` Leon Romanovsky
2021-01-14 16:40               ` Alexander Duyck
2021-01-14 16:48                 ` Jason Gunthorpe
2021-01-14 17:55                   ` Alexander Duyck
2021-01-14 18:29                     ` Jason Gunthorpe
2021-01-14 19:24                       ` Alexander Duyck
2021-01-14 19:56                         ` Leon Romanovsky
2021-01-14 20:08                         ` Jason Gunthorpe
2021-01-14 21:43                           ` Alexander Duyck
2021-01-14 23:28                             ` Alex Williamson
2021-01-15  1:56                               ` Alexander Duyck
2021-01-15 14:06                                 ` Jason Gunthorpe
2021-01-15 15:53                                   ` Leon Romanovsky
2021-01-16  1:48                                     ` Alexander Duyck
2021-01-16  8:20                                       ` Leon Romanovsky
2021-01-18  3:16                                         ` Alexander Duyck
2021-01-18  7:20                                           ` Leon Romanovsky
2021-01-18 13:28                                             ` Leon Romanovsky
2021-01-18 18:21                                               ` Alexander Duyck
2021-01-19  5:43                                                 ` Leon Romanovsky
2021-01-16  4:32                                   ` Alexander Duyck
2021-01-18 15:47                                     ` Jason Gunthorpe
2021-01-15 13:51                             ` Jason Gunthorpe
2021-01-14 17:26                 ` Leon Romanovsky
2021-01-18 17:03   ` Greg KH
2021-01-19  5:38     ` Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 3/5] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 4/5] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-01-10 15:07 ` [PATCH mlx5-next v1 5/5] net/mlx5: Allow to the users to configure number of MSI-X vectors Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKgT0UdPZvSf0qSsU1NGcVcK_j6rPZQ8YT_UcygU+2FEq_dGpQ@mail.gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).