linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Saeed Mahameed" <saeedm@nvidia.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org,
	netdev@vger.kernel.org, "Don Dutile" <ddutile@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Krzysztof Wilczyński" <kw@linux.com>
Subject: Re: [PATCH mlx5-next v6 1/4] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs
Date: Sun, 21 Feb 2021 16:01:32 +0100	[thread overview]
Message-ID: <YDJ1zOpd6xawfSwo@kroah.com> (raw)
In-Reply-To: <YDJmRp5V+TnEQUIV@unreal>

On Sun, Feb 21, 2021 at 03:55:18PM +0200, Leon Romanovsky wrote:
> On Sun, Feb 21, 2021 at 02:00:41PM +0100, Greg Kroah-Hartman wrote:
> > On Sat, Feb 20, 2021 at 01:06:00PM -0600, Bjorn Helgaas wrote:
> > > On Fri, Feb 19, 2021 at 09:20:18AM +0100, Greg Kroah-Hartman wrote:
> > >
> > > > Ok, can you step back and try to explain what problem you are trying to
> > > > solve first, before getting bogged down in odd details?  I find it
> > > > highly unlikely that this is something "unique", but I could be wrong as
> > > > I do not understand what you are wanting to do here at all.
> > >
> > > We want to add two new sysfs files:
> > >
> > >   sriov_vf_total_msix, for PF devices
> > >   sriov_vf_msix_count, for VF devices associated with the PF
> > >
> > > AFAICT it is *acceptable* if they are both present always.  But it
> > > would be *ideal* if they were only present when a driver that
> > > implements the ->sriov_get_vf_total_msix() callback is bound to the
> > > PF.
> >
> > Ok, so in the pci bus probe function, if the driver that successfully
> > binds to the device is of this type, then create the sysfs files.
> >
> > The driver core will properly emit a KOBJ_BIND message when the driver
> > is bound to the device, so userspace knows it is now safe to rescan the
> > device to see any new attributes.
> >
> > Here's some horrible pseudo-patch for where this probably should be
> > done:
> >
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index ec44a79e951a..5a854a5e3977 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -307,8 +307,14 @@ static long local_pci_probe(void *_ddi)
> >  	pm_runtime_get_sync(dev);
> >  	pci_dev->driver = pci_drv;
> >  	rc = pci_drv->probe(pci_dev, ddi->id);
> > -	if (!rc)
> > +	if (!rc) {
> > +		/* If PF or FV driver was bound, let's add some more sysfs files */
> > +		if (pci_drv->is_pf)
> > +			device_add_groups(pci_dev->dev, pf_groups);
> > +		if (pci_drv->is_fv)
> > +			device_add_groups(pci_dev->dev, fv_groups);
> >  		return rc;
> > +	}
> >  	if (rc < 0) {
> >  		pci_dev->driver = NULL;
> >  		pm_runtime_put_sync(dev);
> >
> >
> >
> >
> > Add some proper error handling if device_add_groups() fails, and then do
> > the same thing to remove the sysfs files when the device is unbound from
> > the driver, and you should be good to go.
> >
> > Or is this what you all are talking about already and I'm just totally
> > confused?
> 
> There are two different things here. First we need to add sysfs files
> for VF as the event of PF driver bind, not for the VF binds.
> 
> In your pseudo code, it will look:
>   	rc = pci_drv->probe(pci_dev, ddi->id);
>  -	if (!rc)
>  +	if (!rc) {
>  +		/* If PF or FV driver was bound, let's add some more sysfs files */
>  +		if (pci_drv->is_pf) {
>  +                      int i = 0;
>  +			device_add_groups(pci_dev->dev, pf_groups);
>  +                      for (i; i < pci_dev->totalVF; i++) {
>  +                              struct pci_device vf_dev = find_vf_device(pci_dev, i);
>  +
>  +				device_add_groups(vf_dev->dev, fv_groups);

Hahaha, no.

You are randomly adding new sysfs files to a _DIFFERENT_ device than the
one that is currently undergoing the probe() call?  That's crazy.  And
will break userspace.

Why would you want that?  The device should ONLY change when the device
that controls it has a driver bound/unbound to it, that should NEVER
cause random other devices on the bus to change state or sysfs files.

>  +                      }
>  +              }
>   		return rc;
> 
> Second, the code proposed by me does that but with driver callback that
> PF calls during init/uninit.

That works too, but really, why not just have the pci core do it for
you?  That way you do not have to go and modify each and every PCI
driver to get this type of support.  PCI core things belong in the PCI
core, not in each individual driver.

thanks,

greg k-h

  reply	other threads:[~2021-02-21 15:02 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-09 13:34 [PATCH mlx5-next v6 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-02-09 13:34 ` [PATCH mlx5-next v6 1/4] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs Leon Romanovsky
2021-02-15 21:01   ` Bjorn Helgaas
2021-02-16  7:33     ` Leon Romanovsky
2021-02-16 16:12       ` Bjorn Helgaas
2021-02-16 19:58         ` Leon Romanovsky
2021-02-17 18:02           ` Bjorn Helgaas
2021-02-17 19:25             ` Jason Gunthorpe
2021-02-17 20:28               ` Bjorn Helgaas
2021-02-17 23:52                 ` Jason Gunthorpe
2021-02-18 10:15             ` Leon Romanovsky
2021-02-18 22:39               ` Bjorn Helgaas
2021-02-19  7:52                 ` Leon Romanovsky
2021-02-19  8:20                   ` Greg Kroah-Hartman
2021-02-19 16:58                     ` Leon Romanovsky
2021-02-20 19:06                     ` Bjorn Helgaas
2021-02-21  6:59                       ` Leon Romanovsky
2021-02-23 21:07                         ` Bjorn Helgaas
2021-02-24  8:09                           ` Greg Kroah-Hartman
2021-02-24 21:37                             ` Don Dutile
2021-02-24  9:53                           ` Leon Romanovsky
2021-02-24 15:07                             ` Bjorn Helgaas
2021-02-21 13:00                       ` Greg Kroah-Hartman
2021-02-21 13:55                         ` Leon Romanovsky
2021-02-21 15:01                           ` Greg Kroah-Hartman [this message]
2021-02-21 15:30                             ` Leon Romanovsky
2021-02-16 20:37         ` Jason Gunthorpe
2021-02-09 13:34 ` [PATCH mlx5-next v6 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-02-09 13:34 ` [PATCH mlx5-next v6 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-02-09 13:34 ` [PATCH mlx5-next v6 4/4] net/mlx5: Allow to the users to configure number of MSI-X vectors Leon Romanovsky
2021-02-09 21:06 ` [PATCH mlx5-next v6 0/4] Dynamically assign MSI-X vectors count Alexander Duyck
2021-02-14  5:24   ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDJ1zOpd6xawfSwo@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=ddutile@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kw@linux.com \
    --cc=leon@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).