linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: CREGUT Pierre IMT/OLN <pierre.cregut@orange.com>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] PCI/IOV: update num_VFs earlier
Date: Thu, 13 Jun 2019 18:51:12 -0500	[thread overview]
Message-ID: <20190613235112.GO13533@google.com> (raw)
In-Reply-To: <744273fd-8045-7527-ad29-fa19adf6d015@orange.com>

Hi Pierre,

I'm really sorry, I totally missed your response.

On Fri, Apr 26, 2019 at 10:11:54AM +0200, CREGUT Pierre IMT/OLN wrote:
> I also initially thought that kobject_uevent generated the netlink event
> but this is not the case. This is generated by the specific driver in use.
> For the Intel i40e driver, this is the call to i40e_do_reset_safe in
> i40e_pci_sriov_configure that sends the event.
> It is followed by i40e_pci_sriov_enable that calls i40e_alloc_vfs that
> finally calls the generic pci_enable_sriov function.
> 
> So the proposed patch works well for the i40e driver (x710 cards) because
> the update to num_VFs is fast enough to be committed before the event is
> received. It may not work with other cards. The same is true for the zero
> value and there is no guarantee for other cards.
> 
> The clean solution would be to lock the device in sriov_numvfs_show.
> I guess that there are good reasons why locks have been avoided
> in sysfs getter functions so let us explore other approaches.
> 
> We can either return a "not settled" value (-1) or (probably better)
> do not return a value but an error (-EAGAIN returned by the show
> function).
> 
> To distinguish this "not settled" situation we can either:
> * overload the meaning of num_VFs (eg make it negative)
>   but it is an unsigned short.
> * add a bool to pci_sriov struct (rather simple but modifies a well
>   established structure).
> * use the fact that not_settled => device is locked and use
>   mutex_is_locked as an over approximation.
> 
> The later is not perfect but requires minimal changes to
> sriov_numvfs_show:
> 
>      if (mutex_is_locked(&dev->mutex))
>          return -EAGAIN;

I think this looks like a good resolution.

> In all cases, the device could be locked or the boolean set just
> after the test. But I don't think there is a case where causality
> would be violated.Thank you in advance for your recommendations.  I will
> update the patch according to your instructions.
> 
> Le 06/04/2019 à 00:33, Bjorn Helgaas a écrit :
> > On Fri, Mar 29, 2019 at 09:00:58AM +0100, Pierre Crégut wrote:
> > > Ensure that iov->num_VFs is set before a netlink message is sent
> > > when the number of VFs is changed. Only the path for num_VFs > 0
> > > is affected. The path for num_VFs = 0 is already correct.
> > > 
> > > Monitoring programs can relie on netlink messages to track interface
> > > change and query their state in /sys. But when sriov_numvfs is set to a
> > > positive value, the netlink message is sent before the value is available
> > > in sysfs. The value read after the message is received is always zero.
> > Thanks, Pierre!  Can you clue me in on where exactly the connection
> > from sriov_enable() to netlink is?
> > 
> > I see one side of the race is with sriov_numvfs_show(), but I don't
> > know where the netlink message is sent.  Is that connected with the
> > kobject_uevent(KOBJ_CHANGE)?
> > 
> > One thing this would help with is figuring out exactly how *much*
> > earlier we need to set iov->num_VFs.  It looks like the current patch
> > sets it before we actually enable the VFs, so a user could read
> > /sys/.../sriov_numvfs and get the wrong value.  Of course, that's
> > unavoidable; the question is whether it's OK to get the new value
> > *before* it actually takes effect, or whether we want to return a
> > stale value until after it takes effect.
> > 
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=202991
> > > Signed-off-by: Pierre Crégut <pierre.cregut@orange.com>
> > > ---
> > > note: the behaviour can be tested with the following shell script also
> > > available on the bugzilla (d being the phy device name):
> > > 
> > > ip monitor dev $d | grep --line-buffered "^[0-9]*:" | \
> > > while read line; do cat /sys/class/net/$d/device/sriov_numvfs; done
> > > 
> > >   drivers/pci/iov.c | 3 ++-
> > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> > > index 3aa115ed3a65..a9655c10e87f 100644
> > > --- a/drivers/pci/iov.c
> > > +++ b/drivers/pci/iov.c
> > > @@ -351,6 +351,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> > >   		goto err_pcibios;
> > >   	}
> > > +	iov->num_VFs = nr_virtfn;
> > >   	pci_iov_set_numvfs(dev, nr_virtfn);
> > >   	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
> > >   	pci_cfg_access_lock(dev);
> > > @@ -363,7 +364,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> > >   		goto err_pcibios;
> > >   	kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
> > > -	iov->num_VFs = nr_virtfn;
> > >   	return 0;
> > > @@ -379,6 +379,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> > >   	if (iov->link != dev->devfn)
> > >   		sysfs_remove_link(&dev->dev.kobj, "dep_link");
> > > +	iov->num_VFs = 0;
> > >   	pci_iov_set_numvfs(dev, 0);
> > >   	return rc;
> > >   }
> > > -- 
> > > 2.17.1
> > > 

  reply	other threads:[~2019-06-13 23:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29  8:00 [PATCH] PCI/IOV: update num_VFs earlier Pierre Crégut
2019-04-05 22:33 ` Bjorn Helgaas
2019-04-26  8:11   ` CREGUT Pierre IMT/OLN
2019-06-13 23:51     ` Bjorn Helgaas [this message]
2019-10-01 23:45     ` Bjorn Helgaas
2019-10-03  9:04       ` CREGUT Pierre IMT/OLN
2019-10-03 22:10         ` Bjorn Helgaas
2019-10-03 22:36           ` Jakub Kicinski
2019-10-03 22:37           ` Duyck, Alexander H
2019-10-08 21:38           ` Bjorn Helgaas
2019-10-08 22:06             ` Don Dutile
2019-10-09 12:31               ` Bjorn Helgaas
2019-10-09 14:20                 ` Don Dutile
  -- strict thread matches above, loose matches on Subject: below --
2019-03-25  8:18 Pierre Crégut

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190613235112.GO13533@google.com \
    --to=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=pierre.cregut@orange.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).