From: Niklas Schnelle <schnelle@linux.ibm.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>,
Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>,
ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
netdev <netdev@vger.kernel.org>,
"intel-wired-lan@lists.osuosl.org"
<intel-wired-lan@lists.osuosl.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: Oops in during sriov_enable with ixgbe driver
Date: Fri, 01 Oct 2021 10:23:35 +0200 [thread overview]
Message-ID: <924c2d6ef51a83cce5c9bcf4004bbf1506c5a768.camel@linux.ibm.com> (raw)
In-Reply-To: <CAJZ5v0hsQvHp2PqFjxvyx4tPCnNC7BCWyfPj-eADFa1w68BCMQ@mail.gmail.com>
On Thu, 2021-09-30 at 20:37 +0200, Rafael J. Wysocki wrote:
> On Thu, Sep 30, 2021 at 8:20 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > On Thu, Sep 30, 2021 at 7:38 PM Rafael J. Wysocki
> > <rafael.j.wysocki@intel.com> wrote:
> > > On 9/30/2021 7:31 PM, Jesse Brandeburg wrote:
> > > > On 9/28/2021 4:56 AM, Niklas Schnelle wrote:
> > > > > Hi Jesse, Hi Tony,
> > > > >
> > > > > Since v5.15-rc1 I've been having problems with enabling SR-IOV VFs on
> > > > > my private workstation with an Intel 82599 NIC with the ixgbe driver. I
> > > > > haven't had time to bisect or look closer but since it still happens on
> > > > > v5.15-rc3 I wanted to at least check if you're aware of the problem as
> > > > > I couldn't find anything on the web.
> > > > We haven't heard anything of this problem.
> > > >
> > > >
> > > > > I get below Oops when trying "echo 2 > /sys/bus/pci/.../sriov_numvfs"
> > > > > and suspect that the earlier ACPI messages could have something to do
> > > > > with that, absolutely not an ACPI expert though. If there is a need I
> > > > > could do a bisect.
> > > > Hi Niklas, thanks for the report, I added the Intel Driver's list for
> > > > more exposure.
> > > >
> > > > I asked the developers working on that driver to take a look and they
> > > > tried to reproduce, and were unable to do so. This might be related to
> > > > your platform, which strongly suggests that the ACPI stuff may be related.
> > > >
> > > > We have tried to reproduce but everything works fine no call trace in
> > > > scenario with creating VF.
> > > >
> > > > This is good in that it doesn't seem to be a general failure, you may
> > > > want to file a kernel bugzilla (bugzilla.kernel.org) to track the issue,
> > > > and I hope that @Rafael might have some insight.
> > > >
> > > > This issue may be related to changes in acpi_pci_find_companion,
> > > > but as I say, we are not able to reproduce this.
> > > >
> > > > commit 59dc33252ee777e02332774fbdf3381b1d5d5f5d
> > > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > Date: Tue Aug 24 16:43:55 2021 +0200
> > > > PCI: VMD: ACPI: Make ACPI companion lookup work for VMD bus
> > >
> > > This change doesn't affect any devices beyond the ones on the VMD bus.
> >
> > The only failing case I can see is when the device is on the VMD bus
> > and its bus pointer is NULL, so the dereference in
> > vmd_acpi_find_companion() crashes.
> >
> > Can anything like that happen?
>
> Not really, because pci_iov_add_virtfn() sets virtfn->bus.
>
> However, it doesn\t set virtfn->dev.parent AFAICS, so when that gets
> dereferenced by ACPI_COMPANIO(dev->parent) in
> acpi_pci_find_companion(), the crash occurs.
>
> We need a !dev->parent check in acpi_pci_find_companion() I suppose:
>
> Does the following change help?
>
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -1243,6 +1243,9 @@ static struct acpi_device *acpi_pci_find
> bool check_children;
> u64 addr;
>
> + if (!dev->parent)
> + return NULL;
> +
> down_read(&pci_acpi_companion_lookup_sem);
>
> adev = pci_acpi_find_companion_hook ?
Yes the above change fixes the problem for me. SR-IOV enables
successfully and the VFs are fully usable. Thanks!
Just out of curiosity and because I use this system to test common code
PCI changed. Do you have an idea what makes my system special here?
The call to pci_set_acpi_fwnode() in pci_setup_device() is
unconditional and should do the same on any ACPI enabled system.
Also nothing in your explanation sounds specific to my system.
next prev parent reply other threads:[~2021-10-01 8:23 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-28 11:56 Oops in during sriov_enable with ixgbe driver Niklas Schnelle
2021-09-30 17:31 ` Jesse Brandeburg
2021-09-30 17:38 ` Rafael J. Wysocki
2021-09-30 18:20 ` Rafael J. Wysocki
2021-09-30 18:37 ` Rafael J. Wysocki
2021-10-01 8:23 ` Niklas Schnelle [this message]
2021-10-01 13:21 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=924c2d6ef51a83cce5c9bcf4004bbf1506c5a768.camel@linux.ibm.com \
--to=schnelle@linux.ibm.com \
--cc=anthony.l.nguyen@intel.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jesse.brandeburg@intel.com \
--cc=linux-acpi@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=rafael@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).