linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Russ Weight <russell.h.weight@intel.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
	"Adler, Michael" <michael.adler@intel.com>,
	"Whisonant, Tim" <tim.whisonant@intel.com>, <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Tom Rix <trix@redhat.com>
Subject: Re: BUG REPORT: vfio_pci driver
Date: Fri, 13 Aug 2021 16:09:07 -0600	[thread overview]
Message-ID: <20210813160907.7b143b51.alex.williamson@redhat.com> (raw)
In-Reply-To: <ca6977da-5657-51aa-0ef2-fb4a7e8c15dd@intel.com>

On Fri, 13 Aug 2021 11:34:51 -0700
Russ Weight <russell.h.weight@intel.com> wrote:

> Bug Description:
> 
> A bug in the vfio_pci driver was reported in junction with work on FPGA

This looks like the documented behavior of an IRQ index reporting the
VFIO_IRQ_INFO_NORESIZE flag.  We can certainly work towards trying to
remove the flag from this index, but it seems the userspace driver is
currently ignoring the flag and expecting exactly the behavior the flag
indicates is not available.  Thanks,

Alex

> cards. We were able to reproduce and root-cause the bug using system-tap.
> The original bug description is below. An understanding of the referenced
> dfl and opae tools is not required - it is the sequence of IOCTL calls and
> IRQ vectors that matters:
> 
> > I’m trying to get an example AFU working that uses 2 IRQs, active at the same 
> > time. I’m hitting what looks to be a dfl_pci driver bug.
> >
> > The code tries to allocate two IRQ vectors: 0 and 1. I see opaevfio.c doing the 
> > right thing, picking the MSIX index. Allocating either IRQ 0 or IRQ 1 works fine 
> > and I confirm that the VFIO_DEVICE_SET_IRQS looks reasonable, choosing MSIX and 
> > either start of 0 or 1 and count 1.
> >
> > Note that opaevfio.c always passes count 1, so it will make separate calls for 
> > each IRQ vector.
> >
> > When I try to allocate both, I see the following:
> >
> >   * If the VFIO_DEVICE_SET_IRQS ioctl is called first with start 0 and then
> >     start 1 (always count 1), the start 1 (second) ioctl trap returns EINVAL.
> >   * If I set up the vectors in decreasing order, so start 1 followed by start 0,
> >     the program works!
> >   * I ruled out OPAE SDK user space problems by setting up my program to
> >     allocate in increasing order, which would normally fail. I changed only the
> >     ioctl call in user space opaevfio.c, inverting bit 0 of start so that the
> >     driver is called in decreasing index order. Of course this binds the wrong
> >     vectors to the fds, but I don’t care about that for now. This works! From
> >     this, I conclude that it can’t be a user space problem since the difference
> >     between working and failing is solely the order in which IRQ vectors are
> >     bound in ioctl calls.  
> 
> The EINVAL is coming from vfio_msi_set_block() here:
> https://github.com/torvalds/linux/blob/master/drivers/vfio/pci/vfio_pci_intrs.c#L373
> 
> vfio_msi_set_block() is being called from vfio_pci_set_msi_trigger() here on
> the second IRQ request:
> https://github.com/torvalds/linux/blob/master/drivers/vfio/pci/vfio_pci_intrs.c#L530
> 
> We believe the bug is in vfio_pci_set_msi_trigger(), in the 2nd parameter to the call
> to vfio_msi_enable() here:
> https://github.com/torvalds/linux/blob/master/drivers/vfio/pci/vfio_pci_intrs.c#L533
> 
> In both the passing and failing cases, the first IRQ request results in a call
> to vfio_msi_enable() at line 533 and the second IRQ request results in the
> call to vfio_msi_set_block() at line 530. It is during the first IRQ request
> that vfio_msi_enable() sets vdev->num_ctx based on the 2nd parameter (nvec).
> vdev->num_ctx is part of the conditional that results in the EINVAL for the
> failing case.
> 
> In the passing case, vdev->num_ctx is 2. In the failing case, it is 1.
> 
> I am attaching two text files containing trace information from systemtap: one for
> the failing case and one for the passing case. They contain a lot more information
> than is needed, but if you search for vfio_pci_set_msi_trigger and vfio_msi_set_block,
> you will see values for some of the call parameters.
> 
> - Russ
> 


  reply	other threads:[~2021-08-13 22:09 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-13 18:34 BUG REPORT: vfio_pci driver Russ Weight
2021-08-13 22:09 ` Alex Williamson [this message]
2021-08-17 17:16   ` Russ Weight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210813160907.7b143b51.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.adler@intel.com \
    --cc=russell.h.weight@intel.com \
    --cc=tim.whisonant@intel.com \
    --cc=trix@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).