All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Yishai Hadas <yishaih@nvidia.com>,
	bhelgaas@google.com, saeedm@nvidia.com,
	linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	netdev@vger.kernel.org, kuba@kernel.org, leonro@nvidia.com,
	kwankhede@nvidia.com, mgurtovoy@nvidia.com, maorg@nvidia.com
Subject: Re: [PATCH V4 mlx5-next 13/13] vfio/mlx5: Use its own PCI reset_done error handler
Date: Wed, 27 Oct 2021 09:29:43 -0600	[thread overview]
Message-ID: <20211027092943.4f95f220.alex.williamson@redhat.com> (raw)
In-Reply-To: <20211026235002.GC2744544@nvidia.com>

On Tue, 26 Oct 2021 20:50:02 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Oct 26, 2021 at 05:16:44PM -0600, Alex Williamson wrote:
> > > @@ -471,6 +474,47 @@ mlx5vf_pci_migration_data_rw(struct mlx5vf_pci_core_device *mvdev,
> > >  	return count;
> > >  }
> > >  
> > > +/* This function is called in all state_mutex unlock cases to
> > > + * handle a 'defered_reset' if exists.
> > > + */  
> > 
> > I refrained from noting it elsewhere, but we're not in net/ or
> > drivers/net/ here, but we're using their multi-line comment style.  Are
> > we using the strong relation to a driver that does belong there as
> > justification for the style here?  
> 
> I think it is an oversight, tell Yishai you prefer the other format in
> drivers/vfio and it can be fixed

Seems fixed in the new version.

> > > @@ -539,7 +583,7 @@ static ssize_t mlx5vf_pci_mig_rw(struct vfio_pci_core_device *vdev,
> > >  	}
> > >  
> > >  end:
> > > -	mutex_unlock(&mvdev->state_mutex);
> > > +	mlx5vf_state_mutex_unlock(mvdev);  
> > 
> > I'm a little lost here, if the operation was to read the device_state
> > and mvdev->vmig.vfio_dev_state was error, that's already been copied to
> > the user buffer, so the user continues to see the error state for the
> > first read of device_state after reset if they encounter this race?  
> 
> Yes. If the userspace races ioctls they get a deserved mess.
> 
> This race exists no matter what we do, as soon as the unlock happens a
> racing reset ioctl could run in during the system call exit path.
> 
> The purpose of the locking is to protect the kernel from hostile
> userspace, not to allow userspace to execute concurrent ioctl's in a
> sensible way.

The reset_done handler sets deferred_reset = true and if it's possible
to get the state_mutex, will reset migration data and device_state as
part of releasing that mutex.  If there's contention on state_mutex,
the deferred_reset field flags that this migration state is still stale.

So, I assume that it's possible that a user resets the device via ioctl
or config space, there was contention and the migration state is still
stale, right?

The user then goes to read device_state, but the staleness of the
migration state is not resolved until *after* the stale device state is
copied to the user buffer.

What did the user do wrong to see stale data?  Thanks,

Alex


  reply	other threads:[~2021-10-27 15:29 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-26  9:05 [PATCH V4 mlx5-next 00/13] Add mlx5 live migration driver Yishai Hadas
2021-10-26  9:05 ` [PATCH V4 mlx5-next 01/13] PCI/IOV: Add pci_iov_vf_id() to get VF index Yishai Hadas
2021-10-26  9:05 ` [PATCH V4 mlx5-next 02/13] net/mlx5: Reuse exported virtfn index function call Yishai Hadas
2021-10-26  9:05 ` [PATCH V4 mlx5-next 03/13] net/mlx5: Disable SRIOV before PF removal Yishai Hadas
2021-10-26  9:05 ` [PATCH V4 mlx5-next 04/13] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF Yishai Hadas
2021-10-26  9:05 ` [PATCH V4 mlx5-next 05/13] net/mlx5: Expose APIs to get/put the mlx5 core device Yishai Hadas
2021-10-26  9:05 ` [PATCH V4 mlx5-next 06/13] vfio: Fix VFIO_DEVICE_STATE_SET_ERROR macro Yishai Hadas
2021-10-26 15:32   ` Cornelia Huck
2021-10-26 15:50     ` Alex Williamson
2021-10-26 15:56       ` Cornelia Huck
2021-10-26 16:18     ` Leon Romanovsky
2021-10-26 16:32       ` Cornelia Huck
2021-10-26 16:42         ` Leon Romanovsky
2021-10-26 16:57           ` Cornelia Huck
2021-10-26  9:05 ` [PATCH V4 mlx5-next 07/13] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR Yishai Hadas
2021-10-26 15:37   ` Cornelia Huck
2021-10-26  9:06 ` [PATCH V4 mlx5-next 08/13] vfio/pci_core: Make the region->release() function optional Yishai Hadas
2021-10-26  9:06 ` [PATCH V4 mlx5-next 09/13] net/mlx5: Introduce migration bits and structures Yishai Hadas
2021-10-26  9:06 ` [PATCH V4 mlx5-next 10/13] vfio/mlx5: Expose migration commands over mlx5 device Yishai Hadas
2021-10-26  9:06 ` [PATCH V4 mlx5-next 11/13] vfio/mlx5: Implement vfio_pci driver for mlx5 devices Yishai Hadas
2021-10-26 22:42   ` Alex Williamson
2021-10-26 23:46     ` Jason Gunthorpe
2021-10-26  9:06 ` [PATCH V4 mlx5-next 12/13] vfio/pci: Expose vfio_pci_aer_err_detected() Yishai Hadas
2021-10-26 22:45   ` Alex Williamson
2021-10-26  9:06 ` [PATCH V4 mlx5-next 13/13] vfio/mlx5: Use its own PCI reset_done error handler Yishai Hadas
2021-10-26 23:16   ` Alex Williamson
2021-10-26 23:50     ` Jason Gunthorpe
2021-10-27 15:29       ` Alex Williamson [this message]
2021-10-27 15:53         ` Jason Gunthorpe
2021-10-27 16:48           ` Alex Williamson
2021-10-27 16:53             ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211027092943.4f95f220.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.