All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yishai Hadas <yishaih@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"maorg@nvidia.com" <maorg@nvidia.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	liulongfang <liulongfang@huawei.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	Shameerali Kolothum Thodi  <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
Date: Sun, 19 Jun 2022 12:25:50 +0300	[thread overview]
Message-ID: <4f14e015-e4f7-7632-3cd7-0e644ed05c99@nvidia.com> (raw)
In-Reply-To: <20220617084723.00298d67.alex.williamson@redhat.com>

On 17/06/2022 17:47, Alex Williamson wrote:
> On Fri, 17 Jun 2022 08:50:00 +0000
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
>
>>> -----Original Message-----
>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>> Sent: 17 June 2022 00:01
>>> To: Yishai Hadas <yishaih@nvidia.com>
>>> Cc: Tian, Kevin <kevin.tian@intel.com>; maorg@nvidia.com;
>>> cohuck@redhat.com; Shameerali Kolothum Thodi
>>> <shameerali.kolothum.thodi@huawei.com>; liulongfang
>>> <liulongfang@huawei.com>; kvm@vger.kernel.org; jgg@nvidia.com
>>> Subject: Re: [PATCH vfio 2/2] vfio: Split migration ops from main device ops
>>>
>>> On Thu, 16 Jun 2022 17:18:16 +0300
>>> Yishai Hadas <yishaih@nvidia.com> wrote:
>>>    
>>>> On 13/06/2022 10:13, Yishai Hadas wrote:
>>>>> On 10/06/2022 6:32, Tian, Kevin wrote:
>>>>>>> From: Yishai Hadas <yishaih@nvidia.com>
>>>>>>> Sent: Monday, June 6, 2022 4:56 PM
>>>>>>>
>>>>>>> vfio core checks whether the driver sets some migration op (e.g.
>>>>>>> set_state/get_state) and accordingly calls its op.
>>>>>>>
>>>>>>> However, currently mlx5 driver sets the above ops without regards to
>>>>>>> its
>>>>>>> migration caps.
>>>>>>>
>>>>>>> This might lead to unexpected usage/Oops if user space may call to the
>>>>>>> above ops even if the driver doesn't support migration. As for example,
>>>>>>> the migration state_mutex is not initialized in that case.
>>>>>>>
>>>>>>> The cleanest way to manage that seems to split the migration ops from
>>>>>>> the main device ops, this will let the driver setting them separately
>>>>>>> from the main ops when it's applicable.
>>>>>>>
>>>>>>> As part of that, changed HISI driver to match this scheme.
>>>>>>>
>>>>>>> This scheme may enable down the road to come with some extra
>>> group of
>>>>>>> ops (e.g. DMA log) that can be set without regards to the other options
>>>>>>> based on driver caps.
>>>>>>>
>>>>>>> Fixes: 6fadb021266d ("vfio/mlx5: Implement vfio_pci driver for mlx5
>>>>>>> devices")
>>>>>>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>>>>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one nit:
>>>>> Thanks Kevin, please see below.
>>>>>   
>>>>>>   
>>>>>>> @@ -1534,8 +1534,8 @@
>>> vfio_ioctl_device_feature_mig_device_state(struct
>>>>>>> vfio_device *device,
>>>>>>>        struct file *filp = NULL;
>>>>>>>        int ret;
>>>>>>>
>>>>>>> -    if (!device->ops->migration_set_state ||
>>>>>>> -        !device->ops->migration_get_state)
>>>>>>> +    if (!device->mig_ops->migration_set_state ||
>>>>>>> +        !device->mig_ops->migration_get_state)
>>>>>>>            return -ENOTTY;
>>>>>> ...
>>>>>>   
>>>>>>> @@ -1582,8 +1583,8 @@ static int
>>>>>>> vfio_ioctl_device_feature_migration(struct vfio_device *device,
>>>>>>>        };
>>>>>>>        int ret;
>>>>>>>
>>>>>>> -    if (!device->ops->migration_set_state ||
>>>>>>> -        !device->ops->migration_get_state)
>>>>>>> +    if (!device->mig_ops->migration_set_state ||
>>>>>>> +        !device->mig_ops->migration_get_state)
>>>>>>>            return -ENOTTY;
>>>>>>>   
>>>>>> Above checks can be done once when the device is registered then
>>>>>> here replaced with a single check on device->mig_ops.
>>>>>>   
>>>>> I agree, it may look as of below.
>>>>>
>>>>> Theoretically, this could be done even before this patch upon device
>>>>> registration.
>>>>>
>>>>> We could check that both 'ops' were set and *not* only one of and
>>>>> later check for the specific 'op' upon the feature request.
>>>>>
>>>>> Alex,
>>>>>
>>>>> Do you prefer to switch to the below as part of V2 or stay as of
>>>>> current submission and I'll just add Kevin as Reviewed-by ?
>>>>>
>>>>> diff --git a/drivers/vfio/pci/vfio_pci_core.c
>>>>> b/drivers/vfio/pci/vfio_pci_core.c
>>>>> index a0d69ddaf90d..f42102a03851 100644
>>>>> --- a/drivers/vfio/pci/vfio_pci_core.c
>>>>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>>>>> @@ -1855,6 +1855,11 @@ int vfio_pci_core_register_device(struct
>>>>> vfio_pci_core_device *vdev)
>>>>>          if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
>>>>>                  return -EINVAL;
>>>>>
>>>>> +       if (vdev->vdev.mig_ops &&
>>>>> +          !(vdev->vdev.mig_ops->migration_get_state &&
>>>>> +            vdev->vdev.mig_ops->migration_get_state))
>>>>> +               return -EINVAL;
>>>>> +
>>>>>
>>>>> Yishai
>>>>>   
>>>> Hi Alex,
>>>>
>>>> Did you have the chance to review the above note ?
>>>>
>>>> I would like to send V2 with Kevin's Reviewed-by tag for both patches,
>>>> just wonder about the nit.
>>> The whole mig_ops handling seems rather ad-hoc to me.  What tells a
>>> developer when mig_ops can be set?  Can they be unset?  Does this
>>> enable a device feature callout to dynamically enable migration?  Why do
>>> we init the device with vfio_device_ops, but then open code per driver
>>> setting vfio_migration_ops?
>>>
>>> For the hisi_acc changes, they're specifically defining two device_ops,
>>> one for migration and one without migration.  This patch oddly makes
>>> them identical other than the name and doesn't simplify the setup path,
>>> which should now only require a single call point for init-device with
>>> the proposed API rather than the existing three.
>> I think one of the reasons we ended up using two diff ops were to restrict
>> exposing the migration BAR regions to Guest(and for that we only expose
>> half of the BAR region now). And for nested assignments this may result
>> in invalid BAR sizes if we do it for no migration cases. Hence we have
>> overrides for read/write/mmap/ioctl functions in hisi_acc_vfio_pci_migrn_ops .
> Darn, I always forget about that and skimmed too quickly, the close,
> read, write, and mmap callbacks are all different.  Sorry about that.
>
>> If we have to simplify the setup path, I guess we retain only the _migrn_ops and
>> add explicit checks for migration support in the above override functions.
> Yes, that's an option and none of those callbacks should be terribly
> performance sensitive to an inline test, it's just a matter of where we
> want to simplify the code.  Given the additional differences in
> callbacks I'm not necessarily suggesting this is something we need to
> change.


Alex,

So, do you suggest to go via this approach for mlx5 as well ?

Means, staying with a single device_ops but just inline a check whether 
migration is really supported inside the migration get/set state 
callbacks and let the core call it unconditionally.

Following this approach may introduce extra 'if's for the coming dirty 
tracking 'ops' that may face the same issue with the need to maintain an 
extra cap bit inside the driver for 'dirty tracking' support.

Assuming that an extra 'if' is not performance issue even not on the 
'read and clear' flow (is it ?) this can work.

However, the alternative that was suggested by this patch could be 
cleaner I assume at least for mlx5 and may drop all those extra 'if's, etc.

Let's converge and I'll send V2 accordingly.

Yishai


  reply	other threads:[~2022-06-19  9:26 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-06  8:56 [PATCH vfio 0/2] Migration few enhancements Yishai Hadas
2022-06-06  8:56 ` [PATCH vfio 1/2] vfio/mlx5: Protect mlx5vf_disable_fds() upon close device Yishai Hadas
2022-06-10  3:25   ` Tian, Kevin
2022-06-06  8:56 ` [PATCH vfio 2/2] vfio: Split migration ops from main device ops Yishai Hadas
2022-06-10  3:32   ` Tian, Kevin
2022-06-13  7:13     ` Yishai Hadas
2022-06-16 14:18       ` Yishai Hadas
2022-06-16 23:01         ` Alex Williamson
2022-06-17  3:58           ` Tian, Kevin
2022-06-17  4:04           ` Tian, Kevin
2022-06-17  8:50           ` Shameerali Kolothum Thodi
2022-06-17 14:47             ` Alex Williamson
2022-06-19  9:25               ` Yishai Hadas [this message]
2022-06-20  3:49                 ` Jason Gunthorpe
2022-06-21 16:41                   ` Alex Williamson
2022-06-24 14:12                     ` Jason Gunthorpe
2022-06-24 14:41                       ` Alex Williamson
2022-06-24 14:53                         ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f14e015-e4f7-7632-3cd7-0e644ed05c99@nvidia.com \
    --to=yishaih@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=liulongfang@huawei.com \
    --cc=maorg@nvidia.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.