From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: [PATCH v7 1/4] vfio: Mediated device Core driver Date: Mon, 12 Sep 2016 09:53:14 -0600 Message-ID: <20160912095314.31b20e31@t450s.home> References: <1472097235-6332-1-git-send-email-kwankhede@nvidia.com> <1472097235-6332-2-git-send-email-kwankhede@nvidia.com> <57D11CC3.6010605@intel.com> <20160909124243.0d118541@t450s.home> <57D638C4.7020504@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Jike Song , , , , , , , To: Kirti Wankhede Return-path: Received: from mx1.redhat.com ([209.132.183.28]:58330 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752814AbcILPxQ (ORCPT ); Mon, 12 Sep 2016 11:53:16 -0400 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Mon, 12 Sep 2016 13:19:11 +0530 Kirti Wankhede wrote: > On 9/12/2016 10:40 AM, Jike Song wrote: > > On 09/10/2016 03:55 AM, Kirti Wankhede wrote: > >> On 9/10/2016 12:12 AM, Alex Williamson wrote: > >>> On Fri, 9 Sep 2016 23:18:45 +0530 > >>> Kirti Wankhede wrote: > >>> > >>>> On 9/8/2016 1:39 PM, Jike Song wrote: > >>>>> On 08/25/2016 11:53 AM, Kirti Wankhede wrote: > >>>> > >>>>>> +---------------+ > >>>>>> | | > >>>>>> | +-----------+ | mdev_register_driver() +--------------+ > >>>>>> | | | +<------------------------+ __init() | > >>>>>> | | mdev | | | | > >>>>>> | | bus | +------------------------>+ |<-> VFIO user > >>>>>> | | driver | | probe()/remove() | vfio_mdev.ko | APIs > >>>>>> | | | | | | > >>>>>> | +-----------+ | +--------------+ > >>>>>> | | > >>>>> > >>>>> This aimed to have only one single vfio bus driver for all mediated devices, > >>>>> right? > >>>>> > >>>> > >>>> Yes. That's correct. > >>>> > >>>> > >>>>>> + > >>>>>> +static int mdev_add_attribute_group(struct device *dev, > >>>>>> + const struct attribute_group **groups) > >>>>>> +{ > >>>>>> + return sysfs_create_groups(&dev->kobj, groups); > >>>>>> +} > >>>>>> + > >>>>>> +static void mdev_remove_attribute_group(struct device *dev, > >>>>>> + const struct attribute_group **groups) > >>>>>> +{ > >>>>>> + sysfs_remove_groups(&dev->kobj, groups); > >>>>>> +} > >>>>> > >>>>> These functions are not necessary. You can always specify the attribute groups > >>>>> to dev->groups before registering a new device. > >>>>> > >>>> > >>>> At the time of mdev device create, I specifically didn't used > >>>> dev->groups because we callback in vendor driver before that, see below > >>>> code snippet, and those attributes should only be added if create() > >>>> callback returns success. > >>>> > >>>> ret = parent->ops->create(mdev, mdev_params); > >>>> if (ret) > >>>> return ret; > >>>> > >>>> ret = mdev_add_attribute_group(&mdev->dev, > >>>> parent->ops->mdev_attr_groups); > >>>> if (ret) > >>>> parent->ops->destroy(mdev); > >>>> > >>>> > >>>> > >>>>>> + > >>>>>> +static struct parent_device *mdev_get_parent_from_dev(struct device *dev) > >>>>>> +{ > >>>>>> + struct parent_device *parent; > >>>>>> + > >>>>>> + mutex_lock(&parent_list_lock); > >>>>>> + parent = mdev_get_parent(__find_parent_device(dev)); > >>>>>> + mutex_unlock(&parent_list_lock); > >>>>>> + > >>>>>> + return parent; > >>>>>> +} > >>>>> > >>>>> As we have demonstrated, all these refs and locks and release workqueue are not necessary, > >>>>> as long as you have an independent device associated with the mdev host device > >>>>> ("parent" device here). > >>>>> > >>>> > >>>> I don't think every lock will go away with that. This also changes how > >>>> mdev devices entries are created in sysfs. It adds an extra directory. > >>> > >>> Exposing the parent-child relationship through sysfs is a desirable > >>> feature, so I'm not sure how this is a negative. This part of Jike's > >>> conversion was a big improvement, I thought. Thanks, > >>> > >> > >> Jike's suggestion is to introduced a fake device over parent device i.e. > >> mdev-host, and then all mdev devices are children of 'mdev-host' not > >> children of real parent. > >> > > > > It really depends on how you define 'real parent' :) > > > > With a physical-host-mdev hierarchy, the parent of mdev devices is the host > > device, the parent of host device is the physical device. e.g. > > > > pdev mdev_host mdev_device > > dev<------------dev<------------dev > > parent parent > > > > Figure 1: device hierarchy > > > > Right, mdev-host device doesn't represent physical device nor any mdev > device. Then what is the need of such device? Is there anything implicitly wrong with using a device node to host the mdev child devices? Is the argument against it only that it's unnecessary? Can we make use of the device-core parent/child dependencies as Jike has done w/o that extra node? > >> For example, directory structure we have now is: > >> /sys/bus/pci/devices/0000\:85\:00.0/ > >> > >> mdev devices are in real parents directory. > >> > >> By introducing fake device it would be: > >> /sys/bus/pci/devices/0000\:85\:00.0/mdev-host/ > >> > >> mdev devices are in fake device's directory. > >> > > > > Yes, this is the wanted directory. > > > > I don't think so. Why? > >> Lock would be still required, to handle the race conditions like > >> 'mdev_create' is still in process and parent device is unregistered by > >> vendor driver/ parent device is unbind from vendor driver. > >> > > > > locks are provided to protect resources, would you elaborate more on > > what is the exact resource you want to protect by a lock in mdev_create? > > > > Simple, in your suggestion mdev-host device. Fake device will go away if > vendor driver unregisters the device from mdev module, right. I don't follow the reply here, but aiui there's ordering implicit in the device core that Jike is trying to take advantage of that simplifies the mdev layer significantly. In the case of an mdev_create, the device core needs to take a reference to the parent object, the mdev-host I'd guess in Jike's version, the created mdev device would also have a reference to that object, so the physical host device could not be removed so long as there are outstanding references. It's just a matter of managing references and acquiring and releasing objects. Thanks, Alex From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53547) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bjTXv-0008M7-SM for qemu-devel@nongnu.org; Mon, 12 Sep 2016 11:53:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bjTXr-00038s-1X for qemu-devel@nongnu.org; Mon, 12 Sep 2016 11:53:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40900) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bjTXq-00035p-Pc for qemu-devel@nongnu.org; Mon, 12 Sep 2016 11:53:18 -0400 Date: Mon, 12 Sep 2016 09:53:14 -0600 From: Alex Williamson Message-ID: <20160912095314.31b20e31@t450s.home> In-Reply-To: References: <1472097235-6332-1-git-send-email-kwankhede@nvidia.com> <1472097235-6332-2-git-send-email-kwankhede@nvidia.com> <57D11CC3.6010605@intel.com> <20160909124243.0d118541@t450s.home> <57D638C4.7020504@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7 1/4] vfio: Mediated device Core driver List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kirti Wankhede Cc: Jike Song , pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, kevin.tian@intel.com, bjsdjshi@linux.vnet.ibm.com On Mon, 12 Sep 2016 13:19:11 +0530 Kirti Wankhede wrote: > On 9/12/2016 10:40 AM, Jike Song wrote: > > On 09/10/2016 03:55 AM, Kirti Wankhede wrote: > >> On 9/10/2016 12:12 AM, Alex Williamson wrote: > >>> On Fri, 9 Sep 2016 23:18:45 +0530 > >>> Kirti Wankhede wrote: > >>> > >>>> On 9/8/2016 1:39 PM, Jike Song wrote: > >>>>> On 08/25/2016 11:53 AM, Kirti Wankhede wrote: > >>>> > >>>>>> +---------------+ > >>>>>> | | > >>>>>> | +-----------+ | mdev_register_driver() +--------------+ > >>>>>> | | | +<------------------------+ __init() | > >>>>>> | | mdev | | | | > >>>>>> | | bus | +------------------------>+ |<-> VFIO user > >>>>>> | | driver | | probe()/remove() | vfio_mdev.ko | APIs > >>>>>> | | | | | | > >>>>>> | +-----------+ | +--------------+ > >>>>>> | | > >>>>> > >>>>> This aimed to have only one single vfio bus driver for all mediated devices, > >>>>> right? > >>>>> > >>>> > >>>> Yes. That's correct. > >>>> > >>>> > >>>>>> + > >>>>>> +static int mdev_add_attribute_group(struct device *dev, > >>>>>> + const struct attribute_group **groups) > >>>>>> +{ > >>>>>> + return sysfs_create_groups(&dev->kobj, groups); > >>>>>> +} > >>>>>> + > >>>>>> +static void mdev_remove_attribute_group(struct device *dev, > >>>>>> + const struct attribute_group **groups) > >>>>>> +{ > >>>>>> + sysfs_remove_groups(&dev->kobj, groups); > >>>>>> +} > >>>>> > >>>>> These functions are not necessary. You can always specify the attribute groups > >>>>> to dev->groups before registering a new device. > >>>>> > >>>> > >>>> At the time of mdev device create, I specifically didn't used > >>>> dev->groups because we callback in vendor driver before that, see below > >>>> code snippet, and those attributes should only be added if create() > >>>> callback returns success. > >>>> > >>>> ret = parent->ops->create(mdev, mdev_params); > >>>> if (ret) > >>>> return ret; > >>>> > >>>> ret = mdev_add_attribute_group(&mdev->dev, > >>>> parent->ops->mdev_attr_groups); > >>>> if (ret) > >>>> parent->ops->destroy(mdev); > >>>> > >>>> > >>>> > >>>>>> + > >>>>>> +static struct parent_device *mdev_get_parent_from_dev(struct device *dev) > >>>>>> +{ > >>>>>> + struct parent_device *parent; > >>>>>> + > >>>>>> + mutex_lock(&parent_list_lock); > >>>>>> + parent = mdev_get_parent(__find_parent_device(dev)); > >>>>>> + mutex_unlock(&parent_list_lock); > >>>>>> + > >>>>>> + return parent; > >>>>>> +} > >>>>> > >>>>> As we have demonstrated, all these refs and locks and release workqueue are not necessary, > >>>>> as long as you have an independent device associated with the mdev host device > >>>>> ("parent" device here). > >>>>> > >>>> > >>>> I don't think every lock will go away with that. This also changes how > >>>> mdev devices entries are created in sysfs. It adds an extra directory. > >>> > >>> Exposing the parent-child relationship through sysfs is a desirable > >>> feature, so I'm not sure how this is a negative. This part of Jike's > >>> conversion was a big improvement, I thought. Thanks, > >>> > >> > >> Jike's suggestion is to introduced a fake device over parent device i.e. > >> mdev-host, and then all mdev devices are children of 'mdev-host' not > >> children of real parent. > >> > > > > It really depends on how you define 'real parent' :) > > > > With a physical-host-mdev hierarchy, the parent of mdev devices is the host > > device, the parent of host device is the physical device. e.g. > > > > pdev mdev_host mdev_device > > dev<------------dev<------------dev > > parent parent > > > > Figure 1: device hierarchy > > > > Right, mdev-host device doesn't represent physical device nor any mdev > device. Then what is the need of such device? Is there anything implicitly wrong with using a device node to host the mdev child devices? Is the argument against it only that it's unnecessary? Can we make use of the device-core parent/child dependencies as Jike has done w/o that extra node? > >> For example, directory structure we have now is: > >> /sys/bus/pci/devices/0000\:85\:00.0/ > >> > >> mdev devices are in real parents directory. > >> > >> By introducing fake device it would be: > >> /sys/bus/pci/devices/0000\:85\:00.0/mdev-host/ > >> > >> mdev devices are in fake device's directory. > >> > > > > Yes, this is the wanted directory. > > > > I don't think so. Why? > >> Lock would be still required, to handle the race conditions like > >> 'mdev_create' is still in process and parent device is unregistered by > >> vendor driver/ parent device is unbind from vendor driver. > >> > > > > locks are provided to protect resources, would you elaborate more on > > what is the exact resource you want to protect by a lock in mdev_create? > > > > Simple, in your suggestion mdev-host device. Fake device will go away if > vendor driver unregisters the device from mdev module, right. I don't follow the reply here, but aiui there's ordering implicit in the device core that Jike is trying to take advantage of that simplifies the mdev layer significantly. In the case of an mdev_create, the device core needs to take a reference to the parent object, the mdev-host I'd guess in Jike's version, the created mdev device would also have a reference to that object, so the physical host device could not be removed so long as there are outstanding references. It's just a matter of managing references and acquiring and releasing objects. Thanks, Alex