From: "Ertman, David M" <david.m.ertman@intel.com>
To: Jason Gunthorpe <jgg@ziepe.ca>,
"Kirsher, Jeffrey T" <jeffrey.t.kirsher@intel.com>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"nhorman@redhat.com" <nhorman@redhat.com>,
"sassmann@redhat.com" <sassmann@redhat.com>,
"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>,
"Bowers, AndrewX" <andrewx.bowers@intel.com>
Subject: RE: [RFC PATCH v4 02/25] ice: Create and register virtual bus for RDMA
Date: Thu, 20 Feb 2020 18:48:04 +0000 [thread overview]
Message-ID: <DM6PR11MB2841C1643C0414031941D191DD130@DM6PR11MB2841.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200214203932.GY31668@ziepe.ca>
> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Friday, February 14, 2020 12:40 PM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>
> Cc: davem@davemloft.net; gregkh@linuxfoundation.org; Ertman, David M
> <david.m.ertman@intel.com>; netdev@vger.kernel.org; linux-
> rdma@vger.kernel.org; nhorman@redhat.com; sassmann@redhat.com;
> Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Bowers, AndrewX
> <andrewx.bowers@intel.com>
> Subject: Re: [RFC PATCH v4 02/25] ice: Create and register virtual bus for
> RDMA
>
> On Wed, Feb 12, 2020 at 11:14:01AM -0800, Jeff Kirsher wrote:
> > +/**
> > + * ice_init_peer_devices - initializes peer devices
> > + * @pf: ptr to ice_pf
> > + *
> > + * This function initializes peer devices on the virtual bus.
> > + */
> > +int ice_init_peer_devices(struct ice_pf *pf)
> > +{
> > + struct ice_vsi *vsi = pf->vsi[0];
> > + struct pci_dev *pdev = pf->pdev;
> > + struct device *dev = &pdev->dev;
> > + int status = 0;
> > + int i;
> > +
> > + /* Reserve vector resources */
> > + status = ice_reserve_peer_qvector(pf);
> > + if (status < 0) {
> > + dev_err(dev, "failed to reserve vectors for peer drivers\n");
> > + return status;
> > + }
> > + for (i = 0; i < ARRAY_SIZE(ice_peers); i++) {
> > + struct ice_peer_dev_int *peer_dev_int;
> > + struct ice_peer_drv_int *peer_drv_int;
> > + struct iidc_qos_params *qos_info;
> > + struct iidc_virtbus_object *vbo;
> > + struct msix_entry *entry = NULL;
> > + struct iidc_peer_dev *peer_dev;
> > + struct virtbus_device *vdev;
> > + int j;
> > +
> > + /* structure layout needed for container_of's looks like:
> > + * ice_peer_dev_int (internal only ice peer superstruct)
> > + * |--> iidc_peer_dev
> > + * |--> *ice_peer_drv_int
> > + *
> > + * iidc_virtbus_object (container_of parent for vdev)
> > + * |--> virtbus_device
> > + * |--> *iidc_peer_dev (pointer from internal struct)
> > + *
> > + * ice_peer_drv_int (internal only peer_drv struct)
> > + */
> > + peer_dev_int = devm_kzalloc(dev, sizeof(*peer_dev_int),
> > + GFP_KERNEL);
> > + if (!peer_dev_int)
> > + return -ENOMEM;
> > +
> > + vbo = kzalloc(sizeof(*vbo), GFP_KERNEL);
> > + if (!vbo) {
> > + devm_kfree(dev, peer_dev_int);
> > + return -ENOMEM;
> > + }
> > +
> > + peer_drv_int = devm_kzalloc(dev, sizeof(*peer_drv_int),
> > + GFP_KERNEL);
>
> To me, this looks like a lifetime mess. All these devm allocations
> against the parent object are being referenced through the vbo with a
> different kref lifetime. The whole thing has very unclear semantics
> who should be cleaning up on error
Will cover this at the end after addressing your following points =)
In my reply, I am going to refer to the kernel object that is registering the
virtbus_device(s) as KO_device and the kernel object that is registering
the virtbus_driver(s) as KO_driver.
>
> > + if (!peer_drv_int) {
> > + devm_kfree(dev, peer_dev_int);
> > + kfree(vbo);
>
> ie here we free two things
At this point in the init flow for KO_device, there has only been kallocs done,
no device has been registered with virtbus. So, only memory cleanup is
required.
>
> > + return -ENOMEM;
> > + }
> > +
> > + pf->peers[i] = peer_dev_int;
> > + vbo->peer_dev = &peer_dev_int->peer_dev;
> > + peer_dev_int->peer_drv_int = peer_drv_int;
> > + peer_dev_int->peer_dev.vdev = &vbo->vdev;
> > +
> > + /* Initialize driver values */
> > + for (j = 0; j < IIDC_EVENT_NBITS; j++)
> > + bitmap_zero(peer_drv_int->current_events[j].type,
> > + IIDC_EVENT_NBITS);
> > +
> > + mutex_init(&peer_dev_int->peer_dev_state_mutex);
> > +
> > + peer_dev = &peer_dev_int->peer_dev;
> > + peer_dev->peer_ops = NULL;
> > + peer_dev->hw_addr = (u8 __iomem *)pf->hw.hw_addr;
> > + peer_dev->peer_dev_id = ice_peers[i].id;
> > + peer_dev->pf_vsi_num = vsi->vsi_num;
> > + peer_dev->netdev = vsi->netdev;
> > +
> > + peer_dev_int->ice_peer_wq =
> > + alloc_ordered_workqueue("ice_peer_wq_%d",
> WQ_UNBOUND,
> > + i);
> > + if (!peer_dev_int->ice_peer_wq)
> > + return -ENOMEM;
>
> Here we free nothing
This is a miss on my part. At this point we should keep consistent and free the memory
that has been allocated as we unwind.
>
> > +
> > + peer_dev->pdev = pdev;
> > + qos_info = &peer_dev->initial_qos_info;
> > +
> > + /* setup qos_info fields with defaults */
> > + qos_info->num_apps = 0;
> > + qos_info->num_tc = 1;
> > +
> > + for (j = 0; j < IIDC_MAX_USER_PRIORITY; j++)
> > + qos_info->up2tc[j] = 0;
> > +
> > + qos_info->tc_info[0].rel_bw = 100;
> > + for (j = 1; j < IEEE_8021QAZ_MAX_TCS; j++)
> > + qos_info->tc_info[j].rel_bw = 0;
> > +
> > + /* for DCB, override the qos_info defaults. */
> > + ice_setup_dcb_qos_info(pf, qos_info);
> > +
> > + /* make sure peer specific resources such as msix_count and
> > + * msix_entries are initialized
> > + */
> > + switch (ice_peers[i].id) {
> > + case IIDC_PEER_RDMA_ID:
> > + if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags)) {
> > + peer_dev->msix_count = pf-
> >num_rdma_msix;
> > + entry = &pf->msix_entries[pf-
> >rdma_base_vector];
> > + }
> > + break;
> > + default:
> > + break;
> > + }
> > +
> > + peer_dev->msix_entries = entry;
> > + ice_peer_state_change(peer_dev_int,
> ICE_PEER_DEV_STATE_INIT,
> > + false);
> > +
> > + vdev = &vbo->vdev;
> > + vdev->name = ice_peers[i].name;
> > + vdev->release = ice_peer_vdev_release;
> > + vdev->dev.parent = &pdev->dev;
> > +
> > + status = virtbus_dev_register(vdev);
> > + if (status) {
> > + virtbus_dev_unregister(vdev);
> > + vdev = NULL;
>
> Here we double unregister and free nothing.
>
> You need to go through all of this really carefully and make some kind
> of sane lifetime model and fix all the error unwinding :(
Thanks for catching this. A failure in virtbus_register_device() does
*not* require a call virtbus_unregister_device. The failure path for the
register function handles this. Also, we need to remain consistent with freeing
on unwind.
>
> Why doesn't the release() function of vbo trigger the free of all this
> peer related stuff?
>
> Use a sane design model of splitting into functions to allocate single
> peices of memory, goto error unwind each function, and build things up
> properly.
>
> Jason
I am going to add this to the documentation to record the following information.
The KO_device is responsible for allocating the memory for the virtbus_device
and keeping it viable for the lifetime of the KO_device. KO_device will call
virtbus_register_device to start using the virtbus_device, and KO_device is
responslble for calling virtbus_unregister_device either on KO_device's exit
path (remove/shutdown) or when it is done using the virtbus subsystem.
The KO_driver is responsible for allocating the memory for the virtbus_driver
and keeping it viable for the lifetime of the KO_driver. KO_driver will call
virtbus_register_driver to start using the virtbus_driver, and KO_driver is
responsible for calling virtbus_unregister_driver either on KO_driver's exit
path (remove/shutdown) or when it is done using the virtbus subsystem.
The premise is that the KO_device and KO_driver can load and unload multiple
times and they can reconnect to each other through the virtbus on each
occurrence of their reloads. So one example of a flow looks like the following:
- KO_device loads (order of KO_device and KO_driver loading is irrelevant)
- KO_device allocates memory for virtbus_device(s) it expects to use and
any backing memory it is going to use to interact with KO_driver.
- KO_device performs virtbus_register_device() which is the *only* place
a device_initialize() is performed for virtbus_device.
- KO_driver loads
- KO_driver allocates memory for virtbus_driver(s) it expects to use and
any backing memory it expects to use to interact with KO_device
- KO_driver performs virtbus_register_driver()
- virtbus matches virtbus_device and virtbus_driver and calls the
virtbus_drivers's probe()
- KO_driver and KO_device interact with each other however they choose to do so.
- KO_device (for example) receives a call to its remove callback
- KO_device's unload path severs any interaction the KO_device and KO_driver
were having - implementation dependant
- KO_device's unload path is required to perform a call to
virtbus_unregister_device(). virtbus_unregister_device() is the *only*
place a put_device() is performed.
- KO_device's unload path frees memory associated with the virtbus_device
- vitbus calls KO_drivers's .remove callback defined for the virtbus_driver
So, the lifespan of the virtbus_device is controlled by KO_device and the
lifespan of virtbus_driver is controlled by KO_driver.
It is required for the KO's to "allocate -> register -> unregister -> free"
virtbus objects.
-DaveE
next prev parent reply other threads:[~2020-02-20 18:48 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-12 19:13 [RFC PATCH v4 00/25] Intel Wired LAN/RDMA Driver Updates 2020-02-11 Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 01/25] virtual-bus: Implementation of Virtual Bus Jeff Kirsher
2020-02-14 17:02 ` Greg KH
2020-02-14 20:34 ` Jason Gunthorpe
2020-02-14 20:43 ` Greg KH
2020-02-15 0:01 ` Jason Gunthorpe
2020-02-15 0:53 ` Greg KH
2020-02-14 20:45 ` Greg KH
2020-02-20 18:55 ` Ertman, David M
2020-02-20 19:27 ` Jason Gunthorpe
2020-02-14 21:22 ` Parav Pandit
2020-02-15 0:08 ` Jason Gunthorpe
2020-02-12 19:14 ` [RFC PATCH v4 02/25] ice: Create and register virtual bus for RDMA Jeff Kirsher
2020-02-14 20:39 ` Jason Gunthorpe
2020-02-20 18:48 ` Ertman, David M [this message]
2020-02-20 20:58 ` Jason Gunthorpe
2020-02-12 19:14 ` [RFC PATCH v4 03/25] ice: Complete RDMA peer registration Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 04/25] ice: Support resource allocation requests Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 05/25] ice: Enable event notifications Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 06/25] ice: Allow reset operations Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 07/25] ice: Pass through communications to VF Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 08/25] i40e: Move client header location Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 09/25] i40e: Register a virtbus device to provide RDMA Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 10/25] RDMA/irdma: Add driver framework definitions Jeff Kirsher
2020-02-14 22:13 ` Parav Pandit
2020-02-18 20:42 ` Saleem, Shiraz
2020-02-20 22:24 ` Parav Pandit
2020-02-20 23:06 ` Jason Gunthorpe
2020-02-21 17:01 ` Saleem, Shiraz
2020-02-21 17:23 ` Parav Pandit
2020-02-21 18:04 ` Jason Gunthorpe
2020-03-19 11:49 ` Martin Habets
2020-02-12 19:14 ` [RFC PATCH v4 11/25] RDMA/irdma: Implement device initialization definitions Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 12/25] RDMA/irdma: Implement HW Admin Queue OPs Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 13/25] RDMA/irdma: Add HMC backing store setup functions Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 14/25] RDMA/irdma: Add privileged UDA queue implementation Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 15/25] RDMA/irdma: Add QoS definitions Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 16/25] RDMA/irdma: Add connection manager Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 17/25] RDMA/irdma: Add PBLE resource manager Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 18/25] RDMA/irdma: Implement device supported verb APIs Jeff Kirsher
2020-02-14 14:54 ` Jason Gunthorpe
2020-02-14 15:49 ` Andrew Boyer
2020-02-14 16:45 ` Jason Gunthorpe
2020-02-18 20:43 ` Saleem, Shiraz
2020-02-12 19:14 ` [RFC PATCH v4 19/25] RDMA/irdma: Add RoCEv2 UD OP support Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 20/25] RDMA/irdma: Add user/kernel shared libraries Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 21/25] RDMA/irdma: Add miscellaneous utility definitions Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 22/25] RDMA/irdma: Add dynamic tracing for CM Jeff Kirsher
2020-02-14 14:53 ` Jason Gunthorpe
2020-02-18 20:43 ` Saleem, Shiraz
2020-02-12 19:14 ` [RFC PATCH v4 23/25] RDMA/irdma: Add ABI definitions Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 24/25] RDMA: Add irdma Kconfig/Makefile and remove i40iw Jeff Kirsher
2020-02-12 19:14 ` [RFC PATCH v4 25/25] RDMA/irdma: Update MAINTAINERS file Jeff Kirsher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM6PR11MB2841C1643C0414031941D191DD130@DM6PR11MB2841.namprd11.prod.outlook.com \
--to=david.m.ertman@intel.com \
--cc=andrewx.bowers@intel.com \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=gregkh@linuxfoundation.org \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jgg@ziepe.ca \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nhorman@redhat.com \
--cc=sassmann@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).