From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Guo, Jia" Subject: Re: [PATCH v7 1/2] eal: add uevent monitor for hot plug Date: Tue, 9 Jan 2018 11:45:59 +0000 Message-ID: <01BA8470C017D6468C8290E4B9C5E1E83B2D141C@shsmsx102.ccr.corp.intel.com> References: <1509567405-27439-3-git-send-email-jia.guo@intel.com> <1514943736-5931-1-git-send-email-jia.guo@intel.com> <1514943736-5931-2-git-send-email-jia.guo@intel.com> <2000997.W0TFPrnL2l@xps> <6b97ebc5-1ec5-a724-a620-96b23b126d01@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" , "stephen@networkplumber.org" , "Richardson, Bruce" , "Yigit, Ferruh" , "gaetan.rivet@6wind.com" , "Ananyev, Konstantin" , "shreyansh.jain@nxp.com" , "Wu, Jingjing" , "Zhang, Helin" , "Van Haaren, Harry" To: Mordechay Haimovsky , Thomas Monjalon Return-path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 53B321B1A9 for ; Tue, 9 Jan 2018 12:46:04 +0100 (CET) In-Reply-To: Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Got it, seems that would be more complex work to do later, still need to de= bugging the feature more to fulfill for more driver. Best regards, Jeff Guo From: Mordechay Haimovsky [mailto:motih@mellanox.com] Sent: Tuesday, January 9, 2018 6:32 PM To: Guo, Jia ; Thomas Monjalon Cc: dev@dpdk.org; stephen@networkplumber.org; Richardson, Bruce ; Yigit, Ferruh ; gaetan.rivet@6wi= nd.com; Ananyev, Konstantin ; shreyansh.jain@= nxp.com; Wu, Jingjing ; Zhang, Helin ; Van Haaren, Harry Subject: RE: [dpdk-dev] [PATCH v7 1/2] eal: add uevent monitor for hot plug Hi Jeff, The following will not work for Mellanox devices (see inline) "i will separate it for you all to benefit for review. for kernel bindin= g, i just let it automatically compare with the first time manually binding= , and it is the part of he hot plug flow. so i suggest to review more about= that if it is not side effect and workable, beg for keep on. " Also please note the following compilation warnings, ... lib/librte_eal/linuxapp/eal/eal_dev.c: In function 'dev_uev_monitoring= ': .. lib/librte_eal/linuxapp/eal/eal_dev.c:331:8: error: 'netlink_fd' may be= used uninitialized in this function [-Werror=3Dmaybe-uninit ... drivers/bus/pci/linux/pci.c: In function 'rte_pci_dev_bind_driver': ... /drivers/bus/pci/linux/pci.c:940:7: error: 'drv_bind_fd' may be used un= initialized in this function [-Werror=3Dmaybe-uninitialized] You are releasing uninitialized FDs in the error path of both routines. Moti H. From: Guo, Jia [mailto:jia.guo@intel.com] Sent: Tuesday, January 9, 2018 10:26 AM To: Thomas Monjalon > Cc: dev@dpdk.org; stephen@networkplumber.org; bruce.richardson@intel.com; ferruh.yigit@intel.com; gaeta= n.rivet@6wind.com; konstantin.ananyev@intel.= com; shreyansh.jain@nxp.com; jingjing.wu@intel.com; = helin.zhang@intel.com; Mordechay Haimovsky >; harry.van.haaren@intel.com Subject: Re: [dpdk-dev] [PATCH v7 1/2] eal: add uevent monitor for hot plug On 1/9/2018 8:39 AM, Thomas Monjalon wrote: Hi Jeff, I am surprised that there is not a lot of review of these very important patches. Maybe it is not easy to review. Let's try to progress in the following days. This patch is really too big with a lot of concepts spread in separate files, so it is difficult to properly review. Please, try to split in several patches, bringing only one concept per patch. At first, you can introduce the new events and the callback API. The second patch (and the most important one) would be to bring the uevent parsing for Linux (and void implementation for BSD). Then you can add and explain some patches around PCI mapping. At last there is the kernel binding effort - this one will probably be ignored for 18.02, because it is another huge topic. Without bothering with kernel binding, we can at least remove a device, get a notification, and eventually re-add it. It is a good first step. Anyway your testpmd patch tests exactly this scenario (totally new devices are not seen). i will separate it for you all to benefit for review. for kernel binding= , i just let it automatically compare with the first time manually binding,= and it is the part of he hot plug flow. so i suggest to review more about = that if it is not side effect and workable, beg for keep on. This will not work for Mellanox which uses several drivers and services in = order to map the device and device queues to user space. For example, the mlx4 PMD (PMD for ConnectX-3 devices) requires that mlx4_= core mlx4_en and mlx4_ib drivers to be loaded, and for RDM -core user-space libraries and daemons to be loaded. More comments below: 03/01/2018 02:42, Jeff Guo: --- /dev/null +++ b/lib/librte_eal/bsdapp/eal/eal_dev.c @@ -0,0 +1,64 @@ +/*- + * Copyright(c) 2010-2017 Intel Corporation. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: Please check how Intel Copyright and BSD license is newly formatted with SPDX tag. got it. --- /dev/null +++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_dev.h +enum rte_eal_dev_event_type { + RTE_EAL_DEV_EVENT_UNKNOWN, /**< unknown event type */ + RTE_EAL_DEV_EVENT_ADD, /**< device adding event */ + RTE_EAL_DEV_EVENT_REMOVE, + /**< device removing event */ + RTE_EAL_DEV_EVENT_CHANGE, + /**< device status change event */ + RTE_EAL_DEV_EVENT_MOVE, /**< device sys path move event */ + RTE_EAL_DEV_EVENT_ONLINE, /**< device online event */ + RTE_EAL_DEV_EVENT_OFFLINE, /**< device offline event */ + RTE_EAL_DEV_EVENT_MAX /**< max value of this enum */ +}; The comments are not useful. Please better explain what is change, move, online, etc. The shorter prefix RTE_DEV is preferred over RTE_EAL_DEV. This file is full of definitions which must be common, not specific to BSD or Linux. Please move it. will move it to the better place. +int +_rte_dev_callback_process(struct rte_device *device, + enum rte_eal_dev_event_type event, + void *cb_arg, void *ret_param) cb_arg must be an opaque parameter which is registered with the callback and passed later. No need as parameter of this function. ret_param is not needed at all. The kernel event will be just translated as rte_eal_dev_event_type (rte_dev_event after rename). suggest hold one to let new param, such as device info, add by ret_param, s= o cb_arg have set when register and no use anymore, delete it. --- a/lib/librte_eal/common/include/rte_bus.h +++ b/lib/librte_eal/common/include/rte_bus.h /** + * Device iterator to find a device on a bus. + * + * This function returns an rte_device if one of those held by the bus + * matches the data passed as parameter. + * + * If the comparison function returns zero this function should stop itera= ting + * over any more devices. To continue a search the device of a previous se= arch + * can be passed via the start parameter. + * + * @param cmp + * the device name comparison function. + * + * @param data + * Data to compare each device against. + * + * @param start + * starting point for the iteration + * + * @return + * The first device matching the data, NULL if none exists. + */ +typedef struct rte_device * +(*rte_bus_find_device_by_name_t)(const struct rte_device *start, + rte_dev_cmp_name_t cmp, + const void *data); Why is it needed? There is already rte_bus_find_device_t. because the rte_bus_find_device_t just find a device structure in the devic= e list, but here need to find a device structure by device name which come = from uevent info. +/** + * Implementation specific remap function which is responsible for remmapi= ng + * devices on that bus from original share memory resource to a private me= mory + * resource for the sake of device has been removal. + * + * @param dev + * Device pointer that was returned by a previous call to find_devic= e. + * + * @return + * 0 on success. + * !0 on error. + */ +typedef int (*rte_bus_remap_device_t)(struct rte_device *dev); You need to better explain why this remap op is needed, and when it is called exactly? sure. @@ -206,9 +265,13 @@ struct rte_bus { rte_bus_scan_t scan; /**< Scan for devices attached to bus */ rte_bus_probe_t probe; /**< Probe devices on bus */ rte_bus_find_device_t find_device; /**< Find a device on the bus */ + rte_bus_find_device_by_name_t find_device_by_name; + /**< Find a device on the bus */ rte_bus_plug_t plug; /**< Probe single device for drivers */ rte_bus_unplug_t unplug; /**< Remove single device from driver */ rte_bus_parse_t parse; /**< Parse a device name */ + rte_bus_remap_device_t remap_device; /**< remap a device */ + rte_bus_bind_driver_t bind_driver; /**< bind a driver for bus device */ struct rte_bus_conf conf; /**< Bus configuration */ rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */ }; Every new op must be introduced in a separate patch (if not completely removed). make sense. --- a/lib/librte_eal/common/include/rte_dev.h +++ b/lib/librte_eal/common/include/rte_dev.h +enum device_state { + DEVICE_UNDEFINED, + DEVICE_FAULT, + DEVICE_PARSED, + DEVICE_PROBED, +}; These constants must prefixed with RTE_ and documented with doxygen please. got it. +/** + * It enable the device event monitoring for a specific event. This comment must be reworded. ok. + * + * @param none useless thanks. + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_eal_dev_monitor_enable(void); I suggest to drop this function which is just calling rte_dev_monitor_start= . more discuss, i suggest keep on it , let rte_dev_monitor_start separately s= tay on the platform code and let user commonly call rte_eal_dev_monitor_ena= ble. --- a/lib/librte_eal/linuxapp/eal/eal_alarm.c +++ b/lib/librte_eal/linuxapp/eal/eal_alarm.c @@ -259,6 +260,10 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, voi= d *cb_arg) } ap_prev =3D ap; } + + ret |=3D rte_intr_callback_unregister(&intr_handle, + eal_alarm_callback, NULL); + rte_spinlock_unlock(&alarm_list_lk); } while (executing !=3D 0); Looks to be unrelated. If it is a fix, please do a separate patch. ok. --- /dev/null +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c +int +rte_dev_monitor_start(void) What about adding "event" in the name of this function? rte_dev_event_monitor_start dev monitor sounds shock, agree. +{ + int ret; + + if (!no_request_thread) + return 0; + no_request_thread =3D false; + + /* create the host thread to wait/handle the uevent from kernel */ + ret =3D pthread_create(&uev_monitor_thread, NULL, + dev_uev_monitoring, NULL); + return ret; +} I think you should use rte_service for thread management. thanks for your info, such a good mechanism to use that i even not know th= at before. i will study and use it. --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c @@ -354,6 +354,12 @@ igbuio_pci_release(struct uio_info *info, struct inode= *inode) struct rte_uio_pci_dev *udev =3D info->priv; struct pci_dev *dev =3D udev->pdev; + /* check if device have been remove before release */ + if ((&dev->dev.kobj)->state_remove_uevent_sent =3D=3D 1) { + pr_info("The device have been removed\n"); + return -1; + } This looks to be unrelated. Separate patch please. got it. End of first pass review. There are some basic requirements that other maintainers (especially at Intel) could have reviewed earlier. Let's try to improve it quickly for 18.02, thanks. If we are short in time, we should at least focus on adding the events/callback API and the Linux events implementation.