From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matan Azrad Subject: Re: [PATCH v7 1/2] eal: add uevent monitor for hot plug Date: Mon, 8 Jan 2018 08:14:20 +0000 Message-ID: References: <1509567405-27439-3-git-send-email-jia.guo@intel.com> <1514943736-5931-1-git-send-email-jia.guo@intel.com> <1514943736-5931-2-git-send-email-jia.guo@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "konstantin.ananyev@intel.com" , "jblunck@infradead.org" , "shreyansh.jain@nxp.com" , "jingjing.wu@intel.com" , "dev@dpdk.org" , Thomas Monjalon , "helin.zhang@intel.com" , Mordechay Haimovsky To: "Guo, Jia" , "stephen@networkplumber.org" , "bruce.richardson@intel.com" , "ferruh.yigit@intel.com" , "gaetan.rivet@6wind.com" Return-path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0041.outbound.protection.outlook.com [104.47.0.41]) by dpdk.org (Postfix) with ESMTP id DD6001322C for ; Mon, 8 Jan 2018 09:14:23 +0100 (CET) In-Reply-To: Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Guo I answered for the 2 threads here.=20 From: Guo, Jia, Monday, January 8, 2018 7:27 AM > On 1/3/2018 1:02 AM, Matan Azrad wrote: > > Hi Jeff > > > > Maybe I'm touching in previous discussions but please see some > comments\questions. > > > > From: Jeff Guo: > >> This patch aim to add a general uevent mechanism in eal device layer, > >> to enable all linux kernel object hot plug monitoring, so user could u= se > these > >> APIs to monitor and read out the device status info that sent from the > kernel > >> side, then corresponding to handle it, such as detach or attach the > >> device, and even benefit to use it to do smoothly fail safe work. > >> > >> 1) About uevent monitoring: > >> a: add one epolling to poll the netlink socket, to monitor the uevent = of > >> the device, add device_state in struct of rte_device, to identify = the > >> device state machine. > >> b: add enum of rte_eal_dev_event_type and struct of rte_eal_uevent. > >> c: add below API in rte eal device common layer. > >> rte_eal_dev_monitor_enable > >> rte_dev_callback_register > >> rte_dev_callback_unregister > >> _rte_dev_callback_process > >> rte_dev_monitor_start > >> rte_dev_monitor_stop > >> > >> 2) About failure handler, use pci uio for example, > >> add pci_remap_device in bus layer and below function to process it= : > >> rte_pci_remap_device > >> pci_uio_remap_resource > >> pci_map_private_resource > >> add rte_pci_dev_bind_driver to bind pci device with explicit drive= r. > >> > >> Signed-off-by: Jeff Guo > >> + user_cb->cb_arg =3D cb_arg; > >> + user_cb->event =3D event; > >> + if (event =3D=3D RTE_EAL_DEV_EVENT_ADD) > >> + dev_add_cb =3D user_cb; > > Only one dpdk entity can register to ADD callback? > > > > I suggest to add option to register all devices maybe by using dummy > device which will include all the "ALL_DEVICES" callbacks per event. > > All means past, present and future devices, by this way 1 callback can = be > called for all the devices and more than one dpdk entity could register t= o an > ADD\NEW event. > > What's about NEW instead of ADD? > > > > I also suggest to add the device pointer as a parameter to the > callback(which will be managed by EAL). > if you talk about dev_add_cb, the add means device add not cb add, if > you talk about dev event type, the ADD type is consistent with the type > form kernel side, anyway could be find a better. I'm talking about next: 1. dev_add_cb can hold only 1 callback, why? Can't 2 callbacks to be regist= ered to RTE_EAL_DEV_EVENT_ADD event? (actually there is memory leak in this= case) 2. Suggestion to register same callback to "all" devices by 1 call. 3. Suggestion to add parameter for the callback functions - the device poin= ter.=20 4. Suggestion to change name from RTE_EAL_DEV_EVENT_ADD to RTE_EAL_DEV_EVEN= T_NEW. 5. Clue how to implement 1,2 by dummy device. > but for 1 callback for all device, it is make sense , i will think about = that. > >> + else > >> + TAILQ_INSERT_TAIL(&(device->uev_cbs), user_cb, > >> next); > >> + } > >> + > >> + rte_spinlock_unlock(&rte_dev_cb_lock); > >> + return 0; > >> +} > >> +int > >> +_rte_dev_callback_process(struct rte_device *device, > >> + enum rte_eal_dev_event_type event, > >> + void *cb_arg, void *ret_param) > >> +{ > >> + struct rte_eal_dev_callback dev_cb; > >> + struct rte_eal_dev_callback *cb_lst; > >> + int rc =3D 0; > >> + > >> + rte_spinlock_lock(&rte_dev_cb_lock); > >> + if (event =3D=3D RTE_EAL_DEV_EVENT_ADD) { > >> + if (cb_arg !=3D NULL) > >> + dev_add_cb->cb_arg =3D cb_arg; > >> + > >> + if (ret_param !=3D NULL) > >> + dev_add_cb->ret_param =3D ret_param; > >> + > >> + rte_spinlock_unlock(&rte_dev_cb_lock); > > Can't someone free it when it running? > > I suggest to keep the lock locked. > > Callbacks are not allowed to use this mechanism to prevent deadlock. > seems it would bring some deadlock here, let's check it more. A deadlock should occur only when a callback tries to use this mechanism - = I think it is OK, you just need to document it for the user.=20 > >> + rc =3D dev_add_cb->cb_fn(dev_add_cb->event, > >> + dev_add_cb->cb_arg, dev_add_cb- > >>> ret_param); > >> + rte_spinlock_lock(&rte_dev_cb_lock); > >> + } else { > >> + TAILQ_FOREACH(cb_lst, &(device->uev_cbs), next) { > >> + if (cb_lst->cb_fn =3D=3D NULL || cb_lst->event !=3D event) > >> + continue; > >> + dev_cb =3D *cb_lst; > >> + cb_lst->active =3D 1; > >> + if (cb_arg !=3D NULL) > >> + dev_cb.cb_arg =3D cb_arg; > >> + if (ret_param !=3D NULL) > >> + dev_cb.ret_param =3D ret_param; > >> + > >> + rte_spinlock_unlock(&rte_dev_cb_lock); > > The current active flag doesn't do it thread safe here, I suggest to k= eep the > lock locked. > > Scenario: > > 1. Thread A see active =3D 0 in unregister function. > > 2. Context switch. > > 3. Thread B start the callback. > > 4. Context switch. > > 5. Thread A free it. > > 6. Context switch. > > 7. Seg fault in Thread B. > the same as above. The same as above, and I think the active flag doesn't solve the race and y= ou must solve it for the both cases. I suggest just to keep the lock locked and document the optional deadlock b= y the callback code. =20 > >> + rc =3D dev_cb.cb_fn(dev_cb.event, > >> + dev_cb.cb_arg, dev_cb.ret_param); > >> + rte_spinlock_lock(&rte_dev_cb_lock); > >> + cb_lst->active =3D 0; > >> + } > >> + } > >> + rte_spinlock_unlock(&rte_dev_cb_lock); > >> + return rc; > >> +} > >> return(_rte_dev_callback_process(dev, > >> + RTE_EAL_DEV_EVENT_REMOVE, > >> + NULL, NULL)); > > What is the reason to keep this device in EAL device list after the rem= oval? > > I suggest to remove it (driver remove, bus remove and EAL remove) after > the callbacks running. > > By this way EAL can initiate all device removals. > agree, device should be remove from the eal device list after the removal= . I suggest using rte_eal_hotplug_remove(). > it will do device removal from the device list by the eal device detach= =20 >function in the call backs running. does it fulfill your concerns. I mean the removal\probe should be initiated by the EAL and not by the user= s callbacks. > >> + } else if (uevent.type =3D=3D RTE_EAL_DEV_EVENT_ADD) > >> { > >> + if (dev =3D=3D NULL) { > >> + /** > >> + * bind the driver to the device > >> + * before user's add processing > >> + */ > >> + bus->bind_driver( > >> + uevent.devname, > >> + "igb_uio"); > >> + > > Similar comments here: > > EAL can initiate all device probe operations by adding the device and > probing it here before the callback running. > > Then, also the device pointer can be passed to the callbacks. > pass a device pointer could be bring some more change, let's think about > more. Yes, I know, it will help to the user especially in ADD(NEW) and REMOVE eve= nts. Here you can use rte_eal_hotplug_add(). > >> return(_rte_dev_callback_process(NULL, > >> + RTE_EAL_DEV_EVENT_ADD, > >> + uevent.devname, NULL)); > >> + } > >> + } > >> + } > >> + } > >> + return 0; > >> +} > >> +int > >> +rte_dev_monitor_start(void) > >> +{ > > Maybe add option to run it also by new EAL command line parameter? > good idea. > >> + int ret; > >> + > >> + if (!no_request_thread) > >> + return 0; Look, also here there is race, no_request_thread doesn't solve it. Maybe the EAL parameter should be the only way to run it(just don't expose = this API), I think the default should be TRUE. > >> + no_request_thread =3D false; > >> + > >> + /* create the host thread to wait/handle the uevent from kernel */ > >> + ret =3D pthread_create(&uev_monitor_thread, NULL, > >> + dev_uev_monitoring, NULL); > > What is the reason to open new thread for hotplug? > > Why not to use the current dpdk host thread by the alarm mechanism? > appropriate if you could talk about what you mean the disadvantage of > new thread here and the advantage of alarm mechanism at the case? One more thread can complicate things - the user will need to synchronize h= is alarm\interrupt callbacks code(host thread) with his hotplug callbacks c= ode(hotplug thread). =20 > >> + return ret; > >> +} > >> + > >> +int > >> +rte_dev_monitor_stop(void) > >> +{ > >> + udev_exit =3D true; > >> + no_request_thread =3D true; > >> + return 0; > >> +}