From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhang, Qi Z" Subject: Re: [PATCH v8 04/19] ethdev: introduce device lock Date: Thu, 5 Jul 2018 03:37:27 +0000 Message-ID: <039ED4275CED7440929022BC67E706115325291F@SHSMSX103.ccr.corp.intel.com> References: <20180607123849.14439-1-qi.z.zhang@intel.com> <6105228.vaBHSlUH0d@xps> <039ED4275CED7440929022BC67E7061153250867@SHSMSX103.ccr.corp.intel.com> <4330736.hRnLL86zNI@xps> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" , "Burakov, Anatoly" , "Ananyev, Konstantin" , "Richardson, Bruce" , "Yigit, Ferruh" , "Shelton, Benjamin H" , "Vangati, Narender" , "arybchenko@solarflare.com" To: Thomas Monjalon Return-path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 182DD1BE7C for ; Thu, 5 Jul 2018 05:37:55 +0200 (CEST) In-Reply-To: <4330736.hRnLL86zNI@xps> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Thursday, July 5, 2018 9:55 AM > To: Zhang, Qi Z > Cc: dev@dpdk.org; Burakov, Anatoly ; Ananyev, > Konstantin ; Richardson, Bruce > ; Yigit, Ferruh ; She= lton, > Benjamin H ; Vangati, Narender > ; arybchenko@solarflare.com > Subject: Re: [dpdk-dev] [PATCH v8 04/19] ethdev: introduce device lock >=20 > 05/07/2018 03:38, Zhang, Qi Z: > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > 04/07/2018 12:49, Zhang, Qi Z: > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > 04/07/2018 03:47, Zhang, Qi Z: > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > > 03/07/2018 17:08, Zhang, Qi Z: > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > > > > 02/07/2018 07:44, Qi Zhang: > > > > > > > > > > Introduce API rte_eth_dev_lock and rte_eth_dev_unlock > > > > > > > > > > to let application lock or unlock on specific ethdev, > > > > > > > > > > a locked device can't be detached, this help > > > > > > > > > > applicaiton to prevent unexpected device detaching, > > > > > > > > > > especially in multi-process > > > envrionment. > > > > > > > > > > > > > > > > > > Trying to understand: a process of an application could > > > > > > > > > try to detach a port while another process is against thi= s > decision. > > > > > > > > > Why an application needs to be protected against itself? > > > > > > > > > > > > > > > > I think we can regard this as a help function, it help > > > > > > > > application to simplified > > > > > > > the situation when one process want to detach a device while > > > > > > > another one is still using it. > > > > > > > > Application can register a callback which can do to > > > > > > > > necessary clean up (like > > > > > > > stop traffic, release memory ...) before device be detached. > > > > > > > > > > > > > > Yes I agree such hook can be a good idea. > > > [...] > > > > > > > After all, it is just a pre-detach hook. > > > > > > > > > > > > > Wait, how is it different of RTE_ETH_EVENT_DESTROY callback? > > > > > > > Perhaps we just need to improve the handling of the DESTROY > event? > > > > > > > > > > > > I have thought about this before. > > > > > > Not like RTE_ETH_EVENT_DESTROY and other event hook, the hook > > > > > > here > > > > > need to give feedback, pass or fail will impact the following > > > > > behavior, this make it special, so I separate it from all exist > > > > > rte_eth_event_type handle mechanism. > > > > > > > > > > Look at _rte_eth_dev_callback_process, there is a "ret_param". > > > > > > > > OK, that should work. > > > > > > > > > > > The alternative solution is > > > > > > we just introduce a new event type like > > > > > > RTE_ETH_EVENT_PRE_DETACH and reuse all exist API > > > > > rte_eth_dev_callback_register/rte_eth_dev_callback_unregister. > > > > > > > > > > I don't think we need a new event. > > > > > Let's try to use RTE_ETH_EVENT_DESTROY. > > > > > > > > The problem is RTE_ETH_EVENT_DESTROY is used in > > > rte_eth_dev_release_port already. > > > > And in PMD, rte_eth_dev_release_port is called after dev_uninit, > > > > that mean its too late to reject a detach > > > > > > You're right. > > > > > > It's a real mess currently. > > > The right order should be to remove ethdev ports before removing the > > > underlying EAL device. But it's strangely not the case. > > > > > > We need to separate things. > > > The function rte_eth_dev_close can be used to remove an ethdev port > > > if we add a call to rte_eth_dev_release_port. > > > So we could call rte_eth_dev_close in PMD remove functions. > > > Is "close" a good time to ask confirmation to the application? > > > Or should we ask confirmation a step before, on "stop"? > > > > I think the confirmation should before any cleanup stage, it should at = the > beginning of driver->remove. >=20 > So you stop a port, even if the app policy is against detaching it? My understanding is, stop and detach is different, we may stop a device and= reconfigure it then restart it. but for detach, properly we will not use it, unless it be probed again. For dev_close , it should be called after dev_stop. so we have to like below. If (dev->started) { dev_stop /* but still problem here, if traffic is ongoing */ if (dev_close()) { dev_start() return -EBUSY. } } else { If (dev_close()) Return _EBUSY } So for me, neither rte_eth_dev_stop and rte_eth_dev_close is the right plac= e to check this. But rte_eth_dev_destroy looks like a good one. We can put all the ethdev ge= neral logic into it,=20 and PMD specific dev_unit will be called at last >=20 > > Also we should not put it into rte_eth_dev_stop, because, rte_eth_dev_s= top > can invoked by application directly, in that case, we don't what any call= back be > invoked. >=20 > It it the same to detach a port: it is invoked directly by application. > I thought you wanted a callback as helper for inter-process management? >=20 > > > > So , do you mean we can remove > > > > _rte_eth_dev_callback_process(RTE_ETH_EVENT_DESTROPY) in > > > > rte_eth_dev_release_port > > > > > > I would say we need RTE_ETH_EVENT_DESTROY to notify that the port is > > > really destroyed. > > > Maybe the right thing to do is to add a new event > > > RTE_ETH_EVENT_CLOSE_REQUEST or something else. > > > Note that we already have 2 removal events in ethdev: > > > - RTE_ETH_EVENT_INTR_RMV when the port cannot be used anymore > > > - RTE_ETH_EVENT_DESTROY when the port is going to be deleted > > > > > > > And where is right place to call > > > _rte_eth_dev_callback_process(RTE_ETH_EVENT_DESTROY)? > > > > If can't be called in rte_eth_dev_detach, because if device is > > > > removed by > > > rte_eal_hotplug_remove, it will be skipped. > > > > > > No, rte_eth_dev_detach and rte_eal_hotplug_remove are 2 different > things. > > > One is a mix of ethdev and EAL (and should be deprecated), the other > > > one is for the underlying device at EAL level. > > > > > > > probably we need to call this at the beginning of each PMD > > > > driver->remove?, > > > that means, we need to change all PMD drivers? > > > > > > Yes, we can call rte_eth_dev_stop and rte_eth_dev_close at the > > > beginning of PMD remove. > > > Note that there is already a helper rte_eth_dev_destroy called in > > > some PMD to achieve the removal, but curiously, it doesn't call stop = and > close functions. > > > > Currently PMD implement driver->remove with different way, > rte_eth_dev_stop / rte_eth_dev_close / rte_eth_dev_destroy is not always = be > invoked. > > So Before we standardize what ethdev API and what sequence should be > > called in driver->remove (I think this is a separate task) I will > > suggest 1. Create another help function like > > _rte_eth_dev_allow_to_remove, 2. the help function will call > _rte_eth_dev_callback_process(RTE_ETH_EVENT_PRE_REMOVE) and update > ret_param which contain a reject count. > > 3. the help function should to invoked at beginning at driver->remove a= nd > driver->remove will abort if the help function failed. > > > > But once we standardized that , we can do cleanup to merge it into anot= her > rte_eth_xxx API in next step. > > > > What do you think? >=20 > No > All the problems we have today are because we preferred add more and more > functions instead of fixing the basic stuff. And it is especially the cas= e for all the > detach crap. > So no. > Let's fix stuff first. >=20 >=20