From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerin Jacob Subject: Re: [PATCH v11 0/7] hot-unplug failure handle mechanism Date: Mon, 1 Oct 2018 15:25:15 +0530 Message-ID: <20181001095513.GA18062@jerin> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1538307003-11836-1-git-send-email-jia.guo@intel.com> <20181001110012.273b38fc@shemminger-XPS-13-9360> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: Jeff Guo , bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com, anatoly.burakov@intel.com, jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, helin.zhang@intel.com To: Stephen Hemminger Return-path: Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0050.outbound.protection.outlook.com [104.47.41.50]) by dpdk.org (Postfix) with ESMTP id 197D12C37 for ; Mon, 1 Oct 2018 11:55:44 +0200 (CEST) Content-Disposition: inline In-Reply-To: <20181001110012.273b38fc@shemminger-XPS-13-9360> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" -----Original Message----- > Date: Mon, 1 Oct 2018 11:00:12 +0200 > From: Stephen Hemminger > To: Jeff Guo > Cc: bruce.richardson@intel.com, ferruh.yigit@intel.com, > konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, > jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, > matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, > shaopeng.he@intel.com, bernard.iremonger@intel.com, > arybchenko@solarflare.com, wenzhuo.lu@intel.com, > anatoly.burakov@intel.com, jblunck@infradead.org, shreyansh.jain@nxp.com, > dev@dpdk.org, helin.zhang@intel.com > Subject: Re: [dpdk-dev] [PATCH v11 0/7] hot-unplug failure handle mechanism > > > On Sun, 30 Sep 2018 19:29:56 +0800 > Jeff Guo wrote: > > > Hotplug is an important feature for use-cases like the datacenter device's > > fail-safe and for SRIOV Live Migration in SDN/NFV. It could bring higher > > flexibility and continuality to networking services in multiple use-cases > > in the industry. So let's see how DPDK can help users implement hotplug > > solutions. > > > > We already have a general device-event monitor mechanism, failsafe driver, > > and hot plug/unplug API in DPDK. We have already got the solution of > > “ethdev event + kernel PMD hotplug handler + failsafe”, but we still not > > got “eal event + hotplug handler for pci PMD + failsafe” implement, and we > > need to considerate 2 different solutions between uio pci and vfio pci. > > > > In the case of hotplug for igb_uio, when a hardware device be removed > > physically or disabled in software, the application needs to be notified > > and detach the device out of the bus, and then make the device invalidate. > > The problem is that, the removal of the device is not instantaneous in > > software. If the application data path tries to read/write to the device > > when removal is still in process, it will cause an MMIO error and > > application will crash. > > > > In this patch set, we propose a PCIe bus failure handler mechanism for > > hot-unplug in igb_uio. It aims to guarantee that, when a hot-unplug occurs, > > the application will not crash. > > > > The mechanism should work as below: > > > > First, the application enables the device event monitor, registers the > > hotplug event’s callback and enable hotplug handling before running the > > data path. Once the hot-unplug occurs, the mechanism will detect the > > removal event and then accordingly do the failure handling. In order to > > do that, the below functionality will be required: > > - Add a new bus ops “hot_unplug_handler” to handle hot-unplug failure. > > - Implement pci bus specific ops “pci_hot_unplug_handler”. For uio pci, > > it will be based on the failure address to remap memory for the corresponding > > device that unplugged. For vfio pci, could seperate implement case by case. > > > > For the data path or other unexpected behaviors from the control path > > when a hot unplug occurs: > > - Add a new bus ops “sigbus_handler”, that is responsible for handling > > the sigbus error which is either an original memory error, or a specific > > memory error that is caused by a hot unplug. When a sigbus error is > > captured, it will call this function to handle sigbus error. > > - Implement PCI bus specific ops “pci_sigbus_handler”. It will iterate all > > device on PCI bus to find which device encounter the failure. > > - Implement a "rte_bus_sigbus_handler" to iterate all buses to find a bus > > to handle the failure. > > - Add a couple of APIs “rte_dev_hotplug_handle_enable” and > > “rte_dev_hotplug_handle_diable” to enable/disable hotplug handling. > > It will monitor the sigbus error by a handler which is per-process. > > Based on the signal event principle, the control path thread and the > > data path thread will randomly receive the sigbus error, but will call the > > common sigbus process. When sigbus be captured, it will call the above API > > to find bus to handle it. > > > > The mechanism could be used by app or PMDs. For example, the whole process > > of hotplug in testpmd is: > > - Enable device event monitor->Enable hotplug handle->Register event callback > > ->attach port->start port->start forwarding->Device unplug->failure handle > > ->stop forwarding->stop port->close port->detach port. > > > > This patch set would not cover hotplug insert and binding, and it is only > > implement the igb_uio failure handler, the vfio hotplug failure handler > > will be in next coming patch set. > > > > patchset history: > > v11->v10: > > change the ops name, since both uio and vfio will use the hot-unplug ops. > > add experimental tag. > > since we plan to abandon RTE_ETH_EVENT_INTR_RMV, change to use > > RTE_DEV_EVENT_REMOVE, so modify the hotplug event and callback usage. > > move the igb_uio fixing part, since it is random issue and should be considarate > > as kernel driver defect but not include as this failure handler mechanism. > > > > v10->v9: > > modify the api name and exposure out for public use. > > add hotplug handle enable/disable APIs > > refine commit log > > > > v9->v8: > > refine commit log to be more readable. > > > > v8->v7: > > refine errno process in sigbus handler. > > refine igb uio release process > > > > v7->v6: > > delete some unused part > > > > v6->v5: > > refine some description about bus ops > > refine commit log > > add some entry check. > > > > v5->v4: > > split patches to focus on the failure handle, remove the event usage > > by testpmd to another patch. > > change the hotplug failure handler name. > > refine the sigbus handle logic. > > add lock for udev state in igb uio driver. > > > > v4->v3: > > split patches to be small and clear. > > change to use new parameter "--hotplug-mode" in testpmd to identify > > the eal hotplug and ethdev hotplug. > > > > v3->v2: > > change bus ops name to bus_hotplug_handler. > > add new API and bus ops of bus_signal_handler distingush handle generic. > > sigbus and hotplug sigbus. > > > > v2->v1(v21): > > refine some doc and commit log. > > fix igb uio kernel issue for control path failure rebase testpmd code. > > > > Since the hot plug solution be discussed serval around in the public, > > the scope be changed and the patch set be split into many times. Coming > > to the recently RFC and feature design, it just focus on the hot unplug > > failure handler at this patch set, so in order let this topic more clear > > and focus, summarize privours patch set in history “v1(v21)”, the v2 here > > go ahead for further track. > > > > "v1(21)" == v21 as below: > > v21->v20: > > split function in hot unplug ops. > > sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case. > > combind rmv callback function to be only one. > > > > v20->v19: > > clean the code. > > refine the remap logic for multiple device. > > remove the auto binding. > > > > v19->18: > > note for limitation of multiple hotplug, fix some typo, sqeeze patch. > > > > v18->v15: > > add document, add signal bus handler, refine the code to be more clear. > > > > the prior patch history please check the patch set "add device event monitor framework". > > > > Jeff Guo (7): > > bus: add hot-unplug handler > > bus/pci: implement hot-unplug handler ops > > bus: add sigbus handler > > bus/pci: implement sigbus handler ops > > bus: add helper to handle sigbus > > eal: add failure handle mechanism for hot-unplug > > testpmd: use hot-unplug failure handle mechanism > > > > app/test-pmd/testpmd.c | 39 ++++++-- > > doc/guides/rel_notes/release_18_08.rst | 5 + > > drivers/bus/pci/pci_common.c | 81 ++++++++++++++++ > > drivers/bus/pci/pci_common_uio.c | 33 +++++++ > > drivers/bus/pci/private.h | 12 +++ > > lib/librte_eal/bsdapp/eal/eal_dev.c | 14 +++ > > lib/librte_eal/common/eal_common_bus.c | 43 +++++++++ > > lib/librte_eal/common/eal_private.h | 39 ++++++++ > > lib/librte_eal/common/include/rte_bus.h | 34 +++++++ > > lib/librte_eal/common/include/rte_dev.h | 26 +++++ > > lib/librte_eal/linuxapp/eal/eal_dev.c | 162 +++++++++++++++++++++++++++++++- > > lib/librte_eal/rte_eal_version.map | 2 + > > 12 files changed, 481 insertions(+), 9 deletions(-) > > > > I am glad to see this, hotplug is needed. But have a somewhat controversial > point of view. The DPDK project needs to do more to force users to go to > more modern kernels and API's; there has been too much effort already to > support new DPDK on older kernels and distributions. This leads to higher > testing burden, technical debt and multiple API's. > > To take the extreme point of view. > * igb_uio should be deprecated and all new work only use vfio and vfio-ionommu only > * kni should be deprecated and replaced by virtio +1 I think, The only feature missing in upstream kernel for DPDK may be to enable SRIOV on PF VFIO devices controlled by DPDK PMD. I think, Binding a PF device to DPDK along with VFs(VF can be bound to netdev or DPDK PMDs, Though binding VF to netdev considered as security breach) will be useful for a) rte_flow actions like redirecting the traffic to PF or VF on the given pattern b) Some NICs can support promiscuous mode only on PF c) Enable Switch representation devices https://doc.dpdk.org/guides/prog_guide/switch_representation.html I think, igb_uio mainly used as the backdoor for this use case. I think, there was some work in this area but it is not upstreamed due to various reasons. https://lkml.org/lkml/2018/3/8/1122 > > When there are N ways of doing things against X kernel versions, > and Y distributions, and multiple device vendors; the combinational explosion of cases means > that interfaces don't get the depth of testing they deserve. > > That means why not support hotplug on VFIO only? >