From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Trahe, Fiona" Subject: Re: [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo Date: Fri, 2 Sep 2016 13:52:46 +0000 Message-ID: <348A99DA5F5B7549AA880327E580B4358909D016@IRSMSX101.ger.corp.intel.com> References: <1472217646-26219-1-git-send-email-olivier.matz@6wind.com> <20160830132352.GB30977@hmsreliant.think-freely.org> <48f9320b-9402-0ecd-8971-c3785778081a@6wind.com> <20160831132709.GA32000@hmsreliant.think-freely.org> <54a0164e-b242-b930-ec91-60f91b700119@6wind.com> <348A99DA5F5B7549AA880327E580B4358909A43A@IRSMSX101.ger.corp.intel.com> <20160901173519.GA11132@hmsreliant.think-freely.org> <20160901104122.41c131be@xeon-e3> <20160901191538.GB11132@hmsreliant.think-freely.org> <348A99DA5F5B7549AA880327E580B4358909CE45@IRSMSX101.ger.corp.intel.com> <20160902133327.GA980@hmsreliant.think-freely.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: Stephen Hemminger , "dev@dpdk.org" , Olivier Matz , Thomas Monjalon , "Trahe, Fiona" To: Neil Horman Return-path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id CFE484B79 for ; Fri, 2 Sep 2016 15:52:50 +0200 (CEST) In-Reply-To: <20160902133327.GA980@hmsreliant.think-freely.org> Content-Language: en-US List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Neil Horman [mailto:nhorman@tuxdriver.com] > Sent: Friday, September 2, 2016 2:33 PM > To: Trahe, Fiona > Cc: Stephen Hemminger ; dev@dpdk.org; > Olivier Matz ; Thomas Monjalon > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependenc= ies > in pmdinfo >=20 > On Fri, Sep 02, 2016 at 09:19:26AM +0000, Trahe, Fiona wrote: > > > > > > > -----Original Message----- > > > From: Neil Horman [mailto:nhorman@tuxdriver.com] > > > Sent: Thursday, September 1, 2016 8:16 PM > > > To: Stephen Hemminger > > > Cc: Trahe, Fiona ; dev@dpdk.org; Olivier Matz > > > ; Thomas Monjalon > > > > > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod > > > dependencies in pmdinfo > > > > > > On Thu, Sep 01, 2016 at 10:41:22AM -0700, Stephen Hemminger wrote: > > > > On Thu, 1 Sep 2016 13:35:19 -0400 > > > > Neil Horman wrote: > > > > > > > > > On Thu, Sep 01, 2016 at 12:55:27PM +0000, Trahe, Fiona wrote: > > > > > > Hi Neil and Olivier, > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier > > > > > > > Matz > > > > > > > Sent: Wednesday, August 31, 2016 2:40 PM > > > > > > > To: Neil Horman > > > > > > > Cc: dev@dpdk.org; thomas.monjalon@6wind.com > > > > > > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise > > > > > > > kmod dependencies in pmdinfo > > > > > > > > > > > > > > Hi Neil, > > > > > > > > > > > > > > On 08/31/2016 03:27 PM, Neil Horman wrote: > > > > > > > > On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrot= e: > > > > > > > >> Hi Neil, > > > > > > > >> > > > > > > > >> On 08/30/2016 03:23 PM, Neil Horman wrote: > > > > > > > >>> On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wr= ote: > > > > > > > >>>> Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows > > > > > > > >>>> a driver to declare the list of kernel modules required = to run > properly. > > > > > > > >>>> > > > > > > > >>>> Today, most PCI drivers require uio/vfio. > > > > > > > >>>> > > > > > > > >>>> Signed-off-by: Olivier Matz > > > > > > > >>>> > > > > > > > >>>> --- > > > > > > > >>>> In this RFC, I supposed that all PCI drivers require a > > > > > > > >>>> the loading of a uio/vfio module (except mlx*), this may= be > wrong. > > > > > > > >>>> Comments are welcome! > > > > > > > >>>> > > > > > > > >>>> > > > > > > > >>>> buildtools/pmdinfogen/pmdinfogen.c | 1 + > > > > > > > >>>> buildtools/pmdinfogen/pmdinfogen.h | 1 + > > > > > > > >>>> drivers/crypto/qat/rte_qat_cryptodev.c | 2 ++ > > > > > > > >>>> drivers/net/bnx2x/bnx2x_ethdev.c | 4 ++++ > > > > > > > >>>> drivers/net/bnxt/bnxt_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/cxgbe/cxgbe_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/e1000/em_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/e1000/igb_ethdev.c | 4 ++++ > > > > > > > >>>> drivers/net/ena/ena_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/enic/enic_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/fm10k/fm10k_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/i40e/i40e_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/i40e/i40e_ethdev_vf.c | 2 ++ > > > > > > > >>>> drivers/net/ixgbe/ixgbe_ethdev.c | 4 ++++ > > > > > > > >>>> drivers/net/mlx4/mlx4.c | 2 ++ > > > > > > > >>>> drivers/net/mlx5/mlx5.c | 3 +++ > > > > > > > >>>> drivers/net/nfp/nfp_net.c | 2 ++ > > > > > > > >>>> drivers/net/qede/qede_ethdev.c | 4 ++++ > > > > > > > >>>> drivers/net/szedata2/rte_eth_szedata2.c | 2 ++ > > > > > > > >>>> drivers/net/thunderx/nicvf_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/virtio/virtio_ethdev.c | 2 ++ > > > > > > > >>>> drivers/net/vmxnet3/vmxnet3_ethdev.c | 2 ++ > > > > > > > >>>> lib/librte_eal/common/include/rte_dev.h | 14 > ++++++++++++++ > > > > > > > >>>> tools/dpdk-pmdinfo.py | 5 ++++- > > > > > > > >>>> 24 files changed, 69 insertions(+), 1 deletion(-) > > > > > > > >>>> > > > > > > > >>> > > > > > > > >>> Generally speaking, I like the idea, it makes sense to > > > > > > > >>> me in terms of using pmdinfo to export this information > > > > > > > >>> > > > > > > > >>> That said, This may need to be a set of macros. By that > > > > > > > >>> I mean (and correct > > > > > > > me > > > > > > > >>> if I'm wrong here), but the relationship between pmd's > > > > > > > >>> and kernel modules > > > > > > > is in > > > > > > > >>> some cases, more complex than a 'requires' or 'depends' > > > > > > > >>> relationship. That > > > > > > > is > > > > > > > >>> to say, some pmd may need user space hardware access, > > > > > > > >>> but can use either > > > > > > > uio OR > > > > > > > >>> vfio, but doesn't need both, and can continue to > > > > > > > >>> function if only one is available. Other PMD's may be > > > > > > > >>> able to use vfio or uio, but can still function without > > > > > > > >>> either. And some, as your patch implements, simply > > > > > > > >>> require one or > > > > > > > the > > > > > > > >>> other to function. As such it seems like you may want a > > > > > > > >>> few macros, in the > > > > > > > form > > > > > > > >>> of: > > > > > > > >>> > > > > > > > >>> DRIVER_REGISTER_KMOD_REQUEST - List of modules to > > > > > > > >>> attempt loading, > > > > > > > ignore any > > > > > > > >>> failures > > > > > > > >>> DRIVER_REGISTER_KMOD_REQUIRE - List of modules required > > > > > > > >>> to be > > > > > > > loaded after > > > > > > > >>> request macro completes, fail if any are not loaded > > > > > > > >>> > > > > > > > >>> Thats just spitballing, mind you, theres probably a > > > > > > > >>> better way to do it, but > > > > > > > the > > > > > > > >>> idea is to list a set of modules you would like to have, > > > > > > > >>> and then create a parsable syntax to describe the > > > > > > > >>> modules that need to be loaded after the > > > > > > > request > > > > > > > >>> is complete so that you can accurately codify the > > > > > > > >>> situations I described > > > > > > > above. > > > > > > > >> > > > > > > > >> Thank you for your feedback. > > > > > > > >> However, I'm not sure I'm perfectly getting what you sugge= st. > > > > > > > >> > > > > > > > >> Do you think some PMDs could request a kernel module > > > > > > > >> without really requiring it? Do you have an example in min= d? > > > > > > > >> > > > > > > > > Yes, thats precisely it. The most clear example I could > > > > > > > > think of (though I'm not sure if any pmd currently > > > > > > > > supports this), is a pmd that supports both UIO and VFIO > > > > > > > > communication with the kernel. Such a PMD requires that > > > > > > > > one of > > > > > > > those > > > > > > > > two modules be loaded, but only one (i.e. both are not > > > > > > > > required), so if only > > > > > > > the > > > > > > > > uio kernel module loads is a success case, likewise if > > > > > > > > only the vfio module loads can be treated as success. > > > > > > > > Both loading are clearly successful. Only if neither load > > > > > > > > do we have a failure case. I'm suggesting that the > > > > > > > > grammer that your exports define should take those cases > > > > > > > > into account. Its not always as > > > simple as "I must have the following modules" > > > > > > > > > > > > > > > >> The syntax I've submitted lets you define several lists > > > > > > > >> of modules, so that the user or the script that starts > > > > > > > >> the application can decide which kmod list is better > > > > > > > >> according to the > > > environment. > > > > > > > >> > > > > > > > > If you have a human intervening in the module load > > > > > > > > process, sure, then its > > > > > > > fine. > > > > > > > > But it seems that this particular feature that you're > > > > > > > > implemnting might have automated uses. That is to say the > > > > > > > > dpdk core library might be interested in parsing this > > > > > > > > particular information to direct module autoloading, and > > > > > > > > if thats desireable then you need to define these lists > > > > > > > > such that you can > > > codify failure and success conditions. > > > > > > > > > > > > > > > >> For example, most drivers will advertise > > > > > > > >> "uio,igb_uio:uio,uio_pci_generic:vfio,vfio-pci", and the > > > > > > > >> user or script will have to choose between loading: > > > > > > > >> - uio igb_uio > > > > > > > >> - uio uio_pci_generic > > > > > > > >> - vfio vfio-pci > > > > > > > >> > > > > > > > > Oh, I see, so your list is a colon delimited list of > > > > > > > > module load sets, where at least one set must succeed by > > > > > > > > loading all modules in its set, but the failure of any one > > > > > > > > set isn't fatal to the > > > process? e.g. a string like this: > > > > > > > > > > > > > > > > uio,igb_uio:vfio,vfio-pci > > > > > > > > > > > > > > > > could be interpreted to mean "I must load (uio AND > > > > > > > > igb_uio) OR (vfio AND vfio-pci). If the evaluation of > > > > > > > > that statement results in false, then the operation fails, = otherwise > it succedes. > > > > > > > > > > > > > > > > If thats the case, then, apologies, we're on the same > > > > > > > > page, and this will work just fine. > > > > > > > > > > > > > > Yep, that's the idea. > > > > > > > > > > > > > > Colon and commas are the best separators I've thought about, > > > > > > > but any idea to make the syntax clearer is welcome ;) > > > > > > > > > > > > > > Maybe a syntax like is clearer: > > > > > > > "(mod1 & mod2)|(mod3 & mod4)" ? > > > > > > > But it would let the user think that more complex > > > > > > > expressions are valid, like "(mod1 & (mod2 | mod3)) | mod4", > > > > > > > which is probably > > > overkill. > > > > > > > > > > > > > > Regards, > > > > > > > Olivier > > > > > > > > > > > > This RFC seems like a good idea - and something the Intel > > > > > > QuickAssist PMD > > > could benefit from. > > > > > > However the (mod1 & mod2) can handle the QAT case better in my > > > opinion. > > > > > > i.e. > > > > > > as well as needing one of > > > > > > * uio igb_uio > > > > > > * uio uio_pci_generic > > > > > > * vfio vfio-pci > > > > > > QAT PMD also needs one of (depending on which physical device > > > > > > is > > > > > > plugged) > > > > > > * qat_dh895xcc > > > > > > * qat_c62x > > > > > > * qat_c3xxx > > > > > > > > > > > > So the original syntax would result in a very long list of poss= ible > variations. > > > > > > What really reflects the dependencies would be ((uio & > > > > > > igb_uio) | (uio & uio_pci_generic) | (vfio & vfio_pci)) & > > > > > > (qat_dh895xcc | qat_c62x | qat_c3xxx) > > > > > > > > > > > Ah, I didn't consider that hardware specifics might create a use > > > > > case where a pmd must have one or more kernel modules available > > > > > for hw support. Perhaps it is worthwhile to automate hardware > > > > > support - that is to say, any module loading script should > > > > > automatically look at the pci table exported from a pmd, and, if > > > > > found, load any modules that claim support for that > > > > > device:vendor tuple? Though that might break in the case of > > > > > uio, if there are separate driver modules that > > > support native hardware and uio access. > > > > Actually if the script output was intended to be used to auto-load > > dependent kmods, then even the above would not suffice for the QAT > > driver (and presumably for other PMDs with specific HW dependencies). > > i.e. the qat_dhxxxx modules have further dependencies themselves on an > > intel_qat module, and there are other steps documented in the > But any dependency chain such as what you describe is covered in the next= step > of the chain. That is to say if the qat pmd has a hardware dependency on > qat_dhxxx (or qat_cxxx, etc), and those modules depend on intel_qat, the = pmd > doesn't need to know that, because qat_dhxxx and companions should all li= st > intel_qat as a dependency that modprobe will resolve when installing the = kernel > module. >=20 > > guide which must be taken after loading the kmods. > I'm not sure what you mean by this. Are you referring to the qat > documentation that comes with the DPDK? I only see three additional item= s > there to address >=20 > 1) Removing other modules when using the 01.org kernel modules >=20 > 2) installation of firmware >=20 > 3) Binding of the device to user space for VFIO/UIO >=20 > All three of these tasks fall outside the scope of what this macro is mea= nt to do. > We could try to create macros for them to export information for use in a > loading script if you like, but I wouldn't. All three of the above items= fall in my > mind under the category of administrative responsibilities. That is to s= ay, they > are orthogonoal to defining a module dependency structure, and if they're > arent properly completed, the module dependency chain won't matter anyway= . >=20 Another manual step is documented, which must be done after insmod:=20 echo 32 > /sys/bus/pci/drivers/dh895xcc/0000\:03\:00.0/sriov_numvfs (steps will vary for different hardware types) Which I agree like the others are outside the scope of what this macro is m= eant to do. So using the macro to facilitate auto-loading of modules isn't a very usefu= l feature for the QAT driver. > > The use-case I'd addressed was for the script to identify and just > > throw an error where dependent modules are missing. > > >=20 > That doesn't really add much value then, since missing modules already re= sult in > errors when the PMD tries to initalize. >=20 > > I don't see a simple solution, but also don't see a strong need to find= one. > > Documentation and if necessary a driver-specific script seem sufficient= to me. > > > > My conclusion is the RFC is a nice feature for some drivers, but if int= roduced > needs > > to be optional as it doesn't handle the complexities of all drivers. > > >=20 > I agree its an optional export. If there are no dependencies, or if the a= uthor > wishes to to simply not supply any, thats fine, the results will be in > accordance with that, but I strongly disagree that its optional implies t= he fact > that we can ignore the complexities of the depedencies that can be export= ed. >=20 > The more I think about it the more I like Stephens idea, possibly with so= me > macro assistance. That is to say: >=20 > 1) Start by loading hardware specific modules, the information for which = is > already available. You can parse the pci table that a pmd exports and ma= tch it > with the pci aliases retrieved via modinfo >=20 > 2) Load a special virt driver if no hardware is found on the system in (1= ). > special virt drivers might be worth tagging with a VIRT/VFIO/UIO tag expo= rt for > pmdinfo >=20 > That allows to set asside the complexities of our dependency chain, as we= can > assume hardware support modules will codify any real dependencies there, = and > a > VIRT tag will let us find any modules needed for hardware the is assigned= into > our guest. >=20 > Neil >=20 > Neil >=20 > > > > > > > > I ended up writing a script that went the other way. > > > > First look at the hardware and load VFIO if IOMMU is available. > > > > Then look for special driver needed for Xen and HyperV Lastly fallb= ack > > > > to loading igb_uio if no VFIO and PCI device present. > > > > > > > > In other words it is a system not driver issue. > > > > > > > That sounds like a reasonable approach, yes. > > > Neil > > > > > > > > >