From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ferruh Yigit Subject: Re: [PATCH] igb_uio: revert open and release operations Date: Thu, 19 Oct 2017 18:15:15 -0700 Message-ID: References: <20171017201436.65270-1-ferruh.yigit@intel.com> <2156540.XK3Rr5JXAh@xps> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: "dev@dpdk.org" , Jianfeng Tan , Jingjing Wu , "Thotton, Shijith" , Gregory Etelson , George Prekas , "stable@dpdk.org" , Sergio Gonzalez Monroy , "Glynn, Michael J" To: "Patil, Harish" , Thomas Monjalon Return-path: In-Reply-To: Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 10/19/2017 3:43 PM, Patil, Harish wrote: > -----Original Message----- > From: Harish Patil > Date: Tuesday, October 17, 2017 at 9:50 PM > To: Thomas Monjalon , Ferruh Yigit > > Cc: "dev@dpdk.org" , Jianfeng Tan , > Jingjing Wu , "Thotton, Shijith" > , Gregory Etelson , George > Prekas , "stable@dpdk.org" > Subject: Re: [PATCH] igb_uio: revert open and release operations >> >> >> -----Original Message----- >> From: Thomas Monjalon >> Date: Tuesday, October 17, 2017 at 1:33 PM >> To: Ferruh Yigit , Harish Patil >> >> Cc: "dev@dpdk.org" , Jianfeng Tan , >> Jingjing Wu , "Thotton, Shijith" >> , Gregory Etelson , George >> Prekas , "stable@dpdk.org" >> Subject: Re: [PATCH] igb_uio: revert open and release operations >> >>> 17/10/2017 22:14, Ferruh Yigit: >>>> There were bug reports about terminated application may leave device in >>>> undesired state: >>>> http://dpdk.org/ml/archives/dev/2016-November/049745.html >>>> http://dpdk.org/ml/archives/dev/2016-November/050932.html >>>> >>>> And a proposal to fix: >>>> http://dpdk.org/ml/archives/dev/2016-December/051844.html >>>> >>>> Later another proposal triggered the discussion: >>>> http://dpdk.org/ml/archives/dev/2017-May/066317.html >>>> >>>> Finally a fix patch pushed into v17.08: >>>> Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of >>>> device file") >>>> >>>> Later a regression report sent related to the pushed patch: >>>> http://dpdk.org/ml/archives/dev/2017-September/075236.html >>>> >>>> And a fix for regression integrated into v17.11-rc1: >>>> http://dpdk.org/ml/archives/dev/2017-October/079166.html >>>> Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in >>>> VM") >>>> Commit: 6b9ed026a870 ("igb_uio: fix build with kernel <= 3.17") >>>> >>>> Even after the fix qede PMD reported to be broken: >>>> http://dpdk.org/ml/archives/dev/2017-October/079359.html >>>> >>>> So this patch reverts original fix and related commits. The related >>>> igb_uio code part turns back to v17.05 base. >>> [...] >>>> --- >>>> It would be nice to solve this issue in LTS release, but being close to >>>> the release and the error report without details makes it hard to work >>>> more on this issue. >>> >>> With this revert, we are back to the original issue. >>> We must really try the proposed solution. >>> Harish, please describe your issue and think how it could be fixed. >>> Jingjing made it work for i40e. >>> I know it is less effort to request a simple revert. >>> Please let's try to fix it once for all. >> >> Hi Ferruh/Thomas, >> I’m discussing it internally, so please hold on committing this patch till >> I revert back to you. >> >> Thanks. > >> [Harish] > >> 1) With the introduction of: >> Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of >> device file”) > We saw failures with qede PF & SR-IOV VF initialization in PCI passthru > mode. > >> >> PF PCI passthru mode initialization failure was resolved by: >> “Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in >> VM") Thank you for the update. > >> SR-IOV VF PCI passthru mode initialization issue is that PCI FLR and >> related device cleanup is not completed by the time VF driver starts >> loading. It results in the mbox command failure sent over the HW channel >> between VF and PF. This seems same reason why i40e added a check and wait loop. >> >> Even though pci_reset_function() waits for the stipulated amount of time >> per standards, VF FLR takes longer than that and pci_reset_function() & >> igb_uio_open() call returns before FLR completes and VF PMD driver tries >> to load before FLR completes leading to VF PMD initialization failure. >> >> >> We can work around this problem by adding driver delay/retry logic since >> there is no deterministic way of detecting FLR completion. But this is >> going to increase the driver load time. >> > >> 2) With the above patch ("igb_uio: issue FLR during open and release of >> device file), FLR is going to be issued unconditionally on all devices >> during igb_uio_open. We think it’s an over kill. FLR is required only for >> previous abnormal app termination. We already handle the abnormal app >> termination by doing necessary cleanup in the driver during load. This >> cleanup is more efficient as it is done only when required. So we feel >> that the drivers/devices needing such cleanup (the two cases listed >> below) should do it conditionally when required rather than >> igb_uio_open() unconditionally performing FLR all the time. >> >> e.g., > - cdb166963cae ("net/liquidio: add API for VF FLR”) Both 1) and 2) related to the pci_reset during open(). But the main functionality we are looking for is the pci_reset in release(). So we can remove reset during open() [1]. Will disabling pci_reset in open() [2] solve your problems? Jingjing, Jianfeng, What do you think? [1] Perhaps will need to revert or partially revert liquidio patch after this. [2] disable line "pci_reset_function(dev);" in the igbuio_pci_open() http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/igb_uio/igb_uio.c#n339 >> >> >> - http://dpdk.org/ml/archives/dev/2017-May/066317.html > > Thanks, > Harish >> >