From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ferruh Yigit Subject: Re: [PATCH v4] vmxnet3: fix Rx deadlock Date: Mon, 19 Dec 2016 12:26:21 +0000 Message-ID: <006e2ee9-f586-8b40-a06c-d386833dee10@intel.com> References: <1481902617-16050-1-git-send-email-stefan.puiu@gmail.com> <1482140453-49649-1-git-send-email-stefan.puiu@gmail.com> <8e9b361a-566c-2c25-5497-da0ee0e7c818@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: yongwang@vmware.com, mac_leehk@yahoo.com.hk, dpdk stable To: Stefan Puiu , dev@dpdk.org Return-path: In-Reply-To: <8e9b361a-566c-2c25-5497-da0ee0e7c818@intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 12/19/2016 10:41 AM, Ferruh Yigit wrote: > On 12/19/2016 9:40 AM, Stefan Puiu wrote: >> Our use case is that we have an app that needs to keep mbufs around >> for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from >> vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX >> descriptors (where it returns -err). Since there are no mbufs that the >> virtual hardware can use, no packets will be received after this; the >> driver won't refill the mbuf after this so it gets stuck in this >> state. I call this a deadlock for lack of a better term - the virtual >> HW waits for free mbufs, while the app waits for the hardware to >> notify it for data (by flipping the generation bit on the used Rx >> descriptors). Note that after this, the app can't recover. >> >> This fix is a rework of this patch by Marco Lee: >> http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port >> it, address review comments and also reverted the allocation >> failure handling to the first version of the patch >> (http://dpdk.org/ml/archives/dev/2015-July/022079.html), since >> that's the only approach that seems to work, and seems to be what >> other drivers are doing (I checked ixgbe and em). Reusing the mbuf >> that's getting passed to the application doesn't seem to make >> sense, and it was causing weird issues in our app. Also, reusing >> rxm without checking if it's NULL could cause the code to crash. >> >> Signed-off-by: Stefan Puiu > > Applied to dpdk-next-net/master, thanks. > CC:stable@dpdk.org