From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yongseok Koh Subject: Re: Mellanox ConnectX-5 crashes and mbuf leak Date: Tue, 10 Oct 2017 14:28:26 +0000 Message-ID: <7DB3162E-9A29-4668-BCB3-693CCB4D772A@mellanox.com> References: <5d1f07c4-5933-806d-4d11-8fdfabc701d7@allegro-packets.com> <374F8C13-CFB0-42FD-8993-BF7F0401F891@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Cc: Adrien Mazarguil , =?iso-8859-1?Q?N=E9lio_Laranjeiro?= , "dev@dpdk.org" , Ferruh Yigit To: Martin Weiser Return-path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0069.outbound.protection.outlook.com [104.47.1.69]) by dpdk.org (Postfix) with ESMTP id 8CFF61B216 for ; Tue, 10 Oct 2017 16:28:28 +0200 (CEST) In-Reply-To: Content-Language: en-US Content-ID: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Glad to hear that helped! Then the patch will get merged soon. Thanks, Yongseok > On Oct 10, 2017, at 1:10 AM, Martin Weiser wrote: >=20 > Hi Yongseok, >=20 > I can confirm that this patch fixes the crashes and freezing in my tests > so far. >=20 > We still see an issue that once the mbufs run low and reference counts > are used as well as freeing of mbufs in processing lcores happens we > suddenly lose a large amount of mbufs that will never return to the > pool. But I can also reproduce this with ixgbe so this is not specific > to the mlx5 driver but rather an issue of the current dpdk-net-next > state. I will write up a separate mail with details how to reproduce this= . >=20 > Thank you for your support! >=20 > Best regards, > Martin >=20 >=20 > On 08.10.17 00:19, Yongseok Koh wrote: >>> On Oct 6, 2017, at 3:30 PM, Yongseok Koh wrote: >>>=20 >>> Hi, Martin >>>=20 >>> Even though I had done quite serious tests before sending out the patch= , >>> I figured out deadlock could happen if the Rx queue size is smaller. It= is 128 >>> by default in testpmd while I usually use 256. >>>=20 >>> I've fixed the bug and submitted a new patch [1], which actually revert= s the >>> previous patch. So, you can apply the attached with disregarding the ol= d one. >>>=20 >>> And I have also done extensive tests for this new patch but please let = me know >>> your test results. >>>=20 >>> [1] >>> "net/mlx5: fix deadlock due to buffered slots in Rx SW ring" >>> at https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2F= dpdk.org%2Fdev%2Fpatchwork%2Fpatch%2F29847&data=3D02%7C01%7Cyskoh%40mellano= x.com%7Cd026493e00eb429cb6b608d50fb673ed%7Ca652971c7d2e4d9ba6a4d149256f461b= %7C0%7C0%7C636432198675778556&sdata=3DYOZQFrSeIM%2BFw42CvqazNDYSv98jKB%2F2b= MRSqrYo2a8%3D&reserved=3D0 >> Hi Martin >>=20 >> I've submitted v2 of the patch [1]. I just replaced vector insns with re= gular >> statements. This is just for ease of maintenance because I'm about to a= dd >> vectorized PMD for ARM NEON. In terms of functionality and performance = it is >> identical. >>=20 >> Please proceed your testing with this and let me know the result. >>=20 >> [1] >> [dpdk-dev,v2] net/mlx5: fix deadlock due to buffered slots in Rx SW ring >> , which is at https://emea01.safelinks.protection.outlook.com/?url=3Dhtt= p%3A%2F%2Fdpdk.org%2Fdev%2Fpatchwork%2Fpatch%2F29879%2F&data=3D02%7C01%7Cys= koh%40mellanox.com%7Cd026493e00eb429cb6b608d50fb673ed%7Ca652971c7d2e4d9ba6a= 4d149256f461b%7C0%7C0%7C636432198675778556&sdata=3DIrrEKKLWYRarrbE2McSzytYa= Q4zdh1nAnsWErgijd%2Fg%3D&reserved=3D0 >>=20 >> Thanks, >> Yongseok