From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-by2nam03on0102.outbound.protection.outlook.com ([104.47.42.102]:48448 "EHLO NAM03-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754094AbeDIA1V (ORCPT ); Sun, 8 Apr 2018 20:27:21 -0400 From: Sasha Levin To: "stable@vger.kernel.org" , "linux-kernel@vger.kernel.org" CC: Netanel Belgazal , "David S . Miller" , Sasha Levin Subject: [PATCH AUTOSEL for 4.9 032/293] net: ena: fix rare uncompleted admin command false alarm Date: Mon, 9 Apr 2018 00:23:15 +0000 Message-ID: <20180409002239.163177-32-alexander.levin@microsoft.com> References: <20180409002239.163177-1-alexander.levin@microsoft.com> In-Reply-To: <20180409002239.163177-1-alexander.levin@microsoft.com> Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org List-ID: From: Netanel Belgazal [ Upstream commit a77c1aafcc906f657d1a0890c1d898be9ee1d5c9 ] The current flow to detect admin completion is: while (command_not_completed) { if (timeout) error check_for_completion() sleep() } So in case the sleep took more than the timeout (in case the thread/workqueue was not scheduled due to higher priority task or prolonged VMexit), the driver can detect a stall even if the completion is present. The fix changes the order of this function to first check for completion and only after that check if the timeout expired. Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA= )") Signed-off-by: Netanel Belgazal Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethern= et/amazon/ena/ena_com.c index e2512ab41168..564b556baba9 100644 --- a/drivers/net/ethernet/amazon/ena/ena_com.c +++ b/drivers/net/ethernet/amazon/ena/ena_com.c @@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno(u8 comp_statu= s) static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *= comp_ctx, struct ena_com_admin_queue *admin_queue) { - unsigned long flags; - u32 start_time; + unsigned long flags, timeout; int ret; =20 - start_time =3D ((u32)jiffies_to_usecs(jiffies)); + timeout =3D jiffies + ADMIN_CMD_TIMEOUT_US; + + while (1) { + spin_lock_irqsave(&admin_queue->q_lock, flags); + ena_com_handle_admin_completion(admin_queue); + spin_unlock_irqrestore(&admin_queue->q_lock, flags); =20 - while (comp_ctx->status =3D=3D ENA_CMD_SUBMITTED) { - if ((((u32)jiffies_to_usecs(jiffies)) - start_time) > - ADMIN_CMD_TIMEOUT_US) { + if (comp_ctx->status !=3D ENA_CMD_SUBMITTED) + break; + + if (time_is_before_jiffies(timeout)) { pr_err("Wait for completion (polling) timeout\n"); /* ENA didn't have any completion */ spin_lock_irqsave(&admin_queue->q_lock, flags); @@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admin_cq_polling(s= truct ena_comp_ctx *comp_c goto err; } =20 - spin_lock_irqsave(&admin_queue->q_lock, flags); - ena_com_handle_admin_completion(admin_queue); - spin_unlock_irqrestore(&admin_queue->q_lock, flags); - msleep(100); } =20 --=20 2.15.1