From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 701F4C4332F for ; Tue, 22 Nov 2022 02:29:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231927AbiKVC3K (ORCPT ); Mon, 21 Nov 2022 21:29:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231935AbiKVC2d (ORCPT ); Mon, 21 Nov 2022 21:28:33 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D738286E4 for ; Mon, 21 Nov 2022 18:28:32 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AB09061541 for ; Tue, 22 Nov 2022 02:28:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F02BEC433D6; Tue, 22 Nov 2022 02:28:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669084111; bh=YrDyNjEVRCzAIvoCedWCx29+9d9trcugxXtVTwgOanE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CLy8oO/pxDZaTZ0AfC7VhcdIMtgKOKpqPJyOanilFFn0qtjHHL0i8recBulvVEI21 oas66UTpFlwVdQkg4C/2ugq3OiYrOnRblhAgAzpTWZDegetrlO63oSF/tz52SdPzxS +zOCsV09eTpSVnsdHd/n8/agW72QO+AAyqoYyqbqtpVU5hbpjo/WfvZcjyuAYfH9Kg 4jFLS5U3vVq62k6rGrUJfr6mdpLXyhDMwmWnzBP3old5nEIC0EHcdrk3PCdAO8BvfD nCdEU1+AAqGyuXKrkMTbwPFblpZIPkSxGCs1aMIxtbrrZp45pHCpo1tqEKSFyvM6IK D3a5ggmMoOl8g== From: Saeed Mahameed To: "David S. Miller" , Jakub Kicinski , Paolo Abeni , Eric Dumazet Cc: Saeed Mahameed , netdev@vger.kernel.org, Tariq Toukan , Moshe Shemesh , Aya Levin Subject: [net 08/14] net/mlx5: Fix sync reset event handler error flow Date: Mon, 21 Nov 2022 18:25:53 -0800 Message-Id: <20221122022559.89459-9-saeed@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221122022559.89459-1-saeed@kernel.org> References: <20221122022559.89459-1-saeed@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Moshe Shemesh When sync reset now event handling fails on mlx5_pci_link_toggle() then no reset was done. However, since mlx5_cmd_fast_teardown_hca() was already done, the firmware function is closed and the driver is left without firmware functionality. Fix it by setting device error state and reopen the firmware resources. Reopening is done by the thread that was called for devlink reload fw_activate as it already holds the devlink lock. Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Moshe Shemesh Reviewed-by: Aya Levin Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 9d908a0ccfef..1e46f9afa40e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -9,7 +9,8 @@ enum { MLX5_FW_RESET_FLAGS_RESET_REQUESTED, MLX5_FW_RESET_FLAGS_NACK_RESET_REQUEST, MLX5_FW_RESET_FLAGS_PENDING_COMP, - MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS + MLX5_FW_RESET_FLAGS_DROP_NEW_REQUESTS, + MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED }; struct mlx5_fw_reset { @@ -406,7 +407,7 @@ static void mlx5_sync_reset_now_event(struct work_struct *work) err = mlx5_pci_link_toggle(dev); if (err) { mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, no reset done, err %d\n", err); - goto done; + set_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags); } mlx5_enter_error_state(dev, true); @@ -482,6 +483,10 @@ int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev) goto out; } err = fw_reset->ret; + if (test_and_clear_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags)) { + mlx5_unload_one_devl_locked(dev); + mlx5_load_one_devl_locked(dev, false); + } out: clear_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags); return err; -- 2.38.1