From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9739A749D for ; Thu, 12 Jan 2023 14:13:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B8302C433EF; Thu, 12 Jan 2023 14:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1673532803; bh=rFrDWIvBGbD9t20L/+Zu/wpprXz9Z0jNy2KGMg8ABLM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gVBQ0kJA8RFgm+H9HQuq8s7NNlDrRO2w2ujPQ5vWenKHGYYDEb6KS1jN42n19sK8x HHtcva6f0Ev9B5XWV97RcT2utGjKhJoLEtFDHA5Td4gJpwymqdw9NA6vi4SzidaD0r kIj9UEoEqgYnYE1nA06BDbiz7RrxAHjxLntQ/He8= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Mike Christie , Keith Busch , Christoph Hellwig , Ming Lei , John Garry , Hannes Reinecke , Adrian Hunter , Bart Van Assche , "Martin K. Petersen" , Sasha Levin Subject: [PATCH 5.10 282/783] scsi: core: Fix a race between scsi_done() and scsi_timeout() Date: Thu, 12 Jan 2023 14:49:58 +0100 Message-Id: <20230112135537.465387208@linuxfoundation.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230112135524.143670746@linuxfoundation.org> References: <20230112135524.143670746@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Bart Van Assche [ Upstream commit 978b7922d3dca672b41bb4b8ce6c06ab77112741 ] If there is a race between scsi_done() and scsi_timeout() and if scsi_timeout() loses the race, scsi_timeout() should not reset the request timer. Hence change the return value for this case from BLK_EH_RESET_TIMER into BLK_EH_DONE. Although the block layer holds a reference on a request (req->ref) while calling a timeout handler, restarting the timer (blk_add_timer()) while a request is being completed is racy. Reviewed-by: Mike Christie Cc: Keith Busch Cc: Christoph Hellwig Cc: Ming Lei Cc: John Garry Cc: Hannes Reinecke Reported-by: Adrian Hunter Fixes: 15f73f5b3e59 ("blk-mq: move failure injection out of blk_mq_complete_request") Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20221018202958.1902564-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/scsi_error.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f11f51e2465f..0c4bc42b55c2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -306,19 +306,11 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) if (rtn == BLK_EH_DONE) { /* - * Set the command to complete first in order to prevent a real - * completion from releasing the command while error handling - * is using it. If the command was already completed, then the - * lower level driver beat the timeout handler, and it is safe - * to return without escalating error recovery. - * - * If timeout handling lost the race to a real completion, the - * block layer may ignore that due to a fake timeout injection, - * so return RESET_TIMER to allow error handling another shot - * at this command. + * If scsi_done() has already set SCMD_STATE_COMPLETE, do not + * modify *scmd. */ if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state)) - return BLK_EH_RESET_TIMER; + return BLK_EH_DONE; if (scsi_abort_command(scmd) != SUCCESS) { set_host_byte(scmd, DID_TIME_OUT); scsi_eh_scmd_add(scmd); -- 2.35.1