From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9476C433F5 for ; Wed, 13 Oct 2021 18:08:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C649F61053 for ; Wed, 13 Oct 2021 18:08:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231234AbhJMSKJ (ORCPT ); Wed, 13 Oct 2021 14:10:09 -0400 Received: from mta-02.yadro.com ([89.207.88.252]:46134 "EHLO mta-01.yadro.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231215AbhJMSKJ (ORCPT ); Wed, 13 Oct 2021 14:10:09 -0400 Received: from localhost (unknown [127.0.0.1]) by mta-01.yadro.com (Postfix) with ESMTP id 7A41043F4C; Wed, 13 Oct 2021 18:08:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yadro.com; h= in-reply-to:content-disposition:content-type:content-type :mime-version:references:message-id:subject:subject:from:from :date:date:received:received:received; s=mta-01; t=1634148483; x=1635962884; bh=HBUbwIcUn1YrUcS+L+NZHRTjdg7GiNlBvEU9k7WJL4M=; b= qFk/nQ1SWzGNKyld8cnjqIRsXwOG1PoQZ8pSxs0VYiBYTY1u3sUKC1bcSuQyT4HY fZauyklJ1IFksD1vLqOhweHdUtTWjPTvfqaV0ibfJXUqfWWqqk9wpv3Gi46QWo0W Hc4HBW/sgI/ydfAZsfjBYdW8vPW8lS3zkYH0zgnebB4= X-Virus-Scanned: amavisd-new at yadro.com Received: from mta-01.yadro.com ([127.0.0.1]) by localhost (mta-01.yadro.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hpxte4RNoC8j; Wed, 13 Oct 2021 21:08:03 +0300 (MSK) Received: from T-EXCH-04.corp.yadro.com (t-exch-04.corp.yadro.com [172.17.100.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mta-01.yadro.com (Postfix) with ESMTPS id E38ED43F42; Wed, 13 Oct 2021 21:08:02 +0300 (MSK) Received: from yadro.com (10.199.12.215) by T-EXCH-04.corp.yadro.com (172.17.100.104) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.669.32; Wed, 13 Oct 2021 21:08:01 +0300 Date: Wed, 13 Oct 2021 21:08:01 +0300 From: Konstantin Shelekhin To: Mike Christie CC: , , Maurizio Lombardi Subject: Re: iSCSI Abort Task and WRITE PENDING Message-ID: References: <17647b68-f039-3fc3-808e-0feb652ddf8c@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <17647b68-f039-3fc3-808e-0feb652ddf8c@oracle.com> X-Originating-IP: [10.199.12.215] X-ClientProxiedBy: T-EXCH-01.corp.yadro.com (172.17.10.101) To T-EXCH-04.corp.yadro.com (172.17.100.104) Precedence: bulk List-ID: X-Mailing-List: target-devel@vger.kernel.org On Wed, Oct 13, 2021 at 12:51:32PM -0500, Mike Christie wrote: > On 10/13/21 8:21 AM, Konstantin Shelekhin wrote: > > Hi, > > > > I really need the collective wisdom. > > > > Not long ago we've uncovered the problem with iSCSI and ABORT TASK > > handling. Currently it's not possible to abort a WRITE_10 command in > > TRANSPORT_WRITE_PENDING state, because ABORT TASK will hang itself in > > the process: > > > > # dmesg | tail -2 > > [ 83.563505] ABORT_TASK: Found referenced iSCSI task_tag: 3372979269 > > [ 84.593545] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus > > > > # ps aux | awk '$8 ~/D/' > > root 32 0.0 0.0 0 0 ? D 15:19 0:00 [kworker/0:1+events] > > root 1187 0.0 0.0 0 0 ? D 15:20 0:00 [iscsi_ttx] > > > > # cat /proc/32/stack > > [<0>] target_put_cmd_and_wait+0x68/0xa0 > > [<0>] core_tmr_abort_task.cold+0x16b/0x192 > > [<0>] target_tmr_work+0x9e/0xe0 > > [<0>] process_one_work+0x1d4/0x370 > > [<0>] worker_thread+0x48/0x3d0 > > [<0>] kthread+0x122/0x140 > > [<0>] ret_from_fork+0x22/0x30 > > > > # cat /proc/1187/stack > > [<0>] __transport_wait_for_tasks+0xaf/0x100 > > [<0>] transport_generic_free_cmd+0xe9/0x180 > > [<0>] iscsit_free_cmd+0x50/0xb0 > > [<0>] iscsit_close_connection+0x47d/0x8c0 > > [<0>] iscsit_take_action_for_connection_exit+0x6f/0xf0 > > [<0>] iscsi_target_tx_thread+0x184/0x200 > > [<0>] kthread+0x122/0x140 > > [<0>] ret_from_fork+0x22/0x30 > > > > What happens: > > > > 1. Initiator sends WRITE_10 CDB > > 2. Target parses the CDB and sends R2T > > 3. Target starts the Data-Out timer > > 4. Initiator sends ABORT TASK; no new data from Initiator after this > > 5. Target starts aborting WRITE_10, gets into core_tmr_abort_task() > > and starts waiting for the request completion > > 6. Nothing happens > > 7. The Data-Out timers expires, connection teardown starts and gets > > stuck waiting for ABORT TASK that waits for WRITE_10 > > > > The ABORT TASK processing looks roughly like this: > > > > iscsi_rx_opcode > > iscsi_handle_task_mgt_cmd > > iscsi_tmr_abort_task > > transport_generic_handle_tmr > > if (tmr_cmd->transport_state & CMD_T_ABORTED) > > target_handle_abort > > else > > target_tmr_work > > if (tmr_cmd->transport_state & CMD_T_ABORTED) > > target_handle_abort > > else > > core_tmr_abort_task > > ret = __target_check_io_state > > if (write_cmd->transport_state & CMD_T_STOP) > > return -1 > > write_cmd->transport_state |= CMD_T_ABORTED > > return 0 > > if (!ret) > > list_move_tail(&write_cmd->state_list, &aborted) > > target_put_cmd_and_wait(&write_cmd) > > > > As I see it, the main problem is that the abort path can't initiate the > > command termination, it simply waits for the request to handle this on > > the execution path like in target_execute_cmd(): > > > > target_execute_cmd > > target_cmd_interrupted > > INIT_WORK(&cmd->work, target_abort_work) > > > > However, in this case the request is not going to be executed because > > Initiator will not send the Data-Out buffer. > > > > I have a couple of ideas on how to fix this, but they all look kinda > > ugly. The one that currently works around this for me: > > > > core_tmr_abort_task(): > > > > [...] > > > > spin_lock_irqsave(&se_cmd->t_state_lock, flags); > > write_pending = se_cmd->t_state == TRANSPORT_WRITE_PENDING; > > spin_unlock_irqrestore(&se_cmd->t_state_lock, flags); > > > > if (write_pending && se_cmd->se_tfo->abort_write_pending) > > se_cmd->se_tfo->abort_write_pending(se_cmd); > > > > target_put_cmd_and_wait(se_cmd); > > > > [...] > > > > The new method abort_write_pending() is defined only for iSCSI and calls > > target_handle_abort(). However, this opens up another can of worms > > because this code heavily races with R2T sending and requires a couple > > of checks to "work most of the time". Not ideal, by far. > > > > I can make this one better by introducing R2T list draining that ensures > > the proper order during cleanup, but maybe there is a much easier way > > that I'm not seeing here. > > Ccing Maurizio to make sure I don't add his original bug back. > > If I understand you, I think I added this bug in: > > commit f36199355c64a39fe82cfddc7623d827c7e050da > Author: Mike Christie > Date: Fri Nov 13 19:46:18 2020 -0600 > > scsi: target: iscsi: Fix cmd abort fabric stop race > > With that patch if the abort or a lun reset has got to lio core then we > are going to be stuck waiting for the data which won't come because we > killed the iscsi threads. > > Can go back to always having the iscsi target clean up the cmd, but if > LIO has started to abort the cmd we take an extra ref so we don't free > the cmd from under each other. > > This patch is completely untested: > > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > index 2c54c5d8412d..d221e9be7468 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -4090,12 +4090,13 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) > spin_lock_irq(&se_cmd->t_state_lock); > if (se_cmd->transport_state & CMD_T_ABORTED) { > /* > - * LIO's abort path owns the cleanup for this, > - * so put it back on the list and let > - * aborted_task handle it. > + * The LIO TMR handler owns the cmd but if > + * we were waiting for data from the initiator > + * then we need to internally cleanup to be > + * able to complete it. Get an extra ref so > + * we don't free the cmd from under LIO core. > */ > - list_move_tail(&cmd->i_conn_node, > - &conn->conn_cmd_list); > + target_get_sess_cmd(se_cmd, false); > } else { > se_cmd->transport_state |= CMD_T_FABRIC_STOP; > } The bug was there before. I had to backport this patch in order to introduce my fix. I can revert it and check what is different, but it's there in some form.