From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Christie Date: Tue, 28 Apr 2020 17:50:43 +0000 Subject: Re: [PATCH v3 1/2] iscsi-target: fix login error when receiving is too fast Message-Id: List-Id: References: <20200424123528.17627-1-houpu@bytedance.com> <20200424123528.17627-2-houpu@bytedance.com> <8f3b123e-1dc3-54ba-ae6d-90e76e5702b5@redhat.com> <44a02338-0923-5b57-ed26-8528bf9cde70@bytedance.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit To: Hou Pu , martin.petersen@oracle.com, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org On 4/26/20 1:09 AM, Hou Pu wrote: > >>>>> +     */ >>>>> +    if (conn->sock) { >>>>> +        struct sock *sk = conn->sock->sk; >>>>> + >>>>> +        write_lock_bh(&sk->sk_callback_lock); >>>>> +        clear_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags); >>>>> +        set_bit(LOGIN_FLAGS_WRITE_ACTIVE, &conn->login_flags); >>>>> +        write_unlock_bh(&sk->sk_callback_lock); >>>>> +    } >>>>> + >>>> Hey, >>>> >>>> I had one more question. >>>> >>>> With the above code, I think we have a race where if we clear the bit >>>> above early and iscsi_target_sk_data_ready runs while >>>> iscsi_target_do_login_rx is still running then we could queue the work >>>> an extra time and get stuck. Because the bit is now not set, if >>>> iscsi_target_sk_data_ready runs it will end up calling >>>> schedule_delayed_work which will queue up the work again since the work >>>> is running and not pending. >> >> Yes. I was trying to allow queuing the delayed work early. >> >>>> >>>> If that is correct and we hit the race what happens if this was the >>>> last >>>> login pdu, and we are supposed to go to full feature phase next? For >>>> example if iscsi_target_do_login_rx runs an extra time, will we end up >>>> stuck waiting in iscsi_target_do_login_rx's call to: >>>> >>>> rc = conn->conn_transport->iscsit_get_login_rx(conn, login); >>>> >>>> ? >> >> For the last login pdu, we may have race as you said. thanks for >> pointing it out. >> >> I was trying to image a case where we can hit the race, normally it is >> case a). >> >> a). initiator send last login pdu -> target received -> target replied >> >> b). initiator send last login pdu -> target received -> initiator send >> something -> target replied >> >> in case b). we will queue another delayed work which we should not.  >> After the target replied >> >> the last login pdu, conn->conn_login is freed. we might visited it in >> the delayed wo>> >> >>> Just answering my own question. It looks like we do not get stuck. But >>> we either get nothing on the session so the login timeout fires and we >>> drop the session. Or, we get a PDU and the login thread reads it in >>> before the normal rx thread does, but it assumes it is a login related >>> and we most likely drop the session due to invalid fields. >>> >>> I think in iscsi_target_restore_sock_callbacks we want to do a: >>> >>> cancel_delayed_work(&conn->login_work) >>> >>> after we reset the callbacks and drop the sk_callback_lock lock. >> >> I am not very sure if we could or if it is good to cancel_delayed_work >> from the work itself. >> >> If it is ok then i am ok with it. Or in another way, I think we could >> just clear >> >> LOGIN_FLAGS_READ_ACTIVE and set LOGIN_FLAGS_WRITE_ACTIVE >> >> after iscsi_target_restore_sock_callbacks when finish process last >> login pdu. > > That would look like (in iscsi_target_do_tx_login_io): > > diff --git a/drivers/target/iscsi/iscsi_target_nego.c > b/drivers/target/iscsi/iscsi_target_nego.c > index 685d771b51d4..4d0658731382 100644 > --- a/drivers/target/iscsi/iscsi_target_nego.c > +++ b/drivers/target/iscsi/iscsi_target_nego.c > @@ -328,6 +328,28 @@ static int iscsi_target_do_tx_login_io(struct > iscsi_conn *conn, struct iscsi_log >         u32 padding = 0; >         struct iscsi_login_rsp *login_rsp; > > +       /* > +        * LOGIN_FLAGS_READ_ACTIVE is cleared so that sk_data_ready > +        * could be trigger again after this. > +        * > +        * LOGIN_FLAGS_WRITE_ACTIVE is cleared after we successfully > +        * process a login pdu, so that sk_state_chage could do login > +        * cleanup as needed if the socket is closed. If a delayed work is > +        * ongoing (LOGIN_FLAGS_WRITE_ACTIVE or LOGIN_FLAGS_READ_ACTIVE), > +        * sk_state_change will leave the cleanup to the delayed work or > +        * it will schedule a delayed work to do cleanup. > +        */ > +       if (conn->sock) { > +               struct sock *sk = conn->sock->sk; > + > +               write_lock_bh(&sk->sk_callback_lock); > +               if (!test_bit(LOGIN_FLAGS_INITIAL_PDU, > &conn->login_flags)) { > +                       clear_bit(LOGIN_FLAGS_READ_ACTIVE, > &conn->login_flags); > +                       set_bit(LOGIN_FLAGS_WRITE_ACTIVE, > &conn->login_flags); > +               } > +               write_unlock_bh(&sk->sk_callback_lock); > +       } You lost me. I didn't understand this part. Would you still be doing the above bit manipulation in iscsi_target_do_login_rx too? Is the above code then to handle when iscsi_target_start_negotiation->iscsi_target_do_login->iscsi_target_do_tx_login_io runs? I was thinking when you mentioned the final login PDU you were going to do something when you detect login->login_complete is true. This code is not my expertise area, so I might just not be understanding something simple about how it all works. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6932AC83001 for ; Tue, 28 Apr 2020 17:50:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 44B0020757 for ; Tue, 28 Apr 2020 17:50:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="if1yEGpH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728600AbgD1Rux (ORCPT ); Tue, 28 Apr 2020 13:50:53 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:42933 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728590AbgD1Ruw (ORCPT ); Tue, 28 Apr 2020 13:50:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1588096250; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8lRv+tCtSu1UNwULxaTKR3MF4r3f7PH3zkbzUaNx1RU=; b=if1yEGpHD/UqYI4O3ogvFv17aOdPmMOL3mQtAAxtqw9eV++o3jffSWZ+kD5G0EF3jfkcCw loBG9R14+3+/auaU6Gn0fVxkLKrOS+85+LG3+k/KyQWjjpgzBAaZlrs/gQIebwTv/715kR tiYS/8Vh0eZA9r2BQKWPq7Nc4Ee/LJg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-62-Gudx8GXMPMqKUGcQvU2QVA-1; Tue, 28 Apr 2020 13:50:46 -0400 X-MC-Unique: Gudx8GXMPMqKUGcQvU2QVA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9516F108BD0B; Tue, 28 Apr 2020 17:50:44 +0000 (UTC) Received: from [10.10.112.215] (ovpn-112-215.rdu2.redhat.com [10.10.112.215]) by smtp.corp.redhat.com (Postfix) with ESMTP id A393C5C1BD; Tue, 28 Apr 2020 17:50:43 +0000 (UTC) Subject: Re: [PATCH v3 1/2] iscsi-target: fix login error when receiving is too fast To: Hou Pu , martin.petersen@oracle.com, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org References: <20200424123528.17627-1-houpu@bytedance.com> <20200424123528.17627-2-houpu@bytedance.com> <8f3b123e-1dc3-54ba-ae6d-90e76e5702b5@redhat.com> <44a02338-0923-5b57-ed26-8528bf9cde70@bytedance.com> From: Mike Christie Message-ID: Date: Tue, 28 Apr 2020 12:50:43 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Content-Transfer-Encoding: quoted-printable Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org On 4/26/20 1:09 AM, Hou Pu wrote: >=20 >>>>> +=C2=A0=C2=A0=C2=A0=C2=A0 */ >>>>> +=C2=A0=C2=A0=C2=A0 if (conn->sock) { >>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct sock *sk =3D con= n->sock->sk; >>>>> + >>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 write_lock_bh(&sk->sk_c= allback_lock); >>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 clear_bit(LOGIN_FLAGS_R= EAD_ACTIVE, &conn->login_flags); >>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 set_bit(LOGIN_FLAGS_WRI= TE_ACTIVE, &conn->login_flags); >>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 write_unlock_bh(&sk->sk= _callback_lock); >>>>> +=C2=A0=C2=A0=C2=A0 } >>>>> + >>>> Hey, >>>> >>>> I had one more question. >>>> >>>> With the above code, I think we have a race where if we clear the bi= t >>>> above early and iscsi_target_sk_data_ready runs while >>>> iscsi_target_do_login_rx is still running then we could queue the wo= rk >>>> an extra time and get stuck. Because the bit is now not set, if >>>> iscsi_target_sk_data_ready runs it will end up calling >>>> schedule_delayed_work which will queue up the work again since the w= ork >>>> is running and not pending. >> >> Yes. I was trying to allow queuing the delayed work early. >> >>>> >>>> If that is correct and we hit the race what happens if this was the >>>> last >>>> login pdu, and we are supposed to go to full feature phase next? For >>>> example if iscsi_target_do_login_rx runs an extra time, will we end = up >>>> stuck waiting in iscsi_target_do_login_rx's call to: >>>> >>>> rc =3D conn->conn_transport->iscsit_get_login_rx(conn, login); >>>> >>>> ? >> >> For the last login pdu, we may have race as you said. thanks for >> pointing it out. >> >> I was trying to image a case where we can hit the race, normally it is >> case a). >> >> a). initiator send last login pdu -> target received -> target replied >> >> b). initiator send last login pdu -> target received -> initiator send >> something -> target replied >> >> in case b). we will queue another delayed work which we should not.=C2= =A0 >> After the target replied >> >> the last login pdu, conn->conn_login is freed. we might visited it in >> the delayed wo>> >> >>> Just answering my own question. It looks like we do not get stuck. Bu= t >>> we either get nothing on the session so the login timeout fires and w= e >>> drop the session. Or, we get a PDU and the login thread reads it in >>> before the normal rx thread does, but it assumes it is a login relate= d >>> and we most likely drop the session due to invalid fields. >>> >>> I think in iscsi_target_restore_sock_callbacks we want to do a: >>> >>> cancel_delayed_work(&conn->login_work) >>> >>> after we reset the callbacks and drop the sk_callback_lock lock. >> >> I am not very sure if we could or if it is good to cancel_delayed_work >> from the work itself. >> >> If it is ok then i am ok with it. Or in another way, I think we could >> just clear >> >> LOGIN_FLAGS_READ_ACTIVE and set LOGIN_FLAGS_WRITE_ACTIVE >> >> after iscsi_target_restore_sock_callbacks when finish process last >> login pdu. >=20 > That would look like (in iscsi_target_do_tx_login_io): >=20 > diff --git a/drivers/target/iscsi/iscsi_target_nego.c > b/drivers/target/iscsi/iscsi_target_nego.c > index 685d771b51d4..4d0658731382 100644 > --- a/drivers/target/iscsi/iscsi_target_nego.c > +++ b/drivers/target/iscsi/iscsi_target_nego.c > @@ -328,6 +328,28 @@ static int iscsi_target_do_tx_login_io(struct > iscsi_conn *conn, struct iscsi_log > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 u32 padding =3D 0; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct iscsi_login_rsp *logi= n_rsp; >=20 > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * LOGIN_FLAGS_READ_ACTIVE i= s cleared so that sk_data_ready > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * could be trigger again af= ter this. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * LOGIN_FLAGS_WRITE_ACTIVE = is cleared after we successfully > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * process a login pdu, so t= hat sk_state_chage could do login > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * cleanup as needed if the = socket is closed. If a delayed work is > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * ongoing (LOGIN_FLAGS_WRIT= E_ACTIVE or LOGIN_FLAGS_READ_ACTIVE), > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * sk_state_change will leav= e the cleanup to the delayed work or > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * it will schedule a delaye= d work to do cleanup. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (conn->sock) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 struct sock *sk =3D conn->sock->sk; > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 write_lock_bh(&sk->sk_callback_lock); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 if (!test_bit(LOGIN_FLAGS_INITIAL_PDU, > &conn->login_flags)) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 clear_bit(LO= GIN_FLAGS_READ_ACTIVE, > &conn->login_flags); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 set_bit(LOGI= N_FLAGS_WRITE_ACTIVE, > &conn->login_flags); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 } > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 write_unlock_bh(&sk->sk_callback_lock); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } You lost me. I didn't understand this part. Would you still be doing the above bit manipulation in iscsi_target_do_login_rx too? Is the above code then to handle when iscsi_target_start_negotiation->iscsi_target_do_login->iscsi_target_do_tx= _login_io runs? I was thinking when you mentioned the final login PDU you were going to do something when you detect login->login_complete is true. This code is not my expertise area, so I might just not be understanding something simple about how it all works.