All of lore.kernel.org
 help / color / mirror / Atom feed
From: Quinn Tran <quinn.tran@qlogic.com>
To: Christoph Hellwig <hch@infradead.org>,
	Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: "target-devel@vger.kernel.org" <target-devel@vger.kernel.org>,
	"nab@linux-iscsi.org" <nab@linux-iscsi.org>,
	Giridhar Malavali <giridhar.malavali@qlogic.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 10/20] qla2xxx: Fix interaction issue between qla2xxx and Target Core Module
Date: Wed, 9 Dec 2015 22:07:32 +0000	[thread overview]
Message-ID: <D28DCE82.1B0CE%quinn.tran@qlogic.com> (raw)
In-Reply-To: <20151208023700.GB9088@infradead.org>

[-- Attachment #1: Type: text/plain, Size: 5041 bytes --]


On 12/7/15, 6:37 PM, "target-devel-owner@vger.kernel.org on behalf of
Christoph Hellwig" <target-devel-owner@vger.kernel.org on behalf of
hch@infradead.org> wrote:

>> -void qlt_abort_cmd(struct qla_tgt_cmd *cmd)
>> +int qlt_abort_cmd(struct qla_tgt_cmd *cmd)
>>  {
>>  	struct qla_tgt *tgt = cmd->tgt;
>>  	struct scsi_qla_host *vha = tgt->vha;
>>  	struct se_cmd *se_cmd = &cmd->se_cmd;
>> +	unsigned long flags,refcount;
>>  
>>  	ql_dbg(ql_dbg_tgt_mgt, vha, 0xf014,
>>  	    "qla_target(%d): terminating exchange for aborted cmd=%p "
>>  	    "(se_cmd=%p, tag=%llu)", vha->vp_idx, cmd, &cmd->se_cmd,
>>  	    se_cmd->tag);
>>  
>> +    spin_lock_irqsave(&cmd->cmd_lock, flags);
>> +    if (cmd->aborted) {
>> +        spin_unlock_irqrestore(&cmd->cmd_lock, flags);
>> +
>> +        /* It's normal to see 2 calls in this path:
>> +         *  1) XFER Rdy completion + CMD_T_ABORT
>> +         *  2) TCM TMR - drain_state_list
>> +         */
>> +        refcount = atomic_read(&cmd->se_cmd.cmd_kref.refcount);
>> +        ql_dbg(ql_dbg_tgt_mgt, vha, 0xffff,
>> +               "multiple abort. %p refcount %lx"
>> +               "transport_state %x, t_state %x, se_cmd_flags %x \n",
>> +               cmd, refcount,cmd->se_cmd.transport_state,
>> +               cmd->se_cmd.t_state,cmd->se_cmd.se_cmd_flags);
>> +
>> +        return EIO;
>> +    }
>
>Err, no.  Looking into the refcount inside a kref is never the
>right thing to do.

QT> even for debug purpose??

>
>> +typedef enum {
>> +	/*
>> +	 * BIT_0 - Atio Arrival / schedule to work
>> +	 * BIT_1 - qlt_do_work
>> +	 * BIT_2 - qlt_do work failed
>> +	 * BIT_3 - xfer rdy/tcm_qla2xxx_write_pending
>> +	 * BIT_4 - read respond/tcm_qla2xx_queue_data_in
>> +	 * BIT_5 - status respond / tcm_qla2xx_queue_status
>> +	 * BIT_6 - tcm request to abort/Term exchange.
>> +	 *	pre_xmit_response->qlt_send_term_exchange
>> +	 * BIT_7 - SRR received (qlt_handle_srr->qlt_xmit_response)
>> +	 * BIT_8 - SRR received (qlt_handle_srr->qlt_rdy_to_xfer)
>> +	 * BIT_9 - SRR received (qla_handle_srr->qlt_send_term_exchange)
>> +	 * BIT_10 - Data in - hanlde_data->tcm_qla2xxx_handle_data
>> +
>> +	 * BIT_12 - good completion - qlt_ctio_do_completion -->free_cmd
>> +	 * BIT_13 - Bad completion -
>> +	 *	qlt_ctio_do_completion --> qlt_term_ctio_exchange
>> +	 * BIT_14 - Back end data received/sent.
>> +	 * BIT_15 - SRR prepare ctio
>> +	 * BIT_16 - complete free
>> +	 * BIT_17 - flush - qlt_abort_cmd_on_host_reset
>> +	 * BIT_18 - completion w/abort status
>> +	 * BIT_19 - completion w/unknown status
>> +	 * BIT_20 - tcm_qla2xxx_free_cmd
>
>Please use descriptive names for these flags in the source code!

QT> ACK.  We¹ll change the bits to more descriptive name in a ³follow on²
patch.

>
>> +	BUG_ON(cmd->cmd_flags & BIT_20);
>> +	cmd->cmd_flags |= BIT_20;
>> +
>
>And no crazieness like this.  While we're at it: what synchronizes
>access to ->cmd_flags?

QT> These bits provide indication as to where the command has traversed in
the QLA code.  Each bit is set one time. Due to the async nature of the
TMR code, it triggers QLA driver to repeat this specific free path in the
double free case.  This BUG_ON allows us trap it early on.

In one of the corner case (below), I need to overloaded it + lock for the
cleanup process.

>
>> @@ -466,13 +484,25 @@ static int tcm_qla2xxx_handle_cmd(scsi_qla_host_t
>>*vha, struct qla_tgt_cmd *cmd,
>>  static void tcm_qla2xxx_handle_data_work(struct work_struct *work)
>>  {
>>  	struct qla_tgt_cmd *cmd = container_of(work, struct qla_tgt_cmd,
>>work);
>> +	unsigned long flags;
>>  
>>  	/*
>>  	 * Ensure that the complete FCP WRITE payload has been received.
>>  	 * Otherwise return an exception via CHECK_CONDITION status.
>>  	 */
>>  	cmd->cmd_in_wq = 0;
>> -	cmd->cmd_flags |= BIT_11;
>> +
>> +	spin_lock_irqsave(&cmd->cmd_lock, flags);
>> +	cmd->cmd_flags |= CMD_FLAG_DATA_WORK;
>> +	if (cmd->aborted) {
>> +		cmd->cmd_flags |= CMD_FLAG_DATA_WORK_FREE;
>> +		spin_unlock_irqrestore(&cmd->cmd_lock, flags);
>> +
>> +		tcm_qla2xxx_free_cmd(cmd);
>> +		return;
>> +	}
>> +	spin_unlock_irqrestore(&cmd->cmd_lock, flags);
>
>All these abort flag hacks look very suspicios.  Can you explain the
>exact theory of operation behind them?

QT> The cmd->aborted flag is used to track the CMD_T_ABORT flag at TCM
level.  If the command have been requested to be aborted by TCM or already
aborted, we advance it to the ³free" state because our hardware have
already started freeing up resources associated to this command/exchange.
In this specific case(above), a XFER RDY was aborted by the TMR.
Returning the cmd to TCM to generate SCSI Status would generate erroneous
HW error due to freed resource.


>
>--
>To unsubscribe from this list: send the line "unsubscribe target-devel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 6520 bytes --]

  reply	other threads:[~2015-12-09 22:05 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-08  0:48 [PATCH 00/20] qla2xxx: Patches for target-pending branch Himanshu Madhani
2015-12-08  0:48 ` [PATCH 01/20] qla2xxx: Enable Extended Login support Himanshu Madhani
2015-12-08 15:51   ` Hannes Reinecke
2015-12-08 19:35     ` Himanshu Madhani
2015-12-08  0:48 ` [PATCH 02/20] qla2xxx: Enable Exchange offload support Himanshu Madhani
2015-12-08 15:52   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 03/20] qla2xxx: Enable Target counters in DebugFS Himanshu Madhani
2015-12-08 15:52   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 04/20] qla2xxx: Add FW resource count " Himanshu Madhani
2015-12-08 15:53   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 05/20] qla2xxx: Added interface to send ELS commands from driver Himanshu Madhani
2015-12-08  2:10   ` kbuild test robot
2015-12-08  2:10   ` [PATCH] qla2xxx: fix ifnullfree.cocci warnings kbuild test robot
2015-12-08 15:54   ` [PATCH 05/20] qla2xxx: Added interface to send ELS commands from driver Hannes Reinecke
2015-12-08  0:48 ` [PATCH 06/20] qla2xxx: Delete session if initiator is gone from FW Himanshu Madhani
2015-12-08  1:41   ` kbuild test robot
2015-12-08 15:58   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 07/20] qla2xxx: Wait for all conflicts before ack'ing PLOGI Himanshu Madhani
2015-12-08 16:00   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 08/20] qla2xxx: Replace QLA_TGT_STATE_ABORTED with a bit Himanshu Madhani
2015-12-08 16:01   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 09/20] qla2xxx: Change check_stop_free to always return 1 Himanshu Madhani
2015-12-08  2:33   ` Christoph Hellwig
2015-12-09  6:56   ` Hannes Reinecke
2015-12-10  1:06     ` Quinn Tran
2015-12-08  0:48 ` [PATCH 10/20] qla2xxx: Fix interaction issue between qla2xxx and Target Core Module Himanshu Madhani
2015-12-08  2:37   ` Christoph Hellwig
2015-12-09 22:07     ` Quinn Tran [this message]
2015-12-14 10:34       ` Christoph Hellwig
2015-12-14 21:59         ` Quinn Tran
2015-12-09  7:01   ` Hannes Reinecke
2015-12-09 22:41     ` Quinn Tran
2015-12-08  0:48 ` [PATCH 11/20] qla2xxx: Add TAS detection for kernel 3.15 n newer Himanshu Madhani
2015-12-08  2:48   ` Christoph Hellwig
2015-12-09 20:24     ` Quinn Tran
2015-12-14 10:37       ` Christoph Hellwig
2015-12-14 22:00         ` Quinn Tran
2015-12-09  7:02   ` Hannes Reinecke
2015-12-08  0:48 ` [PATCH 12/20] target/tmr: LUN reset cause cmd premature free Himanshu Madhani
2015-12-08  2:48   ` Christoph Hellwig
2015-12-09 20:11     ` Quinn Tran
2016-01-04  7:44     ` Bart Van Assche
2015-12-09  7:03   ` Hannes Reinecke
2015-12-08  0:49 ` [PATCH 13/20] qla2xxx: Remove dependency on hardware_lock to reduce lock contention Himanshu Madhani
2015-12-08  0:49 ` [PATCH 14/20] qla2xxx: Add irq affinity notification Himanshu Madhani
2015-12-08  0:49 ` [PATCH 15/20] qla2xxx: Add selective command queuing Himanshu Madhani
2015-12-08  0:49 ` [PATCH 16/20] qla2xxx: Move atioq to a different lock to reduce lock contention Himanshu Madhani
2015-12-08  0:49 ` [PATCH 17/20] qla2xxx: Disable ZIO at start time Himanshu Madhani
2015-12-08  0:49 ` [PATCH 18/20] qla2xxx: Set all queues to 4k Himanshu Madhani
2015-12-08  0:49 ` [PATCH 19/20] qla2xxx: Add bulk send for atio & ctio completion paths Himanshu Madhani
2015-12-08  0:49 ` [PATCH 20/20] qla2xxx: Check for online flag instead of active reset when transmitting responses Himanshu Madhani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D28DCE82.1B0CE%quinn.tran@qlogic.com \
    --to=quinn.tran@qlogic.com \
    --cc=giridhar.malavali@qlogic.com \
    --cc=hch@infradead.org \
    --cc=himanshu.madhani@qlogic.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@linux-iscsi.org \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.