All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kashyap Desai <kashyap.desai@broadcom.com>
To: Hannes Reinecke <hare@suse.de>, linux-scsi@vger.kernel.org
Cc: jejb@linux.ibm.com, martin.petersen@oracle.com,
	Steve Hagan <steve.hagan@broadcom.com>,
	Peter Rivera <peter.rivera@broadcom.com>,
	mpi3mr-drvr-developers <mpi3mr-linuxdrv.pdl@broadcom.com>,
	Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Subject: RE: [PATCH v5 04/24] mpi3mr: add support of queue command processing
Date: Mon, 17 May 2021 21:07:50 +0530	[thread overview]
Message-ID: <5f74d8f861417c8a34b90030d4c19f80@mail.gmail.com> (raw)
In-Reply-To: <07cd627d-b661-a4a9-0929-00a594b0e8d0@suse.de>

[-- Attachment #1: Type: text/plain, Size: 6185 bytes --]

> > +		if (reply_dma)
> > +			mpi3mr_repost_reply_buf(mrioc, reply_dma);
> > +		num_op_reply++;
> > +
> > +		if (++reply_ci == op_reply_q->num_replies) {
> > +			reply_ci = 0;
> > +			exp_phase ^= 1;
> > +		}
> > +
> > +		reply_desc = mpi3mr_get_reply_desc(op_reply_q, reply_ci);
> > +
> > +		if ((le16_to_cpu(reply_desc->reply_flags) &
> > +		    MPI3_REPLY_DESCRIPT_FLAGS_PHASE_MASK) !=
> exp_phase)
> > +			break;
> > +
> > +	} while (1);
> > +
>
> Is this loop bounded by something?
> The way it looks like we might end up having to process at lot of replies,
> causing a bogus hangcheck trigger.
> Have you check for that condition?


Hi Hannes -

You are correct. There can be a bogus CPU lockup issue here.  We could have
use irq_poll interface as we used in mpt3sas, megaraid_sas driver.
Since we have used hybrid threaded ISR in mpi3mr driver, we have covered
this scenario using threaded ISR. See patch #0017. We have below check to
handle CPU lockup -

if (num_op_reply > mrioc->max_host_ios) {
                intr_info->op_reply_q->enable_irq_poll = true;
                 break;
}

>
> > +	writel(reply_ci,
> > +	    &mrioc->sysif_regs-
> >oper_queue_indexes[reply_qidx].consumer_index);
> > +	op_reply_q->ci = reply_ci;
> > +	op_reply_q->ephase = exp_phase;
> > +
> > +	return num_op_reply;
> > +}
> > +
> >  static irqreturn_t mpi3mr_isr_primary(int irq, void *privdata)  {
> >  	struct mpi3mr_intr_info *intr_info = privdata; @@ -1302,6 +1395,74
> > @@ static int mpi3mr_create_op_queues(struct mpi3mr_ioc *mrioc)
> >  	return retval;
> >  }
> >
> > +/**
> > + * mpi3mr_op_request_post - Post request to operational queue
> > + * @mrioc: Adapter reference
> > + * @op_req_q: Operational request queue info
> > + * @req: MPI3 request
> > + *
> > + * Post the MPI3 request into operational request queue and
> > + * inform the controller, if the queue is full return
> > + * appropriate error.
> > + *
> > + * Return: 0 on success, non-zero on failure.
> > + */
> > +int mpi3mr_op_request_post(struct mpi3mr_ioc *mrioc,
> > +	struct op_req_qinfo *op_req_q, u8 *req) {
> > +	u16 pi = 0, max_entries, reply_qidx = 0, midx;
> > +	int retval = 0;
> > +	unsigned long flags;
> > +	u8 *req_entry;
> > +	void *segment_base_addr;
> > +	u16 req_sz = mrioc->facts.op_req_sz;
> > +	struct segments *segments = op_req_q->q_segments;
> > +
> > +	reply_qidx = op_req_q->reply_qid - 1;
> > +
> > +	if (mrioc->unrecoverable)
> > +		return -EFAULT;
> > +
> > +	spin_lock_irqsave(&op_req_q->q_lock, flags);
> > +	pi = op_req_q->pi;
> > +	max_entries = op_req_q->num_requests;
> > +
> > +	if (mpi3mr_check_req_qfull(op_req_q)) {
> > +		midx = REPLY_QUEUE_IDX_TO_MSIX_IDX(
> > +		    reply_qidx, mrioc->op_reply_q_offset);
> > +		mpi3mr_process_op_reply_q(mrioc, &mrioc-
> >intr_info[midx]);
> > +
> > +		if (mpi3mr_check_req_qfull(op_req_q)) {
> > +			retval = -EAGAIN;
> > +			goto out;
> > +		}
> > +	}
> > +
> > +	if (mrioc->reset_in_progress) {
> > +		ioc_err(mrioc, "OpReqQ submit reset in progress\n");
> > +		retval = -EAGAIN;
> > +		goto out;
> > +	}
> > +
>
> Have you considered a different error code here?
> reset in progress and queue full are two different scenarios; especially
> the
> latter might warrant some further action like decreasing the queue depth,
> which is getting hard if you have the same error ...

There is really no difference if we  use different error code, because
caller of mpi3mr_op_request_post checks zero or non-zero.
I got your point that we should have some better error code (if possible) in
case of controller is in reset.
I am thinking of  using scsi_block_requests/scsi_unblock_requests pair in
controller reset path. We want stable first out of mpi3mr and also such
changes require additional testing, I will accommodate changes in this area
in next series.
Is it OK with you ?

> > +	switch (ioc_status) {
> > +	case MPI3_IOCSTATUS_BUSY:
> > +	case MPI3_IOCSTATUS_INSUFFICIENT_RESOURCES:
> > +		scmd->result = SAM_STAT_BUSY;
> > +		break;
> > +	case MPI3_IOCSTATUS_SCSI_DEVICE_NOT_THERE:
> > +		scmd->result = DID_NO_CONNECT << 16;
> > +		break;
> > +	case MPI3_IOCSTATUS_SCSI_IOC_TERMINATED:
> > +		scmd->result = DID_SOFT_ERROR << 16;
> > +		break;
> > +	case MPI3_IOCSTATUS_SCSI_TASK_TERMINATED:
> > +	case MPI3_IOCSTATUS_SCSI_EXT_TERMINATED:
> > +		scmd->result = DID_RESET << 16;
> > +		break;
> > +	case MPI3_IOCSTATUS_SCSI_RESIDUAL_MISMATCH:
> > +		if ((xfer_count == 0) || (scmd->underflow > xfer_count))
> > +			scmd->result = DID_SOFT_ERROR << 16;
> > +		else
> > +			scmd->result = (DID_OK << 16) | scsi_status;
> > +		break;
> > +	case MPI3_IOCSTATUS_SCSI_DATA_UNDERRUN:
> > +		scmd->result = (DID_OK << 16) | scsi_status;
> > +		if (sense_state == MPI3_SCSI_STATE_SENSE_VALID)
> > +			break;
> > +		if (xfer_count < scmd->underflow) {
> > +			if (scsi_status == SAM_STAT_BUSY)
> > +				scmd->result = SAM_STAT_BUSY;
> > +			else
> > +				scmd->result = DID_SOFT_ERROR << 16;
> > +		} else if ((scsi_state & (MPI3_SCSI_STATE_NO_SCSI_STATUS))
> ||
> > +		    (sense_state !=
> MPI3_SCSI_STATE_SENSE_NOT_AVAILABLE))
> > +			scmd->result = DID_SOFT_ERROR << 16;
> > +		else if (scsi_state & MPI3_SCSI_STATE_TERMINATED)
> > +			scmd->result = DID_RESET << 16;
> > +		else if (!xfer_count && scmd->cmnd[0] == REPORT_LUNS) {
> > +			scsi_status = SAM_STAT_CHECK_CONDITION;
> > +			scmd->result = (DRIVER_SENSE << 24) |
> > +			    SAM_STAT_CHECK_CONDITION;
> > +			scmd->sense_buffer[0] = 0x70;
> > +			scmd->sense_buffer[2] = ILLEGAL_REQUEST;
> > +			scmd->sense_buffer[12] = 0x20;
> > +			scmd->sense_buffer[13] = 0;
> > +		}
>
> Huh? A separate error handling just for REPORT LUNS?
> Did you mess up your firmware?
> And if you know REPORT LUNS is not supported, by bother sending it to the
> firmware and not completing it straightaway in queuecommand()?

This special case handling ported from past solution we used in mpt3sas. I
am not sure but it was due to mix of FW and Kernel behavior.
Some older kernel had lots of prints if we do not have above special
handling in driver.  We will remove above check from mpi3mr driver and based
on any actual issue, we can add fix in future.
I will handle it in V6 submission.

Kashyap

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

  reply	other threads:[~2021-05-17 16:36 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-13  8:35 [PATCH v5 00/24] Introducing mpi3mr driver Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 01/24] mpi3mr: add mpi30 Rev-R headers and Kconfig Kashyap Desai
2021-05-14 11:31   ` Hannes Reinecke
2021-05-14 14:39   ` Tomas Henzl
2021-05-14 16:18   ` Himanshu Madhani
2021-05-17 15:52     ` Kashyap Desai
2021-05-18  7:39   ` Christoph Hellwig
2021-05-19 14:58     ` Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 02/24] mpi3mr: base driver code Kashyap Desai
2021-05-14 14:43   ` Tomas Henzl
2021-05-14 20:16   ` Himanshu Madhani
2021-05-13  8:35 ` [PATCH v5 03/24] mpi3mr: create operational request and reply queue pair Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 04/24] mpi3mr: add support of queue command processing Kashyap Desai
2021-05-14 11:41   ` Hannes Reinecke
2021-05-17 15:37     ` Kashyap Desai [this message]
2021-05-13  8:35 ` [PATCH v5 05/24] mpi3mr: add support of internal watchdog thread Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 06/24] mpi3mr: add support of event handling part-1 Kashyap Desai
2021-05-14 11:45   ` Hannes Reinecke
2021-05-13  8:35 ` [PATCH v5 07/24] mpi3mr: add support of event handling pcie devices part-2 Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 08/24] mpi3mr: add support of event handling part-3 Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 09/24] mpi3mr: add support for recovering controller Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 10/24] mpi3mr: add support of timestamp sync with firmware Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 11/24] mpi3mr: print ioc info for debugging Kashyap Desai
2021-05-14 11:45   ` Hannes Reinecke
2021-05-13  8:35 ` [PATCH v5 12/24] mpi3mr: add bios_param shost template hook Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 13/24] mpi3mr: implement scsi error handler hooks Kashyap Desai
2021-05-14 14:51   ` Tomas Henzl
2021-05-13  8:35 ` [PATCH v5 14/24] mpi3mr: add change queue depth support Kashyap Desai
2021-05-13  8:35 ` [PATCH v5 15/24] mpi3mr: allow certain commands during pci-remove hook Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 16/24] mpi3mr: hardware workaround for UNMAP commands to nvme drives Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 17/24] mpi3mr: add support of threaded isr Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 18/24] mpi3mr: add complete support of soft reset Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 19/24] mpi3mr: print pending host ios for debug Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 20/24] mpi3mr: wait for pending IO completions upon detection of VD IO timeout Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 21/24] mpi3mr: add support of PM suspend and resume Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 22/24] mpi3mr: add support of DSN secure fw check Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 23/24] mpi3mr: add eedp dif dix support Kashyap Desai
2021-05-13  8:36 ` [PATCH v5 24/24] mpi3mr: add event handling debug prints Kashyap Desai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f74d8f861417c8a34b90030d4c19f80@mail.gmail.com \
    --to=kashyap.desai@broadcom.com \
    --cc=hare@suse.de \
    --cc=jejb@linux.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mpi3mr-linuxdrv.pdl@broadcom.com \
    --cc=peter.rivera@broadcom.com \
    --cc=sathya.prakash@broadcom.com \
    --cc=steve.hagan@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.