All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steffen Maier <maier@linux.ibm.com>
To: Hannes Reinecke <hare@suse.de>, Benjamin Block <bblock@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	Christoph Hellwig <hch@lst.de>,
	James Bottomley <james.bottomley@hansenpartnership.com>,
	linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.com>
Subject: Re: [PATCH 08/51] zfcp: open-code fc_block_scsi_eh() for host reset
Date: Tue, 17 Aug 2021 16:03:41 +0200	[thread overview]
Message-ID: <c9e8ad26-f78c-94d3-5c39-8e7ac15a165a@linux.ibm.com> (raw)
In-Reply-To: <fdf138d0-f730-a985-e5d5-894a14a2c978@suse.de>

On 8/17/21 2:54 PM, Hannes Reinecke wrote:
> On 8/17/21 1:53 PM, Benjamin Block wrote:
>> On Tue, Aug 17, 2021 at 11:14:13AM +0200, Hannes Reinecke wrote:
>>> @@ -383,9 +385,24 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
>>>   	}
>>>   	zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
>>>   	zfcp_erp_wait(adapter);
>>> -	fc_ret = fc_block_scsi_eh(scpnt);
>>> -	if (fc_ret)
>>> -		ret = fc_ret;
>>> +retry_rport_blocked:
>>> +	spin_lock_irqsave(host->host_lock, flags);
>>> +	list_for_each_entry(port, &adapter->port_list, list) {
>>
>> You need to take the `adapter->port_list_lock` to iterate over the `port_list`.
>>
>> i.e.: read_lock_irqsave(&adapter->port_list_lock, flags);
>>
>>> +		struct fc_rport *rport = port->rport;
>>> +
>>> +		if (rport->port_state == FC_PORTSTATE_BLOCKED) {
>>> +			if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
>>> +				ret = FAST_IO_FAIL;
>>> +			else
>>> +				ret = NEEDS_RETRY;
>>> +			break;
>>> +		}
>>> +	}
>>> +	spin_unlock_irqrestore(host->host_lock, flags);
>>> +	if (ret == NEEDS_RETRY) {
>>> +		msleep(1000);
>>> +		goto retry_rport_blocked;
>>> +	}
>>
>> I really can't say I like this open coded FC code in the driver at all.
>>
>> Is there a reason we can't use `fc_block_rport()` for all the rports of
>> the adapter?

Waiting for all rports to unblock in host_reset has been on my todo list since 
we prepared the eh callbacks to get rid of scsi_cmnd with v4.18 commits:
674595d8519f ("scsi: zfcp: decouple our scsi_eh callbacks from scsi_cmnd")
42afc6527d43 ("scsi: zfcp: decouple TMFs from scsi_cmnd by using fc_block_rport")
26f5fa9d47c1 ("scsi: zfcp: decouple SCSI setup of TMF from scsi_cmnd")
39abb11aca00 ("scsi: zfcp: decouple FSF request setup of TMF from scsi_cmnd")
e0116c91c7d8 ("scsi: zfcp: split FCP_CMND IU setup between SCSI I/O and TMF again")
266883f2f7d5 ("scsi: zfcp: decouple TMF response handler from scsi_cmnd")
822121186375 ("scsi: zfcp: decouple SCSI traces for scsi_eh / TMF from scsi_cmnd")

But the synchronization is non-trivial as Benjamin's question shows. There are 
also considerations about lock order, etc.

I'm busy with other things, so don't hold your breath until I can review and 
test the code; I don't want any regression in that recovery code.

>> We already do use it for other EH callbacks in the same file, and you
>> already look up the rports in the adapters rport-list; so using that on
>> the rports in the loop, instead of open-coding it doesn't seem bad? Or
>> is there a locking problem?
>>
>> We might waste a few cycles with that, but frankly, this is all in EH
>> and after adapter reset.. all performance concerns went our of the
>> window with that already.
>>
> 
> Question would be why we need to call fc_block_rport() at all in host reset.
> To my understanding a host reset is expected to do a full resync of the
> SAN topology, so the expectation is that after zfcp_erp_wait() the port
> list is stable (ie the HBA has finished processing all RSCNs related to
> the SAN resync).

There is more to do in zfcp than in other FC HBA drivers, e.g. LUN open 
recoveries and how they related to rport unblock:
v4.10 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery").
The rport unblock is async to our internal recovery. zfcp_erp_wait() only waits 
for the latter by design.

> So can't we just drop the fc_block_rport() call here?

I don't think so.

> All the other FC drivers do fine without that ...

It would have been nice to have a common interface for all scsi_eh scopes. I.e. 
fc_block_host(struct Scsi_Host*) like we already have for 
fc_block_scsi_eh(struct scsi_cmnd*) and fc_block_rport(struct fc_rport*) [the 
latter having been introduced at the time of above eh callback preparations].
But if zfcp is the only one needing it for host_reset, having the code only in 
zfcp seems fine to me.


-- 
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z Development

https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

  reply	other threads:[~2021-08-17 14:04 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-17  9:14 [PATCHv2 00/51] SCSI EH argument reshuffle part II Hannes Reinecke
2021-08-17  9:14 ` [PATCH 01/51] lpfc: kill lpfc_bus_reset_handler Hannes Reinecke
2021-08-17 17:37   ` James Smart
2021-08-17  9:14 ` [PATCH 02/51] lpfc: drop lpfc_no_handler() Hannes Reinecke
2021-08-17 12:13   ` Christoph Hellwig
2021-08-17 17:36   ` James Smart
2021-08-17  9:14 ` [PATCH 03/51] sym53c8xx_2: split off bus reset from host reset Hannes Reinecke
2021-08-17 12:18   ` Christoph Hellwig
2021-08-17  9:14 ` [PATCH 04/51] ips: Do not try to abort command " Hannes Reinecke
2021-08-17  9:14 ` [PATCH 05/51] snic: reserve tag for TMF Hannes Reinecke
2021-08-17 12:21   ` Christoph Hellwig
2021-08-17  9:14 ` [PATCH 06/51] qla1280: separate out host reset function from qla1280_error_action() Hannes Reinecke
2021-08-17 12:22   ` Christoph Hellwig
2021-08-17 14:05     ` Hannes Reinecke
2021-08-17  9:14 ` [PATCH 07/51] megaraid: pass in NULL scb for host reset Hannes Reinecke
2021-08-17 12:26   ` Christoph Hellwig
2021-08-17 13:46     ` Hannes Reinecke
2021-08-17  9:14 ` [PATCH 08/51] zfcp: open-code fc_block_scsi_eh() " Hannes Reinecke
2021-08-17 11:53   ` Benjamin Block
2021-08-17 12:54     ` Hannes Reinecke
2021-08-17 14:03       ` Steffen Maier [this message]
2021-08-17 14:10         ` Hannes Reinecke
2021-08-18 11:00           ` Steffen Maier
2021-08-18 11:58             ` Hannes Reinecke
2021-08-17  9:14 ` [PATCH 09/51] mpi3mr: split off bus_reset function from host_reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 10/51] scsi: Use Scsi_Host as argument for eh_host_reset_handler Hannes Reinecke
2021-08-17 14:55   ` Steffen Maier
2021-08-19 18:34   ` Bart Van Assche
2021-08-17  9:14 ` [PATCH 11/51] mptfc: simplify mpt_fc_block_error_handler() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 12/51] mptfusion: correct definitions for mptscsih_dev_reset() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 13/51] mptfc: open-code mptfc_block_error_handler() for bus reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 14/51] pmcraid: Select device in pmcraid_eh_bus_reset_handler() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 15/51] qla2xxx: open-code qla2xxx_generic_reset() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 16/51] qla2xxx: Do not call fc_block_scsi_eh() during bus reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 17/51] visorhba: select first device on the bus for bus_reset() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 18/51] ncr53c8xx: remove 'sync_reset' argument from ncr_reset_bus() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 19/51] ncr53c8xx: Complete all commands during bus reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 20/51] ncr53c8xx: Remove unused code Hannes Reinecke
2021-08-17  9:14 ` [PATCH 21/51] scsi: Use Scsi_Host and channel number as argument for eh_bus_reset_handler() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 22/51] libiscsi: use cls_session as argument for target and session reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 23/51] bnx2fc: Do not rely on a scsi command when issueing lun or target reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 24/51] ibmvfc: open-code reset loop for " Hannes Reinecke
2021-08-17  9:14 ` [PATCH 25/51] lpfc: use fc_block_rport() Hannes Reinecke
2021-08-18 16:35   ` James Smart
2021-08-17  9:14 ` [PATCH 26/51] lpfc: use rport as argument for lpfc_send_taskmgmt() Hannes Reinecke
2021-08-18 16:35   ` James Smart
2021-08-17  9:14 ` [PATCH 27/51] lpfc: use rport as argument for lpfc_chk_tgt_mapped() Hannes Reinecke
2021-08-18 16:36   ` James Smart
2021-08-17  9:14 ` [PATCH 28/51] csiostor: use fc_block_rport() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 29/51] qla2xxx: " Hannes Reinecke
2021-08-17  9:14 ` [PATCH 30/51] fc_fcp: " Hannes Reinecke
2021-08-17  9:14 ` [PATCH 31/51] qedf: use fc rport as argument for qedf_initiate_tmf() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 32/51] sym53c8xx_2: rework reset handling Hannes Reinecke
2021-08-17  9:14 ` [PATCH 33/51] bfa: Do not use scsi command to signal TMF status Hannes Reinecke
2021-08-17  9:14 ` [PATCH 34/51] scsi_transport_iscsi: use session as argument for iscsi_block_scsi_eh() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 35/51] pmcraid: select first available device for target reset Hannes Reinecke
2021-08-17  9:14 ` [PATCH 36/51] scsi: Use scsi_target as argument for eh_target_reset_handler() Hannes Reinecke
2021-08-17 15:53   ` Steffen Maier
2021-08-18 16:37   ` James Smart
2021-08-19 18:37   ` Bart Van Assche
2021-08-17  9:14 ` [PATCH 37/51] aha152x: look for stuck command when resetting device Hannes Reinecke
2021-08-18  0:43   ` Finn Thain
2021-08-17  9:14 ` [PATCH 38/51] fnic: use dedicated device reset command Hannes Reinecke
2021-08-17  9:14 ` [PATCH 39/51] a1000u2w: do not rely on the command for inia100_device_reset() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 40/51] aic7xxx: use scsi device as argument for BUILD_SCSIID() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 41/51] aic79xx: " Hannes Reinecke
2021-08-17  9:14 ` [PATCH 42/51] aic7xxx: do not reference scsi command when resetting device Hannes Reinecke
2021-08-17  9:14 ` [PATCH 43/51] aic79xx: " Hannes Reinecke
2021-08-17  9:14 ` [PATCH 44/51] xen-scsifront: add scsi device as argument to scsifront_do_request() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 45/51] fas216: Rework device reset to not rely on SCSI command pointer Hannes Reinecke
2021-08-17  9:14 ` [PATCH 46/51] csiostor: use separate TMF command Hannes Reinecke
2021-08-17  9:14 ` [PATCH 47/51] snic: use dedicated device reset command Hannes Reinecke
2021-08-17  9:14 ` [PATCH 48/51] snic: Use scsi_host_busy_iter() to traverse commands Hannes Reinecke
2021-08-17  9:14 ` [PATCH 49/51] scsi: Move eh_device_reset_handler() to use scsi_device as argument Hannes Reinecke
2021-08-17 16:13   ` Steffen Maier
2021-08-17 18:09     ` Hannes Reinecke
2021-08-17  9:14 ` [PATCH 50/51] scsi: Do not allocate scsi command in scsi_ioctl_reset() Hannes Reinecke
2021-08-17  9:14 ` [PATCH 51/51] scsi_error: streamline scsi_eh_bus_device_reset() Hannes Reinecke
2021-08-17 12:13 ` [PATCHv2 00/51] SCSI EH argument reshuffle part II Christoph Hellwig
2021-08-17 12:55   ` Hannes Reinecke
2022-02-23 12:49     ` Christoph Hellwig
2022-02-23 13:03       ` Hannes Reinecke
2022-02-23 13:05         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9e8ad26-f78c-94d3-5c39-8e7ac15a165a@linux.ibm.com \
    --to=maier@linux.ibm.com \
    --cc=bblock@linux.ibm.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=james.bottomley@hansenpartnership.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.