linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC] qla2xxx: Add dev_loss_tmo kernel module options
@ 2021-04-19 10:00 Daniel Wagner
  2021-04-19 16:19 ` Randy Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Daniel Wagner @ 2021-04-19 10:00 UTC (permalink / raw)
  To: linux-scsi
  Cc: GR-QLogic-Storage-Upstream, linux-nvme, Hannes Reinecke,
	Daniel Wagner, Nilesh Javali, Arun Easi

Allow to set the default dev_loss_tmo value as kernel module option.

Cc: Nilesh Javali <njavali@marvell.com>
Cc: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
Hi,

During array upgrade tests with NVMe/FC on systems equiped with QLogic
HBAs we faced the problem with the default setting of dev_loss_tmo.

When the default timeout hit after 60 seconds the file system went
into read only mode. The fix was to set the dev_loss_tmo to infinity
(note this patch can't handle this).

For lpfc devices we could use the sysfs interface under
fc_remote_ports which exposed the dev_loss_tmo for SCSI and NVMe
rports.

The QLogic only expose the rports via fc_remote_ports if SCSI is used.
There is the debugfs interface to set the dev_loss_tmo but this has
two issues. First, it's not watched by udevd hence no rules work. This
could be somehow worked around by setting it statically, but that is
really only an option for testing. Even if the debugfs interface is
used there is a bug in the code. In qla_nvme_register_remote() the
value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
default value 60 (this code path is exercised if the rport droppes
twice).

Anyway, this patch is just to get the discussion going. Maybe the
driver could implement the fc_remote_port interface? Hannes was
pointing out it might make sense to think about an controller sysfs
API as there is already a host and the NVMe protocol is all about host
and controller.

Thanks,
Daniel

 drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
 drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
 drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
 drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
index 3aa9869f6fae..0d2386ba65c0 100644
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable)
 	}
 
 	/* initialize attributes */
-	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
+	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
 	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
 	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
 	fc_host_supported_classes(vha->host) =
@@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha)
 	struct qla_hw_data *ha = vha->hw;
 	u32 speeds = FC_PORTSPEED_UNKNOWN;
 
-	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
+	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
 	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
 	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
 	fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ?
diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
index fae5cae6f0a8..0b9c24475711 100644
--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers;
 extern int ql2xfulldump_on_mpifail;
 extern int ql2xenforce_iocb_limit;
 extern int ql2xabts_wait_nvme;
+extern int ql2xdev_loss_tmo;
 
 extern int qla2x00_loop_reset(scsi_qla_host_t *);
 extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int);
diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
index 0cacb667a88b..cdc5b5075407 100644
--- a/drivers/scsi/qla2xxx/qla_nvme.c
+++ b/drivers/scsi/qla2xxx/qla_nvme.c
@@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
 	req.port_name = wwn_to_u64(fcport->port_name);
 	req.node_name = wwn_to_u64(fcport->node_name);
 	req.port_role = 0;
-	req.dev_loss_tmo = 0;
+	req.dev_loss_tmo = ql2xdev_loss_tmo;
 
 	if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR)
 		req.port_role = FC_PORT_ROLE_NVME_INITIATOR;
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index d74c32f84ef5..c686522ff64e 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
 static int qla2xxx_map_queues(struct Scsi_Host *shost);
 static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
 
+int ql2xdev_loss_tmo = 60;
+module_param(ql2xdev_loss_tmo, int, 0444);
+MODULE_PARM_DESC(ql2xdev_loss_tmo,
+		"Time to wait for device to recover before reporting\n"
+		"an error. Default is 60 seconds\n");
 
 static struct scsi_transport_template *qla2xxx_transport_template = NULL;
 struct scsi_transport_template *qla2xxx_transport_vport_template = NULL;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-19 10:00 [RFC] qla2xxx: Add dev_loss_tmo kernel module options Daniel Wagner
@ 2021-04-19 16:19 ` Randy Dunlap
  2021-04-20 12:37   ` Daniel Wagner
  2021-04-20 17:27 ` Benjamin Block
  2021-04-20 17:35 ` Roman Bolshakov
  2 siblings, 1 reply; 14+ messages in thread
From: Randy Dunlap @ 2021-04-19 16:19 UTC (permalink / raw)
  To: Daniel Wagner, linux-scsi
  Cc: GR-QLogic-Storage-Upstream, linux-nvme, Hannes Reinecke,
	Nilesh Javali, Arun Easi

Hi,

On 4/19/21 3:00 AM, Daniel Wagner wrote:
> Allow to set the default dev_loss_tmo value as kernel module option.
> 
> Cc: Nilesh Javali <njavali@marvell.com>
> Cc: Arun Easi <aeasi@marvell.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> 
>  drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
>  drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
>  drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 

> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
>  static int qla2xxx_map_queues(struct Scsi_Host *shost);
>  static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>  
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> +		"Time to wait for device to recover before reporting\n"
> +		"an error. Default is 60 seconds\n");

No need for the \n in the quoted string. Just change it to a space.


-- 
~Randy


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-19 16:19 ` Randy Dunlap
@ 2021-04-20 12:37   ` Daniel Wagner
  2021-04-20 14:51     ` Himanshu Madhani
  2021-04-20 15:35     ` Randy Dunlap
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel Wagner @ 2021-04-20 12:37 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, linux-nvme,
	Hannes Reinecke, Nilesh Javali, Arun Easi

Hi Randy,

On Mon, Apr 19, 2021 at 09:19:13AM -0700, Randy Dunlap wrote:
> > +int ql2xdev_loss_tmo = 60;
> > +module_param(ql2xdev_loss_tmo, int, 0444);
> > +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> > +		"Time to wait for device to recover before reporting\n"
> > +		"an error. Default is 60 seconds\n");
> 
> No need for the \n in the quoted string. Just change it to a space.

I just followed the current style in this file. I guess this style
question is up to the maintainers to decide what they want.

Thanks,
Daniel


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-20 12:37   ` Daniel Wagner
@ 2021-04-20 14:51     ` Himanshu Madhani
  2021-04-20 15:35     ` Randy Dunlap
  1 sibling, 0 replies; 14+ messages in thread
From: Himanshu Madhani @ 2021-04-20 14:51 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Randy Dunlap, linux-scsi, GR-QLogic-Storage-Upstream, linux-nvme,
	Hannes Reinecke, Nilesh Javali, Arun Easi

Hi Daniel,

> On Apr 20, 2021, at 7:37 AM, Daniel Wagner <dwagner@suse.de> wrote:
> 
> Hi Randy,
> 
> On Mon, Apr 19, 2021 at 09:19:13AM -0700, Randy Dunlap wrote:
>>> +int ql2xdev_loss_tmo = 60;
>>> +module_param(ql2xdev_loss_tmo, int, 0444);
>>> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
>>> +		"Time to wait for device to recover before reporting\n"
>>> +		"an error. Default is 60 seconds\n");
>> 
>> No need for the \n in the quoted string. Just change it to a space.
> 
> I just followed the current style in this file. I guess this style
> question is up to the maintainers to decide what they want.
> 
> Thanks,
> Daniel
> 

Some of the instance in the file needed \n for readability of Module options. In this particular instance, 
I would rather move \n so that the message is displayed on same line and default option on second line

Something like following

>>> "Time to wait for device to recover before reporting an error.\n"
>>> "Default is 60 seconds\n");

With the change mentioned above I am ok with this patch. You can add my R-B when you send official patch. 

--
Himanshu Madhani	 Oracle Linux Engineering


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-20 12:37   ` Daniel Wagner
  2021-04-20 14:51     ` Himanshu Madhani
@ 2021-04-20 15:35     ` Randy Dunlap
  1 sibling, 0 replies; 14+ messages in thread
From: Randy Dunlap @ 2021-04-20 15:35 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, linux-nvme,
	Hannes Reinecke, Nilesh Javali, Arun Easi

On 4/20/21 5:37 AM, Daniel Wagner wrote:
> Hi Randy,
> 
> On Mon, Apr 19, 2021 at 09:19:13AM -0700, Randy Dunlap wrote:
>>> +int ql2xdev_loss_tmo = 60;
>>> +module_param(ql2xdev_loss_tmo, int, 0444);
>>> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
>>> +		"Time to wait for device to recover before reporting\n"
>>> +		"an error. Default is 60 seconds\n");
>>
>> No need for the \n in the quoted string. Just change it to a space.
> 
> I just followed the current style in this file. I guess this style
> question is up to the maintainers to decide what they want.

I see what you are saying, but the file has no consistency
of style in that regard. :)

oh well.
-- 
~Randy


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-19 10:00 [RFC] qla2xxx: Add dev_loss_tmo kernel module options Daniel Wagner
  2021-04-19 16:19 ` Randy Dunlap
@ 2021-04-20 17:27 ` Benjamin Block
  2021-04-20 17:35 ` Roman Bolshakov
  2 siblings, 0 replies; 14+ messages in thread
From: Benjamin Block @ 2021-04-20 17:27 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, linux-nvme,
	Hannes Reinecke, Nilesh Javali, Arun Easi

On Mon, Apr 19, 2021 at 12:00:14PM +0200, Daniel Wagner wrote:
> Allow to set the default dev_loss_tmo value as kernel module option.
> 
> Cc: Nilesh Javali <njavali@marvell.com>
> Cc: Arun Easi <aeasi@marvell.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> Hi,
> 
> During array upgrade tests with NVMe/FC on systems equiped with QLogic
> HBAs we faced the problem with the default setting of dev_loss_tmo.
> 
> When the default timeout hit after 60 seconds the file system went
> into read only mode. The fix was to set the dev_loss_tmo to infinity
> (note this patch can't handle this).
> 
> For lpfc devices we could use the sysfs interface under
> fc_remote_ports which exposed the dev_loss_tmo for SCSI and NVMe
> rports.
> 
> The QLogic only expose the rports via fc_remote_ports if SCSI is used.
> There is the debugfs interface to set the dev_loss_tmo but this has
> two issues. First, it's not watched by udevd hence no rules work. This
> could be somehow worked around by setting it statically, but that is
> really only an option for testing. Even if the debugfs interface is
> used there is a bug in the code. In qla_nvme_register_remote() the
> value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
> default value 60 (this code path is exercised if the rport droppes
> twice).
> 
> Anyway, this patch is just to get the discussion going. Maybe the
> driver could implement the fc_remote_port interface? Hannes was
> pointing out it might make sense to think about an controller sysfs
> API as there is already a host and the NVMe protocol is all about host
> and controller.
> 
> Thanks,
> Daniel
> 
>  drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
>  drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
>  drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
> index 3aa9869f6fae..0d2386ba65c0 100644
> --- a/drivers/scsi/qla2xxx/qla_attr.c
> +++ b/drivers/scsi/qla2xxx/qla_attr.c
> @@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable)
>  	}
>  
>  	/* initialize attributes */
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) =
> @@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha)
>  	struct qla_hw_data *ha = vha->hw;
>  	u32 speeds = FC_PORTSPEED_UNKNOWN;
>  
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ?
> diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
> index fae5cae6f0a8..0b9c24475711 100644
> --- a/drivers/scsi/qla2xxx/qla_gbl.h
> +++ b/drivers/scsi/qla2xxx/qla_gbl.h
> @@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers;
>  extern int ql2xfulldump_on_mpifail;
>  extern int ql2xenforce_iocb_limit;
>  extern int ql2xabts_wait_nvme;
> +extern int ql2xdev_loss_tmo;
>  
>  extern int qla2x00_loop_reset(scsi_qla_host_t *);
>  extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int);
> diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
> index 0cacb667a88b..cdc5b5075407 100644
> --- a/drivers/scsi/qla2xxx/qla_nvme.c
> +++ b/drivers/scsi/qla2xxx/qla_nvme.c
> @@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
>  	req.port_name = wwn_to_u64(fcport->port_name);
>  	req.node_name = wwn_to_u64(fcport->node_name);
>  	req.port_role = 0;
> -	req.dev_loss_tmo = 0;
> +	req.dev_loss_tmo = ql2xdev_loss_tmo;
>  
>  	if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR)
>  		req.port_role = FC_PORT_ROLE_NVME_INITIATOR;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
>  static int qla2xxx_map_queues(struct Scsi_Host *shost);
>  static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>  
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> +		"Time to wait for device to recover before reporting\n"
> +		"an error. Default is 60 seconds\n");

Wouldn't that be really really confusing, if you set essentially the
same thing with two different knobs for one FC HBA? We already have
a `dev_loss_tmo` kernel parameter - granted, only for scsi_transport_fc;
but doesn't qla implement that as well?

I don't really have any horses in this race here, but that sounds
strange.


-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-19 10:00 [RFC] qla2xxx: Add dev_loss_tmo kernel module options Daniel Wagner
  2021-04-19 16:19 ` Randy Dunlap
  2021-04-20 17:27 ` Benjamin Block
@ 2021-04-20 17:35 ` Roman Bolshakov
  2021-04-20 18:28   ` Daniel Wagner
  2 siblings, 1 reply; 14+ messages in thread
From: Roman Bolshakov @ 2021-04-20 17:35 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, linux-nvme,
	Hannes Reinecke, Nilesh Javali, Arun Easi, James Smart

On Mon, Apr 19, 2021 at 12:00:14PM +0200, Daniel Wagner wrote:
> 
> The QLogic only expose the rports via fc_remote_ports if SCSI is used.
> There is the debugfs interface to set the dev_loss_tmo but this has
> two issues. First, it's not watched by udevd hence no rules work. This
> could be somehow worked around by setting it statically, but that is
> really only an option for testing. Even if the debugfs interface is
> used there is a bug in the code. In qla_nvme_register_remote() the
> value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
> default value 60 (this code path is exercised if the rport droppes
> twice).
> 
> Anyway, this patch is just to get the discussion going. Maybe the
> driver could implement the fc_remote_port interface? Hannes was
> pointing out it might make sense to think about an controller sysfs
> API as there is already a host and the NVMe protocol is all about host
> and controller.
> 

Hi Daniel,

I agree that proper fc_remote_ports interface is needed for NVMe in
qla2xxx.

There was a similar concern from me when the dev_loss_tmo setting in
debugfs was added to the driver. Arun agreed the debugfs approach is a
crutch but the discussion never took off beyond that:

https://patchwork.kernel.org/project/linux-scsi/patch/20200818123203.20361-4-njavali@marvell.com/#23553423

+ James S.

Daniel, WRT to your patch I don't think we should add one more approach
to set dev_loss_tmo via kernel module parameter as NVMe adopters are
going to be even more confused about the parameter. Just imagine
knowledge bases populated with all sorts of the workarounds, that apply
to kernel version x, y, z, etc :)

What exists for FCP/SCSI is quite clear and reasonable. I don't know why
FC-NVMe rports should be way too different.

Thanks,
Roman

> Thanks,
> Daniel
> 
>  drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
>  drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
>  drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
> index 3aa9869f6fae..0d2386ba65c0 100644
> --- a/drivers/scsi/qla2xxx/qla_attr.c
> +++ b/drivers/scsi/qla2xxx/qla_attr.c
> @@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable)
>  	}
>  
>  	/* initialize attributes */
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) =
> @@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha)
>  	struct qla_hw_data *ha = vha->hw;
>  	u32 speeds = FC_PORTSPEED_UNKNOWN;
>  
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ?
> diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
> index fae5cae6f0a8..0b9c24475711 100644
> --- a/drivers/scsi/qla2xxx/qla_gbl.h
> +++ b/drivers/scsi/qla2xxx/qla_gbl.h
> @@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers;
>  extern int ql2xfulldump_on_mpifail;
>  extern int ql2xenforce_iocb_limit;
>  extern int ql2xabts_wait_nvme;
> +extern int ql2xdev_loss_tmo;
>  
>  extern int qla2x00_loop_reset(scsi_qla_host_t *);
>  extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int);
> diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
> index 0cacb667a88b..cdc5b5075407 100644
> --- a/drivers/scsi/qla2xxx/qla_nvme.c
> +++ b/drivers/scsi/qla2xxx/qla_nvme.c
> @@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
>  	req.port_name = wwn_to_u64(fcport->port_name);
>  	req.node_name = wwn_to_u64(fcport->node_name);
>  	req.port_role = 0;
> -	req.dev_loss_tmo = 0;
> +	req.dev_loss_tmo = ql2xdev_loss_tmo;
>  
>  	if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR)
>  		req.port_role = FC_PORT_ROLE_NVME_INITIATOR;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
>  static int qla2xxx_map_queues(struct Scsi_Host *shost);
>  static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>  
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> +		"Time to wait for device to recover before reporting\n"
> +		"an error. Default is 60 seconds\n");
>  
>  static struct scsi_transport_template *qla2xxx_transport_template = NULL;
>  struct scsi_transport_template *qla2xxx_transport_vport_template = NULL;
> -- 
> 2.29.2
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-20 17:35 ` Roman Bolshakov
@ 2021-04-20 18:28   ` Daniel Wagner
  2021-04-21  0:25     ` [EXT] " Arun Easi
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Wagner @ 2021-04-20 18:28 UTC (permalink / raw)
  To: Roman Bolshakov
  Cc: linux-scsi, GR-QLogic-Storage-Upstream, linux-nvme,
	Hannes Reinecke, Nilesh Javali, Arun Easi, James Smart

Hi Roman,

On Tue, Apr 20, 2021 at 08:35:10PM +0300, Roman Bolshakov wrote:
> + James S.
> 
> Daniel, WRT to your patch I don't think we should add one more approach
> to set dev_loss_tmo via kernel module parameter as NVMe adopters are
> going to be even more confused about the parameter. Just imagine
> knowledge bases populated with all sorts of the workarounds, that apply
> to kernel version x, y, z, etc :)

Totally agree. I consider this patch just a hack and way to get the
discussion going, hence the RFC :) Well, maybe we are going to add it
downstream in our kernels until we have a better way for setting the
dev_loss_tmo.

As explained the debugfs interface is not working (okay, that's
something which could be fixed) and it has the big problem that it is
not under control by udevd. Not sure if we with some new udev rules the
debugfs could automatically discovered or not.

> What exists for FCP/SCSI is quite clear and reasonable. I don't know why
> FC-NVMe rports should be way too different.

The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
via the fc_remote_ports and this is what I would like to have from the
qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
FC-NVMe rports.

Thanks,
Daniel

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXT] Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-20 18:28   ` Daniel Wagner
@ 2021-04-21  0:25     ` Arun Easi
  2021-04-21  7:56       ` Daniel Wagner
  2021-04-28 14:51       ` James Smart
  0 siblings, 2 replies; 14+ messages in thread
From: Arun Easi @ 2021-04-21  0:25 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Roman Bolshakov, linux-scsi, GR-QLogic-Storage-Upstream,
	linux-nvme, Hannes Reinecke, Nilesh Javali, James Smart

Hi Daniel,

On Tue, 20 Apr 2021, 11:28am, Daniel Wagner wrote:

> ----------------------------------------------------------------------
> Hi Roman,
> 
> On Tue, Apr 20, 2021 at 08:35:10PM +0300, Roman Bolshakov wrote:
> > + James S.
> > 
> > Daniel, WRT to your patch I don't think we should add one more approach
> > to set dev_loss_tmo via kernel module parameter as NVMe adopters are
> > going to be even more confused about the parameter. Just imagine
> > knowledge bases populated with all sorts of the workarounds, that apply
> > to kernel version x, y, z, etc :)
> 
> Totally agree. I consider this patch just a hack and way to get the
> discussion going, hence the RFC :) Well, maybe we are going to add it
> downstream in our kernels until we have a better way for setting the
> dev_loss_tmo.
> 
> As explained the debugfs interface is not working (okay, that's
> something which could be fixed) and it has the big problem that it is
> not under control by udevd. Not sure if we with some new udev rules the
> debugfs could automatically discovered or not.

Curious, which udev script does this today for FC SCSI?

In theory, the exsting fc nvmediscovery udev event has enough information 
to find out the right qla2xxx debugfs node and set dev_loss_tmo.

> 
> > What exists for FCP/SCSI is quite clear and reasonable. I don't know why
> > FC-NVMe rports should be way too different.
> 
> The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
> via the fc_remote_ports and this is what I would like to have from the
> qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
> FC-NVMe rports.
> 

Given that FC NVME does not have sysfs hierarchy like FC SCSI, I see 
utility in making FC-NVME ports available via fc_remote_ports. If, though, 
a FC target port is dual protocol aware this would leave with only one 
knob to control both.

I think, going with fc_remote_ports is better than introducing one more 
way (like this patch) to set this.

Regards,
-Arun

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXT] Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-21  0:25     ` [EXT] " Arun Easi
@ 2021-04-21  7:56       ` Daniel Wagner
  2021-04-27  9:51         ` Daniel Wagner
  2021-04-28 14:51       ` James Smart
  1 sibling, 1 reply; 14+ messages in thread
From: Daniel Wagner @ 2021-04-21  7:56 UTC (permalink / raw)
  To: Arun Easi
  Cc: Roman Bolshakov, linux-scsi, GR-QLogic-Storage-Upstream,
	linux-nvme, Hannes Reinecke, Nilesh Javali, James Smart

Hi Arun,

On Tue, Apr 20, 2021 at 05:25:52PM -0700, Arun Easi wrote:
> On Tue, 20 Apr 2021, 11:28am, Daniel Wagner wrote:
> > As explained the debugfs interface is not working (okay, that's
> > something which could be fixed) and it has the big problem that it is
> > not under control by udevd. Not sure if we with some new udev rules the
> > debugfs could automatically discovered or not.
>
> Curious, which udev script does this today for FC SCSI?

I am currently figuring out the 'correct' settings for passing the
various tests our partner does in their labs. That is no upstream udev
rules for this (yet).

Anyway, the settings are

  ACTION!="add|change", GOTO="tmo_end"
  # SCSI fc_remote_ports
  KERNELS=="rport-?*", SUBSYSTEM=="fc_remote_ports", ATTR{fast_io_fail_tmo}="5", ATTR{dev_loss_tmo}="4294967295"
  # nvme fabrics
  KERNELS=="ctl", SUBSYSTEMS=="nvme-fabrics", ATTR{transport}=="fc", ATTR{fast_io_fail_tmo}="-1", ATTR{ctrl_loss_tmo}="-1"
  LABEL="tmo_end"

and this works for lpfc but only half for qla2xxx.

> In theory, the exsting fc nvmediscovery udev event has enough information
> to find out the right qla2xxx debugfs node and set dev_loss_tmo.

Ah, didn't know about nvmediscovery until very recentetly. I try to get
it working with this approach (as this patch is not really a proper
solution).

> > > What exists for FCP/SCSI is quite clear and reasonable. I don't know why
> > > FC-NVMe rports should be way too different.
> >
> > The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
> > via the fc_remote_ports and this is what I would like to have from the
> > qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
> > FC-NVMe rports.
> >
>
> Given that FC NVME does not have sysfs hierarchy like FC SCSI, I see
> utility in making FC-NVME ports available via fc_remote_ports. If, though,
> a FC target port is dual protocol aware this would leave with only one
> knob to control both.

So far I haven't had the need to distinguish between the two protocols
for the dev_loss_tmo setting. I think this is what Hannes was also
trying to tell, it might make sense to introduce sysfs APIs per
protocol.

> I think, going with fc_remote_ports is better than introducing one more
> way (like this patch) to set this.

As I said this patch was really a RFC. I will experiment with
nvmediscovery. Though, I think this is just a stopgap solution. Having
two completely different ways to configure the same thing is sub optimal
and it is asking for a lot of troubles with end customers. I am really
hoping we could streamline the current APIs, so we have only one
recommended way to configure the system independent of the driver
involved.

Thanks,
Daniel

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXT] Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-21  7:56       ` Daniel Wagner
@ 2021-04-27  9:51         ` Daniel Wagner
  2021-04-27 22:35           ` Arun Easi
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Wagner @ 2021-04-27  9:51 UTC (permalink / raw)
  To: Arun Easi
  Cc: Roman Bolshakov, linux-scsi, GR-QLogic-Storage-Upstream,
	linux-nvme, Hannes Reinecke, Nilesh Javali, James Smart

On Wed, Apr 21, 2021 at 09:56:59AM +0200, Daniel Wagner wrote:
> Ah, didn't know about nvmediscovery until very recentetly. I try to get
> it working with this approach (as this patch is not really a proper
> solution).

Finally found some time to play with this. FTR, the nvmediscovery
carries following information:

  UDEV  [65238.364677] change   /devices/virtual/fc/fc_udev_device (fc)
  ACTION=change
  DEVPATH=/devices/virtual/fc/fc_udev_device
  FC_EVENT=nvmediscovery
  NVMEFC_HOST_TRADDR=nn-0x20000024ff7fa448:pn-0x21000024ff7fa448
  NVMEFC_TRADDR=nn-0x200200a09890f5bf:pn-0x203800a09890f5bf
  SEQNUM=12357
  SUBSYSTEM=fc
  USEC_INITIALIZED=65238333374

The udev rule I came up is:

  ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
      ENV{NVMEFC_TRADDR}=="*", \
      RUN+="/usr/local/sbin/qla2xxx_dev_loss_tmo.sh $env{NVMEFC_TRADDR} 4294967295"

and the script is:

  #!/bin/sh

  TRADDR=$1
  TMO=$2

  id=$(echo $TRADDR | sed -n "s/.*pn-0x\([0-9a-f]\+\)/\1/p")
  find /sys/kernel/debug/qla2xxx -name pn-$id -exec /bin/sh -c "echo $TMO > {}/dev_loss_tmo" \;

I am sure this can be done in a more elegant way. Anyway, I am testing
this right now, the first 30 minutes look good...

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXT] Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-27  9:51         ` Daniel Wagner
@ 2021-04-27 22:35           ` Arun Easi
  2021-04-28  7:17             ` Daniel Wagner
  0 siblings, 1 reply; 14+ messages in thread
From: Arun Easi @ 2021-04-27 22:35 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Roman Bolshakov, linux-scsi, GR-QLogic-Storage-Upstream,
	linux-nvme, Hannes Reinecke, Nilesh Javali, James Smart

On Tue, 27 Apr 2021, 2:51am, Daniel Wagner wrote:

> On Wed, Apr 21, 2021 at 09:56:59AM +0200, Daniel Wagner wrote:
> > Ah, didn't know about nvmediscovery until very recentetly. I try to get
> > it working with this approach (as this patch is not really a proper
> > solution).
> 
> Finally found some time to play with this. FTR, the nvmediscovery
> carries following information:
> 
>   UDEV  [65238.364677] change   /devices/virtual/fc/fc_udev_device (fc)
>   ACTION=change
>   DEVPATH=/devices/virtual/fc/fc_udev_device
>   FC_EVENT=nvmediscovery
>   NVMEFC_HOST_TRADDR=nn-0x20000024ff7fa448:pn-0x21000024ff7fa448
>   NVMEFC_TRADDR=nn-0x200200a09890f5bf:pn-0x203800a09890f5bf
>   SEQNUM=12357
>   SUBSYSTEM=fc
>   USEC_INITIALIZED=65238333374
> 
> The udev rule I came up is:
> 
>   ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
>       ENV{NVMEFC_TRADDR}=="*", \
>       RUN+="/usr/local/sbin/qla2xxx_dev_loss_tmo.sh $env{NVMEFC_TRADDR} 4294967295"
> 
> and the script is:
> 
>   #!/bin/sh
> 
>   TRADDR=$1
>   TMO=$2
> 
>   id=$(echo $TRADDR | sed -n "s/.*pn-0x\([0-9a-f]\+\)/\1/p")
>   find /sys/kernel/debug/qla2xxx -name pn-$id -exec /bin/sh -c "echo $TMO > {}/dev_loss_tmo" \;
> 
> I am sure this can be done in a more elegant way. Anyway, I am testing
> this right now, the first 30 minutes look good...
> 

Looks ok to me. Just keep in mind that, with this you'd be setting all 
instances of pn-XXX (multiple initiator ports seeing the same target 
scenario). It looks like this is what you want, but thought I'd point that 
out.

Regards,
-Arun

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXT] Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-27 22:35           ` Arun Easi
@ 2021-04-28  7:17             ` Daniel Wagner
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Wagner @ 2021-04-28  7:17 UTC (permalink / raw)
  To: Arun Easi
  Cc: Roman Bolshakov, linux-scsi, GR-QLogic-Storage-Upstream,
	linux-nvme, Hannes Reinecke, Nilesh Javali, James Smart

Hi Arun,

On Tue, Apr 27, 2021 at 03:35:47PM -0700, Arun Easi wrote:
> > I am sure this can be done in a more elegant way. Anyway, I am testing
> > this right now, the first 30 minutes look good...
> > 
> 
> Looks ok to me. Just keep in mind that, with this you'd be setting all 
> instances of pn-XXX (multiple initiator ports seeing the same target 
> scenario). It looks like this is what you want, but thought I'd point that 
> out.

Good point. Yes, that's was the plan, set all ports to the same
value. BTW, an 8 hours port toggle test passed. With this setup we don't
need to add dirty patches downstream.

Thanks,
Daniel


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [EXT] Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
  2021-04-21  0:25     ` [EXT] " Arun Easi
  2021-04-21  7:56       ` Daniel Wagner
@ 2021-04-28 14:51       ` James Smart
  1 sibling, 0 replies; 14+ messages in thread
From: James Smart @ 2021-04-28 14:51 UTC (permalink / raw)
  To: Arun Easi, Daniel Wagner
  Cc: Roman Bolshakov, linux-scsi, GR-QLogic-Storage-Upstream,
	linux-nvme, Hannes Reinecke, Nilesh Javali, James Smart

On 4/20/2021 5:25 PM, Arun Easi wrote:
> Hi Daniel,
> 
> On Tue, 20 Apr 2021, 11:28am, Daniel Wagner wrote:
> 
>> ----------------------------------------------------------------------
>> Hi Roman,
>>
>> On Tue, Apr 20, 2021 at 08:35:10PM +0300, Roman Bolshakov wrote:
>>> + James S.
>>>
>>> Daniel, WRT to your patch I don't think we should add one more approach
>>> to set dev_loss_tmo via kernel module parameter as NVMe adopters are
>>> going to be even more confused about the parameter. Just imagine
>>> knowledge bases populated with all sorts of the workarounds, that apply
>>> to kernel version x, y, z, etc :)
>>
>> Totally agree. I consider this patch just a hack and way to get the
>> discussion going, hence the RFC :) Well, maybe we are going to add it
>> downstream in our kernels until we have a better way for setting the
>> dev_loss_tmo.
>>
>> As explained the debugfs interface is not working (okay, that's
>> something which could be fixed) and it has the big problem that it is
>> not under control by udevd. Not sure if we with some new udev rules the
>> debugfs could automatically discovered or not.
> 
> Curious, which udev script does this today for FC SCSI?
> 
> In theory, the exsting fc nvmediscovery udev event has enough information
> to find out the right qla2xxx debugfs node and set dev_loss_tmo.
> 
>>
>>> What exists for FCP/SCSI is quite clear and reasonable. I don't know why
>>> FC-NVMe rports should be way too different.
>>
>> The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
>> via the fc_remote_ports and this is what I would like to have from the
>> qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
>> FC-NVMe rports.
>>
> 
> Given that FC NVME does not have sysfs hierarchy like FC SCSI, I see
> utility in making FC-NVME ports available via fc_remote_ports. If, though,
> a FC target port is dual protocol aware this would leave with only one
> knob to control both.
> 
> I think, going with fc_remote_ports is better than introducing one more
> way (like this patch) to set this.
> 
> Regards,
> -Arun

Thanks Arun,

I think we're all better off if the qla exports the nvme nodes via the 
scsi-side fc_remote_ports.  In the end - we will commonize a fc 
transport that then sits above scsi and nvme and will definitely be 
compatible with what's there. The registration with scsi was rather 
straight-forward for lpfc, so I assume it will be for qla as well and 
the devloss interface, although kludegy to have the driver propagate the 
scsi callback to nvme, also isn't much.

I also don't think we want to keep creating new mgmt points. it's 
already ugly enough.

-- james


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-04-28 14:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-19 10:00 [RFC] qla2xxx: Add dev_loss_tmo kernel module options Daniel Wagner
2021-04-19 16:19 ` Randy Dunlap
2021-04-20 12:37   ` Daniel Wagner
2021-04-20 14:51     ` Himanshu Madhani
2021-04-20 15:35     ` Randy Dunlap
2021-04-20 17:27 ` Benjamin Block
2021-04-20 17:35 ` Roman Bolshakov
2021-04-20 18:28   ` Daniel Wagner
2021-04-21  0:25     ` [EXT] " Arun Easi
2021-04-21  7:56       ` Daniel Wagner
2021-04-27  9:51         ` Daniel Wagner
2021-04-27 22:35           ` Arun Easi
2021-04-28  7:17             ` Daniel Wagner
2021-04-28 14:51       ` James Smart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).