All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] convert libsas to the libata new eh
@ 2011-01-23 15:41 James Bottomley
  2011-01-23 15:42 ` [PATCH 1/2] libata: separate error handler into usable components James Bottomley
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: James Bottomley @ 2011-01-23 15:41 UTC (permalink / raw)
  To: linux-scsi, linux-ide; +Cc: Brian King, Robert Jennings

The bad news is that this is a pretty horrific conversion.  Because of
the amount of control the new eh wants (and the slew of monolithic
assumptions it makes), the entire error handler routine has to be sliced
apart and sewn back together to make it callable as part of an existing
error handler (rather than trying to do everything on its own).

The even worse news is that unless you have an existing strategy
handler, you can't make use of this.  That means that ipr is
unconvertable in its current form.  Given the two really nasty options:
give ipr its own error handler via its own transport class, or attach
ipr to libsas and let it do the work, I'd choose the latter.
Unfortunately, my power box with IPR has been non-functional for a year
(and all pleas to IBM to fix it have so far gone unanswered), so I can't
really do this myself.

Once we carve up libata, it's still a rather complex call sequence for
libsas because of libata's annoying insistance on calling the error
handler at the drop of a hat.  The principle differentiator used is the
fact that if a SCSI command no longer has an associated SAS task, (i.e.
libsas thinks it's complete) it must be destined for libata.  This works
correctly in all the ATAPI sense processing instances I've tested.
Finally, we still have to call port recovery.  Currently, it's
impossible to tell which port libata thinks is in error (libata knows
because it has one host per port, but libsas doesn't), so the final
solution is just to call port recovery for every SATA/STP attached port.

I've tested all of this with mvsas over expander attached ATA devices
using STP (aic94xx seems to have developed some handling fault for STP,
which I'll look into next).

James

---

James Bottomley (2):
  libata: separate error handler into usable components
  libsas: convert to libata new error handler

 drivers/ata/libata-eh.c             |   53 ++++++++++++++++++----
 drivers/scsi/libsas/sas_ata.c       |   87 ++++++++++++++++++++++++++++++++--
 drivers/scsi/libsas/sas_scsi_host.c |   14 +++++-
 include/linux/libata.h              |    2 +
 include/scsi/sas_ata.h              |   22 +++++++++
 5 files changed, 161 insertions(+), 17 deletions(-)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] libata: separate error handler into usable components
  2011-01-23 15:41 [PATCH 0/2] convert libsas to the libata new eh James Bottomley
@ 2011-01-23 15:42 ` James Bottomley
  2011-01-23 15:44 ` [PATCH 2/2] libsas: convert to libata new error handler James Bottomley
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2011-01-23 15:42 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Brian King, Robert Jennings

Right at the moment, the libata error handler is incredibly
monolithic.  This makes it impossible to use from composite drivers
like libsas and ipr which have to handle error themselves in the first
instance.

The essence of the change is to split the monolithic error handler
into two components: one which handles a queue of ata commands for
processing and the other which handles the back end of readying a
port.  This allows the upper error handler fine grained control in
calling libsas functions (and making sure they only get called for ATA
commands whose lower errors have been fixed up).

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
---
 drivers/ata/libata-eh.c |   53 +++++++++++++++++++++++++++++++++++++++--------
 include/linux/libata.h  |    2 +
 2 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index fc3f339..e02455b 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -587,7 +587,6 @@ static void ata_eh_unload(struct ata_port *ap)
 void ata_scsi_error(struct Scsi_Host *host)
 {
 	struct ata_port *ap = ata_shost_to_port(host);
-	int i;
 	unsigned long flags;
 	LIST_HEAD(eh_work_q);
 
@@ -597,6 +596,34 @@ void ata_scsi_error(struct Scsi_Host *host)
 	list_splice_init(&host->eh_cmd_q, &eh_work_q);
 	spin_unlock_irqrestore(host->host_lock, flags);
 
+	ata_scsi_cmd_error_handler(host, ap, &eh_work_q);
+
+	/* If we timed raced normal completion and there is nothing to
+	   recover nr_timedout == 0 why exactly are we doing error recovery ? */
+	ata_scsi_port_error_handler(host, ap);
+
+	/* finish or retry handled scmd's and clean up */
+	WARN_ON(host->host_failed || !list_empty(&eh_work_q));
+
+	DPRINTK("EXIT\n");
+}
+
+/**
+ * ata_scsi_cmd_error_handler - error callback for a list of commands
+ * @host:	scsi host containing the port
+ * @ap:		ATA port within the host
+ * @eh_work_q:	list of commands to process
+ *
+ * process the given list of commands and return those finished to the
+ * ap->eh_done_q.  This function is the first part of the libata error
+ * handler which processes a given list of failed commands.
+ */
+void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
+				struct list_head *eh_work_q)
+{
+	int i;
+	unsigned long flags;
+
 	/* make sure sff pio task is not running */
 	ata_sff_flush_pio_task(ap);
 
@@ -632,7 +659,7 @@ void ata_scsi_error(struct Scsi_Host *host)
 		if (ap->ops->lost_interrupt)
 			ap->ops->lost_interrupt(ap);
 
-		list_for_each_entry_safe(scmd, tmp, &eh_work_q, eh_entry) {
+		list_for_each_entry_safe(scmd, tmp, eh_work_q, eh_entry) {
 			struct ata_queued_cmd *qc;
 
 			for (i = 0; i < ATA_MAX_QUEUE; i++) {
@@ -676,8 +703,20 @@ void ata_scsi_error(struct Scsi_Host *host)
 	} else
 		spin_unlock_wait(ap->lock);
 
-	/* If we timed raced normal completion and there is nothing to
-	   recover nr_timedout == 0 why exactly are we doing error recovery ? */
+}
+EXPORT_SYMBOL(ata_scsi_cmd_error_handler);
+
+/**
+ * ata_scsi_port_error_handler - recover the port after the commands
+ * @host:	SCSI host containing the port
+ * @ap:		the ATA port
+ *
+ * Handle the recovery of the port @ap after all the commands
+ * have been recovered.
+ */
+void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap)
+{
+	unsigned long flags;
 
 	/* invoke error handler */
 	if (ap->ops->error_handler) {
@@ -766,9 +805,6 @@ void ata_scsi_error(struct Scsi_Host *host)
 		ap->ops->eng_timeout(ap);
 	}
 
-	/* finish or retry handled scmd's and clean up */
-	WARN_ON(host->host_failed || !list_empty(&eh_work_q));
-
 	scsi_eh_flush_done_q(&ap->eh_done_q);
 
 	/* clean up */
@@ -789,9 +825,8 @@ void ata_scsi_error(struct Scsi_Host *host)
 	wake_up_all(&ap->eh_wait_q);
 
 	spin_unlock_irqrestore(ap->lock, flags);
-
-	DPRINTK("EXIT\n");
 }
+EXPORT_SYMBOL_GPL(ata_scsi_port_error_handler);
 
 /**
  *	ata_port_wait_eh - Wait for the currently pending EH to complete
diff --git a/include/linux/libata.h b/include/linux/libata.h
index c9c5d7a..9739317 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1050,6 +1050,8 @@ extern int ata_scsi_change_queue_depth(struct scsi_device *sdev,
 				       int queue_depth, int reason);
 extern struct ata_device *ata_dev_pair(struct ata_device *adev);
 extern int ata_do_set_mode(struct ata_link *link, struct ata_device **r_failed_dev);
+extern void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap);
+extern void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, struct list_head *eh_q);
 
 extern int ata_cable_40wire(struct ata_port *ap);
 extern int ata_cable_80wire(struct ata_port *ap);
-- 
1.7.2.3




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] libsas: convert to libata new error handler
  2011-01-23 15:41 [PATCH 0/2] convert libsas to the libata new eh James Bottomley
  2011-01-23 15:42 ` [PATCH 1/2] libata: separate error handler into usable components James Bottomley
@ 2011-01-23 15:44 ` James Bottomley
  2011-01-24 21:12 ` [PATCH 0/2] convert libsas to the libata new eh Jeff Garzik
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2011-01-23 15:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Brian King, Robert Jennings

The conversion is quite complex given that the libata new error
handler has to be hooked into the current libsas timeout and error
handling.  The way this is done is to process all the failed commands
via libsas first, but if they have no underlying sas task (and they're
on a sata device) assume they are destined for the libata error
handler and send them accordingly.

Finally, activate the port recovery of the libata error handler for
each port known to the host.  This is somewhat suboptimal, since that
port may not need recovering, but given the current architecture of
the libata error handler, it's the only way; and the spurious
activation is harmless.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
---
 drivers/scsi/libsas/sas_ata.c       |   87 ++++++++++++++++++++++++++++++++--
 drivers/scsi/libsas/sas_scsi_host.c |   14 +++++-
 include/scsi/sas_ata.h              |   22 +++++++++
 3 files changed, 115 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 571603d..6aa4a79 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -240,37 +240,43 @@ static bool sas_ata_qc_fill_rtf(struct ata_queued_cmd *qc)
 	return true;
 }
 
-static void sas_ata_phy_reset(struct ata_port *ap)
+static int sas_ata_hard_reset(struct ata_link *link, unsigned int *class,
+			       unsigned long deadline)
 {
+	struct ata_port *ap = link->ap;
 	struct domain_device *dev = ap->private_data;
 	struct sas_internal *i =
 		to_sas_internal(dev->port->ha->core.shost->transportt);
 	int res = TMF_RESP_FUNC_FAILED;
+	int ret = 0;
 
 	if (i->dft->lldd_I_T_nexus_reset)
 		res = i->dft->lldd_I_T_nexus_reset(dev);
 
-	if (res != TMF_RESP_FUNC_COMPLETE)
+	if (res != TMF_RESP_FUNC_COMPLETE) {
 		SAS_DPRINTK("%s: Unable to reset I T nexus?\n", __func__);
+		ret = -EAGAIN;
+	}
 
 	switch (dev->sata_dev.command_set) {
 		case ATA_COMMAND_SET:
 			SAS_DPRINTK("%s: Found ATA device.\n", __func__);
-			ap->link.device[0].class = ATA_DEV_ATA;
+			*class = ATA_DEV_ATA;
 			break;
 		case ATAPI_COMMAND_SET:
 			SAS_DPRINTK("%s: Found ATAPI device.\n", __func__);
-			ap->link.device[0].class = ATA_DEV_ATAPI;
+			*class = ATA_DEV_ATAPI;
 			break;
 		default:
 			SAS_DPRINTK("%s: Unknown SATA command set: %d.\n",
 				    __func__,
 				    dev->sata_dev.command_set);
-			ap->link.device[0].class = ATA_DEV_UNKNOWN;
+			*class = ATA_DEV_UNKNOWN;
 			break;
 	}
 
 	ap->cbl = ATA_CBL_SATA;
+	return ret;
 }
 
 static void sas_ata_post_internal(struct ata_queued_cmd *qc)
@@ -351,7 +357,11 @@ static int sas_ata_scr_read(struct ata_link *link, unsigned int sc_reg_in,
 }
 
 static struct ata_port_operations sas_sata_ops = {
-	.phy_reset		= sas_ata_phy_reset,
+	.prereset		= ata_std_prereset,
+	.softreset		= NULL,
+	.hardreset		= sas_ata_hard_reset,
+	.postreset		= ata_std_postreset,
+	.error_handler		= ata_std_error_handler,
 	.post_internal_cmd	= sas_ata_post_internal,
 	.qc_defer               = ata_std_qc_defer,
 	.qc_prep		= ata_noop_qc_prep,
@@ -783,3 +793,68 @@ int sas_discover_sata(struct domain_device *dev)
 
 	return res;
 }
+
+void sas_ata_strategy_handler(struct Scsi_Host *shost)
+{
+	struct scsi_device *sdev;
+
+	shost_for_each_device(sdev, shost) {
+		struct domain_device *ddev = sdev_to_domain_dev(sdev);
+		struct ata_port *ap = ddev->sata_dev.ap;
+
+		if (!dev_is_sata(ddev))
+			continue;
+		
+		ata_port_printk(ap, KERN_DEBUG, "sas eh calling libata port error handler");
+		ata_scsi_port_error_handler(shost, ap);
+	}
+}
+
+int sas_ata_timed_out(struct scsi_cmnd *cmd, struct sas_task *task,
+		      enum blk_eh_timer_return *rtn)
+{
+	struct domain_device *ddev = cmd_to_domain_dev(cmd);
+
+	if (!dev_is_sata(ddev) || task)
+		return 0;
+
+	/* we're a sata device with no task, so this must be a libata
+	 * eh timeout.  Ideally should hook into libata timeout
+	 * handling, but there's no point, it just wants to activate
+	 * the eh thread */
+	*rtn = BLK_EH_NOT_HANDLED;
+	return 1;
+}
+
+int sas_ata_eh(struct Scsi_Host *shost, struct list_head *work_q,
+	       struct list_head *done_q)
+{
+	int rtn = 0;
+	struct scsi_cmnd *cmd, *n;
+	struct ata_port *ap;
+
+	do {
+		LIST_HEAD(sata_q);
+
+		ap = NULL;
+		
+		list_for_each_entry_safe(cmd, n, work_q, eh_entry) {
+			struct domain_device *ddev = cmd_to_domain_dev(cmd);
+
+			if (!dev_is_sata(ddev) || TO_SAS_TASK(cmd))
+				continue;
+			if(ap && ap != ddev->sata_dev.ap)
+				continue;
+			ap = ddev->sata_dev.ap;
+			rtn = 1;
+			list_move(&cmd->eh_entry, &sata_q);
+		}
+
+		if (!list_empty(&sata_q)) {
+			ata_port_printk(ap, KERN_DEBUG,"sas eh calling libata cmd error handler\n");
+			ata_scsi_cmd_error_handler(shost, ap, &sata_q);
+		}
+	} while (ap);
+
+	return rtn;
+}
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 9a7aaf5..67758ea 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -663,11 +663,16 @@ void sas_scsi_recover_host(struct Scsi_Host *shost)
 	 * scsi_unjam_host does, but we skip scsi_eh_abort_cmds because any
 	 * command we see here has no sas_task and is thus unknown to the HA.
 	 */
-	if (!scsi_eh_get_sense(&eh_work_q, &ha->eh_done_q))
-		scsi_eh_ready_devs(shost, &eh_work_q, &ha->eh_done_q);
+	if (!sas_ata_eh(shost, &eh_work_q, &ha->eh_done_q))
+		if (!scsi_eh_get_sense(&eh_work_q, &ha->eh_done_q))
+			scsi_eh_ready_devs(shost, &eh_work_q, &ha->eh_done_q);
 
 out:
+	/* now link into libata eh --- if we have any ata devices */
+	sas_ata_strategy_handler(shost);
+
 	scsi_eh_flush_done_q(&ha->eh_done_q);
+
 	SAS_DPRINTK("--- Exit %s\n", __func__);
 	return;
 }
@@ -676,6 +681,11 @@ enum blk_eh_timer_return sas_scsi_timed_out(struct scsi_cmnd *cmd)
 {
 	struct sas_task *task = TO_SAS_TASK(cmd);
 	unsigned long flags;
+	enum blk_eh_timer_return rtn; 
+
+	if (sas_ata_timed_out(cmd, task, &rtn))
+		return rtn;
+	
 
 	if (!task) {
 		cmd->request->timeout /= 2;
diff --git a/include/scsi/sas_ata.h b/include/scsi/sas_ata.h
index c583193..9c159f7 100644
--- a/include/scsi/sas_ata.h
+++ b/include/scsi/sas_ata.h
@@ -39,6 +39,11 @@ int sas_ata_init_host_and_port(struct domain_device *found_dev,
 			       struct scsi_target *starget);
 
 void sas_ata_task_abort(struct sas_task *task);
+void sas_ata_strategy_handler(struct Scsi_Host *shost);
+int sas_ata_timed_out(struct scsi_cmnd *cmd, struct sas_task *task,
+		      enum blk_eh_timer_return *rtn);
+int sas_ata_eh(struct Scsi_Host *shost, struct list_head *work_q,
+	       struct list_head *done_q);
 
 #else
 
@@ -55,6 +60,23 @@ static inline int sas_ata_init_host_and_port(struct domain_device *found_dev,
 static inline void sas_ata_task_abort(struct sas_task *task)
 {
 }
+
+static inline void sas_ata_strategy_handler(struct Scsi_Host *shost)
+{
+}
+
+static inline int sas_ata_timed_out(struct scsi_cmnd *cmd,
+				    struct sas_task *task,
+				    enum blk_eh_timer_return *rtn)
+{
+	return 0;
+}
+static inline int sas_ata_eh(struct Scsi_Host *shost, struct list_head *work_q,
+			     struct list_head *done_q)
+{
+	return 0;
+}
+
 #endif
 
 #endif /* _SAS_ATA_H_ */
-- 
1.7.2.3




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-23 15:41 [PATCH 0/2] convert libsas to the libata new eh James Bottomley
  2011-01-23 15:42 ` [PATCH 1/2] libata: separate error handler into usable components James Bottomley
  2011-01-23 15:44 ` [PATCH 2/2] libsas: convert to libata new error handler James Bottomley
@ 2011-01-24 21:12 ` Jeff Garzik
  2011-01-24 21:16   ` James Bottomley
  2011-01-25 18:56 ` Brian King
  2011-02-13 19:18 ` James Bottomley
  4 siblings, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2011-01-24 21:12 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-ide, Brian King, Robert Jennings

About to dive into this.  Thanks for tackling the new-EH conversion, 
libsas is the last straggler in this regard.  As you noted, "old-EH" is 
now a synonym for "SAS EH code path".

	Jeff




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-24 21:12 ` [PATCH 0/2] convert libsas to the libata new eh Jeff Garzik
@ 2011-01-24 21:16   ` James Bottomley
  0 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2011-01-24 21:16 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-scsi, linux-ide, Brian King, Robert Jennings

On Mon, 2011-01-24 at 16:12 -0500, Jeff Garzik wrote:
> About to dive into this.

You're welcome.

>   Thanks for tackling the new-EH conversion, 
> libsas is the last straggler in this regard.

No it's not ... there's still ipr (and any other external consumer).
In it's current form, the new eh simply wasn't usable externally.  As I
said for ipr in the intro blurb, I can't see a way to convert it without
making it attach fully to libsas, although an alternative might be to
get the raid transport class to proxy.

>   As you noted, "old-EH" is 
> now a synonym for "SAS EH code path".

Say more "external EH code path".

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-23 15:41 [PATCH 0/2] convert libsas to the libata new eh James Bottomley
                   ` (2 preceding siblings ...)
  2011-01-24 21:12 ` [PATCH 0/2] convert libsas to the libata new eh Jeff Garzik
@ 2011-01-25 18:56 ` Brian King
  2011-01-26 14:08   ` James Bottomley
  2011-02-13 19:18 ` James Bottomley
  4 siblings, 1 reply; 12+ messages in thread
From: Brian King @ 2011-01-25 18:56 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-ide, Robert Jennings, Wayne Boyer

On 01/23/2011 09:41 AM, James Bottomley wrote:
> The bad news is that this is a pretty horrific conversion.  Because of
> the amount of control the new eh wants (and the slew of monolithic
> assumptions it makes), the entire error handler routine has to be sliced
> apart and sewn back together to make it callable as part of an existing
> error handler (rather than trying to do everything on its own).
> 
> The even worse news is that unless you have an existing strategy
> handler, you can't make use of this.  That means that ipr is
> unconvertable in its current form.  Given the two really nasty options:
> give ipr its own error handler via its own transport class, or attach
> ipr to libsas and let it do the work, I'd choose the latter.

I've got a patch set that's been sitting collecting dust for quite some time
that implements yet another option. The basic idea is that we move away from
the idea of having a single scsi_host for the entire SAS adapter, but
rather have one for the SAS devices and one for each SATA rphy.

This allows us to use libata's EH pretty much as is, but does introduce
some issues to solve:

1. Adapter queue depth. Since we have multiple scsi_hosts for a single
   HBA, we lose the adapter queue depth tracking done by scsi core.
   Anything in block that can help us here?
2. Locking gets messy since we have a lock for each sata rphy as well
   as one for the sas scsi_host.

There were some other smaller issues as well. I liked the idea of moving
to multiple scsi_host's since that eliminates the issue of userspace polling an
empty CD drive causing EH to get invoked and quiescing all SAS I/O as well,
but it does result in potentially a lot of scsi_hosts getting created.

-Brian


-- 
Brian King
Linux on Power Virtualization
IBM Linux Technology Center



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-25 18:56 ` Brian King
@ 2011-01-26 14:08   ` James Bottomley
  2011-01-26 17:46     ` Jeff Garzik
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2011-01-26 14:08 UTC (permalink / raw)
  To: Brian King; +Cc: linux-scsi, linux-ide, Robert Jennings, Wayne Boyer

On Tue, 2011-01-25 at 12:56 -0600, Brian King wrote:
> On 01/23/2011 09:41 AM, James Bottomley wrote:
> > The bad news is that this is a pretty horrific conversion.  Because of
> > the amount of control the new eh wants (and the slew of monolithic
> > assumptions it makes), the entire error handler routine has to be sliced
> > apart and sewn back together to make it callable as part of an existing
> > error handler (rather than trying to do everything on its own).
> > 
> > The even worse news is that unless you have an existing strategy
> > handler, you can't make use of this.  That means that ipr is
> > unconvertable in its current form.  Given the two really nasty options:
> > give ipr its own error handler via its own transport class, or attach
> > ipr to libsas and let it do the work, I'd choose the latter.
> 
> I've got a patch set that's been sitting collecting dust for quite some time
> that implements yet another option. The basic idea is that we move away from
> the idea of having a single scsi_host for the entire SAS adapter, but
> rather have one for the SAS devices and one for each SATA rphy.

So, as I said already, I think this is a pretty nasty hack.  A host
represents a single attachment to the bus (usually a card).  We have a
lot of logic that crawls up the device tree looking for the host.  If we
have multiple hosts in the hierarchy, that logic will always find the
dummy one.

> This allows us to use libata's EH pretty much as is, but does introduce
> some issues to solve:
> 
> 1. Adapter queue depth. Since we have multiple scsi_hosts for a single
>    HBA, we lose the adapter queue depth tracking done by scsi core.
>    Anything in block that can help us here?
> 2. Locking gets messy since we have a lock for each sata rphy as well
>    as one for the sas scsi_host.
> 
> There were some other smaller issues as well. I liked the idea of moving
> to multiple scsi_host's since that eliminates the issue of userspace polling an
> empty CD drive causing EH to get invoked and quiescing all SAS I/O as well,
> but it does result in potentially a lot of scsi_hosts getting created.

I think better might be to move the eh thread functionality into block.
What libata really wants is one eh thread per phy.  It gets this by
completely abusing the idea of a host.  There's actually good reasons
why most of our modern HBAs would like one eh per port ... which would
amount to per-phy on ata which can't do wide ports.  If we allowed the
driver to determine the correct level for this, and block to provide the
machinery, everyone should get what they want, and libata could dispense
with its current corruption of multiple SCSI hosts per physical bus
attachment.

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-26 14:08   ` James Bottomley
@ 2011-01-26 17:46     ` Jeff Garzik
  2011-01-26 17:55       ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2011-01-26 17:46 UTC (permalink / raw)
  To: James Bottomley
  Cc: Brian King, linux-scsi, linux-ide, Robert Jennings, Wayne Boyer

On 01/26/2011 09:08 AM, James Bottomley wrote:
> I think better might be to move the eh thread functionality into block.
> What libata really wants is one eh thread per phy.  It gets this by
[...]
> machinery, everyone should get what they want, and libata could dispense
> with its current corruption of multiple SCSI hosts per physical bus
> attachment.

That's not the whole picture.  libata originally chose one-host-per-port 
because that winds up being the best queueing/queue-busy arrangement for 
legacy IDE ports.  They behave quite similarly to independent 
controllers -- even to the point of having separate irqs -- even though 
multiple IDE ports are shoehorned into a single PCI device.

	Jeff



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-26 17:46     ` Jeff Garzik
@ 2011-01-26 17:55       ` James Bottomley
  0 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2011-01-26 17:55 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Brian King, linux-scsi, linux-ide, Robert Jennings, Wayne Boyer

On Wed, 2011-01-26 at 12:46 -0500, Jeff Garzik wrote:
> On 01/26/2011 09:08 AM, James Bottomley wrote:
> > I think better might be to move the eh thread functionality into block.
> > What libata really wants is one eh thread per phy.  It gets this by
> [...]
> > machinery, everyone should get what they want, and libata could dispense
> > with its current corruption of multiple SCSI hosts per physical bus
> > attachment.
> 
> That's not the whole picture.  libata originally chose one-host-per-port 
> because that winds up being the best queueing/queue-busy arrangement for 
> legacy IDE ports.

Right, but we count things at the lun, the target and the host.  For a
two level queue setup, you could use the target.

>   They behave quite similarly to independent 
> controllers -- even to the point of having separate irqs -- even though 
> multiple IDE ports are shoehorned into a single PCI device.

So interrupts don't really have much to do with the host, that's all
part of driver setup.  IDE is really just using the two lines as a
simple completion signal ... much like the current multi queue cards do
(which also have multiple interrupts per host).

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-01-23 15:41 [PATCH 0/2] convert libsas to the libata new eh James Bottomley
                   ` (3 preceding siblings ...)
  2011-01-25 18:56 ` Brian King
@ 2011-02-13 19:18 ` James Bottomley
  2011-02-15  7:03   ` Jeff Garzik
  4 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2011-02-13 19:18 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Brian King, Robert Jennings

On Sun, 2011-01-23 at 09:41 -0600, James Bottomley wrote:
> The bad news is that this is a pretty horrific conversion.  Because of
> the amount of control the new eh wants (and the slew of monolithic
> assumptions it makes), the entire error handler routine has to be sliced
> apart and sewn back together to make it callable as part of an existing
> error handler (rather than trying to do everything on its own).
> 
> The even worse news is that unless you have an existing strategy
> handler, you can't make use of this.  That means that ipr is
> unconvertable in its current form.  Given the two really nasty options:
> give ipr its own error handler via its own transport class, or attach
> ipr to libsas and let it do the work, I'd choose the latter.
> Unfortunately, my power box with IPR has been non-functional for a year
> (and all pleas to IBM to fix it have so far gone unanswered), so I can't
> really do this myself.
> 
> Once we carve up libata, it's still a rather complex call sequence for
> libsas because of libata's annoying insistance on calling the error
> handler at the drop of a hat.  The principle differentiator used is the
> fact that if a SCSI command no longer has an associated SAS task, (i.e.
> libsas thinks it's complete) it must be destined for libata.  This works
> correctly in all the ATAPI sense processing instances I've tested.
> Finally, we still have to call port recovery.  Currently, it's
> impossible to tell which port libata thinks is in error (libata knows
> because it has one host per port, but libsas doesn't), so the final
> solution is just to call port recovery for every SATA/STP attached port.
> 
> I've tested all of this with mvsas over expander attached ATA devices
> using STP (aic94xx seems to have developed some handling fault for STP,
> which I'll look into next).

Since there's been remarkably little discussion of the ata changes, I
think it's time to put this in linux-next to see what shakes out.

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-02-13 19:18 ` James Bottomley
@ 2011-02-15  7:03   ` Jeff Garzik
  2011-02-15 14:15     ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Garzik @ 2011-02-15  7:03 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-ide, Brian King, Robert Jennings

On 02/13/2011 02:18 PM, James Bottomley wrote:
> On Sun, 2011-01-23 at 09:41 -0600, James Bottomley wrote:
>> The bad news is that this is a pretty horrific conversion.  Because of
>> the amount of control the new eh wants (and the slew of monolithic
>> assumptions it makes), the entire error handler routine has to be sliced
>> apart and sewn back together to make it callable as part of an existing
>> error handler (rather than trying to do everything on its own).
>>
>> The even worse news is that unless you have an existing strategy
>> handler, you can't make use of this.  That means that ipr is
>> unconvertable in its current form.  Given the two really nasty options:
>> give ipr its own error handler via its own transport class, or attach
>> ipr to libsas and let it do the work, I'd choose the latter.
>> Unfortunately, my power box with IPR has been non-functional for a year
>> (and all pleas to IBM to fix it have so far gone unanswered), so I can't
>> really do this myself.
>>
>> Once we carve up libata, it's still a rather complex call sequence for
>> libsas because of libata's annoying insistance on calling the error
>> handler at the drop of a hat.  The principle differentiator used is the
>> fact that if a SCSI command no longer has an associated SAS task, (i.e.
>> libsas thinks it's complete) it must be destined for libata.  This works
>> correctly in all the ATAPI sense processing instances I've tested.
>> Finally, we still have to call port recovery.  Currently, it's
>> impossible to tell which port libata thinks is in error (libata knows
>> because it has one host per port, but libsas doesn't), so the final
>> solution is just to call port recovery for every SATA/STP attached port.
>>
>> I've tested all of this with mvsas over expander attached ATA devices
>> using STP (aic94xx seems to have developed some handling fault for STP,
>> which I'll look into next).
>
> Since there's been remarkably little discussion of the ata changes, I
> think it's time to put this in linux-next to see what shakes out.

The fix series was quite straightforward, and the error handler 
separation is almost mechanical.  All looks good and tests well here, so 
I threw it into libata-dev.git#upstream (and thus linux-next).

Let me know if you want to peel off patch 2/2 here into a post-merge tree...

	Jeff




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] convert libsas to the libata new eh
  2011-02-15  7:03   ` Jeff Garzik
@ 2011-02-15 14:15     ` James Bottomley
  0 siblings, 0 replies; 12+ messages in thread
From: James Bottomley @ 2011-02-15 14:15 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-scsi, linux-ide, Brian King, Robert Jennings

On Tue, 2011-02-15 at 02:03 -0500, Jeff Garzik wrote:
> On 02/13/2011 02:18 PM, James Bottomley wrote:
> > On Sun, 2011-01-23 at 09:41 -0600, James Bottomley wrote:
> >> The bad news is that this is a pretty horrific conversion.  Because of
> >> the amount of control the new eh wants (and the slew of monolithic
> >> assumptions it makes), the entire error handler routine has to be sliced
> >> apart and sewn back together to make it callable as part of an existing
> >> error handler (rather than trying to do everything on its own).
> >>
> >> The even worse news is that unless you have an existing strategy
> >> handler, you can't make use of this.  That means that ipr is
> >> unconvertable in its current form.  Given the two really nasty options:
> >> give ipr its own error handler via its own transport class, or attach
> >> ipr to libsas and let it do the work, I'd choose the latter.
> >> Unfortunately, my power box with IPR has been non-functional for a year
> >> (and all pleas to IBM to fix it have so far gone unanswered), so I can't
> >> really do this myself.
> >>
> >> Once we carve up libata, it's still a rather complex call sequence for
> >> libsas because of libata's annoying insistance on calling the error
> >> handler at the drop of a hat.  The principle differentiator used is the
> >> fact that if a SCSI command no longer has an associated SAS task, (i.e.
> >> libsas thinks it's complete) it must be destined for libata.  This works
> >> correctly in all the ATAPI sense processing instances I've tested.
> >> Finally, we still have to call port recovery.  Currently, it's
> >> impossible to tell which port libata thinks is in error (libata knows
> >> because it has one host per port, but libsas doesn't), so the final
> >> solution is just to call port recovery for every SATA/STP attached port.
> >>
> >> I've tested all of this with mvsas over expander attached ATA devices
> >> using STP (aic94xx seems to have developed some handling fault for STP,
> >> which I'll look into next).
> >
> > Since there's been remarkably little discussion of the ata changes, I
> > think it's time to put this in linux-next to see what shakes out.
> 
> The fix series was quite straightforward, and the error handler 
> separation is almost mechanical.  All looks good and tests well here, so 
> I threw it into libata-dev.git#upstream (and thus linux-next).

Well, as you know, it's already in scsi-misc; as long as we don't get a
merge conflict in linux-next, I don't really care.

However, since it's a SCSI feature, which depends on patches in the scsi
tree both before and after, it doesn't make a whole lot of sense in the
ata tree.  It's just one actual patch that is libata impacting (the eh
split).  The other three correct the sas paths, which only have SCSI
users.

> Let me know if you want to peel off patch 2/2 here into a post-merge tree...

Well, you're missing the precursor patches to make it functional ...
what you have will hang fairly fast on an actual SAS device.  But, as
long as people with SAS only test scsi-misc or linux-next, hopefully it
won't matter.

James



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-02-15 14:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-23 15:41 [PATCH 0/2] convert libsas to the libata new eh James Bottomley
2011-01-23 15:42 ` [PATCH 1/2] libata: separate error handler into usable components James Bottomley
2011-01-23 15:44 ` [PATCH 2/2] libsas: convert to libata new error handler James Bottomley
2011-01-24 21:12 ` [PATCH 0/2] convert libsas to the libata new eh Jeff Garzik
2011-01-24 21:16   ` James Bottomley
2011-01-25 18:56 ` Brian King
2011-01-26 14:08   ` James Bottomley
2011-01-26 17:46     ` Jeff Garzik
2011-01-26 17:55       ` James Bottomley
2011-02-13 19:18 ` James Bottomley
2011-02-15  7:03   ` Jeff Garzik
2011-02-15 14:15     ` James Bottomley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.