All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/42] hpsa updates
@ 2015-03-17 20:02 Don Brace
  2015-03-17 20:02 ` [PATCH v3 01/42] hpsa: add masked physical devices into h->dev[] array Don Brace
                   ` (41 more replies)
  0 siblings, 42 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

These patches are based on Linus's tree

The changes are:
 - make function names consistent
 - refactor functions
 - cleanup driver messages
 - cleanup error handling
 - clean up abort management
 - enhance sense data reporting
 - enhance ioaccel command support
 - add in block layer tag support
 - clean up resets
 - update copyright

---

Don Brace (2):
      hpsa: change driver version
      hpsa: add PMC to copyright

Joe Handzik (3):
      hpsa: use ioaccel2 path to submit IOs to physical drives in HBA mode.
      hpsa: Get queue depth from identify physical bmic for physical disks.
      hpsa: add more ioaccel2 error handling, including underrun statuses.

Robert Elliott (18):
      hpsa: make function names consistent
      hpsa: print accurate SSD Smart Path Enabled status
      hpsa: break hpsa_free_irqs_and_disable_msix into two functions
      hpsa: clean up error handling
      hpsa: refactor freeing of resources into more logical functions
      hpsa: do not check cmd_alloc return value - it cannnot return NULL
      hpsa: correct return values from driver functions.
      hpsa: clean up driver init
      hpsa: clean up some error reporting output in abort handler
      hpsa: do not print ioaccel2 warning messages about unusual completions.
      hpsa: call pci_release_regions after pci_disable_device
      hpsa: skip free_irq calls if irqs are not allocated
      hpsa: cleanup for init_one step 2 in kdump
      hpsa: fix try_soft_reset error handling
      hpsa: create workqueue after the driver is ready for use
      hpsa: add interrupt number to /proc/interrupts interrupt name
      hpsa: use scsi host_no as hpsa controller number
      hpsa: propagate the error code in hpsa_kdump_soft_reset

Stephen Cameron (9):
      hpsa: add masked physical devices into h->dev[] array
      hpsa: clean up aborts
      hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds
      hpsa: hpsa decode sense data for io and tmf
      hpsa: allow lockup detected to be viewed via sysfs
      hpsa: factor out hpsa_init_cmd function
      hpsa: do not ignore return value of hpsa_register_scsi
      hpsa: try resubmitting down raid path on task set full
      hpsa: add support sending aborts to physical devices via the ioaccel2 path

Webb Scales (10):
      hpsa: clean up host, channel, target, lun prints
      hpsa: rework controller command submission
      hpsa: factor out hpsa_ioaccel_submit function
      hpsa: add ioaccel sg chaining for the ioaccel2 path
      hpsa: use helper routines for finishing commands
      hpsa: don't return abort request until target is complete
      hpsa: refactor and rework support for sending TEST_UNIT_READY
      hpsa: performance tweak for hpsa_scatter_gather()
      hpsa: use block layer tag for command allocation
      hpsa: cleanup reset


 drivers/scsi/hpsa.c     | 2801 ++++++++++++++++++++++++++++++++++-------------
 drivers/scsi/hpsa.h     |   22 
 drivers/scsi/hpsa_cmd.h |   37 +
 3 files changed, 2052 insertions(+), 808 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 01/42] hpsa: add masked physical devices into h->dev[] array
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
@ 2015-03-17 20:02 ` Don Brace
  2015-03-17 20:02 ` [PATCH v3 02/42] hpsa: clean up host, channel, target, lun prints Don Brace
                   ` (40 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

Cache the ioaccel handle so that when we need to abort commands sent
down the ioaccel2 path, we can look up the LUN ID in h->dev[] instead of
having to do I/O to the controller.

Add a field to elements in h->dev[] to keep track of how the device is exposed
to the SCSI mid layer: Not at all, without an upper level driver
(no_uld_attach) or normally exposed.

Since masked physical devices are now present in h->dev[] array
it would be perfectly possible to do

	echo scsi add-single-device 2 2 0 0 > /proc/scsi/scsi

and bring them online.  This was previously not allowed for masked
physical devices.

Ensure that the mapping of physical disks to logical drives gets updated in a
consistent way when a RAID migration occurs and is not touched until updates
to it are complete.

now instead of doing CISS_REPORT_PHYSICAL to get the LUNID for
the physical disk in hpsa_get_pdisk_of_ioaccel2(), just get
it out of h->dev[] where we already have it cached.

do not touch phys_disk[] for ioaccel enabled logical drives during rescan

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |  253 +++++++++++++++++++++++++----------------------
 drivers/scsi/hpsa.h     |    6 +
 drivers/scsi/hpsa_cmd.h |    3 +
 3 files changed, 141 insertions(+), 121 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index a1cfbd3..3417b8b 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -222,6 +222,7 @@ static int hpsa_change_queue_depth(struct scsi_device *sdev, int qdepth);
 static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd);
 static int hpsa_eh_abort_handler(struct scsi_cmnd *scsicmd);
 static int hpsa_slave_alloc(struct scsi_device *sdev);
+static int hpsa_slave_configure(struct scsi_device *sdev);
 static void hpsa_slave_destroy(struct scsi_device *sdev);
 
 static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno);
@@ -667,6 +668,9 @@ static struct device_attribute *hpsa_shost_attrs[] = {
 	NULL,
 };
 
+#define HPSA_NRESERVED_CMDS	(HPSA_CMDS_RESERVED_FOR_ABORTS + \
+		HPSA_CMDS_RESERVED_FOR_DRIVER + HPSA_MAX_CONCURRENT_PASSTHRUS)
+
 static struct scsi_host_template hpsa_driver_template = {
 	.module			= THIS_MODULE,
 	.name			= HPSA,
@@ -681,6 +685,7 @@ static struct scsi_host_template hpsa_driver_template = {
 	.eh_device_reset_handler = hpsa_eh_device_reset_handler,
 	.ioctl			= hpsa_ioctl,
 	.slave_alloc		= hpsa_slave_alloc,
+	.slave_configure	= hpsa_slave_configure,
 	.slave_destroy		= hpsa_slave_destroy,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl		= hpsa_compat_ioctl,
@@ -946,6 +951,8 @@ lun_assigned:
 
 	h->dev[n] = device;
 	h->ndevices++;
+	device->offload_to_be_enabled = device->offload_enabled;
+	device->offload_enabled = 0;
 	added[*nadded] = device;
 	(*nadded)++;
 
@@ -982,16 +989,20 @@ static void hpsa_scsi_update_entry(struct ctlr_info *h, int hostno,
 		 */
 		h->dev[entry]->raid_map = new_entry->raid_map;
 		h->dev[entry]->ioaccel_handle = new_entry->ioaccel_handle;
-		wmb(); /* ensure raid map updated prior to ->offload_enabled */
 	}
 	h->dev[entry]->offload_config = new_entry->offload_config;
 	h->dev[entry]->offload_to_mirror = new_entry->offload_to_mirror;
-	h->dev[entry]->offload_enabled = new_entry->offload_enabled;
 	h->dev[entry]->queue_depth = new_entry->queue_depth;
 
-	dev_info(&h->pdev->dev, "%s device c%db%dt%dl%d updated.\n",
-		scsi_device_type(new_entry->devtype), hostno, new_entry->bus,
-		new_entry->target, new_entry->lun);
+	/*
+	 * We can turn off ioaccel offload now, but need to delay turning
+	 * it on until we can update h->dev[entry]->phys_disk[], but we
+	 * can't do that until all the devices are updated.
+	 */
+	h->dev[entry]->offload_to_be_enabled = new_entry->offload_enabled;
+	if (!new_entry->offload_enabled)
+		h->dev[entry]->offload_enabled = 0;
+
 }
 
 /* Replace an entry from h->dev[] array. */
@@ -1014,6 +1025,8 @@ static void hpsa_scsi_replace_entry(struct ctlr_info *h, int hostno,
 		new_entry->lun = h->dev[entry]->lun;
 	}
 
+	new_entry->offload_to_be_enabled = new_entry->offload_enabled;
+	new_entry->offload_enabled = 0;
 	h->dev[entry] = new_entry;
 	added[*nadded] = new_entry;
 	(*nadded)++;
@@ -1312,7 +1325,8 @@ static void hpsa_figure_phys_disk_ptrs(struct ctlr_info *h,
 		 */
 		if (!logical_drive->phys_disk[i]) {
 			logical_drive->offload_enabled = 0;
-			logical_drive->queue_depth = h->nr_cmds;
+			logical_drive->offload_to_be_enabled = 0;
+			logical_drive->queue_depth = 8;
 		}
 	}
 	if (nraid_map_entries)
@@ -1335,6 +1349,16 @@ static void hpsa_update_log_drive_phys_drive_ptrs(struct ctlr_info *h,
 			continue;
 		if (!is_logical_dev_addr_mode(dev[i]->scsi3addr))
 			continue;
+
+		/*
+		 * If offload is currently enabled, the RAID map and
+		 * phys_disk[] assignment *better* not be changing
+		 * and since it isn't changing, we do not need to
+		 * update it.
+		 */
+		if (dev[i]->offload_enabled)
+			continue;
+
 		hpsa_figure_phys_disk_ptrs(h, dev, ndevices, dev[i]);
 	}
 }
@@ -1433,6 +1457,14 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 			/* but if it does happen, we just ignore that device */
 		}
 	}
+	hpsa_update_log_drive_phys_drive_ptrs(h, h->dev, h->ndevices);
+
+	/* Now that h->dev[]->phys_disk[] is coherent, we can enable
+	 * any logical drives that need it enabled.
+	 */
+	for (i = 0; i < h->ndevices; i++)
+		h->dev[i]->offload_enabled = h->dev[i]->offload_to_be_enabled;
+
 	spin_unlock_irqrestore(&h->devlock, flags);
 
 	/* Monitor devices which are in one of several NOT READY states to be
@@ -1456,20 +1488,24 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 	sh = h->scsi_host;
 	/* Notify scsi mid layer of any removed devices */
 	for (i = 0; i < nremoved; i++) {
-		struct scsi_device *sdev =
-			scsi_device_lookup(sh, removed[i]->bus,
-				removed[i]->target, removed[i]->lun);
-		if (sdev != NULL) {
-			scsi_remove_device(sdev);
-			scsi_device_put(sdev);
-		} else {
-			/* We don't expect to get here.
-			 * future cmds to this device will get selection
-			 * timeout as if the device was gone.
-			 */
-			dev_warn(&h->pdev->dev, "didn't find c%db%dt%dl%d "
-				" for removal.", hostno, removed[i]->bus,
-				removed[i]->target, removed[i]->lun);
+		if (removed[i]->expose_state & HPSA_SCSI_ADD) {
+			struct scsi_device *sdev =
+				scsi_device_lookup(sh, removed[i]->bus,
+					removed[i]->target, removed[i]->lun);
+			if (sdev != NULL) {
+				scsi_remove_device(sdev);
+				scsi_device_put(sdev);
+			} else {
+				/*
+				 * We don't expect to get here.
+				 * future cmds to this device will get selection
+				 * timeout as if the device was gone.
+				 */
+				dev_warn(&h->pdev->dev,
+					"didn't find c%db%dt%dl%d for removal.\n",
+					hostno, removed[i]->bus,
+					removed[i]->target, removed[i]->lun);
+			}
 		}
 		kfree(removed[i]);
 		removed[i] = NULL;
@@ -1477,6 +1513,8 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 
 	/* Notify scsi mid layer of any added devices */
 	for (i = 0; i < nadded; i++) {
+		if (!(added[i]->expose_state & HPSA_SCSI_ADD))
+			continue;
 		if (scsi_add_device(sh, added[i]->bus,
 			added[i]->target, added[i]->lun) == 0)
 			continue;
@@ -1512,7 +1550,6 @@ static struct hpsa_scsi_dev_t *lookup_hpsa_scsi_dev(struct ctlr_info *h,
 	return NULL;
 }
 
-/* link sdev->hostdata to our per-device structure. */
 static int hpsa_slave_alloc(struct scsi_device *sdev)
 {
 	struct hpsa_scsi_dev_t *sd;
@@ -1523,16 +1560,35 @@ static int hpsa_slave_alloc(struct scsi_device *sdev)
 	spin_lock_irqsave(&h->devlock, flags);
 	sd = lookup_hpsa_scsi_dev(h, sdev_channel(sdev),
 		sdev_id(sdev), sdev->lun);
-	if (sd != NULL) {
-		sdev->hostdata = sd;
-		if (sd->queue_depth)
-			scsi_change_queue_depth(sdev, sd->queue_depth);
+	if (likely(sd)) {
 		atomic_set(&sd->ioaccel_cmds_out, 0);
-	}
+		sdev->hostdata = (sd->expose_state & HPSA_SCSI_ADD) ? sd : NULL;
+	} else
+		sdev->hostdata = NULL;
 	spin_unlock_irqrestore(&h->devlock, flags);
 	return 0;
 }
 
+/* configure scsi device based on internal per-device structure */
+static int hpsa_slave_configure(struct scsi_device *sdev)
+{
+	struct hpsa_scsi_dev_t *sd;
+	int queue_depth;
+
+	sd = sdev->hostdata;
+	sdev->no_uld_attach = !sd || !(sd->expose_state & HPSA_ULD_ATTACH);
+
+	if (sd)
+		queue_depth = sd->queue_depth != 0 ?
+			sd->queue_depth : sdev->host->can_queue;
+	else
+		queue_depth = sdev->host->can_queue;
+
+	scsi_change_queue_depth(sdev, queue_depth);
+
+	return 0;
+}
+
 static void hpsa_slave_destroy(struct scsi_device *sdev)
 {
 	/* nothing to do. */
@@ -2438,6 +2494,7 @@ static void hpsa_get_ioaccel_status(struct ctlr_info *h,
 
 	this_device->offload_config = 0;
 	this_device->offload_enabled = 0;
+	this_device->offload_to_be_enabled = 0;
 
 	buf = kzalloc(64, GFP_KERNEL);
 	if (!buf)
@@ -2461,6 +2518,7 @@ static void hpsa_get_ioaccel_status(struct ctlr_info *h,
 		if (hpsa_get_raid_map(h, scsi3addr, this_device))
 			this_device->offload_enabled = 0;
 	}
+	this_device->offload_to_be_enabled = this_device->offload_enabled;
 out:
 	kfree(buf);
 	return;
@@ -2708,6 +2766,7 @@ static int hpsa_update_device_info(struct ctlr_info *h,
 		this_device->raid_level = RAID_UNKNOWN;
 		this_device->offload_config = 0;
 		this_device->offload_enabled = 0;
+		this_device->offload_to_be_enabled = 0;
 		this_device->volume_offline = 0;
 		this_device->queue_depth = h->nr_cmds;
 	}
@@ -2850,88 +2909,23 @@ static int add_ext_target_dev(struct ctlr_info *h,
 static int hpsa_get_pdisk_of_ioaccel2(struct ctlr_info *h,
 	struct CommandList *ioaccel2_cmd_to_abort, unsigned char *scsi3addr)
 {
-	struct ReportExtendedLUNdata *physicals = NULL;
-	int responsesize = 24;	/* size of physical extended response */
-	int reportsize = sizeof(*physicals) + HPSA_MAX_PHYS_LUN * responsesize;
-	u32 nphysicals = 0;	/* number of reported physical devs */
-	int found = 0;		/* found match (1) or not (0) */
-	u32 find;		/* handle we need to match */
+	struct io_accel2_cmd *c2 =
+			&h->ioaccel2_cmd_pool[ioaccel2_cmd_to_abort->cmdindex];
+	unsigned long flags;
 	int i;
-	struct scsi_cmnd *scmd;	/* scsi command within request being aborted */
-	struct hpsa_scsi_dev_t *d; /* device of request being aborted */
-	struct io_accel2_cmd *c2a; /* ioaccel2 command to abort */
-	__le32 it_nexus;	/* 4 byte device handle for the ioaccel2 cmd */
-	__le32 scsi_nexus;	/* 4 byte device handle for the ioaccel2 cmd */
-
-	if (ioaccel2_cmd_to_abort->cmd_type != CMD_IOACCEL2)
-		return 0; /* no match */
-
-	/* point to the ioaccel2 device handle */
-	c2a = &h->ioaccel2_cmd_pool[ioaccel2_cmd_to_abort->cmdindex];
-	if (c2a == NULL)
-		return 0; /* no match */
-
-	scmd = (struct scsi_cmnd *) ioaccel2_cmd_to_abort->scsi_cmd;
-	if (scmd == NULL)
-		return 0; /* no match */
-
-	d = scmd->device->hostdata;
-	if (d == NULL)
-		return 0; /* no match */
-
-	it_nexus = cpu_to_le32(d->ioaccel_handle);
-	scsi_nexus = c2a->scsi_nexus;
-	find = le32_to_cpu(c2a->scsi_nexus);
-
-	if (h->raid_offload_debug > 0)
-		dev_info(&h->pdev->dev,
-			"%s: scsi_nexus:0x%08x device id: 0x%02x%02x%02x%02x %02x%02x%02x%02x %02x%02x%02x%02x %02x%02x%02x%02x\n",
-			__func__, scsi_nexus,
-			d->device_id[0], d->device_id[1], d->device_id[2],
-			d->device_id[3], d->device_id[4], d->device_id[5],
-			d->device_id[6], d->device_id[7], d->device_id[8],
-			d->device_id[9], d->device_id[10], d->device_id[11],
-			d->device_id[12], d->device_id[13], d->device_id[14],
-			d->device_id[15]);
-
-	/* Get the list of physical devices */
-	physicals = kzalloc(reportsize, GFP_KERNEL);
-	if (physicals == NULL)
-		return 0;
-	if (hpsa_scsi_do_report_phys_luns(h, physicals, reportsize)) {
-		dev_err(&h->pdev->dev,
-			"Can't lookup %s device handle: report physical LUNs failed.\n",
-			"HP SSD Smart Path");
-		kfree(physicals);
-		return 0;
-	}
-	nphysicals = be32_to_cpu(*((__be32 *)physicals->LUNListLength)) /
-							responsesize;
-
-	/* find ioaccel2 handle in list of physicals: */
-	for (i = 0; i < nphysicals; i++) {
-		struct ext_report_lun_entry *entry = &physicals->LUN[i];
-
-		/* handle is in bytes 28-31 of each lun */
-		if (entry->ioaccel_handle != find)
-			continue; /* didn't match */
-		found = 1;
-		memcpy(scsi3addr, entry->lunid, 8);
-		if (h->raid_offload_debug > 0)
-			dev_info(&h->pdev->dev,
-				"%s: Searched h=0x%08x, Found h=0x%08x, scsiaddr 0x%8phN\n",
-				__func__, find,
-				entry->ioaccel_handle, scsi3addr);
-		break; /* found it */
-	}
-
-	kfree(physicals);
-	if (found)
-		return 1;
-	else
-		return 0;
 
+	spin_lock_irqsave(&h->devlock, flags);
+	for (i = 0; i < h->ndevices; i++)
+		if (h->dev[i]->ioaccel_handle == le32_to_cpu(c2->scsi_nexus)) {
+			memcpy(scsi3addr, h->dev[i]->scsi3addr,
+				sizeof(h->dev[i]->scsi3addr));
+			spin_unlock_irqrestore(&h->devlock, flags);
+			return 1;
+		}
+	spin_unlock_irqrestore(&h->devlock, flags);
+	return 0;
 }
+
 /*
  * Do CISS_REPORT_PHYS and CISS_REPORT_LOG.  Data is returned in physdev,
  * logdev.  The number of luns in physdev and logdev are returned in
@@ -3142,10 +3136,12 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
 		/* Figure out where the LUN ID info is coming from */
 		lunaddrbytes = figure_lunaddrbytes(h, raid_ctlr_position,
 			i, nphysicals, nlogicals, physdev_list, logdev_list);
-		/* skip masked physical devices. */
-		if (lunaddrbytes[3] & 0xC0 &&
-			i < nphysicals + (raid_ctlr_position == 0))
-			continue;
+
+		/* skip masked non-disk devices */
+		if (MASKED_DEVICE(lunaddrbytes))
+			if (i < nphysicals + (raid_ctlr_position == 0) &&
+				NON_DISK_PHYS_DEV(lunaddrbytes))
+				continue;
 
 		/* Get device type, vendor, model, device id */
 		if (hpsa_update_device_info(h, lunaddrbytes, tmpdevice,
@@ -3170,6 +3166,18 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
 
 		*this_device = *tmpdevice;
 
+		/* do not expose masked devices */
+		if (MASKED_DEVICE(lunaddrbytes) &&
+			i < nphysicals + (raid_ctlr_position == 0)) {
+			if (h->hba_mode_enabled)
+				dev_warn(&h->pdev->dev,
+					"Masked physical device detected\n");
+			this_device->expose_state = HPSA_DO_NOT_EXPOSE;
+		} else {
+			this_device->expose_state =
+					HPSA_SG_ATTACH | HPSA_ULD_ATTACH;
+		}
+
 		switch (this_device->devtype) {
 		case TYPE_ROM:
 			/* We don't *really* support actual CD-ROM devices,
@@ -3211,6 +3219,10 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
 		case TYPE_MEDIUM_CHANGER:
 			ncurrent++;
 			break;
+		case TYPE_ENCLOSURE:
+			if (h->hba_mode_enabled)
+				ncurrent++;
+			break;
 		case TYPE_RAID:
 			/* Only present the Smartarray HBA as a RAID controller.
 			 * If it's a RAID controller other than the HBA itself
@@ -3227,7 +3239,6 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
 		if (ncurrent >= HPSA_MAX_DEVICES)
 			break;
 	}
-	hpsa_update_log_drive_phys_drive_ptrs(h, currentsd, ncurrent);
 	adjust_hpsa_scsi_table(h, hostno, currentsd, ncurrent);
 out:
 	kfree(tmpdevice);
@@ -4252,10 +4263,7 @@ static int hpsa_register_scsi(struct ctlr_info *h)
 	sh->max_cmd_len = MAX_COMMAND_SIZE;
 	sh->max_lun = HPSA_MAX_LUN;
 	sh->max_id = HPSA_MAX_LUN;
-	sh->can_queue = h->nr_cmds -
-			HPSA_CMDS_RESERVED_FOR_ABORTS -
-			HPSA_CMDS_RESERVED_FOR_DRIVER -
-			HPSA_MAX_CONCURRENT_PASSTHRUS;
+	sh->can_queue = h->nr_cmds - HPSA_NRESERVED_CMDS;
 	sh->cmd_per_lun = sh->can_queue;
 	sh->sg_tablesize = h->maxsgentries;
 	h->scsi_host = sh;
@@ -6097,18 +6105,21 @@ static int hpsa_find_cfgtables(struct ctlr_info *h)
 
 static void hpsa_get_max_perf_mode_cmds(struct ctlr_info *h)
 {
-	h->max_commands = readl(&(h->cfgtable->MaxPerformantModeCommands));
+#define MIN_MAX_COMMANDS 16
+	BUILD_BUG_ON(MIN_MAX_COMMANDS <= HPSA_NRESERVED_CMDS);
+
+	h->max_commands = readl(&h->cfgtable->MaxPerformantModeCommands);
 
 	/* Limit commands in memory limited kdump scenario. */
 	if (reset_devices && h->max_commands > 32)
 		h->max_commands = 32;
 
-	if (h->max_commands < 16) {
-		dev_warn(&h->pdev->dev, "Controller reports "
-			"max supported commands of %d, an obvious lie. "
-			"Using 16.  Ensure that firmware is up to date.\n",
-			h->max_commands);
-		h->max_commands = 16;
+	if (h->max_commands < MIN_MAX_COMMANDS) {
+		dev_warn(&h->pdev->dev,
+			"Controller reports max supported commands of %d Using %d instead. Ensure that firmware is up to date.\n",
+			h->max_commands,
+			MIN_MAX_COMMANDS);
+		h->max_commands = MIN_MAX_COMMANDS;
 	}
 }
 
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 6577130..58f3315 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -54,6 +54,7 @@ struct hpsa_scsi_dev_t {
 	u32 ioaccel_handle;
 	int offload_config;		/* I/O accel RAID offload configured */
 	int offload_enabled;		/* I/O accel RAID offload enabled */
+	int offload_to_be_enabled;
 	int offload_to_mirror;		/* Send next I/O accelerator RAID
 					 * offload request to mirror drive
 					 */
@@ -68,6 +69,11 @@ struct hpsa_scsi_dev_t {
 	 * devices in order to honor physical device queue depth limits.
 	 */
 	struct hpsa_scsi_dev_t *phys_disk[RAID_MAP_MAX_ENTRIES];
+#define HPSA_DO_NOT_EXPOSE	0x0
+#define HPSA_SG_ATTACH		0x1
+#define HPSA_ULD_ATTACH		0x2
+#define HPSA_SCSI_ADD		(HPSA_SG_ATTACH | HPSA_ULD_ATTACH)
+	u8 expose_state;
 };
 
 struct reply_queue_buffer {
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 3a621c7..76d5499 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -240,6 +240,7 @@ struct ReportLUNdata {
 
 struct ext_report_lun_entry {
 	u8 lunid[8];
+#define MASKED_DEVICE(x) ((x)[3] & 0xC0)
 #define GET_BMIC_BUS(lunid) ((lunid)[7] & 0x3F)
 #define GET_BMIC_LEVEL_TWO_TARGET(lunid) ((lunid)[6])
 #define GET_BMIC_DRIVE_NUMBER(lunid) (((GET_BMIC_BUS((lunid)) - 1) << 8) + \
@@ -247,6 +248,8 @@ struct ext_report_lun_entry {
 	u8 wwid[8];
 	u8 device_type;
 	u8 device_flags;
+#define NON_DISK_PHYS_DEV(x) ((x)[17] & 0x01)
+#define PHYS_IOACCEL(x) ((x)[17] & 0x08)
 	u8 lun_count; /* multi-lun device, how many luns */
 	u8 redundant_paths;
 	u32 ioaccel_handle; /* ioaccel1 only uses lower 16 bits */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 02/42] hpsa: clean up host, channel, target, lun prints
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
  2015-03-17 20:02 ` [PATCH v3 01/42] hpsa: add masked physical devices into h->dev[] array Don Brace
@ 2015-03-17 20:02 ` Don Brace
  2015-03-17 20:02 ` [PATCH v3 03/42] hpsa: rework controller command submission Don Brace
                   ` (39 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

We had a mix of formats used for specifying controller, bus, target,
and lun address of devices.

change to the format used by the scsi midlayer and upper layer (2:3:0:0)
so you can easily follow the information from hpsa to scsi midlayer
to sd upper layer.

Also add this information:
- product ID
- vendor ID
- RAID level
- SSD Smath Path capable and enabled
- exposure level (sg-only)

Example:
hpsa 0000:04:00.0: added scsi 2:0:0:0: Direct-Access     HP LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En+ Exp=4
scsi 2:0:0:0: Direct-Access     HP       LOGICAL VOLUME   10.0 PQ: 0 ANSI: 5
sd 2:0:0:0: [sdr] 12501713072 512-byte logical blocks: (6.40 TB/5.82 TiB)
sd 2:0:0:0: [sdr] 4096-byte physical blocks
sd 2:0:0:0: [sdr] Attached SCSI disk
sd 2:0:0:0: Attached scsi generic sg20 type 0

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  103 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 71 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3417b8b..9b88726 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -886,6 +886,53 @@ static int hpsa_find_target_lun(struct ctlr_info *h,
 	return !found;
 }
 
+static inline void hpsa_show_dev_msg(int type, struct ctlr_info *h,
+	struct hpsa_scsi_dev_t *dev, char *description)
+{
+#define HPSA_INFO 0
+#define HPSA_WARN 1
+#define HPSA_ERR 2
+	if (type == HPSA_INFO)
+		dev_info(&h->pdev->dev,
+			"scsi %d:%d:%d:%d: %s %s %.8s %.16s RAID-%s SSDSmartPathCap%c En%c Exp=%d\n",
+			h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+			description,
+			scsi_device_type(dev->devtype),
+			dev->vendor,
+			dev->model,
+			dev->raid_level > RAID_UNKNOWN ?
+				"RAID-?" : raid_label[dev->raid_level],
+			dev->offload_config ? '+' : '-',
+			dev->offload_enabled ? '+' : '-',
+			dev->expose_state);
+	else if (type == HPSA_WARN)
+		dev_warn(&h->pdev->dev,
+			"scsi %d:%d:%d:%d: %s %s %.8s %.16s RAID-%s SSDSmartPathCap%c En%c Exp=%d\n",
+			h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+			description,
+			scsi_device_type(dev->devtype),
+			dev->vendor,
+			dev->model,
+			dev->raid_level > RAID_UNKNOWN ?
+				"RAID-?" : raid_label[dev->raid_level],
+			dev->offload_config ? '+' : '-',
+			dev->offload_enabled ? '+' : '-',
+			dev->expose_state);
+	else if (type == HPSA_ERR)
+		dev_err(&h->pdev->dev,
+			"scsi %d:%d:%d:%d: %s %s %.8s %.16s RAID-%s SSDSmartPathCap%c En%c Exp=%d\n",
+			h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+			description,
+			scsi_device_type(dev->devtype),
+			dev->vendor,
+			dev->model,
+			dev->raid_level > RAID_UNKNOWN ?
+				"RAID-?" : raid_label[dev->raid_level],
+			dev->offload_config ? '+' : '-',
+			dev->offload_enabled ? '+' : '-',
+			dev->expose_state);
+}
+
 /* Add an entry into h->dev[] array. */
 static int hpsa_scsi_add_entry(struct ctlr_info *h, int hostno,
 		struct hpsa_scsi_dev_t *device,
@@ -955,15 +1002,8 @@ lun_assigned:
 	device->offload_enabled = 0;
 	added[*nadded] = device;
 	(*nadded)++;
-
-	/* initially, (before registering with scsi layer) we don't
-	 * know our hostno and we don't want to print anything first
-	 * time anyway (the scsi layer's inquiries will show that info)
-	 */
-	/* if (hostno != -1) */
-		dev_info(&h->pdev->dev, "%s device c%db%dt%dl%d added.\n",
-			scsi_device_type(device->devtype), hostno,
-			device->bus, device->target, device->lun);
+	hpsa_show_dev_msg(HPSA_INFO, h, device,
+		device->expose_state & HPSA_SCSI_ADD ? "added" : "masked");
 	return 0;
 }
 
@@ -1003,6 +1043,7 @@ static void hpsa_scsi_update_entry(struct ctlr_info *h, int hostno,
 	if (!new_entry->offload_enabled)
 		h->dev[entry]->offload_enabled = 0;
 
+	hpsa_show_dev_msg(HPSA_INFO, h, h->dev[entry], "updated");
 }
 
 /* Replace an entry from h->dev[] array. */
@@ -1030,9 +1071,7 @@ static void hpsa_scsi_replace_entry(struct ctlr_info *h, int hostno,
 	h->dev[entry] = new_entry;
 	added[*nadded] = new_entry;
 	(*nadded)++;
-	dev_info(&h->pdev->dev, "%s device c%db%dt%dl%d changed.\n",
-		scsi_device_type(new_entry->devtype), hostno, new_entry->bus,
-			new_entry->target, new_entry->lun);
+	hpsa_show_dev_msg(HPSA_INFO, h, new_entry, "replaced");
 }
 
 /* Remove an entry from h->dev[] array. */
@@ -1052,9 +1091,7 @@ static void hpsa_scsi_remove_entry(struct ctlr_info *h, int hostno, int entry,
 	for (i = entry; i < h->ndevices-1; i++)
 		h->dev[i] = h->dev[i+1];
 	h->ndevices--;
-	dev_info(&h->pdev->dev, "%s device c%db%dt%dl%d removed.\n",
-		scsi_device_type(sd->devtype), hostno, sd->bus, sd->target,
-		sd->lun);
+	hpsa_show_dev_msg(HPSA_INFO, h, sd, "removed");
 }
 
 #define SCSI3ADDR_EQ(a, b) ( \
@@ -1435,9 +1472,7 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 		 */
 		if (sd[i]->volume_offline) {
 			hpsa_show_volume_status(h, sd[i]);
-			dev_info(&h->pdev->dev, "c%db%dt%dl%d: temporarily offline\n",
-				h->scsi_host->host_no,
-				sd[i]->bus, sd[i]->target, sd[i]->lun);
+			hpsa_show_dev_msg(HPSA_INFO, h, sd[i], "offline");
 			continue;
 		}
 
@@ -1501,10 +1536,11 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 				 * future cmds to this device will get selection
 				 * timeout as if the device was gone.
 				 */
-				dev_warn(&h->pdev->dev,
-					"didn't find c%db%dt%dl%d for removal.\n",
+			dev_warn(&h->pdev->dev,
+					"scsi %d:%d:%d:%d %s\n",
 					hostno, removed[i]->bus,
-					removed[i]->target, removed[i]->lun);
+					removed[i]->target, removed[i]->lun,
+					"didn't find device for removal.");
 			}
 		}
 		kfree(removed[i]);
@@ -1518,8 +1554,9 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 		if (scsi_add_device(sh, added[i]->bus,
 			added[i]->target, added[i]->lun) == 0)
 			continue;
-		dev_warn(&h->pdev->dev, "scsi_add_device c%db%dt%dl%d failed, "
-			"device not added.\n", hostno, added[i]->bus,
+		dev_warn(&h->pdev->dev,
+			"scsi %d:%d:%d:%d addition failed, device not added.\n",
+			hostno, added[i]->bus,
 			added[i]->target, added[i]->lun);
 		/* now we have to remove it from h->dev,
 		 * since it didn't get added to scsi mid layer
@@ -4375,7 +4412,6 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
 	if (rc == 0 && wait_for_device_to_become_ready(h, dev->scsi3addr) == 0)
 		return SUCCESS;
 
-	dev_warn(&h->pdev->dev, "resetting device failed.\n");
 	return FAILED;
 }
 
@@ -4491,8 +4527,9 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
 
 	if (h->raid_offload_debug > 0)
 		dev_info(&h->pdev->dev,
-			"Reset as abort: Abort requested on C%d:B%d:T%d:L%d scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
+			"scsi %d:%d:%d:%d %s scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
 			h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+			"Reset as abort",
 			scsi3addr[0], scsi3addr[1], scsi3addr[2], scsi3addr[3],
 			scsi3addr[4], scsi3addr[5], scsi3addr[6], scsi3addr[7]);
 
@@ -4594,9 +4631,10 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		return FAILED;
 
 	memset(msg, 0, sizeof(msg));
-	ml += sprintf(msg+ml, "ABORT REQUEST on C%d:B%d:T%d:L%llu ",
+	ml += sprintf(msg+ml, "scsi %d:%d:%d:%llu %s",
 		h->scsi_host->host_no, sc->device->channel,
-		sc->device->id, sc->device->lun);
+		sc->device->id, sc->device->lun,
+		"Aborting command");
 
 	/* Find the device of the command to be aborted */
 	dev = sc->device->hostdata;
@@ -4624,8 +4662,9 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		ml += sprintf(msg+ml, "Command:0x%x SN:0x%lx ",
 			as->cmnd[0], as->serial_number);
 	dev_dbg(&h->pdev->dev, "%s\n", msg);
-	dev_warn(&h->pdev->dev, "Abort request on C%d:B%d:T%d:L%d\n",
-		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
+	dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
+		h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+		"Aborting command");
 	/*
 	 * Command is in flight, or possibly already completed
 	 * by the firmware (but not to the scsi mid layer) but we can't
@@ -4633,10 +4672,10 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	 */
 	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort);
 	if (rc != 0) {
-		dev_dbg(&h->pdev->dev, "%s Request FAILED.\n", msg);
-		dev_warn(&h->pdev->dev, "FAILED abort on device C%d:B%d:T%d:L%d\n",
+		dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
 			h->scsi_host->host_no,
-			dev->bus, dev->target, dev->lun);
+			dev->bus, dev->target, dev->lun,
+			"FAILED to abort command");
 		cmd_free(h, abort);
 		return FAILED;
 	}


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 03/42] hpsa: rework controller command submission
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
  2015-03-17 20:02 ` [PATCH v3 01/42] hpsa: add masked physical devices into h->dev[] array Don Brace
  2015-03-17 20:02 ` [PATCH v3 02/42] hpsa: clean up host, channel, target, lun prints Don Brace
@ 2015-03-17 20:02 ` Don Brace
  2015-03-27 15:11   ` Tomas Henzl
  2015-03-17 20:02 ` [PATCH v3 04/42] hpsa: clean up aborts Don Brace
                   ` (38 subsequent siblings)
  41 siblings, 1 reply; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webb.scales@hp.com>

Allow driver initiated commands to have a timeout.  It does not
yet try to do anything with timeouts on such commands.

We are sending a reset in order to get rid of a command we want to abort.
If we make it return on the same reply queue as the command we want to abort,
the completion of the aborted command will not race with the completion of
the reset command.

Rename hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd(), since
this function is the interface for issuing commands to the controller and
not the "core" of that implementation.  Add a parameter to it which allows
the caller to specify the reply queue to be used.  Modify existing callers
to specify the default reply queue.

Rename __hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd_core(),
since this routine is the "core" implementation of the "do simple command"
function and there is no longer any other function with a similar name.
Modify the existing callers of this routine (other than
hpsa_scsi_do_simple_cmd()) to instead call hpsa_scsi_do_simple_cmd(), since
it will now accept the reply_queue paramenter, and it provides a controller
lock-up check.  (Also, tweak two related message strings to make them
distinct from each other.)

Submitting a command to a locked up controller always results in a timeout,
so check for controller lock-up before submitting.

This is to enable fixing a race between command completions and
abort completions on different reply queues in a subsequent patch.
We want to be able to specify which reply queue an abort completion
should occur on so that it cannot race the completion of the command
it is trying to abort.

The following race was possible in theory:

  1. Abort command is sent to hardware.
  2. Command to be aborted simultaneously completes on another
     reply queue.
  3. Hardware receives abort command, decides command has already
     completed and indicates this to the driver via another different
     reply queue.
  4. driver processes abort completion finds that the hardware does not know
     about the command, concludes that therefore the command cannot complete,
     returns SUCCESS indicating to the mid-layer that the scsi_cmnd may be
     re-used.
  5. Command from step 2 is processed and completed back to scsi mid
     layer (after we already promised that would never happen.)

Fix by forcing aborts to complete on the same reply queue as the command
they are aborting.

Piggybacking device rescanning functionality onto the lockup
detection thread is not a good idea because if the controller
locks up during device rescanning, then the thread could get
stuck, then the lockup isn't detected.  Use separate work
queues for device rescanning and lockup detection.

Detect controller lockup in abort handler.

After a lockup is detected, return DO_NO_CONNECT which results in immediate
termination of commands rather than DID_ERR which results in retries.

Modify detect_controller_lockup() to return the result, to remove the need for a separate check.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |  326 ++++++++++++++++++++++++++++++++++++-----------
 drivers/scsi/hpsa_cmd.h |    5 +
 2 files changed, 254 insertions(+), 77 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 9b88726..488f81b 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -253,6 +253,8 @@ static int hpsa_scsi_ioaccel_queue_command(struct ctlr_info *h,
 	struct CommandList *c, u32 ioaccel_handle, u8 *cdb, int cdb_len,
 	u8 *scsi3addr, struct hpsa_scsi_dev_t *phys_disk);
 static void hpsa_command_resubmit_worker(struct work_struct *work);
+static u32 lockup_detected(struct ctlr_info *h);
+static int detect_controller_lockup(struct ctlr_info *h);
 
 static inline struct ctlr_info *sdev_to_hba(struct scsi_device *sdev)
 {
@@ -748,30 +750,43 @@ static inline u32 next_command(struct ctlr_info *h, u8 q)
  * a separate special register for submitting commands.
  */
 
-/* set_performant_mode: Modify the tag for cciss performant
+/*
+ * set_performant_mode: Modify the tag for cciss performant
  * set bit 0 for pull model, bits 3-1 for block fetch
  * register number
  */
-static void set_performant_mode(struct ctlr_info *h, struct CommandList *c)
+#define DEFAULT_REPLY_QUEUE (-1)
+static void set_performant_mode(struct ctlr_info *h, struct CommandList *c,
+					int reply_queue)
 {
 	if (likely(h->transMethod & CFGTBL_Trans_Performant)) {
 		c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1);
-		if (likely(h->msix_vector > 0))
+		if (unlikely(!h->msix_vector))
+			return;
+		if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
 			c->Header.ReplyQueue =
 				raw_smp_processor_id() % h->nreply_queues;
+		else
+			c->Header.ReplyQueue = reply_queue % h->nreply_queues;
 	}
 }
 
 static void set_ioaccel1_performant_mode(struct ctlr_info *h,
-						struct CommandList *c)
+						struct CommandList *c,
+						int reply_queue)
 {
 	struct io_accel1_cmd *cp = &h->ioaccel_cmd_pool[c->cmdindex];
 
-	/* Tell the controller to post the reply to the queue for this
+	/*
+	 * Tell the controller to post the reply to the queue for this
 	 * processor.  This seems to give the best I/O throughput.
 	 */
-	cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
-	/* Set the bits in the address sent down to include:
+	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
+		cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
+	else
+		cp->ReplyQueue = reply_queue % h->nreply_queues;
+	/*
+	 * Set the bits in the address sent down to include:
 	 *  - performant mode bit (bit 0)
 	 *  - pull count (bits 1-3)
 	 *  - command type (bits 4-6)
@@ -781,15 +796,21 @@ static void set_ioaccel1_performant_mode(struct ctlr_info *h,
 }
 
 static void set_ioaccel2_performant_mode(struct ctlr_info *h,
-						struct CommandList *c)
+						struct CommandList *c,
+						int reply_queue)
 {
 	struct io_accel2_cmd *cp = &h->ioaccel2_cmd_pool[c->cmdindex];
 
-	/* Tell the controller to post the reply to the queue for this
+	/*
+	 * Tell the controller to post the reply to the queue for this
 	 * processor.  This seems to give the best I/O throughput.
 	 */
-	cp->reply_queue = smp_processor_id() % h->nreply_queues;
-	/* Set the bits in the address sent down to include:
+	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
+		cp->reply_queue = smp_processor_id() % h->nreply_queues;
+	else
+		cp->reply_queue = reply_queue % h->nreply_queues;
+	/*
+	 * Set the bits in the address sent down to include:
 	 *  - performant mode bit not used in ioaccel mode 2
 	 *  - pull count (bits 0-3)
 	 *  - command type isn't needed for ioaccel2
@@ -826,26 +847,32 @@ static void dial_up_lockup_detection_on_fw_flash_complete(struct ctlr_info *h,
 		h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
 }
 
-static void enqueue_cmd_and_start_io(struct ctlr_info *h,
-	struct CommandList *c)
+static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
+	struct CommandList *c, int reply_queue)
 {
 	dial_down_lockup_detection_during_fw_flash(h, c);
 	atomic_inc(&h->commands_outstanding);
 	switch (c->cmd_type) {
 	case CMD_IOACCEL1:
-		set_ioaccel1_performant_mode(h, c);
+		set_ioaccel1_performant_mode(h, c, reply_queue);
 		writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
 		break;
 	case CMD_IOACCEL2:
-		set_ioaccel2_performant_mode(h, c);
+		set_ioaccel2_performant_mode(h, c, reply_queue);
 		writel(c->busaddr, h->vaddr + IOACCEL2_INBOUND_POSTQ_32);
 		break;
 	default:
-		set_performant_mode(h, c);
+		set_performant_mode(h, c, reply_queue);
 		h->access.submit_command(h, c);
 	}
 }
 
+static void enqueue_cmd_and_start_io(struct ctlr_info *h,
+					struct CommandList *c)
+{
+	__enqueue_cmd_and_start_io(h, c, DEFAULT_REPLY_QUEUE);
+}
+
 static inline int is_hba_lunid(unsigned char scsi3addr[])
 {
 	return memcmp(scsi3addr, RAID_CTLR_LUNID, 8) == 0;
@@ -1877,6 +1904,19 @@ static void complete_scsi_command(struct CommandList *cp)
 	if (cp->cmd_type == CMD_IOACCEL2 || cp->cmd_type == CMD_IOACCEL1)
 		atomic_dec(&cp->phys_disk->ioaccel_cmds_out);
 
+	/*
+	 * We check for lockup status here as it may be set for
+	 * CMD_SCSI, CMD_IOACCEL1 and CMD_IOACCEL2 commands by
+	 * fail_all_oustanding_cmds()
+	 */
+	if (unlikely(ei->CommandStatus == CMD_CTLR_LOCKUP)) {
+		/* DID_NO_CONNECT will prevent a retry */
+		cmd->result = DID_NO_CONNECT << 16;
+		cmd_free(h, cp);
+		cmd->scsi_done(cmd);
+		return;
+	}
+
 	if (cp->cmd_type == CMD_IOACCEL2)
 		return process_ioaccel2_completion(h, cp, cmd, dev);
 
@@ -2091,14 +2131,36 @@ static int hpsa_map_one(struct pci_dev *pdev,
 	return 0;
 }
 
-static inline void hpsa_scsi_do_simple_cmd_core(struct ctlr_info *h,
-	struct CommandList *c)
+#define NO_TIMEOUT ((unsigned long) -1)
+#define DEFAULT_TIMEOUT 30000 /* milliseconds */
+static int hpsa_scsi_do_simple_cmd_core(struct ctlr_info *h,
+	struct CommandList *c, int reply_queue, unsigned long timeout_msecs)
 {
 	DECLARE_COMPLETION_ONSTACK(wait);
 
 	c->waiting = &wait;
-	enqueue_cmd_and_start_io(h, c);
-	wait_for_completion(&wait);
+	__enqueue_cmd_and_start_io(h, c, reply_queue);
+	if (timeout_msecs == NO_TIMEOUT) {
+		/* TODO: get rid of this no-timeout thing */
+		wait_for_completion_io(&wait);
+		return IO_OK;
+	}
+	if (!wait_for_completion_io_timeout(&wait,
+					msecs_to_jiffies(timeout_msecs))) {
+		dev_warn(&h->pdev->dev, "Command timed out.\n");
+		return -ETIMEDOUT;
+	}
+	return IO_OK;
+}
+
+static int hpsa_scsi_do_simple_cmd(struct ctlr_info *h, struct CommandList *c,
+				   int reply_queue, unsigned long timeout_msecs)
+{
+	if (unlikely(lockup_detected(h))) {
+		c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
+		return IO_OK;
+	}
+	return hpsa_scsi_do_simple_cmd_core(h, c, reply_queue, timeout_msecs);
 }
 
 static u32 lockup_detected(struct ctlr_info *h)
@@ -2113,25 +2175,19 @@ static u32 lockup_detected(struct ctlr_info *h)
 	return rc;
 }
 
-static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
-	struct CommandList *c)
-{
-	/* If controller lockup detected, fake a hardware error. */
-	if (unlikely(lockup_detected(h)))
-		c->err_info->CommandStatus = CMD_HARDWARE_ERR;
-	else
-		hpsa_scsi_do_simple_cmd_core(h, c);
-}
-
 #define MAX_DRIVER_CMD_RETRIES 25
-static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
-	struct CommandList *c, int data_direction)
+static int hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
+	struct CommandList *c, int data_direction, unsigned long timeout_msecs)
 {
 	int backoff_time = 10, retry_count = 0;
+	int rc;
 
 	do {
 		memset(c->err_info, 0, sizeof(*c->err_info));
-		hpsa_scsi_do_simple_cmd_core(h, c);
+		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
+						  timeout_msecs);
+		if (rc)
+			break;
 		retry_count++;
 		if (retry_count > 3) {
 			msleep(backoff_time);
@@ -2142,6 +2198,9 @@ static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
 			check_for_busy(h, c)) &&
 			retry_count <= MAX_DRIVER_CMD_RETRIES);
 	hpsa_pci_unmap(h->pdev, c, 1, data_direction);
+	if (retry_count > MAX_DRIVER_CMD_RETRIES)
+		rc = -EIO;
+	return rc;
 }
 
 static void hpsa_print_cmd(struct ctlr_info *h, char *txt,
@@ -2218,6 +2277,9 @@ static void hpsa_scsi_interpret_error(struct ctlr_info *h,
 	case CMD_UNABORTABLE:
 		hpsa_print_cmd(h, "unabortable", cp);
 		break;
+	case CMD_CTLR_LOCKUP:
+		hpsa_print_cmd(h, "controller lockup detected", cp);
+		break;
 	default:
 		hpsa_print_cmd(h, "unknown status", cp);
 		dev_warn(d, "Unknown command status %x\n",
@@ -2245,7 +2307,10 @@ static int hpsa_scsi_do_inquiry(struct ctlr_info *h, unsigned char *scsi3addr,
 		rc = -1;
 		goto out;
 	}
-	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
+	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
+			PCI_DMA_FROMDEVICE, NO_TIMEOUT);
+	if (rc)
+		goto out;
 	ei = c->err_info;
 	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
 		hpsa_scsi_interpret_error(h, c);
@@ -2275,7 +2340,10 @@ static int hpsa_bmic_ctrl_mode_sense(struct ctlr_info *h,
 		rc = -1;
 		goto out;
 	}
-	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
+	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
+					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
+	if (rc)
+		goto out;
 	ei = c->err_info;
 	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
 		hpsa_scsi_interpret_error(h, c);
@@ -2287,7 +2355,7 @@ out:
 	}
 
 static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
-	u8 reset_type)
+	u8 reset_type, int reply_queue)
 {
 	int rc = IO_OK;
 	struct CommandList *c;
@@ -2304,7 +2372,11 @@ static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
 	(void) fill_cmd(c, HPSA_DEVICE_RESET_MSG, h, NULL, 0, 0,
 			scsi3addr, TYPE_MSG);
 	c->Request.CDB[1] = reset_type; /* fill_cmd defaults to LUN reset */
-	hpsa_scsi_do_simple_cmd_core(h, c);
+	rc = hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
+	if (rc) {
+		dev_warn(&h->pdev->dev, "Failed to send reset command\n");
+		goto out;
+	}
 	/* no unmap needed here because no data xfer. */
 
 	ei = c->err_info;
@@ -2312,6 +2384,7 @@ static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
 		hpsa_scsi_interpret_error(h, c);
 		rc = -1;
 	}
+out:
 	cmd_free(h, c);
 	return rc;
 }
@@ -2429,15 +2502,18 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
 			sizeof(this_device->raid_map), 0,
 			scsi3addr, TYPE_CMD)) {
 		dev_warn(&h->pdev->dev, "Out of memory in hpsa_get_raid_map()\n");
-		cmd_free(h, c);
-		return -ENOMEM;
+		rc = -ENOMEM;
+		goto out;
 	}
-	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
+	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
+					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
+	if (rc)
+		goto out;
 	ei = c->err_info;
 	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
 		hpsa_scsi_interpret_error(h, c);
-		cmd_free(h, c);
-		return -1;
+		rc = -1;
+		goto out;
 	}
 	cmd_free(h, c);
 
@@ -2449,6 +2525,9 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
 	}
 	hpsa_debug_map_buff(h, rc, &this_device->raid_map);
 	return rc;
+out:
+	cmd_free(h, c);
+	return rc;
 }
 
 static int hpsa_bmic_id_physical_device(struct ctlr_info *h,
@@ -2468,7 +2547,8 @@ static int hpsa_bmic_id_physical_device(struct ctlr_info *h,
 	c->Request.CDB[2] = bmic_device_index & 0xff;
 	c->Request.CDB[9] = (bmic_device_index >> 8) & 0xff;
 
-	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
+	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE,
+						NO_TIMEOUT);
 	ei = c->err_info;
 	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
 		hpsa_scsi_interpret_error(h, c);
@@ -2603,7 +2683,10 @@ static int hpsa_scsi_do_report_luns(struct ctlr_info *h, int logical,
 	}
 	if (extended_response)
 		c->Request.CDB[1] = extended_response;
-	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
+	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
+					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
+	if (rc)
+		goto out;
 	ei = c->err_info;
 	if (ei->CommandStatus != 0 &&
 	    ei->CommandStatus != CMD_DATA_UNDERRUN) {
@@ -2696,7 +2779,7 @@ static int hpsa_volume_offline(struct ctlr_info *h,
 {
 	struct CommandList *c;
 	unsigned char *sense, sense_key, asc, ascq;
-	int ldstat = 0;
+	int rc, ldstat = 0;
 	u16 cmd_status;
 	u8 scsi_status;
 #define ASC_LUN_NOT_READY 0x04
@@ -2707,7 +2790,11 @@ static int hpsa_volume_offline(struct ctlr_info *h,
 	if (!c)
 		return 0;
 	(void) fill_cmd(c, TEST_UNIT_READY, h, NULL, 0, 0, scsi3addr, TYPE_CMD);
-	hpsa_scsi_do_simple_cmd_core(h, c);
+	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
+	if (rc) {
+		cmd_free(h, c);
+		return 0;
+	}
 	sense = c->err_info->SenseInfo;
 	sense_key = sense[2];
 	asc = sense[12];
@@ -4040,7 +4127,11 @@ static int hpsa_scsi_ioaccel_raid_map(struct ctlr_info *h,
 						dev->phys_disk[map_index]);
 }
 
-/* Submit commands down the "normal" RAID stack path */
+/*
+ * Submit commands down the "normal" RAID stack path
+ * All callers to hpsa_ciss_submit must check lockup_detected
+ * beforehand, before (opt.) and after calling cmd_alloc
+ */
 static int hpsa_ciss_submit(struct ctlr_info *h,
 	struct CommandList *c, struct scsi_cmnd *cmd,
 	unsigned char scsi3addr[])
@@ -4151,7 +4242,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
 
 	if (unlikely(lockup_detected(h))) {
-		cmd->result = DID_ERROR << 16;
+		cmd->result = DID_NO_CONNECT << 16;
 		cmd->scsi_done(cmd);
 		return 0;
 	}
@@ -4161,7 +4252,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 		return SCSI_MLQUEUE_HOST_BUSY;
 	}
 	if (unlikely(lockup_detected(h))) {
-		cmd->result = DID_ERROR << 16;
+		cmd->result = DID_NO_CONNECT << 16;
 		cmd_free(h, c);
 		cmd->scsi_done(cmd);
 		return 0;
@@ -4356,7 +4447,10 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h,
 		/* Send the Test Unit Ready, fill_cmd can't fail, no mapping */
 		(void) fill_cmd(c, TEST_UNIT_READY, h,
 				NULL, 0, 0, lunaddr, TYPE_CMD);
-		hpsa_scsi_do_simple_cmd_core(h, c);
+		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
+						NO_TIMEOUT);
+		if (rc)
+			goto do_it_again;
 		/* no unmap needed here because no data xfer. */
 
 		if (c->err_info->CommandStatus == CMD_SUCCESS)
@@ -4367,7 +4461,7 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h,
 			(c->err_info->SenseInfo[2] == NO_SENSE ||
 			c->err_info->SenseInfo[2] == UNIT_ATTENTION))
 			break;
-
+do_it_again:
 		dev_warn(&h->pdev->dev, "waiting %d secs "
 			"for device to become ready.\n", waittime);
 		rc = 1; /* device not ready. */
@@ -4405,13 +4499,46 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
 			"device lookup failed.\n");
 		return FAILED;
 	}
-	dev_warn(&h->pdev->dev, "resetting device %d:%d:%d:%d\n",
-		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
+
+	/* if controller locked up, we can guarantee command won't complete */
+	if (lockup_detected(h)) {
+		dev_warn(&h->pdev->dev,
+			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
+			h->scsi_host->host_no, dev->bus, dev->target,
+			dev->lun);
+		return FAILED;
+	}
+
+	/* this reset request might be the result of a lockup; check */
+	if (detect_controller_lockup(h)) {
+		dev_warn(&h->pdev->dev,
+			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
+			 h->scsi_host->host_no, dev->bus, dev->target,
+			 dev->lun);
+		return FAILED;
+	}
+
+	dev_warn(&h->pdev->dev,
+		"scsi %d:%d:%d:%d: %s %.8s %.16s resetting RAID-%s SSDSmartPathCap%c En%c Exp=%d\n",
+		h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+		scsi_device_type(dev->devtype),
+		dev->vendor,
+		dev->model,
+		dev->raid_level > RAID_UNKNOWN ?
+			"RAID-?" : raid_label[dev->raid_level],
+		dev->offload_config ? '+' : '-',
+		dev->offload_enabled ? '+' : '-',
+		dev->expose_state);
+
 	/* send a reset to the SCSI LUN which the command was sent to */
-	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN);
+	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN,
+			     DEFAULT_REPLY_QUEUE);
 	if (rc == 0 && wait_for_device_to_become_ready(h, dev->scsi3addr) == 0)
 		return SUCCESS;
 
+	dev_warn(&h->pdev->dev,
+		"scsi %d:%d:%d:%d reset failed\n",
+		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
 	return FAILED;
 }
 
@@ -4456,7 +4583,7 @@ static void hpsa_get_tag(struct ctlr_info *h,
 }
 
 static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
-	struct CommandList *abort, int swizzle)
+	struct CommandList *abort, int swizzle, int reply_queue)
 {
 	int rc = IO_OK;
 	struct CommandList *c;
@@ -4474,9 +4601,9 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
 		0, 0, scsi3addr, TYPE_MSG);
 	if (swizzle)
 		swizzle_abort_tag(&c->Request.CDB[4]);
-	hpsa_scsi_do_simple_cmd_core(h, c);
+	(void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
 	hpsa_get_tag(h, abort, &taglower, &tagupper);
-	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd_core completed.\n",
+	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd(abort) completed.\n",
 		__func__, tagupper, taglower);
 	/* no unmap needed here because no data xfer. */
 
@@ -4508,7 +4635,7 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
  */
 
 static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
-	unsigned char *scsi3addr, struct CommandList *abort)
+	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
 {
 	int rc = IO_OK;
 	struct scsi_cmnd *scmd; /* scsi command within request being aborted */
@@ -4551,7 +4678,7 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
 			"Reset as abort: Resetting physical device at scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
 			psa[0], psa[1], psa[2], psa[3],
 			psa[4], psa[5], psa[6], psa[7]);
-	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET);
+	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET, reply_queue);
 	if (rc != 0) {
 		dev_warn(&h->pdev->dev,
 			"Reset as abort: Failed on physical device at scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
@@ -4585,7 +4712,7 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
  * make this true someday become false.
  */
 static int hpsa_send_abort_both_ways(struct ctlr_info *h,
-	unsigned char *scsi3addr, struct CommandList *abort)
+	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
 {
 	/* ioccelerator mode 2 commands should be aborted via the
 	 * accelerated path, since RAID path is unaware of these commands,
@@ -4593,10 +4720,20 @@ static int hpsa_send_abort_both_ways(struct ctlr_info *h,
 	 * Change abort to physical device reset.
 	 */
 	if (abort->cmd_type == CMD_IOACCEL2)
-		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr, abort);
+		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr,
+							abort, reply_queue);
+
+	return hpsa_send_abort(h, scsi3addr, abort, 0, reply_queue) &&
+			hpsa_send_abort(h, scsi3addr, abort, 1, reply_queue);
+}
 
-	return hpsa_send_abort(h, scsi3addr, abort, 0) &&
-			hpsa_send_abort(h, scsi3addr, abort, 1);
+/* Find out which reply queue a command was meant to return on */
+static int hpsa_extract_reply_queue(struct ctlr_info *h,
+					struct CommandList *c)
+{
+	if (c->cmd_type == CMD_IOACCEL2)
+		return h->ioaccel2_cmd_pool[c->cmdindex].reply_queue;
+	return c->Header.ReplyQueue;
 }
 
 /* Send an abort for the specified command.
@@ -4614,7 +4751,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	char msg[256];		/* For debug messaging. */
 	int ml = 0;
 	__le32 tagupper, taglower;
-	int refcount;
+	int refcount, reply_queue;
 
 	/* Find the controller of the command to be aborted */
 	h = sdev_to_hba(sc->device);
@@ -4622,8 +4759,23 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 			"ABORT REQUEST FAILED, Controller lookup failed.\n"))
 		return FAILED;
 
-	if (lockup_detected(h))
+	/* If controller locked up, we can guarantee command won't complete */
+	if (lockup_detected(h)) {
+		dev_warn(&h->pdev->dev,
+			"scsi %d:%d:%d:%llu scmd %p ABORT FAILED, lockup detected\n",
+			h->scsi_host->host_no, sc->device->channel,
+			sc->device->id, sc->device->lun, sc);
 		return FAILED;
+	}
+
+	/* This is a good time to check if controller lockup has occurred */
+	if (detect_controller_lockup(h)) {
+		dev_warn(&h->pdev->dev,
+			 "scsi %d:%d:%d:%llu scmd %p ABORT FAILED, new lockup detected\n",
+			 h->scsi_host->host_no, sc->device->channel,
+			 sc->device->id, sc->device->lun, sc);
+		return FAILED;
+	}
 
 	/* Check that controller supports some kind of task abort */
 	if (!(HPSATMF_PHYS_TASK_ABORT & h->TMFSupportFlags) &&
@@ -4656,6 +4808,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		return SUCCESS;
 	}
 	hpsa_get_tag(h, abort, &taglower, &tagupper);
+	reply_queue = hpsa_extract_reply_queue(h, abort);
 	ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower);
 	as  = abort->scsi_cmd;
 	if (as != NULL)
@@ -4670,7 +4823,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	 * by the firmware (but not to the scsi mid layer) but we can't
 	 * distinguish which.  Send the abort down.
 	 */
-	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort);
+	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort, reply_queue);
 	if (rc != 0) {
 		dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
 			h->scsi_host->host_no,
@@ -4995,7 +5148,9 @@ static int hpsa_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 		c->SG[0].Len = cpu_to_le32(iocommand.buf_size);
 		c->SG[0].Ext = cpu_to_le32(HPSA_SG_LAST); /* not chaining */
 	}
-	hpsa_scsi_do_simple_cmd_core_if_no_lockup(h, c);
+	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
+	if (rc)
+		rc = -EIO;
 	if (iocommand.buf_size > 0)
 		hpsa_pci_unmap(h->pdev, c, 1, PCI_DMA_BIDIRECTIONAL);
 	check_ioctl_unit_attention(h, c);
@@ -5125,7 +5280,11 @@ static int hpsa_big_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 		}
 		c->SG[--i].Ext = cpu_to_le32(HPSA_SG_LAST);
 	}
-	hpsa_scsi_do_simple_cmd_core_if_no_lockup(h, c);
+	status = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
+	if (status) {
+		status = -EIO;
+		goto cleanup0;
+	}
 	if (sg_used)
 		hpsa_pci_unmap(h->pdev, c, sg_used, PCI_DMA_BIDIRECTIONAL);
 	check_ioctl_unit_attention(h, c);
@@ -6272,6 +6431,8 @@ static int hpsa_wait_for_mode_change_ack(struct ctlr_info *h)
 	 * as we enter this code.)
 	 */
 	for (i = 0; i < MAX_MODE_CHANGE_WAIT; i++) {
+		if (h->remove_in_progress)
+			goto done;
 		spin_lock_irqsave(&h->lock, flags);
 		doorbell_value = readl(h->vaddr + SA5_DOORBELL);
 		spin_unlock_irqrestore(&h->lock, flags);
@@ -6667,17 +6828,21 @@ static void fail_all_outstanding_cmds(struct ctlr_info *h)
 {
 	int i, refcount;
 	struct CommandList *c;
+	int failcount = 0;
 
 	flush_workqueue(h->resubmit_wq); /* ensure all cmds are fully built */
 	for (i = 0; i < h->nr_cmds; i++) {
 		c = h->cmd_pool + i;
 		refcount = atomic_inc_return(&c->refcount);
 		if (refcount > 1) {
-			c->err_info->CommandStatus = CMD_HARDWARE_ERR;
+			c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
 			finish_cmd(c);
+			failcount++;
 		}
 		cmd_free(h, c);
 	}
+	dev_warn(&h->pdev->dev,
+		"failed %d commands in fail_all\n", failcount);
 }
 
 static void set_lockup_detected_for_all_cpus(struct ctlr_info *h, u32 value)
@@ -6705,18 +6870,19 @@ static void controller_lockup_detected(struct ctlr_info *h)
 	if (!lockup_detected) {
 		/* no heartbeat, but controller gave us a zero. */
 		dev_warn(&h->pdev->dev,
-			"lockup detected but scratchpad register is zero\n");
+			"lockup detected after %d but scratchpad register is zero\n",
+			h->heartbeat_sample_interval / HZ);
 		lockup_detected = 0xffffffff;
 	}
 	set_lockup_detected_for_all_cpus(h, lockup_detected);
 	spin_unlock_irqrestore(&h->lock, flags);
-	dev_warn(&h->pdev->dev, "Controller lockup detected: 0x%08x\n",
-			lockup_detected);
+	dev_warn(&h->pdev->dev, "Controller lockup detected: 0x%08x after %d\n",
+			lockup_detected, h->heartbeat_sample_interval / HZ);
 	pci_disable_device(h->pdev);
 	fail_all_outstanding_cmds(h);
 }
 
-static void detect_controller_lockup(struct ctlr_info *h)
+static int detect_controller_lockup(struct ctlr_info *h)
 {
 	u64 now;
 	u32 heartbeat;
@@ -6726,7 +6892,7 @@ static void detect_controller_lockup(struct ctlr_info *h)
 	/* If we've received an interrupt recently, we're ok. */
 	if (time_after64(h->last_intr_timestamp +
 				(h->heartbeat_sample_interval), now))
-		return;
+		return false;
 
 	/*
 	 * If we've already checked the heartbeat recently, we're ok.
@@ -6735,7 +6901,7 @@ static void detect_controller_lockup(struct ctlr_info *h)
 	 */
 	if (time_after64(h->last_heartbeat_timestamp +
 				(h->heartbeat_sample_interval), now))
-		return;
+		return false;
 
 	/* If heartbeat has not changed since we last looked, we're not ok. */
 	spin_lock_irqsave(&h->lock, flags);
@@ -6743,12 +6909,13 @@ static void detect_controller_lockup(struct ctlr_info *h)
 	spin_unlock_irqrestore(&h->lock, flags);
 	if (h->last_heartbeat == heartbeat) {
 		controller_lockup_detected(h);
-		return;
+		return true;
 	}
 
 	/* We're ok. */
 	h->last_heartbeat = heartbeat;
 	h->last_heartbeat_timestamp = now;
+	return false;
 }
 
 static void hpsa_ack_ctlr_events(struct ctlr_info *h)
@@ -7092,8 +7259,10 @@ static void hpsa_flush_cache(struct ctlr_info *h)
 {
 	char *flush_buf;
 	struct CommandList *c;
+	int rc;
 
 	/* Don't bother trying to flush the cache if locked up */
+	/* FIXME not necessary if do_simple_cmd does the check */
 	if (unlikely(lockup_detected(h)))
 		return;
 	flush_buf = kzalloc(4, GFP_KERNEL);
@@ -7109,7 +7278,10 @@ static void hpsa_flush_cache(struct ctlr_info *h)
 		RAID_CTLR_LUNID, TYPE_CMD)) {
 		goto out;
 	}
-	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_TODEVICE);
+	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
+					PCI_DMA_TODEVICE, NO_TIMEOUT);
+	if (rc)
+		goto out;
 	if (c->err_info->CommandStatus != 0)
 out:
 		dev_warn(&h->pdev->dev,
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 76d5499..f52c847 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -43,6 +43,11 @@
 #define CMD_TIMEOUT             0x000B
 #define CMD_UNABORTABLE		0x000C
 #define CMD_IOACCEL_DISABLED	0x000E
+#define CMD_CTLR_LOCKUP		0xffff
+/* Note: CMD_CTLR_LOCKUP is not a value defined by the CISS spec
+ * it is a value defined by the driver that commands can be marked
+ * with when a controller lockup has been detected by the driver
+ */
 
 
 /* Unit Attentions ASC's as defined for the MSA2012sa */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 04/42] hpsa: clean up aborts
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (2 preceding siblings ...)
  2015-03-17 20:02 ` [PATCH v3 03/42] hpsa: rework controller command submission Don Brace
@ 2015-03-17 20:02 ` Don Brace
  2015-03-17 20:02 ` [PATCH v3 05/42] hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds Don Brace
                   ` (37 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

Do not send aborts to logical devices that do not support aborts

Instead of relying on what the Smart Array claims for supporting logical
drives, simply try an abort and see how it responds at device discovery
time.  This way devices that do support aborts (e.g. MSA2000) can work
and we do not waste time trying to send aborts to logical drives that do
not support them (important for high IOPS devices.)

While rescanning devices only test whether devices support aborts
the first time we encounter a device rather than every time.

Some Smart Arrays required aborts to be sent with tags in
the wrong endian byte order.  To avoid having to know about
this, we would send two aborts with tags with each endian order.
On high IOPS devices, this turns out to be not such a hot idea.
So we now have a list of the devices that got the tag backwards,
and we only send it one way.

If all available commands are outstanding and the abort handler
is invoked, the abort handler may not be able to allocate a command
and may busy-wait excessivly.  Reserve a small number of commands
for the abort handler and limit the number of concurrent abort
requests to the number of reserved commands.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  176 +++++++++++++++++++++++++++++++++++++++++----------
 drivers/scsi/hpsa.h |    4 +
 2 files changed, 147 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 488f81b..fe860a6 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -428,7 +428,7 @@ static ssize_t host_show_hp_ssd_smart_path_status(struct device *dev,
 /* List of controllers which cannot be hard reset on kexec with reset_devices */
 static u32 unresettable_controller[] = {
 	0x324a103C, /* Smart Array P712m */
-	0x324b103C, /* SmartArray P711m */
+	0x324b103C, /* Smart Array P711m */
 	0x3223103C, /* Smart Array P800 */
 	0x3234103C, /* Smart Array P400 */
 	0x3235103C, /* Smart Array P400i */
@@ -470,24 +470,32 @@ static u32 soft_unresettable_controller[] = {
 	0x409D0E11, /* Smart Array 6400 EM */
 };
 
-static int ctlr_is_hard_resettable(u32 board_id)
+static u32 needs_abort_tags_swizzled[] = {
+	0x323D103C, /* Smart Array P700m */
+	0x324a103C, /* Smart Array P712m */
+	0x324b103C, /* SmartArray P711m */
+};
+
+static int board_id_in_array(u32 a[], int nelems, u32 board_id)
 {
 	int i;
 
-	for (i = 0; i < ARRAY_SIZE(unresettable_controller); i++)
-		if (unresettable_controller[i] == board_id)
-			return 0;
-	return 1;
+	for (i = 0; i < nelems; i++)
+		if (a[i] == board_id)
+			return 1;
+	return 0;
 }
 
-static int ctlr_is_soft_resettable(u32 board_id)
+static int ctlr_is_hard_resettable(u32 board_id)
 {
-	int i;
+	return !board_id_in_array(unresettable_controller,
+			ARRAY_SIZE(unresettable_controller), board_id);
+}
 
-	for (i = 0; i < ARRAY_SIZE(soft_unresettable_controller); i++)
-		if (soft_unresettable_controller[i] == board_id)
-			return 0;
-	return 1;
+static int ctlr_is_soft_resettable(u32 board_id)
+{
+	return !board_id_in_array(soft_unresettable_controller,
+			ARRAY_SIZE(soft_unresettable_controller), board_id);
 }
 
 static int ctlr_is_resettable(u32 board_id)
@@ -496,6 +504,12 @@ static int ctlr_is_resettable(u32 board_id)
 		ctlr_is_soft_resettable(board_id);
 }
 
+static int ctlr_needs_abort_tags_swizzled(u32 board_id)
+{
+	return board_id_in_array(needs_abort_tags_swizzled,
+			ARRAY_SIZE(needs_abort_tags_swizzled), board_id);
+}
+
 static ssize_t host_show_resettable(struct device *dev,
 	struct device_attribute *attr, char *buf)
 {
@@ -2838,6 +2852,50 @@ static int hpsa_volume_offline(struct ctlr_info *h,
 	return 0;
 }
 
+/*
+ * Find out if a logical device supports aborts by simply trying one.
+ * Smart Array may claim not to support aborts on logical drives, but
+ * if a MSA2000 * is connected, the drives on that will be presented
+ * by the Smart Array as logical drives, and aborts may be sent to
+ * those devices successfully.  So the simplest way to find out is
+ * to simply try an abort and see how the device responds.
+ */
+static int hpsa_device_supports_aborts(struct ctlr_info *h,
+					unsigned char *scsi3addr)
+{
+	struct CommandList *c;
+	struct ErrorInfo *ei;
+	int rc = 0;
+
+	u64 tag = (u64) -1; /* bogus tag */
+
+	/* Assume that physical devices support aborts */
+	if (!is_logical_dev_addr_mode(scsi3addr))
+		return 1;
+
+	c = cmd_alloc(h);
+	if (!c)
+		return -ENOMEM;
+	(void) fill_cmd(c, HPSA_ABORT_MSG, h, &tag, 0, 0, scsi3addr, TYPE_MSG);
+	(void) hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
+	/* no unmap needed here because no data xfer. */
+	ei = c->err_info;
+	switch (ei->CommandStatus) {
+	case CMD_INVALID:
+		rc = 0;
+		break;
+	case CMD_UNABORTABLE:
+	case CMD_ABORT_FAILED:
+		rc = 1;
+		break;
+	default:
+		rc = 0;
+		break;
+	}
+	cmd_free(h, c);
+	return rc;
+}
+
 static int hpsa_update_device_info(struct ctlr_info *h,
 	unsigned char scsi3addr[], struct hpsa_scsi_dev_t *this_device,
 	unsigned char *is_OBDR_device)
@@ -2904,7 +2962,6 @@ static int hpsa_update_device_info(struct ctlr_info *h,
 					strncmp(obdr_sig, OBDR_TAPE_SIG,
 						OBDR_SIG_LEN) == 0);
 	}
-
 	kfree(inq_buff);
 	return 0;
 
@@ -2913,6 +2970,31 @@ bail_out:
 	return 1;
 }
 
+static void hpsa_update_device_supports_aborts(struct ctlr_info *h,
+			struct hpsa_scsi_dev_t *dev, u8 *scsi3addr)
+{
+	unsigned long flags;
+	int rc, entry;
+	/*
+	 * See if this device supports aborts.  If we already know
+	 * the device, we already know if it supports aborts, otherwise
+	 * we have to find out if it supports aborts by trying one.
+	 */
+	spin_lock_irqsave(&h->devlock, flags);
+	rc = hpsa_scsi_find_entry(dev, h->dev, h->ndevices, &entry);
+	if ((rc == DEVICE_SAME || rc == DEVICE_UPDATED) &&
+		entry >= 0 && entry < h->ndevices) {
+		dev->supports_aborts = h->dev[entry]->supports_aborts;
+		spin_unlock_irqrestore(&h->devlock, flags);
+	} else {
+		spin_unlock_irqrestore(&h->devlock, flags);
+		dev->supports_aborts =
+				hpsa_device_supports_aborts(h, scsi3addr);
+		if (dev->supports_aborts < 0)
+			dev->supports_aborts = 0;
+	}
+}
+
 static unsigned char *ext_target_model[] = {
 	"MSA2012",
 	"MSA2024",
@@ -3018,6 +3100,7 @@ static int add_ext_target_dev(struct ctlr_info *h,
 	(*n_ext_target_devs)++;
 	hpsa_set_bus_target_lun(this_device,
 				tmpdevice->bus, tmpdevice->target, 0);
+	hpsa_update_device_supports_aborts(h, this_device, scsi3addr);
 	set_bit(tmpdevice->target, lunzerobits);
 	return 1;
 }
@@ -3272,6 +3355,7 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
 							&is_OBDR))
 			continue; /* skip it if we can't talk to it. */
 		figure_bus_target_lun(h, lunaddrbytes, tmpdevice);
+		hpsa_update_device_supports_aborts(h, tmpdevice, lunaddrbytes);
 		this_device = currentsd[ncurrent];
 
 		/*
@@ -4583,7 +4667,7 @@ static void hpsa_get_tag(struct ctlr_info *h,
 }
 
 static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
-	struct CommandList *abort, int swizzle, int reply_queue)
+	struct CommandList *abort, int reply_queue)
 {
 	int rc = IO_OK;
 	struct CommandList *c;
@@ -4597,9 +4681,9 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
 	}
 
 	/* fill_cmd can't fail here, no buffer to map */
-	(void) fill_cmd(c, HPSA_ABORT_MSG, h, abort,
+	(void) fill_cmd(c, HPSA_ABORT_MSG, h, &abort->Header.tag,
 		0, 0, scsi3addr, TYPE_MSG);
-	if (swizzle)
+	if (h->needs_abort_tags_swizzled)
 		swizzle_abort_tag(&c->Request.CDB[4]);
 	(void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
 	hpsa_get_tag(h, abort, &taglower, &tagupper);
@@ -4705,12 +4789,6 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
 	return rc; /* success */
 }
 
-/* Some Smart Arrays need the abort tag swizzled, and some don't.  It's hard to
- * tell which kind we're dealing with, so we send the abort both ways.  There
- * shouldn't be any collisions between swizzled and unswizzled tags due to the
- * way we construct our tags but we check anyway in case the assumptions which
- * make this true someday become false.
- */
 static int hpsa_send_abort_both_ways(struct ctlr_info *h,
 	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
 {
@@ -4722,9 +4800,7 @@ static int hpsa_send_abort_both_ways(struct ctlr_info *h,
 	if (abort->cmd_type == CMD_IOACCEL2)
 		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr,
 							abort, reply_queue);
-
-	return hpsa_send_abort(h, scsi3addr, abort, 0, reply_queue) &&
-			hpsa_send_abort(h, scsi3addr, abort, 1, reply_queue);
+	return hpsa_send_abort(h, scsi3addr, abort, reply_queue);
 }
 
 /* Find out which reply queue a command was meant to return on */
@@ -4736,6 +4812,18 @@ static int hpsa_extract_reply_queue(struct ctlr_info *h,
 	return c->Header.ReplyQueue;
 }
 
+/*
+ * Limit concurrency of abort commands to prevent
+ * over-subscription of commands
+ */
+static inline int wait_for_available_abort_cmd(struct ctlr_info *h)
+{
+#define ABORT_CMD_WAIT_MSECS 5000
+	return !wait_event_timeout(h->abort_cmd_wait_queue,
+			atomic_dec_if_positive(&h->abort_cmds_available) >= 0,
+			msecs_to_jiffies(ABORT_CMD_WAIT_MSECS));
+}
+
 /* Send an abort for the specified command.
  *	If the device and controller support it,
  *		send a task abort request.
@@ -4753,10 +4841,12 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	__le32 tagupper, taglower;
 	int refcount, reply_queue;
 
+	if (sc->device == NULL)
+		return FAILED;
+
 	/* Find the controller of the command to be aborted */
 	h = sdev_to_hba(sc->device);
-	if (WARN(h == NULL,
-			"ABORT REQUEST FAILED, Controller lookup failed.\n"))
+	if (h == NULL)
 		return FAILED;
 
 	/* If controller locked up, we can guarantee command won't complete */
@@ -4807,6 +4897,14 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		cmd_free(h, abort);
 		return SUCCESS;
 	}
+
+	/* Don't bother trying the abort if we know it won't work. */
+	if (abort->cmd_type != CMD_IOACCEL2 &&
+		abort->cmd_type != CMD_IOACCEL1 && !dev->supports_aborts) {
+		cmd_free(h, abort);
+		return FAILED;
+	}
+
 	hpsa_get_tag(h, abort, &taglower, &tagupper);
 	reply_queue = hpsa_extract_reply_queue(h, abort);
 	ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower);
@@ -4823,7 +4921,15 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	 * by the firmware (but not to the scsi mid layer) but we can't
 	 * distinguish which.  Send the abort down.
 	 */
+	if (wait_for_available_abort_cmd(h)) {
+		dev_warn(&h->pdev->dev,
+			"Timed out waiting for an abort command to become available.\n");
+		cmd_free(h, abort);
+		return FAILED;
+	}
 	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort, reply_queue);
+	atomic_inc(&h->abort_cmds_available);
+	wake_up_all(&h->abort_cmd_wait_queue);
 	if (rc != 0) {
 		dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
 			h->scsi_host->host_no,
@@ -5395,7 +5501,7 @@ static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
 	int cmd_type)
 {
 	int pci_dir = XFER_NONE;
-	struct CommandList *a; /* for commands to be aborted */
+	u64 tag; /* for commands to be aborted */
 
 	c->cmd_type = CMD_IOCTL_PEND;
 	c->Header.ReplyQueue = 0;
@@ -5511,10 +5617,10 @@ static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
 			c->Request.CDB[7] = 0x00;
 			break;
 		case  HPSA_ABORT_MSG:
-			a = buff;       /* point to command to be aborted */
+			memcpy(&tag, buff, sizeof(tag));
 			dev_dbg(&h->pdev->dev,
-				"Abort Tag:0x%016llx request Tag:0x%016llx",
-				a->Header.tag, c->Header.tag);
+				"Abort Tag:0x%016llx using rqst Tag:0x%016llx",
+				tag, c->Header.tag);
 			c->Request.CDBLen = 16;
 			c->Request.type_attr_dir =
 					TYPE_ATTR_DIR(cmd_type,
@@ -5525,8 +5631,7 @@ static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
 			c->Request.CDB[2] = 0x00; /* reserved */
 			c->Request.CDB[3] = 0x00; /* reserved */
 			/* Tag to abort goes in CDB[4]-CDB[11] */
-			memcpy(&c->Request.CDB[4], &a->Header.tag,
-				sizeof(a->Header.tag));
+			memcpy(&c->Request.CDB[4], &tag, sizeof(tag));
 			c->Request.CDB[12] = 0x00; /* reserved */
 			c->Request.CDB[13] = 0x00; /* reserved */
 			c->Request.CDB[14] = 0x00; /* reserved */
@@ -6483,6 +6588,9 @@ static int hpsa_pci_init(struct ctlr_info *h)
 	h->product_name = products[prod_index].product_name;
 	h->access = *(products[prod_index].access);
 
+	h->needs_abort_tags_swizzled =
+		ctlr_needs_abort_tags_swizzled(h->board_id);
+
 	pci_disable_link_state(h->pdev, PCIE_LINK_STATE_L0S |
 			       PCIE_LINK_STATE_L1 | PCIE_LINK_STATE_CLKPM);
 
@@ -7097,6 +7205,7 @@ reinit_after_soft_reset:
 	spin_lock_init(&h->offline_device_lock);
 	spin_lock_init(&h->scan_lock);
 	atomic_set(&h->passthru_cmds_avail, HPSA_MAX_CONCURRENT_PASSTHRUS);
+	atomic_set(&h->abort_cmds_available, HPSA_CMDS_RESERVED_FOR_ABORTS);
 
 	h->rescan_ctlr_wq = hpsa_create_controller_wq(h, "rescan");
 	if (!h->rescan_ctlr_wq) {
@@ -7154,6 +7263,7 @@ reinit_after_soft_reset:
 	if (hpsa_allocate_sg_chain_blocks(h))
 		goto clean4;
 	init_waitqueue_head(&h->scan_wait_queue);
+	init_waitqueue_head(&h->abort_cmd_wait_queue);
 	h->scan_finished = 1; /* no scan currently in progress */
 
 	pci_set_drvdata(pdev, h);
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 58f3315..df2468c 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -69,6 +69,7 @@ struct hpsa_scsi_dev_t {
 	 * devices in order to honor physical device queue depth limits.
 	 */
 	struct hpsa_scsi_dev_t *phys_disk[RAID_MAP_MAX_ENTRIES];
+	int supports_aborts;
 #define HPSA_DO_NOT_EXPOSE	0x0
 #define HPSA_SG_ATTACH		0x1
 #define HPSA_ULD_ATTACH		0x2
@@ -257,8 +258,11 @@ struct ctlr_info {
 	struct list_head offline_device_list;
 	int	acciopath_status;
 	int	raid_offload_debug;
+	int	needs_abort_tags_swizzled;
 	struct workqueue_struct *resubmit_wq;
 	struct workqueue_struct *rescan_ctlr_wq;
+	atomic_t abort_cmds_available;
+	wait_queue_head_t abort_cmd_wait_queue;
 };
 
 struct offline_device_entry {


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 05/42] hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (3 preceding siblings ...)
  2015-03-17 20:02 ` [PATCH v3 04/42] hpsa: clean up aborts Don Brace
@ 2015-03-17 20:02 ` Don Brace
  2015-04-02 13:33   ` Tomas Henzl
  2015-03-17 20:02 ` [PATCH v3 06/42] hpsa: hpsa decode sense data for io and tmf Don Brace
                   ` (36 subsequent siblings)
  41 siblings, 1 reply; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

make tracking of outstanding commands more robust

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index fe860a6..3a06812 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -6945,7 +6945,7 @@ static void fail_all_outstanding_cmds(struct ctlr_info *h)
 		if (refcount > 1) {
 			c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
 			finish_cmd(c);
-			failcount++;
+			atomic_dec(&h->commands_outstanding);
 		}
 		cmd_free(h, c);
 	}


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 06/42] hpsa: hpsa decode sense data for io and tmf
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (4 preceding siblings ...)
  2015-03-17 20:02 ` [PATCH v3 05/42] hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds Don Brace
@ 2015-03-17 20:02 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 07/42] hpsa: allow lockup detected to be viewed via sysfs Don Brace
                   ` (35 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:02 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

In hba mode, we could get sense data in descriptor format so
we need to handle that.

It's possible for CommandStatus to have value 0x0D
"TMF Function Status", which we should handle.  We will get
this from a P1224 when aborting a non-existent tag, for
example.  The "ScsiStatus" field of the errinfo field
will contain the TMF function status value.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |  143 +++++++++++++++++++++++++++++++++++------------
 drivers/scsi/hpsa_cmd.h |    9 +++
 2 files changed, 117 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3a06812..415ec4d 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -43,6 +43,7 @@
 #include <scsi/scsi_device.h>
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_tcq.h>
+#include <scsi/scsi_eh.h>
 #include <linux/cciss_ioctl.h>
 #include <linux/string.h>
 #include <linux/bitmap.h>
@@ -268,16 +269,49 @@ static inline struct ctlr_info *shost_to_hba(struct Scsi_Host *sh)
 	return (struct ctlr_info *) *priv;
 }
 
+/* extract sense key, asc, and ascq from sense data.  -1 means invalid. */
+static void decode_sense_data(const u8 *sense_data, int sense_data_len,
+			int *sense_key, int *asc, int *ascq)
+{
+	struct scsi_sense_hdr sshdr;
+	bool rc;
+
+	*sense_key = -1;
+	*asc = -1;
+	*ascq = -1;
+
+	if (sense_data_len < 1)
+		return;
+
+	rc = scsi_normalize_sense(sense_data, sense_data_len, &sshdr);
+	if (rc) {
+		*sense_key = sshdr.sense_key;
+		*asc = sshdr.asc;
+		*ascq = sshdr.ascq;
+	}
+}
+
 static int check_for_unit_attention(struct ctlr_info *h,
 	struct CommandList *c)
 {
-	if (c->err_info->SenseInfo[2] != UNIT_ATTENTION)
+	int sense_key, asc, ascq;
+	int sense_len;
+
+	if (c->err_info->SenseLen > sizeof(c->err_info->SenseInfo))
+		sense_len = sizeof(c->err_info->SenseInfo);
+	else
+		sense_len = c->err_info->SenseLen;
+
+	decode_sense_data(c->err_info->SenseInfo, sense_len,
+				&sense_key, &asc, &ascq);
+	if (sense_key != UNIT_ATTENTION || asc == -1)
 		return 0;
 
-	switch (c->err_info->SenseInfo[12]) {
+	switch (asc) {
 	case STATE_CHANGED:
-		dev_warn(&h->pdev->dev, HPSA "%d: a state change "
-			"detected, command retried\n", h->ctlr);
+		dev_warn(&h->pdev->dev,
+			HPSA "%d: a state change detected, command retried\n",
+			h->ctlr);
 		break;
 	case LUN_FAILED:
 		dev_warn(&h->pdev->dev,
@@ -1890,6 +1924,34 @@ retry_cmd:
 	queue_work_on(raw_smp_processor_id(), h->resubmit_wq, &c->work);
 }
 
+/* Returns 0 on success, < 0 otherwise. */
+static int hpsa_evaluate_tmf_status(struct ctlr_info *h,
+					struct CommandList *cp)
+{
+	u8 tmf_status = cp->err_info->ScsiStatus;
+
+	switch (tmf_status) {
+	case CISS_TMF_COMPLETE:
+		/*
+		 * CISS_TMF_COMPLETE never happens, instead,
+		 * ei->CommandStatus == 0 for this case.
+		 */
+	case CISS_TMF_SUCCESS:
+		return 0;
+	case CISS_TMF_INVALID_FRAME:
+	case CISS_TMF_NOT_SUPPORTED:
+	case CISS_TMF_FAILED:
+	case CISS_TMF_WRONG_LUN:
+	case CISS_TMF_OVERLAPPED_TAG:
+		break;
+	default:
+		dev_warn(&h->pdev->dev, "Unknown TMF status: 0x%02x\n",
+				tmf_status);
+		break;
+	}
+	return -tmf_status;
+}
+
 static void complete_scsi_command(struct CommandList *cp)
 {
 	struct scsi_cmnd *cmd;
@@ -1897,9 +1959,9 @@ static void complete_scsi_command(struct CommandList *cp)
 	struct ErrorInfo *ei;
 	struct hpsa_scsi_dev_t *dev;
 
-	unsigned char sense_key;
-	unsigned char asc;      /* additional sense code */
-	unsigned char ascq;     /* additional sense code qualifier */
+	int sense_key;
+	int asc;      /* additional sense code */
+	int ascq;     /* additional sense code qualifier */
 	unsigned long sense_data_size;
 
 	ei = cp->err_info;
@@ -1934,8 +1996,6 @@ static void complete_scsi_command(struct CommandList *cp)
 	if (cp->cmd_type == CMD_IOACCEL2)
 		return process_ioaccel2_completion(h, cp, cmd, dev);
 
-	cmd->result |= ei->ScsiStatus;
-
 	scsi_set_resid(cmd, ei->ResidualCnt);
 	if (ei->CommandStatus == 0) {
 		if (cp->cmd_type == CMD_IOACCEL1)
@@ -1945,16 +2005,6 @@ static void complete_scsi_command(struct CommandList *cp)
 		return;
 	}
 
-	/* copy the sense data */
-	if (SCSI_SENSE_BUFFERSIZE < sizeof(ei->SenseInfo))
-		sense_data_size = SCSI_SENSE_BUFFERSIZE;
-	else
-		sense_data_size = sizeof(ei->SenseInfo);
-	if (ei->SenseLen < sense_data_size)
-		sense_data_size = ei->SenseLen;
-
-	memcpy(cmd->sense_buffer, ei->SenseInfo, sense_data_size);
-
 	/* For I/O accelerator commands, copy over some fields to the normal
 	 * CISS header used below for error handling.
 	 */
@@ -1986,14 +2036,18 @@ static void complete_scsi_command(struct CommandList *cp)
 	switch (ei->CommandStatus) {
 
 	case CMD_TARGET_STATUS:
-		if (ei->ScsiStatus) {
-			/* Get sense key */
-			sense_key = 0xf & ei->SenseInfo[2];
-			/* Get additional sense code */
-			asc = ei->SenseInfo[12];
-			/* Get addition sense code qualifier */
-			ascq = ei->SenseInfo[13];
-		}
+		cmd->result |= ei->ScsiStatus;
+		/* copy the sense data */
+		if (SCSI_SENSE_BUFFERSIZE < sizeof(ei->SenseInfo))
+			sense_data_size = SCSI_SENSE_BUFFERSIZE;
+		else
+			sense_data_size = sizeof(ei->SenseInfo);
+		if (ei->SenseLen < sense_data_size)
+			sense_data_size = ei->SenseLen;
+		memcpy(cmd->sense_buffer, ei->SenseInfo, sense_data_size);
+		if (ei->ScsiStatus)
+			decode_sense_data(ei->SenseInfo, sense_data_size,
+				&sense_key, &asc, &ascq);
 		if (ei->ScsiStatus == SAM_STAT_CHECK_CONDITION) {
 			if (sense_key == ABORTED_COMMAND) {
 				cmd->result |= DID_SOFT_ERROR << 16;
@@ -2088,6 +2142,10 @@ static void complete_scsi_command(struct CommandList *cp)
 		cmd->result = DID_ERROR << 16;
 		dev_warn(&h->pdev->dev, "Command unabortable\n");
 		break;
+	case CMD_TMF_STATUS:
+		if (hpsa_evaluate_tmf_status(h, cp)) /* TMF failed? */
+			cmd->result = DID_ERROR << 16;
+		break;
 	case CMD_IOACCEL_DISABLED:
 		/* This only handles the direct pass-through case since RAID
 		 * offload is handled above.  Just attempt a retry.
@@ -2238,16 +2296,22 @@ static void hpsa_scsi_interpret_error(struct ctlr_info *h,
 {
 	const struct ErrorInfo *ei = cp->err_info;
 	struct device *d = &cp->h->pdev->dev;
-	const u8 *sd = ei->SenseInfo;
+	int sense_key, asc, ascq, sense_len;
 
 	switch (ei->CommandStatus) {
 	case CMD_TARGET_STATUS:
+		if (ei->SenseLen > sizeof(ei->SenseInfo))
+			sense_len = sizeof(ei->SenseInfo);
+		else
+			sense_len = ei->SenseLen;
+		decode_sense_data(ei->SenseInfo, sense_len,
+					&sense_key, &asc, &ascq);
 		hpsa_print_cmd(h, "SCSI status", cp);
 		if (ei->ScsiStatus == SAM_STAT_CHECK_CONDITION)
-			dev_warn(d, "SCSI Status = 02, Sense key = %02x, ASC = %02x, ASCQ = %02x\n",
-				sd[2] & 0x0f, sd[12], sd[13]);
+			dev_warn(d, "SCSI Status = 02, Sense key = 0x%02x, ASC = 0x%02x, ASCQ = 0x%02x\n",
+				sense_key, asc, ascq);
 		else
-			dev_warn(d, "SCSI Status = %02x\n", ei->ScsiStatus);
+			dev_warn(d, "SCSI Status = 0x%02x\n", ei->ScsiStatus);
 		if (ei->ScsiStatus == 0)
 			dev_warn(d, "SCSI status is abnormally zero.  "
 			"(probably indicates selection timeout "
@@ -2792,7 +2856,8 @@ static int hpsa_volume_offline(struct ctlr_info *h,
 					unsigned char scsi3addr[])
 {
 	struct CommandList *c;
-	unsigned char *sense, sense_key, asc, ascq;
+	unsigned char *sense;
+	int sense_key, asc, ascq, sense_len;
 	int rc, ldstat = 0;
 	u16 cmd_status;
 	u8 scsi_status;
@@ -2810,9 +2875,11 @@ static int hpsa_volume_offline(struct ctlr_info *h,
 		return 0;
 	}
 	sense = c->err_info->SenseInfo;
-	sense_key = sense[2];
-	asc = sense[12];
-	ascq = sense[13];
+	if (c->err_info->SenseLen > sizeof(c->err_info->SenseInfo))
+		sense_len = sizeof(c->err_info->SenseInfo);
+	else
+		sense_len = c->err_info->SenseLen;
+	decode_sense_data(sense, sense_len, &sense_key, &asc, &ascq);
 	cmd_status = c->err_info->CommandStatus;
 	scsi_status = c->err_info->ScsiStatus;
 	cmd_free(h, c);
@@ -2888,6 +2955,9 @@ static int hpsa_device_supports_aborts(struct ctlr_info *h,
 	case CMD_ABORT_FAILED:
 		rc = 1;
 		break;
+	case CMD_TMF_STATUS:
+		rc = hpsa_evaluate_tmf_status(h, c);
+		break;
 	default:
 		rc = 0;
 		break;
@@ -4695,6 +4765,9 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
 	switch (ei->CommandStatus) {
 	case CMD_SUCCESS:
 		break;
+	case CMD_TMF_STATUS:
+		rc = hpsa_evaluate_tmf_status(h, c);
+		break;
 	case CMD_UNABORTABLE: /* Very common, don't make noise. */
 		rc = -1;
 		break;
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index f52c847..f6ca5fa 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -42,6 +42,7 @@
 #define CMD_UNSOLICITED_ABORT   0x000A
 #define CMD_TIMEOUT             0x000B
 #define CMD_UNABORTABLE		0x000C
+#define CMD_TMF_STATUS		0x000D
 #define CMD_IOACCEL_DISABLED	0x000E
 #define CMD_CTLR_LOCKUP		0xffff
 /* Note: CMD_CTLR_LOCKUP is not a value defined by the CISS spec
@@ -49,6 +50,14 @@
  * with when a controller lockup has been detected by the driver
  */
 
+/* TMF function status values */
+#define CISS_TMF_COMPLETE	0x00
+#define CISS_TMF_INVALID_FRAME	0x02
+#define CISS_TMF_NOT_SUPPORTED	0x04
+#define CISS_TMF_FAILED		0x05
+#define CISS_TMF_SUCCESS	0x08
+#define CISS_TMF_WRONG_LUN	0x09
+#define CISS_TMF_OVERLAPPED_TAG 0x0a
 
 /* Unit Attentions ASC's as defined for the MSA2012sa */
 #define POWER_OR_RESET			0x29


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 07/42] hpsa: allow lockup detected to be viewed via sysfs
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (5 preceding siblings ...)
  2015-03-17 20:02 ` [PATCH v3 06/42] hpsa: hpsa decode sense data for io and tmf Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 08/42] hpsa: make function names consistent Don Brace
                   ` (34 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

expose a detected lockup via sysfs

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 415ec4d..f84d63f 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -351,6 +351,20 @@ static int check_for_busy(struct ctlr_info *h, struct CommandList *c)
 	return 1;
 }
 
+static u32 lockup_detected(struct ctlr_info *h);
+static ssize_t host_show_lockup_detected(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	int ld;
+	struct ctlr_info *h;
+	struct Scsi_Host *shost = class_to_shost(dev);
+
+	h = shost_to_hba(shost);
+	ld = lockup_detected(h);
+
+	return sprintf(buf, "ld=%d\n", ld);
+}
+
 static ssize_t host_store_hp_ssd_smart_path_status(struct device *dev,
 					 struct device_attribute *attr,
 					 const char *buf, size_t count)
@@ -698,12 +712,15 @@ static DEVICE_ATTR(transport_mode, S_IRUGO,
 	host_show_transport_mode, NULL);
 static DEVICE_ATTR(resettable, S_IRUGO,
 	host_show_resettable, NULL);
+static DEVICE_ATTR(lockup_detected, S_IRUGO,
+	host_show_lockup_detected, NULL);
 
 static struct device_attribute *hpsa_sdev_attrs[] = {
 	&dev_attr_raid_level,
 	&dev_attr_lunid,
 	&dev_attr_unique_id,
 	&dev_attr_hp_ssd_smart_path_enabled,
+	&dev_attr_lockup_detected,
 	NULL,
 };
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 08/42] hpsa: make function names consistent
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (6 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 07/42] hpsa: allow lockup detected to be viewed via sysfs Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 09/42] hpsa: factor out hpsa_init_cmd function Don Brace
                   ` (33 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

make function names more consistent and meaningful

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index f84d63f..9a13525 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -6817,7 +6817,7 @@ out_disable:
 	return rc;
 }
 
-static int hpsa_allocate_cmd_pool(struct ctlr_info *h)
+static int hpsa_alloc_cmd_pool(struct ctlr_info *h)
 {
 	h->cmd_pool_bits = kzalloc(
 		DIV_ROUND_UP(h->nr_cmds, BITS_PER_LONG) *
@@ -7347,7 +7347,7 @@ reinit_after_soft_reset:
 	dev_info(&pdev->dev, "%s: <0x%x> at IRQ %d%s using DAC\n",
 	       h->devname, pdev->device,
 	       h->intr[h->intr_mode], dac ? "" : " not");
-	rc = hpsa_allocate_cmd_pool(h);
+	rc = hpsa_alloc_cmd_pool(h);
 	if (rc)
 		goto clean2_and_free_irqs;
 	if (hpsa_allocate_sg_chain_blocks(h))
@@ -7794,7 +7794,8 @@ static int hpsa_enter_performant_mode(struct ctlr_info *h, u32 trans_support)
 	return 0;
 }
 
-static int hpsa_alloc_ioaccel_cmd_and_bft(struct ctlr_info *h)
+/* Allocate ioaccel1 mode command blocks and block fetch table */
+static int hpsa_alloc_ioaccel1_cmd_and_bft(struct ctlr_info *h)
 {
 	h->ioaccel_maxsg =
 		readl(&(h->cfgtable->io_accel_max_embedded_sg_count));
@@ -7833,7 +7834,8 @@ clean_up:
 	return 1;
 }
 
-static int ioaccel2_alloc_cmds_and_bft(struct ctlr_info *h)
+/* Allocate ioaccel2 mode command blocks and block fetch table */
+static int hpsa_alloc_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 {
 	/* Allocate ioaccel2 mode command blocks and block fetch table */
 
@@ -7888,13 +7890,13 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
 	if (trans_support & CFGTBL_Trans_io_accel1) {
 		transMethod |= CFGTBL_Trans_io_accel1 |
 				CFGTBL_Trans_enable_directed_msix;
-		if (hpsa_alloc_ioaccel_cmd_and_bft(h))
+		if (hpsa_alloc_ioaccel1_cmd_and_bft(h))
 			goto clean_up;
 	} else {
 		if (trans_support & CFGTBL_Trans_io_accel2) {
 				transMethod |= CFGTBL_Trans_io_accel2 |
 				CFGTBL_Trans_enable_directed_msix;
-		if (ioaccel2_alloc_cmds_and_bft(h))
+		if (hpsa_alloc_ioaccel2_cmd_and_bft(h))
 			goto clean_up;
 		}
 	}


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 09/42] hpsa: factor out hpsa_init_cmd function
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (7 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 08/42] hpsa: make function names consistent Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 10/42] hpsa: do not ignore return value of hpsa_register_scsi Don Brace
                   ` (32 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

Factor out hpsa_cmd_init from cmd_alloc().  We also need
this for resubmitting commands down the default RAID path
when they have returned from the ioaccel paths with errors.

In particular, reinitialize the cmd_type and busaddr fields as these
will not be correct for submitting down the RAID stack path
after ioaccel command completion.

This saves time when submitting commands.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |   77 ++++++++++++++++++++++++++++++++---------------
 drivers/scsi/hpsa_cmd.h |    2 +
 2 files changed, 53 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 9a13525..d57fa4b 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -4317,7 +4317,6 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
 	/* Fill in the request block... */
 
 	c->Request.Timeout = 0;
-	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
 	BUG_ON(cmd->cmd_len > sizeof(c->Request.CDB));
 	c->Request.CDBLen = cmd->cmd_len;
 	memcpy(c->Request.CDB, cmd->cmnd, cmd->cmd_len);
@@ -4368,6 +4367,48 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
 	return 0;
 }
 
+static void hpsa_cmd_init(struct ctlr_info *h, int index,
+				struct CommandList *c)
+{
+	dma_addr_t cmd_dma_handle, err_dma_handle;
+
+	/* Zero out all of commandlist except the last field, refcount */
+	memset(c, 0, offsetof(struct CommandList, refcount));
+	c->Header.tag = cpu_to_le64((u64) (index << DIRECT_LOOKUP_SHIFT));
+	cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
+	c->err_info = h->errinfo_pool + index;
+	memset(c->err_info, 0, sizeof(*c->err_info));
+	err_dma_handle = h->errinfo_pool_dhandle
+	    + index * sizeof(*c->err_info);
+	c->cmdindex = index;
+	c->busaddr = (u32) cmd_dma_handle;
+	c->ErrDesc.Addr = cpu_to_le64((u64) err_dma_handle);
+	c->ErrDesc.Len = cpu_to_le32((u32) sizeof(*c->err_info));
+	c->h = h;
+}
+
+static void hpsa_preinitialize_commands(struct ctlr_info *h)
+{
+	int i;
+
+	for (i = 0; i < h->nr_cmds; i++) {
+		struct CommandList *c = h->cmd_pool + i;
+
+		hpsa_cmd_init(h, i, c);
+		atomic_set(&c->refcount, 0);
+	}
+}
+
+static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
+				struct CommandList *c)
+{
+	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
+
+	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
+	memset(c->err_info, 0, sizeof(*c->err_info));
+	c->busaddr = (u32) cmd_dma_handle;
+}
+
 static void hpsa_command_resubmit_worker(struct work_struct *work)
 {
 	struct scsi_cmnd *cmd;
@@ -4382,6 +4423,7 @@ static void hpsa_command_resubmit_worker(struct work_struct *work)
 		cmd->scsi_done(cmd);
 		return;
 	}
+	hpsa_cmd_partial_init(c->h, c->cmdindex, c);
 	if (hpsa_ciss_submit(c->h, c, cmd, dev->scsi3addr)) {
 		/*
 		 * If we get here, it means dma mapping failed. Try
@@ -4438,10 +4480,11 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 		h->acciopath_status)) {
 
 		cmd->host_scribble = (unsigned char *) c;
-		c->cmd_type = CMD_SCSI;
-		c->scsi_cmd = cmd;
 
 		if (dev->offload_enabled) {
+			hpsa_cmd_init(h, c->cmdindex, c);
+			c->cmd_type = CMD_SCSI;
+			c->scsi_cmd = cmd;
 			rc = hpsa_scsi_ioaccel_raid_map(h, c);
 			if (rc == 0)
 				return 0; /* Sent on ioaccel path */
@@ -4450,6 +4493,9 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 				return SCSI_MLQUEUE_HOST_BUSY;
 			}
 		} else if (dev->ioaccel_handle) {
+			hpsa_cmd_init(h, c->cmdindex, c);
+			c->cmd_type = CMD_SCSI;
+			c->scsi_cmd = cmd;
 			rc = hpsa_scsi_ioaccel_direct_map(h, c);
 			if (rc == 0)
 				return 0; /* Sent on direct map path */
@@ -5061,10 +5107,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 static struct CommandList *cmd_alloc(struct ctlr_info *h)
 {
 	struct CommandList *c;
-	int i;
-	union u64bit temp64;
-	dma_addr_t cmd_dma_handle, err_dma_handle;
-	int refcount;
+	int refcount, i;
 	unsigned long offset;
 
 	/*
@@ -5098,24 +5141,7 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
 		break; /* it's ours now. */
 	}
 	h->last_allocation = i; /* benignly racy */
-
-	/* Zero out all of commandlist except the last field, refcount */
-	memset(c, 0, offsetof(struct CommandList, refcount));
-	c->Header.tag = cpu_to_le64((u64) (i << DIRECT_LOOKUP_SHIFT));
-	cmd_dma_handle = h->cmd_pool_dhandle + i * sizeof(*c);
-	c->err_info = h->errinfo_pool + i;
-	memset(c->err_info, 0, sizeof(*c->err_info));
-	err_dma_handle = h->errinfo_pool_dhandle
-	    + i * sizeof(*c->err_info);
-
-	c->cmdindex = i;
-
-	c->busaddr = (u32) cmd_dma_handle;
-	temp64.val = (u64) err_dma_handle;
-	c->ErrDesc.Addr = cpu_to_le64((u64) err_dma_handle);
-	c->ErrDesc.Len = cpu_to_le32((u32) sizeof(*c->err_info));
-
-	c->h = h;
+	hpsa_cmd_partial_init(h, i, c);
 	return c;
 }
 
@@ -6834,6 +6860,7 @@ static int hpsa_alloc_cmd_pool(struct ctlr_info *h)
 		dev_err(&h->pdev->dev, "out of memory in %s", __func__);
 		goto clean_up;
 	}
+	hpsa_preinitialize_commands(h);
 	return 0;
 clean_up:
 	hpsa_free_cmd_pool(h);
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index f6ca5fa..0efb6f2b 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -438,7 +438,7 @@ struct CommandList {
 	 * not used.
 	 */
 	struct hpsa_scsi_dev_t *phys_disk;
-	atomic_t refcount; /* Must be last to avoid memset in cmd_alloc */
+	atomic_t refcount; /* Must be last to avoid memset in hpsa_cmd_init() */
 } __aligned(COMMANDLIST_ALIGNMENT);
 
 /* Max S/G elements in I/O accelerator command */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 10/42] hpsa: do not ignore return value of hpsa_register_scsi
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (8 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 09/42] hpsa: factor out hpsa_init_cmd function Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 11/42] hpsa: try resubmitting down raid path on task set full Don Brace
                   ` (31 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

add error handling for failure when registering with SCSI subsystem.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index d57fa4b..15bf3bcb 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7453,7 +7453,9 @@ reinit_after_soft_reset:
 	h->access.set_intr_mask(h, HPSA_INTR_ON);
 
 	hpsa_hba_inquiry(h);
-	hpsa_register_scsi(h);	/* hook ourselves into SCSI subsystem */
+	rc = hpsa_register_scsi(h); /* hook ourselves into SCSI subsystem */
+	if (rc)
+		goto clean4;
 
 	/* Monitor the controller for firmware lockups */
 	h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 11/42] hpsa: try resubmitting down raid path on task set full
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (9 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 10/42] hpsa: do not ignore return value of hpsa_register_scsi Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 12/42] hpsa: factor out hpsa_ioaccel_submit function Don Brace
                   ` (30 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

allow the controller firmware to queue up commands when the ioaccel device
queue is full.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 15bf3bcb..76865df 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1855,8 +1855,7 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 			retry = 1;
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_SET_FULL:
-			/* Make scsi midlayer do unlimited retries */
-			cmd->result = DID_IMM_RETRY << 16;
+			retry = 1;
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_ABORTED:
 			dev_warn(&h->pdev->dev,


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 12/42] hpsa: factor out hpsa_ioaccel_submit function
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (10 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 11/42] hpsa: try resubmitting down raid path on task set full Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 13/42] hpsa: print accurate SSD Smart Path Enabled status Don Brace
                   ` (29 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

clean up command submission

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   91 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 66 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 76865df..813f3c8 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -4408,6 +4408,33 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
 	c->busaddr = (u32) cmd_dma_handle;
 }
 
+static int hpsa_ioaccel_submit(struct ctlr_info *h,
+		struct CommandList *c, struct scsi_cmnd *cmd,
+		unsigned char *scsi3addr)
+{
+	struct hpsa_scsi_dev_t *dev = cmd->device->hostdata;
+	int rc = IO_ACCEL_INELIGIBLE;
+
+	cmd->host_scribble = (unsigned char *) c;
+
+	if (dev->offload_enabled) {
+		hpsa_cmd_init(h, c->cmdindex, c);
+		c->cmd_type = CMD_SCSI;
+		c->scsi_cmd = cmd;
+		rc = hpsa_scsi_ioaccel_raid_map(h, c);
+		if (rc < 0)     /* scsi_dma_map failed. */
+			rc = SCSI_MLQUEUE_HOST_BUSY;
+	} else if (dev->ioaccel_handle) {
+		hpsa_cmd_init(h, c->cmdindex, c);
+		c->cmd_type = CMD_SCSI;
+		c->scsi_cmd = cmd;
+		rc = hpsa_scsi_ioaccel_direct_map(h, c);
+		if (rc < 0)     /* scsi_dma_map failed. */
+			rc = SCSI_MLQUEUE_HOST_BUSY;
+	}
+	return rc;
+}
+
 static void hpsa_command_resubmit_worker(struct work_struct *work)
 {
 	struct scsi_cmnd *cmd;
@@ -4419,15 +4446,46 @@ static void hpsa_command_resubmit_worker(struct work_struct *work)
 	dev = cmd->device->hostdata;
 	if (!dev) {
 		cmd->result = DID_NO_CONNECT << 16;
+		cmd_free(c->h, c);
 		cmd->scsi_done(cmd);
 		return;
 	}
+	if (c->cmd_type == CMD_IOACCEL2) {
+		struct ctlr_info *h = c->h;
+		struct io_accel2_cmd *c2 = &h->ioaccel2_cmd_pool[c->cmdindex];
+		int rc;
+
+		if (c2->error_data.serv_response ==
+				IOACCEL2_STATUS_SR_TASK_COMP_SET_FULL) {
+			rc = hpsa_ioaccel_submit(h, c, cmd, dev->scsi3addr);
+			if (rc == 0)
+				return;
+			if (rc == SCSI_MLQUEUE_HOST_BUSY) {
+				/*
+				 * If we get here, it means dma mapping failed.
+				 * Try again via scsi mid layer, which will
+				 * then get SCSI_MLQUEUE_HOST_BUSY.
+				 */
+				cmd->result = DID_IMM_RETRY << 16;
+				cmd->scsi_done(cmd);
+				cmd_free(h, c);	/* FIX-ME:  on merge, change
+						 * to cmd_tagged_free() and
+						 * ultimately to
+						 * hpsa_cmd_free_and_done(). */
+				return;
+			}
+			/* else, fall thru and resubmit down CISS path */
+		}
+	}
 	hpsa_cmd_partial_init(c->h, c->cmdindex, c);
 	if (hpsa_ciss_submit(c->h, c, cmd, dev->scsi3addr)) {
 		/*
 		 * If we get here, it means dma mapping failed. Try
 		 * again via scsi mid layer, which will then get
 		 * SCSI_MLQUEUE_HOST_BUSY.
+		 *
+		 * hpsa_ciss_submit will have already freed c
+		 * if it encountered a dma mapping failure.
 		 */
 		cmd->result = DID_IMM_RETRY << 16;
 		cmd->scsi_done(cmd);
@@ -4477,31 +4535,14 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 	if (likely(cmd->retries == 0 &&
 		cmd->request->cmd_type == REQ_TYPE_FS &&
 		h->acciopath_status)) {
-
-		cmd->host_scribble = (unsigned char *) c;
-
-		if (dev->offload_enabled) {
-			hpsa_cmd_init(h, c->cmdindex, c);
-			c->cmd_type = CMD_SCSI;
-			c->scsi_cmd = cmd;
-			rc = hpsa_scsi_ioaccel_raid_map(h, c);
-			if (rc == 0)
-				return 0; /* Sent on ioaccel path */
-			if (rc < 0) {   /* scsi_dma_map failed. */
-				cmd_free(h, c);
-				return SCSI_MLQUEUE_HOST_BUSY;
-			}
-		} else if (dev->ioaccel_handle) {
-			hpsa_cmd_init(h, c->cmdindex, c);
-			c->cmd_type = CMD_SCSI;
-			c->scsi_cmd = cmd;
-			rc = hpsa_scsi_ioaccel_direct_map(h, c);
-			if (rc == 0)
-				return 0; /* Sent on direct map path */
-			if (rc < 0) {   /* scsi_dma_map failed. */
-				cmd_free(h, c);
-				return SCSI_MLQUEUE_HOST_BUSY;
-			}
+		rc = hpsa_ioaccel_submit(h, c, cmd, scsi3addr);
+		if (rc == 0)
+			return 0;
+		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
+			cmd_free(h, c);	/* FIX-ME:  on merge, change to
+					 * cmd_tagged_free(), and ultimately
+					 * to hpsa_cmd_resolve_and_free(). */
+			return SCSI_MLQUEUE_HOST_BUSY;
 		}
 	}
 	return hpsa_ciss_submit(h, c, cmd, scsi3addr);


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 13/42] hpsa: print accurate SSD Smart Path Enabled status
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (11 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 12/42] hpsa: factor out hpsa_ioaccel_submit function Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 14/42] hpsa: use ioaccel2 path to submit IOs to physical drives in HBA mode Don Brace
                   ` (28 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

offload_enabled changes are deferred until after the
added/updated prints occur, so the values are incorrect.

defer printing SSD Smart Path Enabled status information until the
information is correct

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 813f3c8..e9d3d71 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1090,12 +1090,12 @@ lun_assigned:
 
 	h->dev[n] = device;
 	h->ndevices++;
-	device->offload_to_be_enabled = device->offload_enabled;
-	device->offload_enabled = 0;
 	added[*nadded] = device;
 	(*nadded)++;
 	hpsa_show_dev_msg(HPSA_INFO, h, device,
 		device->expose_state & HPSA_SCSI_ADD ? "added" : "masked");
+	device->offload_to_be_enabled = device->offload_enabled;
+	device->offload_enabled = 0;
 	return 0;
 }
 
@@ -1103,6 +1103,7 @@ lun_assigned:
 static void hpsa_scsi_update_entry(struct ctlr_info *h, int hostno,
 	int entry, struct hpsa_scsi_dev_t *new_entry)
 {
+	int offload_enabled;
 	/* assumes h->devlock is held */
 	BUG_ON(entry < 0 || entry >= HPSA_MAX_DEVICES);
 
@@ -1135,7 +1136,10 @@ static void hpsa_scsi_update_entry(struct ctlr_info *h, int hostno,
 	if (!new_entry->offload_enabled)
 		h->dev[entry]->offload_enabled = 0;
 
+	offload_enabled = h->dev[entry]->offload_enabled;
+	h->dev[entry]->offload_enabled = h->dev[entry]->offload_to_be_enabled;
 	hpsa_show_dev_msg(HPSA_INFO, h, h->dev[entry], "updated");
+	h->dev[entry]->offload_enabled = offload_enabled;
 }
 
 /* Replace an entry from h->dev[] array. */
@@ -1158,12 +1162,12 @@ static void hpsa_scsi_replace_entry(struct ctlr_info *h, int hostno,
 		new_entry->lun = h->dev[entry]->lun;
 	}
 
-	new_entry->offload_to_be_enabled = new_entry->offload_enabled;
-	new_entry->offload_enabled = 0;
 	h->dev[entry] = new_entry;
 	added[*nadded] = new_entry;
 	(*nadded)++;
 	hpsa_show_dev_msg(HPSA_INFO, h, new_entry, "replaced");
+	new_entry->offload_to_be_enabled = new_entry->offload_enabled;
+	new_entry->offload_enabled = 0;
 }
 
 /* Remove an entry from h->dev[] array. */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 14/42] hpsa: use ioaccel2 path to submit IOs to physical drives in HBA mode.
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (12 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 13/42] hpsa: print accurate SSD Smart Path Enabled status Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 15/42] hpsa: Get queue depth from identify physical bmic for physical disks Don Brace
                   ` (27 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Joe Handzik <joseph.t.handzik@hp.com>

use ioaccel2 path to submit I/O to physical drives in HBA mode

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   10 +++++++++-
 drivers/scsi/hpsa.h |    1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index e9d3d71..76f9042 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1123,6 +1123,11 @@ static void hpsa_scsi_update_entry(struct ctlr_info *h, int hostno,
 		h->dev[entry]->raid_map = new_entry->raid_map;
 		h->dev[entry]->ioaccel_handle = new_entry->ioaccel_handle;
 	}
+	if (new_entry->hba_ioaccel_enabled) {
+		h->dev[entry]->ioaccel_handle = new_entry->ioaccel_handle;
+		wmb(); /* set ioaccel_handle *before* hba_ioaccel_enabled */
+	}
+	h->dev[entry]->hba_ioaccel_enabled = new_entry->hba_ioaccel_enabled;
 	h->dev[entry]->offload_config = new_entry->offload_config;
 	h->dev[entry]->offload_to_mirror = new_entry->offload_to_mirror;
 	h->dev[entry]->queue_depth = new_entry->queue_depth;
@@ -3039,6 +3044,7 @@ static int hpsa_update_device_info(struct ctlr_info *h,
 		this_device->offload_config = 0;
 		this_device->offload_enabled = 0;
 		this_device->offload_to_be_enabled = 0;
+		this_device->hba_ioaccel_enabled = 0;
 		this_device->volume_offline = 0;
 		this_device->queue_depth = h->nr_cmds;
 	}
@@ -3327,6 +3333,8 @@ static void hpsa_get_ioaccel_drive_info(struct ctlr_info *h,
 		(struct ext_report_lun_entry *) lunaddrbytes;
 
 	dev->ioaccel_handle = rle->ioaccel_handle;
+	if (PHYS_IOACCEL(lunaddrbytes) && dev->ioaccel_handle)
+		dev->hba_ioaccel_enabled = 1;
 	memset(id_phys, 0, sizeof(*id_phys));
 	rc = hpsa_bmic_id_physical_device(h, lunaddrbytes,
 			GET_BMIC_DRIVE_NUMBER(lunaddrbytes), id_phys,
@@ -4428,7 +4436,7 @@ static int hpsa_ioaccel_submit(struct ctlr_info *h,
 		rc = hpsa_scsi_ioaccel_raid_map(h, c);
 		if (rc < 0)     /* scsi_dma_map failed. */
 			rc = SCSI_MLQUEUE_HOST_BUSY;
-	} else if (dev->ioaccel_handle) {
+	} else if (dev->hba_ioaccel_enabled) {
 		hpsa_cmd_init(h, c->cmdindex, c);
 		c->cmd_type = CMD_SCSI;
 		c->scsi_cmd = cmd;
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index df2468c..87a70b5 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -55,6 +55,7 @@ struct hpsa_scsi_dev_t {
 	int offload_config;		/* I/O accel RAID offload configured */
 	int offload_enabled;		/* I/O accel RAID offload enabled */
 	int offload_to_be_enabled;
+	int hba_ioaccel_enabled;
 	int offload_to_mirror;		/* Send next I/O accelerator RAID
 					 * offload request to mirror drive
 					 */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 15/42] hpsa: Get queue depth from identify physical bmic for physical disks.
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (13 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 14/42] hpsa: use ioaccel2 path to submit IOs to physical drives in HBA mode Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 16/42] hpsa: break hpsa_free_irqs_and_disable_msix into two functions Don Brace
                   ` (26 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Joe Handzik <joseph.t.handzik@hp.com>

get drive queue depth to help avoid task set full conditions.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 76f9042..3fa72a1 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3497,29 +3497,22 @@ static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
 				ncurrent++;
 			break;
 		case TYPE_DISK:
-			if (h->hba_mode_enabled) {
-				/* never use raid mapper in HBA mode */
-				this_device->offload_enabled = 0;
-				ncurrent++;
-				break;
-			} else if (h->acciopath_status) {
-				if (i >= nphysicals) {
-					ncurrent++;
-					break;
-				}
-			} else {
-				if (i < nphysicals)
-					break;
+			if (i >= nphysicals) {
 				ncurrent++;
 				break;
 			}
-			if (h->transMethod & CFGTBL_Trans_io_accel1 ||
-				h->transMethod & CFGTBL_Trans_io_accel2) {
-				hpsa_get_ioaccel_drive_info(h, this_device,
-							lunaddrbytes, id_phys);
-				atomic_set(&this_device->ioaccel_cmds_out, 0);
-				ncurrent++;
-			}
+
+			if (h->hba_mode_enabled)
+				/* never use raid mapper in HBA mode */
+				this_device->offload_enabled = 0;
+			else if (!(h->transMethod & CFGTBL_Trans_io_accel1 ||
+				h->transMethod & CFGTBL_Trans_io_accel2))
+				break;
+
+			hpsa_get_ioaccel_drive_info(h, this_device,
+						lunaddrbytes, id_phys);
+			atomic_set(&this_device->ioaccel_cmds_out, 0);
+			ncurrent++;
 			break;
 		case TYPE_TAPE:
 		case TYPE_MEDIUM_CHANGER:


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 16/42] hpsa: break hpsa_free_irqs_and_disable_msix into two functions
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (14 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 15/42] hpsa: Get queue depth from identify physical bmic for physical disks Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:03 ` [PATCH v3 17/42] hpsa: clean up error handling Don Brace
                   ` (25 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

replace calls to hpsa_free_irqs_and_disable_msix with
hpsa_free_irqs and hpsa_disable_interrupt_mode

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3fa72a1..c98c591 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -6402,10 +6402,20 @@ static int find_PCI_BAR_index(struct pci_dev *pdev, unsigned long pci_bar_addr)
 	return -1;
 }
 
+static void hpsa_disable_interrupt_mode(struct ctlr_info *h)
+{
+	if (h->msix_vector) {
+		if (h->pdev->msix_enabled)
+			pci_disable_msix(h->pdev);
+	} else if (h->msi_vector) {
+		if (h->pdev->msi_enabled)
+			pci_disable_msi(h->pdev);
+	}
+}
+
 /* If MSI/MSI-X is supported by the kernel we will try to enable it on
  * controllers that are capable. If not, we use legacy INTx mode.
  */
-
 static void hpsa_interrupt_mode(struct ctlr_info *h)
 {
 #ifdef CONFIG_PCI_MSI
@@ -7046,20 +7056,6 @@ static int hpsa_kdump_soft_reset(struct ctlr_info *h)
 	return 0;
 }
 
-static void hpsa_free_irqs_and_disable_msix(struct ctlr_info *h)
-{
-	hpsa_free_irqs(h);
-#ifdef CONFIG_PCI_MSI
-	if (h->msix_vector) {
-		if (h->pdev->msix_enabled)
-			pci_disable_msix(h->pdev);
-	} else if (h->msi_vector) {
-		if (h->pdev->msi_enabled)
-			pci_disable_msi(h->pdev);
-	}
-#endif /* CONFIG_PCI_MSI */
-}
-
 static void hpsa_free_reply_queues(struct ctlr_info *h)
 {
 	int i;
@@ -7076,7 +7072,7 @@ static void hpsa_free_reply_queues(struct ctlr_info *h)
 
 static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 {
-	hpsa_free_irqs_and_disable_msix(h);
+	hpsa_free_irqs(h);
 	hpsa_free_sg_chain_blocks(h);
 	hpsa_free_cmd_pool(h);
 	kfree(h->ioaccel1_blockFetchTable);
@@ -7088,6 +7084,7 @@ static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 		iounmap(h->transtable);
 	if (h->cfgtable)
 		iounmap(h->cfgtable);
+	hpsa_disable_interrupt_mode(h);
 	pci_disable_device(h->pdev);
 	pci_release_regions(h->pdev);
 	kfree(h);
@@ -7576,7 +7573,8 @@ static void hpsa_shutdown(struct pci_dev *pdev)
 	 */
 	hpsa_flush_cache(h);
 	h->access.set_intr_mask(h, HPSA_INTR_OFF);
-	hpsa_free_irqs_and_disable_msix(h);
+	hpsa_free_irqs(h);
+	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
 }
 
 static void hpsa_free_device_info(struct ctlr_info *h)
@@ -7607,7 +7605,10 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 	destroy_workqueue(h->rescan_ctlr_wq);
 	destroy_workqueue(h->resubmit_wq);
 	hpsa_unregister_scsi(h);	/* unhook from SCSI subsystem */
+
+	/* includes hpsa_free_irqs and hpsa_disable_interrupt_mode */
 	hpsa_shutdown(pdev);
+
 	iounmap(h->vaddr);
 	iounmap(h->transtable);
 	iounmap(h->cfgtable);


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 17/42] hpsa: clean up error handling
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (15 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 16/42] hpsa: break hpsa_free_irqs_and_disable_msix into two functions Don Brace
@ 2015-03-17 20:03 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 18/42] hpsa: refactor freeing of resources into more logical functions Don Brace
                   ` (24 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:03 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

refactor error cleanup and shutdown
disable interrupts and pci_disable_device on critical failures
add hpsa_free_cfgtables function

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   92 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 58 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index c98c591..3559425a4 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -6546,6 +6546,17 @@ static int hpsa_find_cfg_addrs(struct pci_dev *pdev, void __iomem *vaddr,
 	return 0;
 }
 
+static void hpsa_free_cfgtables(struct ctlr_info *h)
+{
+	if (h->transtable)
+		iounmap(h->transtable);
+	if (h->cfgtable)
+		iounmap(h->cfgtable);
+}
+
+/* Find and map CISS config table and transfer table
++ * several items must be unmapped (freed) later
++ * */
 static int hpsa_find_cfgtables(struct ctlr_info *h)
 {
 	u64 cfg_offset;
@@ -6572,8 +6583,11 @@ static int hpsa_find_cfgtables(struct ctlr_info *h)
 	h->transtable = remap_pci_mem(pci_resource_start(h->pdev,
 				cfg_base_addr_index)+cfg_offset+trans_offset,
 				sizeof(*h->transtable));
-	if (!h->transtable)
+	if (!h->transtable) {
+		dev_err(&h->pdev->dev, "Failed mapping transfer table\n");
+		hpsa_free_cfgtables(h);
 		return -ENOMEM;
+	}
 	return 0;
 }
 
@@ -6749,6 +6763,17 @@ error:
 	return -ENODEV;
 }
 
+/* free items allocated or mapped by hpsa_pci_init */
+static void hpsa_free_pci_init(struct ctlr_info *h)
+{
+	hpsa_free_cfgtables(h);			/* pci_init 4 */
+	iounmap(h->vaddr);			/* pci_init 3 */
+	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
+	pci_release_regions(h->pdev);		/* pci_init 2 */
+	pci_disable_device(h->pdev);		/* pci_init 1 */
+}
+
+/* several items must be freed later */
 static int hpsa_pci_init(struct ctlr_info *h)
 {
 	int prod_index, err;
@@ -6767,15 +6792,15 @@ static int hpsa_pci_init(struct ctlr_info *h)
 
 	err = pci_enable_device(h->pdev);
 	if (err) {
-		dev_warn(&h->pdev->dev, "unable to enable PCI device\n");
+		dev_err(&h->pdev->dev, "failed to enable PCI device\n");
 		return err;
 	}
 
 	err = pci_request_regions(h->pdev, HPSA);
 	if (err) {
 		dev_err(&h->pdev->dev,
-			"cannot obtain PCI resources, aborting\n");
-		return err;
+			"failed to obtain PCI resources\n");
+		goto clean1;	/* pci */
 	}
 
 	pci_set_master(h->pdev);
@@ -6783,40 +6808,41 @@ static int hpsa_pci_init(struct ctlr_info *h)
 	hpsa_interrupt_mode(h);
 	err = hpsa_pci_find_memory_BAR(h->pdev, &h->paddr);
 	if (err)
-		goto err_out_free_res;
+		goto clean2;	/* intmode+region, pci */
 	h->vaddr = remap_pci_mem(h->paddr, 0x250);
 	if (!h->vaddr) {
+		dev_err(&h->pdev->dev, "failed to remap PCI mem\n");
 		err = -ENOMEM;
-		goto err_out_free_res;
+		goto clean2;	/* intmode+region, pci */
 	}
 	err = hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_READY);
 	if (err)
-		goto err_out_free_res;
+		goto clean3;	/* vaddr, intmode+region, pci */
 	err = hpsa_find_cfgtables(h);
 	if (err)
-		goto err_out_free_res;
+		goto clean3;	/* vaddr, intmode+region, pci */
 	hpsa_find_board_params(h);
 
 	if (!hpsa_CISS_signature_present(h)) {
 		err = -ENODEV;
-		goto err_out_free_res;
+		goto clean4;	/* cfgtables, vaddr, intmode+region, pci */
 	}
 	hpsa_set_driver_support_bits(h);
 	hpsa_p600_dma_prefetch_quirk(h);
 	err = hpsa_enter_simple_mode(h);
 	if (err)
-		goto err_out_free_res;
+		goto clean4;	/* cfgtables, vaddr, intmode+region, pci */
 	return 0;
 
-err_out_free_res:
-	if (h->transtable)
-		iounmap(h->transtable);
-	if (h->cfgtable)
-		iounmap(h->cfgtable);
-	if (h->vaddr)
-		iounmap(h->vaddr);
-	pci_disable_device(h->pdev);
+clean4:	/* cfgtables, vaddr, intmode+region, pci */
+	hpsa_free_cfgtables(h);
+clean3:	/* vaddr, intmode+region, pci */
+	iounmap(h->vaddr);
+clean2:	/* intmode+region, pci */
+	hpsa_disable_interrupt_mode(h);
 	pci_release_regions(h->pdev);
+clean1:	/* pci */
+	pci_disable_device(h->pdev);
 	return err;
 }
 
@@ -7025,8 +7051,9 @@ static int hpsa_request_irqs(struct ctlr_info *h,
 		}
 	}
 	if (rc) {
-		dev_err(&h->pdev->dev, "unable to get irq %d for %s\n",
+		dev_err(&h->pdev->dev, "failed to get irq %d for %s\n",
 		       h->intr[h->intr_mode], h->devname);
+		hpsa_free_irqs(h);
 		return -ENODEV;
 	}
 	return 0;
@@ -7078,15 +7105,11 @@ static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 	kfree(h->ioaccel1_blockFetchTable);
 	kfree(h->blockFetchTable);
 	hpsa_free_reply_queues(h);
-	if (h->vaddr)
-		iounmap(h->vaddr);
-	if (h->transtable)
-		iounmap(h->transtable);
-	if (h->cfgtable)
-		iounmap(h->cfgtable);
-	hpsa_disable_interrupt_mode(h);
+	hpsa_free_cfgtables(h);			/* pci_init 4 */
+	iounmap(h->vaddr);			/* pci_init 3 */
+	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
 	pci_disable_device(h->pdev);
-	pci_release_regions(h->pdev);
+	pci_release_regions(h->pdev);		/* pci_init 2 */
 	kfree(h);
 }
 
@@ -7404,7 +7427,7 @@ reinit_after_soft_reset:
 			dac = 0;
 		} else {
 			dev_err(&pdev->dev, "no suitable DMA available\n");
-			goto clean1;
+			goto clean2;
 		}
 	}
 
@@ -7515,6 +7538,7 @@ clean4:
 clean2_and_free_irqs:
 	hpsa_free_irqs(h);
 clean2:
+	hpsa_free_pci_init(h);
 clean1:
 	if (h->resubmit_wq)
 		destroy_workqueue(h->resubmit_wq);
@@ -7606,12 +7630,10 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 	destroy_workqueue(h->resubmit_wq);
 	hpsa_unregister_scsi(h);	/* unhook from SCSI subsystem */
 
-	/* includes hpsa_free_irqs and hpsa_disable_interrupt_mode */
+	/* includes hpsa_free_irqs */
+	/* includes hpsa_disable_interrupt_mode - pci_init 2 */
 	hpsa_shutdown(pdev);
 
-	iounmap(h->vaddr);
-	iounmap(h->transtable);
-	iounmap(h->cfgtable);
 	hpsa_free_device_info(h);
 	hpsa_free_sg_chain_blocks(h);
 	pci_free_consistent(h->pdev,
@@ -7626,8 +7648,10 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 	kfree(h->ioaccel1_blockFetchTable);
 	kfree(h->ioaccel2_blockFetchTable);
 	kfree(h->hba_inquiry_data);
-	pci_disable_device(pdev);
-	pci_release_regions(pdev);
+
+	/* includes hpsa_disable_interrupt_mode - pci_init 2 */
+	hpsa_free_pci_init(h);
+
 	free_percpu(h->lockup_detected);
 	kfree(h);
 }


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 18/42] hpsa: refactor freeing of resources into more logical functions
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (16 preceding siblings ...)
  2015-03-17 20:03 ` [PATCH v3 17/42] hpsa: clean up error handling Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 19/42] hpsa: add ioaccel sg chaining for the ioaccel2 path Don Brace
                   ` (23 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

refactor freeing of resources into more logical functions

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  104 +++++++++++++++++++++++++++------------------------
 1 file changed, 56 insertions(+), 48 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3559425a4..9ca86be 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -235,6 +235,8 @@ static void check_ioctl_unit_attention(struct ctlr_info *h,
 static void calc_bucket_map(int *bucket, int num_buckets,
 	int nsgs, int min_blocks, u32 *bucket_map);
 static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h);
+static void hpsa_free_ioaccel1_cmd_and_bft(struct ctlr_info *h);
+static void hpsa_free_ioaccel2_cmd_and_bft(struct ctlr_info *h);
 static inline u32 next_command(struct ctlr_info *h, u8 q);
 static int hpsa_find_cfg_addrs(struct pci_dev *pdev, void __iomem *vaddr,
 			       u32 *cfg_base_addr, u64 *cfg_base_addr_index,
@@ -6924,6 +6926,21 @@ out_disable:
 	return rc;
 }
 
+static void hpsa_free_cmd_pool(struct ctlr_info *h)
+{
+	kfree(h->cmd_pool_bits);
+	if (h->cmd_pool)
+		pci_free_consistent(h->pdev,
+				h->nr_cmds * sizeof(struct CommandList),
+				h->cmd_pool,
+				h->cmd_pool_dhandle);
+	if (h->errinfo_pool)
+		pci_free_consistent(h->pdev,
+				h->nr_cmds * sizeof(struct ErrorInfo),
+				h->errinfo_pool,
+				h->errinfo_pool_dhandle);
+}
+
 static int hpsa_alloc_cmd_pool(struct ctlr_info *h)
 {
 	h->cmd_pool_bits = kzalloc(
@@ -6948,28 +6965,6 @@ clean_up:
 	return -ENOMEM;
 }
 
-static void hpsa_free_cmd_pool(struct ctlr_info *h)
-{
-	kfree(h->cmd_pool_bits);
-	if (h->cmd_pool)
-		pci_free_consistent(h->pdev,
-			    h->nr_cmds * sizeof(struct CommandList),
-			    h->cmd_pool, h->cmd_pool_dhandle);
-	if (h->ioaccel2_cmd_pool)
-		pci_free_consistent(h->pdev,
-			h->nr_cmds * sizeof(*h->ioaccel2_cmd_pool),
-			h->ioaccel2_cmd_pool, h->ioaccel2_cmd_pool_dhandle);
-	if (h->errinfo_pool)
-		pci_free_consistent(h->pdev,
-			    h->nr_cmds * sizeof(struct ErrorInfo),
-			    h->errinfo_pool,
-			    h->errinfo_pool_dhandle);
-	if (h->ioaccel_cmd_pool)
-		pci_free_consistent(h->pdev,
-			h->nr_cmds * sizeof(struct io_accel1_cmd),
-			h->ioaccel_cmd_pool, h->ioaccel_cmd_pool_dhandle);
-}
-
 static void hpsa_irq_affinity_hints(struct ctlr_info *h)
 {
 	int i, cpu;
@@ -7090,8 +7085,10 @@ static void hpsa_free_reply_queues(struct ctlr_info *h)
 	for (i = 0; i < h->nreply_queues; i++) {
 		if (!h->reply_queue[i].head)
 			continue;
-		pci_free_consistent(h->pdev, h->reply_queue_size,
-			h->reply_queue[i].head, h->reply_queue[i].busaddr);
+		pci_free_consistent(h->pdev,
+					h->reply_queue_size,
+					h->reply_queue[i].head,
+					h->reply_queue[i].busaddr);
 		h->reply_queue[i].head = NULL;
 		h->reply_queue[i].busaddr = 0;
 	}
@@ -7102,9 +7099,10 @@ static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 	hpsa_free_irqs(h);
 	hpsa_free_sg_chain_blocks(h);
 	hpsa_free_cmd_pool(h);
-	kfree(h->ioaccel1_blockFetchTable);
-	kfree(h->blockFetchTable);
-	hpsa_free_reply_queues(h);
+	kfree(h->blockFetchTable);		/* perf 2 */
+	hpsa_free_reply_queues(h);		/* perf 1 */
+	hpsa_free_ioaccel1_cmd_and_bft(h);	/* perf 1 */
+	hpsa_free_ioaccel2_cmd_and_bft(h);	/* perf 1 */
 	hpsa_free_cfgtables(h);			/* pci_init 4 */
 	iounmap(h->vaddr);			/* pci_init 3 */
 	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
@@ -7535,6 +7533,8 @@ reinit_after_soft_reset:
 clean4:
 	hpsa_free_sg_chain_blocks(h);
 	hpsa_free_cmd_pool(h);
+	hpsa_free_ioaccel1_cmd_and_bft(h);
+	hpsa_free_ioaccel2_cmd_and_bft(h);
 clean2_and_free_irqs:
 	hpsa_free_irqs(h);
 clean2:
@@ -7636,17 +7636,11 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 
 	hpsa_free_device_info(h);
 	hpsa_free_sg_chain_blocks(h);
-	pci_free_consistent(h->pdev,
-		h->nr_cmds * sizeof(struct CommandList),
-		h->cmd_pool, h->cmd_pool_dhandle);
-	pci_free_consistent(h->pdev,
-		h->nr_cmds * sizeof(struct ErrorInfo),
-		h->errinfo_pool, h->errinfo_pool_dhandle);
-	hpsa_free_reply_queues(h);
-	kfree(h->cmd_pool_bits);
-	kfree(h->blockFetchTable);
-	kfree(h->ioaccel1_blockFetchTable);
-	kfree(h->ioaccel2_blockFetchTable);
+	kfree(h->blockFetchTable);		/* perf 2 */
+	hpsa_free_reply_queues(h);		/* perf 1 */
+	hpsa_free_ioaccel1_cmd_and_bft(h);	/* perf 1 */
+	hpsa_free_ioaccel2_cmd_and_bft(h);	/* perf 1 */
+	hpsa_free_cmd_pool(h);			/* init_one 5 */
 	kfree(h->hba_inquiry_data);
 
 	/* includes hpsa_disable_interrupt_mode - pci_init 2 */
@@ -7893,6 +7887,17 @@ static int hpsa_enter_performant_mode(struct ctlr_info *h, u32 trans_support)
 	return 0;
 }
 
+/* Free ioaccel1 mode command blocks and block fetch table */
+static void hpsa_free_ioaccel1_cmd_and_bft(struct ctlr_info *h)
+{
+	if (h->ioaccel_cmd_pool)
+		pci_free_consistent(h->pdev,
+			h->nr_cmds * sizeof(*h->ioaccel_cmd_pool),
+			h->ioaccel_cmd_pool,
+			h->ioaccel_cmd_pool_dhandle);
+	kfree(h->ioaccel1_blockFetchTable);
+}
+
 /* Allocate ioaccel1 mode command blocks and block fetch table */
 static int hpsa_alloc_ioaccel1_cmd_and_bft(struct ctlr_info *h)
 {
@@ -7925,14 +7930,21 @@ static int hpsa_alloc_ioaccel1_cmd_and_bft(struct ctlr_info *h)
 	return 0;
 
 clean_up:
-	if (h->ioaccel_cmd_pool)
-		pci_free_consistent(h->pdev,
-			h->nr_cmds * sizeof(*h->ioaccel_cmd_pool),
-			h->ioaccel_cmd_pool, h->ioaccel_cmd_pool_dhandle);
-	kfree(h->ioaccel1_blockFetchTable);
+	hpsa_free_ioaccel1_cmd_and_bft(h);
 	return 1;
 }
 
+/* Free ioaccel2 mode command blocks and block fetch table */
+static void hpsa_free_ioaccel2_cmd_and_bft(struct ctlr_info *h)
+{
+	if (h->ioaccel2_cmd_pool)
+		pci_free_consistent(h->pdev,
+			h->nr_cmds * sizeof(*h->ioaccel2_cmd_pool),
+			h->ioaccel2_cmd_pool,
+			h->ioaccel2_cmd_pool_dhandle);
+	kfree(h->ioaccel2_blockFetchTable);
+}
+
 /* Allocate ioaccel2 mode command blocks and block fetch table */
 static int hpsa_alloc_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 {
@@ -7963,11 +7975,7 @@ static int hpsa_alloc_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 	return 0;
 
 clean_up:
-	if (h->ioaccel2_cmd_pool)
-		pci_free_consistent(h->pdev,
-			h->nr_cmds * sizeof(*h->ioaccel2_cmd_pool),
-			h->ioaccel2_cmd_pool, h->ioaccel2_cmd_pool_dhandle);
-	kfree(h->ioaccel2_blockFetchTable);
+	hpsa_free_ioaccel2_cmd_and_bft(h);
 	return 1;
 }
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 19/42] hpsa: add ioaccel sg chaining for the ioaccel2 path
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (17 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 18/42] hpsa: refactor freeing of resources into more logical functions Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 20/42] hpsa: add more ioaccel2 error handling, including underrun statuses Don Brace
                   ` (22 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

Increase the request size for ioaccel2 path.

The error, if any, returned by hpsa_allocate_ioaccel2_sg_chain_blocks
to hpsa_alloc_ioaccel2_cmd_and_bft should be returned upstream rather
than assumed to be -ENOMEM.

This differs slightly from hpsa_alloc_ioaccel1_cmd_and_bft,
which does not call another hpsa_allocate function and only
has -ENOMEM to return from some kmalloc calls.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  125 +++++++++++++++++++++++++++++++++++++++++++++++----
 drivers/scsi/hpsa.h |    1 
 2 files changed, 116 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 9ca86be..0a3ea37 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1734,6 +1734,46 @@ static void hpsa_slave_destroy(struct scsi_device *sdev)
 	/* nothing to do. */
 }
 
+static void hpsa_free_ioaccel2_sg_chain_blocks(struct ctlr_info *h)
+{
+	int i;
+
+	if (!h->ioaccel2_cmd_sg_list)
+		return;
+	for (i = 0; i < h->nr_cmds; i++) {
+		kfree(h->ioaccel2_cmd_sg_list[i]);
+		h->ioaccel2_cmd_sg_list[i] = NULL;
+	}
+	kfree(h->ioaccel2_cmd_sg_list);
+	h->ioaccel2_cmd_sg_list = NULL;
+}
+
+static int hpsa_allocate_ioaccel2_sg_chain_blocks(struct ctlr_info *h)
+{
+	int i;
+
+	if (h->chainsize <= 0)
+		return 0;
+
+	h->ioaccel2_cmd_sg_list =
+		kzalloc(sizeof(*h->ioaccel2_cmd_sg_list) * h->nr_cmds,
+					GFP_KERNEL);
+	if (!h->ioaccel2_cmd_sg_list)
+		return -ENOMEM;
+	for (i = 0; i < h->nr_cmds; i++) {
+		h->ioaccel2_cmd_sg_list[i] =
+			kmalloc(sizeof(*h->ioaccel2_cmd_sg_list[i]) *
+					h->maxsgentries, GFP_KERNEL);
+		if (!h->ioaccel2_cmd_sg_list[i])
+			goto clean;
+	}
+	return 0;
+
+clean:
+	hpsa_free_ioaccel2_sg_chain_blocks(h);
+	return -ENOMEM;
+}
+
 static void hpsa_free_sg_chain_blocks(struct ctlr_info *h)
 {
 	int i;
@@ -1776,6 +1816,39 @@ clean:
 	return -ENOMEM;
 }
 
+static int hpsa_map_ioaccel2_sg_chain_block(struct ctlr_info *h,
+	struct io_accel2_cmd *cp, struct CommandList *c)
+{
+	struct ioaccel2_sg_element *chain_block;
+	u64 temp64;
+	u32 chain_size;
+
+	chain_block = h->ioaccel2_cmd_sg_list[c->cmdindex];
+	chain_size = le32_to_cpu(cp->data_len);
+	temp64 = pci_map_single(h->pdev, chain_block, chain_size,
+				PCI_DMA_TODEVICE);
+	if (dma_mapping_error(&h->pdev->dev, temp64)) {
+		/* prevent subsequent unmapping */
+		cp->sg->address = 0;
+		return -1;
+	}
+	cp->sg->address = cpu_to_le64(temp64);
+	return 0;
+}
+
+static void hpsa_unmap_ioaccel2_sg_chain_block(struct ctlr_info *h,
+	struct io_accel2_cmd *cp)
+{
+	struct ioaccel2_sg_element *chain_sg;
+	u64 temp64;
+	u32 chain_size;
+
+	chain_sg = cp->sg;
+	temp64 = le64_to_cpu(chain_sg->address);
+	chain_size = le32_to_cpu(cp->data_len);
+	pci_unmap_single(h->pdev, temp64, chain_size, PCI_DMA_TODEVICE);
+}
+
 static int hpsa_map_sg_chain_block(struct ctlr_info *h,
 	struct CommandList *c)
 {
@@ -1985,6 +2058,7 @@ static void complete_scsi_command(struct CommandList *cp)
 	struct ctlr_info *h;
 	struct ErrorInfo *ei;
 	struct hpsa_scsi_dev_t *dev;
+	struct io_accel2_cmd *c2;
 
 	int sense_key;
 	int asc;      /* additional sense code */
@@ -1995,12 +2069,17 @@ static void complete_scsi_command(struct CommandList *cp)
 	cmd = cp->scsi_cmd;
 	h = cp->h;
 	dev = cmd->device->hostdata;
+	c2 = &h->ioaccel2_cmd_pool[cp->cmdindex];
 
 	scsi_dma_unmap(cmd); /* undo the DMA mappings */
 	if ((cp->cmd_type == CMD_SCSI) &&
 		(le16_to_cpu(cp->Header.SGTotal) > h->max_cmd_sg_entries))
 		hpsa_unmap_sg_chain_block(h, cp);
 
+	if ((cp->cmd_type == CMD_IOACCEL2) &&
+		(c2->sg[0].chain_indicator == IOACCEL2_CHAIN))
+		hpsa_unmap_ioaccel2_sg_chain_block(h, c2);
+
 	cmd->result = (DID_OK << 16); 		/* host byte */
 	cmd->result |= (COMMAND_COMPLETE << 8);	/* msg byte */
 
@@ -3842,10 +3921,7 @@ static int hpsa_scsi_ioaccel2_queue_command(struct ctlr_info *h,
 	u32 len;
 	u32 total_len = 0;
 
-	if (scsi_sg_count(cmd) > h->ioaccel_maxsg) {
-		atomic_dec(&phys_disk->ioaccel_cmds_out);
-		return IO_ACCEL_INELIGIBLE;
-	}
+	BUG_ON(scsi_sg_count(cmd) > h->maxsgentries);
 
 	if (fixup_ioaccel_cdb(cdb, &cdb_len)) {
 		atomic_dec(&phys_disk->ioaccel_cmds_out);
@@ -3868,8 +3944,19 @@ static int hpsa_scsi_ioaccel2_queue_command(struct ctlr_info *h,
 	}
 
 	if (use_sg) {
-		BUG_ON(use_sg > IOACCEL2_MAXSGENTRIES);
 		curr_sg = cp->sg;
+		if (use_sg > h->ioaccel_maxsg) {
+			addr64 = le64_to_cpu(
+				h->ioaccel2_cmd_sg_list[c->cmdindex]->address);
+			curr_sg->address = cpu_to_le64(addr64);
+			curr_sg->length = 0;
+			curr_sg->reserved[0] = 0;
+			curr_sg->reserved[1] = 0;
+			curr_sg->reserved[2] = 0;
+			curr_sg->chain_indicator = 0x80;
+
+			curr_sg = h->ioaccel2_cmd_sg_list[c->cmdindex];
+		}
 		scsi_for_each_sg(cmd, sg, use_sg, i) {
 			addr64 = (u64) sg_dma_address(sg);
 			len  = sg_dma_len(sg);
@@ -3914,14 +4001,22 @@ static int hpsa_scsi_ioaccel2_queue_command(struct ctlr_info *h,
 	cp->Tag = cpu_to_le32(c->cmdindex << DIRECT_LOOKUP_SHIFT);
 	memcpy(cp->cdb, cdb, sizeof(cp->cdb));
 
-	/* fill in sg elements */
-	cp->sg_count = (u8) use_sg;
-
 	cp->data_len = cpu_to_le32(total_len);
 	cp->err_ptr = cpu_to_le64(c->busaddr +
 			offsetof(struct io_accel2_cmd, error_data));
 	cp->err_len = cpu_to_le32(sizeof(cp->error_data));
 
+	/* fill in sg elements */
+	if (use_sg > h->ioaccel_maxsg) {
+		cp->sg_count = 1;
+		if (hpsa_map_ioaccel2_sg_chain_block(h, cp, c)) {
+			atomic_dec(&phys_disk->ioaccel_cmds_out);
+			scsi_dma_unmap(cmd);
+			return -1;
+		}
+	} else
+		cp->sg_count = (u8) use_sg;
+
 	enqueue_cmd_and_start_io(h, c);
 	return 0;
 }
@@ -7937,6 +8032,8 @@ clean_up:
 /* Free ioaccel2 mode command blocks and block fetch table */
 static void hpsa_free_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 {
+	hpsa_free_ioaccel2_sg_chain_blocks(h);
+
 	if (h->ioaccel2_cmd_pool)
 		pci_free_consistent(h->pdev,
 			h->nr_cmds * sizeof(*h->ioaccel2_cmd_pool),
@@ -7948,6 +8045,8 @@ static void hpsa_free_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 /* Allocate ioaccel2 mode command blocks and block fetch table */
 static int hpsa_alloc_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 {
+	int rc;
+
 	/* Allocate ioaccel2 mode command blocks and block fetch table */
 
 	h->ioaccel_maxsg =
@@ -7967,7 +8066,13 @@ static int hpsa_alloc_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 				sizeof(u32)), GFP_KERNEL);
 
 	if ((h->ioaccel2_cmd_pool == NULL) ||
-		(h->ioaccel2_blockFetchTable == NULL))
+		(h->ioaccel2_blockFetchTable == NULL)) {
+		rc = -ENOMEM;
+		goto clean_up;
+	}
+
+	rc = hpsa_allocate_ioaccel2_sg_chain_blocks(h);
+	if (rc)
 		goto clean_up;
 
 	memset(h->ioaccel2_cmd_pool, 0,
@@ -7976,7 +8081,7 @@ static int hpsa_alloc_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 
 clean_up:
 	hpsa_free_ioaccel2_cmd_and_bft(h);
-	return 1;
+	return rc;
 }
 
 static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 87a70b5..3acacf6 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -162,6 +162,7 @@ struct ctlr_info {
 	u8 max_cmd_sg_entries;
 	int chainsize;
 	struct SGDescriptor **cmd_sg_list;
+	struct ioaccel2_sg_element **ioaccel2_cmd_sg_list;
 
 	/* pointers to command and error info pool */
 	struct CommandList 	*cmd_pool;


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 20/42] hpsa: add more ioaccel2 error handling, including underrun statuses.
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (18 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 19/42] hpsa: add ioaccel sg chaining for the ioaccel2 path Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 21/42] hpsa: do not check cmd_alloc return value - it cannnot return NULL Don Brace
                   ` (21 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Joe Handzik <joseph.t.handzik@hp.com>

improve ioaccel2 error handling, including better handling of
underrun statuses

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |   33 ++++++++++++++++++++++++++++-----
 drivers/scsi/hpsa_cmd.h |    6 ++++++
 2 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 0a3ea37..ba89375a 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1898,6 +1898,7 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 {
 	int data_len;
 	int retry = 0;
+	u32 ioaccel2_resid = 0;
 
 	switch (c2->error_data.serv_response) {
 	case IOACCEL2_SERV_RESPONSE_COMPLETE:
@@ -1956,11 +1957,33 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 		}
 		break;
 	case IOACCEL2_SERV_RESPONSE_FAILURE:
-		/* don't expect to get here. */
-		dev_warn(&h->pdev->dev,
-			"unexpected delivery or target failure, status = 0x%02x\n",
-			c2->error_data.status);
-		retry = 1;
+		switch (c2->error_data.status) {
+		case IOACCEL2_STATUS_SR_IO_ERROR:
+		case IOACCEL2_STATUS_SR_IO_ABORTED:
+		case IOACCEL2_STATUS_SR_OVERRUN:
+			retry = 1;
+			break;
+		case IOACCEL2_STATUS_SR_UNDERRUN:
+			cmd->result = (DID_OK << 16);		/* host byte */
+			cmd->result |= (COMMAND_COMPLETE << 8);	/* msg byte */
+			ioaccel2_resid = c2->error_data.resid_cnt[3] << 24;
+			ioaccel2_resid |= c2->error_data.resid_cnt[2] << 16;
+			ioaccel2_resid |= c2->error_data.resid_cnt[1] << 8;
+			ioaccel2_resid |= c2->error_data.resid_cnt[0];
+			scsi_set_resid(cmd, ioaccel2_resid);
+			break;
+		case IOACCEL2_STATUS_SR_NO_PATH_TO_DEVICE:
+		case IOACCEL2_STATUS_SR_INVALID_DEVICE:
+		case IOACCEL2_STATUS_SR_IOACCEL_DISABLED:
+			/* We will get an event from ctlr to trigger rescan */
+			retry = 1;
+			break;
+		default:
+			retry = 1;
+			dev_warn(&h->pdev->dev,
+				"unexpected delivery or target failure, status = 0x%02x\n",
+				c2->error_data.status);
+		}
 		break;
 	case IOACCEL2_SERV_RESPONSE_TMF_COMPLETE:
 		break;
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 0efb6f2b..cecb62b 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -532,6 +532,12 @@ struct io_accel2_scsi_response {
 #define IOACCEL2_STATUS_SR_TASK_COMP_SET_FULL	0x28
 #define IOACCEL2_STATUS_SR_TASK_COMP_ABORTED	0x40
 #define IOACCEL2_STATUS_SR_IOACCEL_DISABLED	0x0E
+#define IOACCEL2_STATUS_SR_IO_ERROR		0x01
+#define IOACCEL2_STATUS_SR_IO_ABORTED		0x02
+#define IOACCEL2_STATUS_SR_NO_PATH_TO_DEVICE	0x03
+#define IOACCEL2_STATUS_SR_INVALID_DEVICE	0x04
+#define IOACCEL2_STATUS_SR_UNDERRUN		0x51
+#define IOACCEL2_STATUS_SR_OVERRUN		0x75
 	u8 data_present;		/* low 2 bits */
 #define IOACCEL2_NO_DATAPRESENT		0x000
 #define IOACCEL2_RESPONSE_DATAPRESENT	0x001


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 21/42] hpsa: do not check cmd_alloc return value - it cannnot return NULL
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (19 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 20/42] hpsa: add more ioaccel2 error handling, including underrun statuses Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 22/42] hpsa: correct return values from driver functions Don Brace
                   ` (20 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

cmd_alloc can no longer return NULL, so don't check for NULL any more
(which is unreachable code).

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   77 ++++++++++-----------------------------------------
 1 file changed, 15 insertions(+), 62 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index ba89375a..886e928 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -2504,11 +2504,6 @@ static int hpsa_scsi_do_inquiry(struct ctlr_info *h, unsigned char *scsi3addr,
 
 	c = cmd_alloc(h);
 
-	if (c == NULL) {
-		dev_warn(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return -ENOMEM;
-	}
-
 	if (fill_cmd(c, HPSA_INQUIRY, h, buf, bufsize,
 			page, scsi3addr, TYPE_CMD)) {
 		rc = -1;
@@ -2537,11 +2532,6 @@ static int hpsa_bmic_ctrl_mode_sense(struct ctlr_info *h,
 	struct ErrorInfo *ei;
 
 	c = cmd_alloc(h);
-	if (c == NULL) {			/* trouble... */
-		dev_warn(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return -ENOMEM;
-	}
-
 	if (fill_cmd(c, BMIC_SENSE_CONTROLLER_PARAMETERS, h, buf, bufsize,
 			page, scsi3addr, TYPE_CMD)) {
 		rc = -1;
@@ -2559,7 +2549,7 @@ static int hpsa_bmic_ctrl_mode_sense(struct ctlr_info *h,
 out:
 	cmd_free(h, c);
 	return rc;
-	}
+}
 
 static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
 	u8 reset_type, int reply_queue)
@@ -2570,10 +2560,6 @@ static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
 
 	c = cmd_alloc(h);
 
-	if (c == NULL) {			/* trouble... */
-		dev_warn(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return -ENOMEM;
-	}
 
 	/* fill_cmd can't fail here, no data buffer to map. */
 	(void) fill_cmd(c, HPSA_DEVICE_RESET_MSG, h, NULL, 0, 0,
@@ -2701,10 +2687,7 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
 	struct ErrorInfo *ei;
 
 	c = cmd_alloc(h);
-	if (c == NULL) {
-		dev_warn(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return -ENOMEM;
-	}
+
 	if (fill_cmd(c, HPSA_GET_RAID_MAP, h, &this_device->raid_map,
 			sizeof(this_device->raid_map), 0,
 			scsi3addr, TYPE_CMD)) {
@@ -2877,10 +2860,7 @@ static int hpsa_scsi_do_report_luns(struct ctlr_info *h, int logical,
 	struct ErrorInfo *ei;
 
 	c = cmd_alloc(h);
-	if (c == NULL) {			/* trouble... */
-		dev_err(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return -1;
-	}
+
 	/* address the controller */
 	memset(scsi3addr, 0, sizeof(scsi3addr));
 	if (fill_cmd(c, logical ? HPSA_REPORT_LOG : HPSA_REPORT_PHYS, h,
@@ -2995,8 +2975,7 @@ static int hpsa_volume_offline(struct ctlr_info *h,
 #define ASCQ_LUN_NOT_READY_INITIALIZING_CMD_REQ 0x02
 
 	c = cmd_alloc(h);
-	if (!c)
-		return 0;
+
 	(void) fill_cmd(c, TEST_UNIT_READY, h, NULL, 0, 0, scsi3addr, TYPE_CMD);
 	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
 	if (rc) {
@@ -3070,8 +3049,7 @@ static int hpsa_device_supports_aborts(struct ctlr_info *h,
 		return 1;
 
 	c = cmd_alloc(h);
-	if (!c)
-		return -ENOMEM;
+
 	(void) fill_cmd(c, HPSA_ABORT_MSG, h, &tag, 0, 0, scsi3addr, TYPE_MSG);
 	(void) hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
 	/* no unmap needed here because no data xfer. */
@@ -4642,10 +4620,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 		return 0;
 	}
 	c = cmd_alloc(h);
-	if (c == NULL) {			/* trouble... */
-		dev_err(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return SCSI_MLQUEUE_HOST_BUSY;
-	}
+
 	if (unlikely(lockup_detected(h))) {
 		cmd->result = DID_NO_CONNECT << 16;
 		cmd_free(h, c);
@@ -4806,11 +4781,6 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h,
 	struct CommandList *c;
 
 	c = cmd_alloc(h);
-	if (!c) {
-		dev_warn(&h->pdev->dev, "out of memory in "
-			"wait_for_device_to_become_ready.\n");
-		return IO_ERROR;
-	}
 
 	/* Send test unit ready until device ready, or give up. */
 	while (count < HPSA_TUR_RETRY_LIMIT) {
@@ -4973,10 +4943,6 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
 	__le32 tagupper, taglower;
 
 	c = cmd_alloc(h);
-	if (c == NULL) {	/* trouble... */
-		dev_warn(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		return -ENOMEM;
-	}
 
 	/* fill_cmd can't fail here, no buffer to map */
 	(void) fill_cmd(c, HPSA_ABORT_MSG, h, &abort->Header.tag,
@@ -5267,6 +5233,8 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
  * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
  * which ones are free or in use.  Lock must be held when calling this.
  * cmd_free() is the complement.
+ * This function never gives up and returns NULL.  If it hangs,
+ * another thread must call cmd_free() to free some tags.
  */
 
 static struct CommandList *cmd_alloc(struct ctlr_info *h)
@@ -5500,10 +5468,7 @@ static int hpsa_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 		}
 	}
 	c = cmd_alloc(h);
-	if (c == NULL) {
-		rc = -ENOMEM;
-		goto out_kfree;
-	}
+
 	/* Fill in the command type */
 	c->cmd_type = CMD_IOCTL_PEND;
 	/* Fill in Command Header */
@@ -5637,10 +5602,7 @@ static int hpsa_big_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 		sg_used++;
 	}
 	c = cmd_alloc(h);
-	if (c == NULL) {
-		status = -ENOMEM;
-		goto cleanup1;
-	}
+
 	c->cmd_type = CMD_IOCTL_PEND;
 	c->Header.ReplyQueue = 0;
 	c->Header.SGList = (u8) sg_used;
@@ -5756,14 +5718,13 @@ static int hpsa_ioctl(struct scsi_device *dev, int cmd, void __user *arg)
 	}
 }
 
-static int hpsa_send_host_reset(struct ctlr_info *h, unsigned char *scsi3addr,
+static void hpsa_send_host_reset(struct ctlr_info *h, unsigned char *scsi3addr,
 				u8 reset_type)
 {
 	struct CommandList *c;
 
 	c = cmd_alloc(h);
-	if (!c)
-		return -ENOMEM;
+
 	/* fill_cmd can't fail here, no data buffer to map */
 	(void) fill_cmd(c, HPSA_DEVICE_RESET_MSG, h, NULL, 0, 0,
 		RAID_CTLR_LUNID, TYPE_MSG);
@@ -5774,7 +5735,7 @@ static int hpsa_send_host_reset(struct ctlr_info *h, unsigned char *scsi3addr,
 	 * the command either.  This is the last command we will send before
 	 * re-initializing everything, so it doesn't matter and won't leak.
 	 */
-	return 0;
+	return;
 }
 
 static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
@@ -7174,11 +7135,7 @@ static int hpsa_request_irqs(struct ctlr_info *h,
 
 static int hpsa_kdump_soft_reset(struct ctlr_info *h)
 {
-	if (hpsa_send_host_reset(h, RAID_CTLR_LUNID,
-		HPSA_RESET_TYPE_CONTROLLER)) {
-		dev_warn(&h->pdev->dev, "Resetting array controller failed.\n");
-		return -EIO;
-	}
+	hpsa_send_host_reset(h, RAID_CTLR_LUNID, HPSA_RESET_TYPE_CONTROLLER);
 
 	dev_info(&h->pdev->dev, "Waiting for board to soft reset.\n");
 	if (hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_NOT_READY)) {
@@ -7683,10 +7640,7 @@ static void hpsa_flush_cache(struct ctlr_info *h)
 		return;
 
 	c = cmd_alloc(h);
-	if (!c) {
-		dev_warn(&h->pdev->dev, "cmd_alloc returned NULL!\n");
-		goto out_of_memory;
-	}
+
 	if (fill_cmd(c, HPSA_CACHE_FLUSH, h, flush_buf, 4, 0,
 		RAID_CTLR_LUNID, TYPE_CMD)) {
 		goto out;
@@ -7700,7 +7654,6 @@ out:
 		dev_warn(&h->pdev->dev,
 			"error flushing cache on controller\n");
 	cmd_free(h, c);
-out_of_memory:
 	kfree(flush_buf);
 }
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 22/42] hpsa: correct return values from driver functions.
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (20 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 21/42] hpsa: do not check cmd_alloc return value - it cannnot return NULL Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 23/42] hpsa: clean up driver init Don Brace
                   ` (19 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

correct return codes for error conditions

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 886e928..2cb4db7 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -2691,9 +2691,9 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
 	if (fill_cmd(c, HPSA_GET_RAID_MAP, h, &this_device->raid_map,
 			sizeof(this_device->raid_map), 0,
 			scsi3addr, TYPE_CMD)) {
-		dev_warn(&h->pdev->dev, "Out of memory in hpsa_get_raid_map()\n");
-		rc = -ENOMEM;
-		goto out;
+		dev_warn(&h->pdev->dev, "hpsa_get_raid_map fill_cmd failed\n");
+		cmd_free(h, c);
+		return -1;
 	}
 	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
 					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
@@ -5455,7 +5455,7 @@ static int hpsa_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 	if (iocommand.buf_size > 0) {
 		buff = kmalloc(iocommand.buf_size, GFP_KERNEL);
 		if (buff == NULL)
-			return -EFAULT;
+			return -ENOMEM;
 		if (iocommand.Request.Type.Direction & XFER_WRITE) {
 			/* Copy the data into the buffer we created */
 			if (copy_from_user(buff, iocommand.buf,
@@ -8002,7 +8002,7 @@ static int hpsa_alloc_ioaccel1_cmd_and_bft(struct ctlr_info *h)
 
 clean_up:
 	hpsa_free_ioaccel1_cmd_and_bft(h);
-	return 1;
+	return -ENOMEM;
 }
 
 /* Free ioaccel2 mode command blocks and block fetch table */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 23/42] hpsa: clean up driver init
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (21 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 22/42] hpsa: correct return values from driver functions Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 24/42] hpsa: clean up some error reporting output in abort handler Don Brace
                   ` (18 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

Improve initialization error handling in hpsa_init_one
Clean up style and indent issues
Rename functions for consistency
Improve error messaging on allocations
Fix return status from hpsa_put_ctlr_into_performant_mode
Correct free order in hpsa_init_one using new function
   hpsa_free_performant_mode
Prevent inadvertent use of null pointers by nulling out the parent structures
   and zeroing out associated size variables.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  243 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 157 insertions(+), 86 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 2cb4db7..f980b89 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -234,9 +234,8 @@ static void check_ioctl_unit_attention(struct ctlr_info *h,
 /* performant mode helper functions */
 static void calc_bucket_map(int *bucket, int num_buckets,
 	int nsgs, int min_blocks, u32 *bucket_map);
-static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h);
-static void hpsa_free_ioaccel1_cmd_and_bft(struct ctlr_info *h);
-static void hpsa_free_ioaccel2_cmd_and_bft(struct ctlr_info *h);
+static void hpsa_free_performant_mode(struct ctlr_info *h);
+static int hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h);
 static inline u32 next_command(struct ctlr_info *h, u8 q);
 static int hpsa_find_cfg_addrs(struct pci_dev *pdev, void __iomem *vaddr,
 			       u32 *cfg_base_addr, u64 *cfg_base_addr_index,
@@ -1665,6 +1664,7 @@ static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
 		 * since it didn't get added to scsi mid layer
 		 */
 		fixup_botched_add(h, added[i]);
+		added[i] = NULL;
 	}
 
 free_and_out:
@@ -1788,7 +1788,7 @@ static void hpsa_free_sg_chain_blocks(struct ctlr_info *h)
 	h->cmd_sg_list = NULL;
 }
 
-static int hpsa_allocate_sg_chain_blocks(struct ctlr_info *h)
+static int hpsa_alloc_sg_chain_blocks(struct ctlr_info *h)
 {
 	int i;
 
@@ -6488,9 +6488,11 @@ static void hpsa_disable_interrupt_mode(struct ctlr_info *h)
 	if (h->msix_vector) {
 		if (h->pdev->msix_enabled)
 			pci_disable_msix(h->pdev);
+		h->msix_vector = 0;
 	} else if (h->msi_vector) {
 		if (h->pdev->msi_enabled)
 			pci_disable_msi(h->pdev);
+		h->msi_vector = 0;
 	}
 }
 
@@ -6629,10 +6631,14 @@ static int hpsa_find_cfg_addrs(struct pci_dev *pdev, void __iomem *vaddr,
 
 static void hpsa_free_cfgtables(struct ctlr_info *h)
 {
-	if (h->transtable)
+	if (h->transtable) {
 		iounmap(h->transtable);
-	if (h->cfgtable)
+		h->transtable = NULL;
+	}
+	if (h->cfgtable) {
 		iounmap(h->cfgtable);
+		h->cfgtable = NULL;
+	}
 }
 
 /* Find and map CISS config table and transfer table
@@ -6849,6 +6855,7 @@ static void hpsa_free_pci_init(struct ctlr_info *h)
 {
 	hpsa_free_cfgtables(h);			/* pci_init 4 */
 	iounmap(h->vaddr);			/* pci_init 3 */
+	h->vaddr = NULL;
 	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
 	pci_release_regions(h->pdev);		/* pci_init 2 */
 	pci_disable_device(h->pdev);		/* pci_init 1 */
@@ -6919,6 +6926,7 @@ clean4:	/* cfgtables, vaddr, intmode+region, pci */
 	hpsa_free_cfgtables(h);
 clean3:	/* vaddr, intmode+region, pci */
 	iounmap(h->vaddr);
+	h->vaddr = NULL;
 clean2:	/* intmode+region, pci */
 	hpsa_disable_interrupt_mode(h);
 	pci_release_regions(h->pdev);
@@ -7008,16 +7016,23 @@ out_disable:
 static void hpsa_free_cmd_pool(struct ctlr_info *h)
 {
 	kfree(h->cmd_pool_bits);
-	if (h->cmd_pool)
+	h->cmd_pool_bits = NULL;
+	if (h->cmd_pool) {
 		pci_free_consistent(h->pdev,
 				h->nr_cmds * sizeof(struct CommandList),
 				h->cmd_pool,
 				h->cmd_pool_dhandle);
-	if (h->errinfo_pool)
+		h->cmd_pool = NULL;
+		h->cmd_pool_dhandle = 0;
+	}
+	if (h->errinfo_pool) {
 		pci_free_consistent(h->pdev,
 				h->nr_cmds * sizeof(struct ErrorInfo),
 				h->errinfo_pool,
 				h->errinfo_pool_dhandle);
+		h->errinfo_pool = NULL;
+		h->errinfo_pool_dhandle = 0;
+	}
 }
 
 static int hpsa_alloc_cmd_pool(struct ctlr_info *h)
@@ -7065,12 +7080,14 @@ static void hpsa_free_irqs(struct ctlr_info *h)
 		i = h->intr_mode;
 		irq_set_affinity_hint(h->intr[i], NULL);
 		free_irq(h->intr[i], &h->q[i]);
+		h->q[i] = 0;
 		return;
 	}
 
 	for (i = 0; i < h->msix_vector; i++) {
 		irq_set_affinity_hint(h->intr[i], NULL);
 		free_irq(h->intr[i], &h->q[i]);
+		h->q[i] = 0;
 	}
 	for (; i < MAX_REPLY_QUEUES; i++)
 		h->q[i] = 0;
@@ -7123,6 +7140,7 @@ static int hpsa_request_irqs(struct ctlr_info *h,
 				intxhandler, IRQF_SHARED, h->devname,
 				&h->q[h->intr_mode]);
 		}
+		irq_set_affinity_hint(h->intr[h->intr_mode], NULL);
 	}
 	if (rc) {
 		dev_err(&h->pdev->dev, "failed to get irq %d for %s\n",
@@ -7167,23 +7185,17 @@ static void hpsa_free_reply_queues(struct ctlr_info *h)
 		h->reply_queue[i].head = NULL;
 		h->reply_queue[i].busaddr = 0;
 	}
+	h->reply_queue_size = 0;
 }
 
 static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 {
-	hpsa_free_irqs(h);
-	hpsa_free_sg_chain_blocks(h);
-	hpsa_free_cmd_pool(h);
-	kfree(h->blockFetchTable);		/* perf 2 */
-	hpsa_free_reply_queues(h);		/* perf 1 */
-	hpsa_free_ioaccel1_cmd_and_bft(h);	/* perf 1 */
-	hpsa_free_ioaccel2_cmd_and_bft(h);	/* perf 1 */
-	hpsa_free_cfgtables(h);			/* pci_init 4 */
-	iounmap(h->vaddr);			/* pci_init 3 */
-	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
-	pci_disable_device(h->pdev);
-	pci_release_regions(h->pdev);		/* pci_init 2 */
-	kfree(h);
+	hpsa_free_performant_mode(h);		/* init_one 7 */
+	hpsa_free_sg_chain_blocks(h);		/* init_one 6 */
+	hpsa_free_cmd_pool(h);			/* init_one 5 */
+	hpsa_free_irqs(h);			/* init_one 4 */
+	hpsa_free_pci_init(h);			/* init_one 3 */
+	kfree(h);				/* init_one 1 */
 }
 
 /* Called when controller lockup detected. */
@@ -7450,10 +7462,13 @@ reinit_after_soft_reset:
 	 */
 	BUILD_BUG_ON(sizeof(struct CommandList) % COMMANDLIST_ALIGNMENT);
 	h = kzalloc(sizeof(*h), GFP_KERNEL);
-	if (!h)
+	if (!h) {
+		dev_err(&pdev->dev, "Failed to allocate controller head\n");
 		return -ENOMEM;
+	}
 
 	h->pdev = pdev;
+
 	h->intr_mode = hpsa_simple_mode ? SIMPLE_MODE_INT : PERF_MODE_INT;
 	INIT_LIST_HEAD(&h->offline_device_list);
 	spin_lock_init(&h->lock);
@@ -7471,20 +7486,21 @@ reinit_after_soft_reset:
 	h->resubmit_wq = hpsa_create_controller_wq(h, "resubmit");
 	if (!h->resubmit_wq) {
 		rc = -ENOMEM;
-		goto clean1;
+		goto clean1;	/* aer/h */
 	}
 
 	/* Allocate and clear per-cpu variable lockup_detected */
 	h->lockup_detected = alloc_percpu(u32);
 	if (!h->lockup_detected) {
+		dev_err(&h->pdev->dev, "Failed to allocate lockup detector\n");
 		rc = -ENOMEM;
-		goto clean1;
+		goto clean1;	/* wq/aer/h */
 	}
 	set_lockup_detected_for_all_cpus(h, 0);
 
 	rc = hpsa_pci_init(h);
-	if (rc != 0)
-		goto clean1;
+	if (rc)
+		goto clean2;	/* lockup, wq/aer/h */
 
 	sprintf(h->devname, HPSA "%d", number_of_controllers);
 	h->ctlr = number_of_controllers;
@@ -7500,23 +7516,25 @@ reinit_after_soft_reset:
 			dac = 0;
 		} else {
 			dev_err(&pdev->dev, "no suitable DMA available\n");
-			goto clean2;
+			goto clean3;	/* pci, lockup, wq/aer/h */
 		}
 	}
 
 	/* make sure the board interrupts are off */
 	h->access.set_intr_mask(h, HPSA_INTR_OFF);
 
-	if (hpsa_request_irqs(h, do_hpsa_intr_msi, do_hpsa_intr_intx))
-		goto clean2;
+	rc = hpsa_request_irqs(h, do_hpsa_intr_msi, do_hpsa_intr_intx);
+	if (rc)
+		goto clean3;	/* pci, lockup, wq/aer/h */
 	dev_info(&pdev->dev, "%s: <0x%x> at IRQ %d%s using DAC\n",
 	       h->devname, pdev->device,
 	       h->intr[h->intr_mode], dac ? "" : " not");
 	rc = hpsa_alloc_cmd_pool(h);
 	if (rc)
-		goto clean2_and_free_irqs;
-	if (hpsa_allocate_sg_chain_blocks(h))
-		goto clean4;
+		goto clean4;	/* irq, pci, lockup, wq/aer/h */
+	rc = hpsa_alloc_sg_chain_blocks(h);
+	if (rc)
+		goto clean5;	/* cmd, irq, pci, lockup, wq/aer/h */
 	init_waitqueue_head(&h->scan_wait_queue);
 	init_waitqueue_head(&h->abort_cmd_wait_queue);
 	h->scan_finished = 1; /* no scan currently in progress */
@@ -7526,9 +7544,12 @@ reinit_after_soft_reset:
 	h->hba_mode_enabled = 0;
 	h->scsi_host = NULL;
 	spin_lock_init(&h->devlock);
-	hpsa_put_ctlr_into_performant_mode(h);
+	rc = hpsa_put_ctlr_into_performant_mode(h);
+	if (rc)
+		goto clean6;	/* sg, cmd, irq, pci, lockup, wq/aer/h */
 
-	/* At this point, the controller is ready to take commands.
+	/*
+	 * At this point, the controller is ready to take commands.
 	 * Now, if reset_devices and the hard reset didn't work, try
 	 * the soft reset and see if that works.
 	 */
@@ -7583,17 +7604,17 @@ reinit_after_soft_reset:
 		goto reinit_after_soft_reset;
 	}
 
-		/* Enable Accelerated IO path at driver layer */
-		h->acciopath_status = 1;
+	/* Enable Accelerated IO path at driver layer */
+	h->acciopath_status = 1;
 
 
 	/* Turn the interrupts on so we can service requests */
 	h->access.set_intr_mask(h, HPSA_INTR_ON);
 
 	hpsa_hba_inquiry(h);
-	rc = hpsa_register_scsi(h); /* hook ourselves into SCSI subsystem */
+	rc = hpsa_register_scsi(h);	/* hook ourselves into SCSI subsystem */
 	if (rc)
-		goto clean4;
+		goto clean7;
 
 	/* Monitor the controller for firmware lockups */
 	h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
@@ -7605,22 +7626,32 @@ reinit_after_soft_reset:
 				h->heartbeat_sample_interval);
 	return 0;
 
-clean4:
+clean7: /* perf, sg, cmd, irq, pci, lockup, wq/aer/h */
+	kfree(h->hba_inquiry_data);
+	hpsa_free_performant_mode(h);
+	h->access.set_intr_mask(h, HPSA_INTR_OFF);
+clean6: /* sg, cmd, irq, pci, lockup, wq/aer/h */
 	hpsa_free_sg_chain_blocks(h);
+clean5: /* cmd, irq, pci, lockup, wq/aer/h */
 	hpsa_free_cmd_pool(h);
-	hpsa_free_ioaccel1_cmd_and_bft(h);
-	hpsa_free_ioaccel2_cmd_and_bft(h);
-clean2_and_free_irqs:
+clean4: /* irq, pci, lockup, wq/aer/h */
 	hpsa_free_irqs(h);
-clean2:
+clean3: /* pci, lockup, wq/aer/h */
 	hpsa_free_pci_init(h);
-clean1:
-	if (h->resubmit_wq)
+clean2: /* lockup, wq/aer/h */
+	if (h->lockup_detected) {
+		free_percpu(h->lockup_detected);
+		h->lockup_detected = NULL;
+	}
+clean1:	/* wq/aer/h */
+	if (h->resubmit_wq) {
 		destroy_workqueue(h->resubmit_wq);
-	if (h->rescan_ctlr_wq)
+		h->resubmit_wq = NULL;
+	}
+	if (h->rescan_ctlr_wq) {
 		destroy_workqueue(h->rescan_ctlr_wq);
-	if (h->lockup_detected)
-		free_percpu(h->lockup_detected);
+		h->rescan_ctlr_wq = NULL;
+	}
 	kfree(h);
 	return rc;
 }
@@ -7668,7 +7699,7 @@ static void hpsa_shutdown(struct pci_dev *pdev)
 	 */
 	hpsa_flush_cache(h);
 	h->access.set_intr_mask(h, HPSA_INTR_OFF);
-	hpsa_free_irqs(h);
+	hpsa_free_irqs(h);			/* init_one 4 */
 	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
 }
 
@@ -7676,8 +7707,10 @@ static void hpsa_free_device_info(struct ctlr_info *h)
 {
 	int i;
 
-	for (i = 0; i < h->ndevices; i++)
+	for (i = 0; i < h->ndevices; i++) {
 		kfree(h->dev[i]);
+		h->dev[i] = NULL;
+	}
 }
 
 static void hpsa_remove_one(struct pci_dev *pdev)
@@ -7699,26 +7732,29 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 	cancel_delayed_work_sync(&h->rescan_ctlr_work);
 	destroy_workqueue(h->rescan_ctlr_wq);
 	destroy_workqueue(h->resubmit_wq);
-	hpsa_unregister_scsi(h);	/* unhook from SCSI subsystem */
 
-	/* includes hpsa_free_irqs */
+	/* includes hpsa_free_irqs - init_one 4 */
 	/* includes hpsa_disable_interrupt_mode - pci_init 2 */
 	hpsa_shutdown(pdev);
 
-	hpsa_free_device_info(h);
-	hpsa_free_sg_chain_blocks(h);
-	kfree(h->blockFetchTable);		/* perf 2 */
-	hpsa_free_reply_queues(h);		/* perf 1 */
-	hpsa_free_ioaccel1_cmd_and_bft(h);	/* perf 1 */
-	hpsa_free_ioaccel2_cmd_and_bft(h);	/* perf 1 */
-	hpsa_free_cmd_pool(h);			/* init_one 5 */
-	kfree(h->hba_inquiry_data);
+	hpsa_free_device_info(h);		/* scan */
+
+	hpsa_unregister_scsi(h);			/* init_one "8" */
+	kfree(h->hba_inquiry_data);			/* init_one "8" */
+	h->hba_inquiry_data = NULL;			/* init_one "8" */
+	hpsa_free_performant_mode(h);			/* init_one 7 */
+	hpsa_free_sg_chain_blocks(h);			/* init_one 6 */
+	hpsa_free_cmd_pool(h);				/* init_one 5 */
+
+	/* hpsa_free_irqs already called via hpsa_shutdown init_one 4 */
 
 	/* includes hpsa_disable_interrupt_mode - pci_init 2 */
-	hpsa_free_pci_init(h);
+	hpsa_free_pci_init(h);				/* init_one 3 */
 
-	free_percpu(h->lockup_detected);
-	kfree(h);
+	free_percpu(h->lockup_detected);		/* init_one 2 */
+	h->lockup_detected = NULL;			/* init_one 2 */
+	/* (void) pci_disable_pcie_error_reporting(pdev); */	/* init_one 1 */
+	kfree(h);					/* init_one 1 */
 }
 
 static int hpsa_suspend(__attribute__((unused)) struct pci_dev *pdev,
@@ -7776,7 +7812,10 @@ static void  calc_bucket_map(int bucket[], int num_buckets,
 	}
 }
 
-/* return -ENODEV or other reason on error, 0 on success */
+/*
+ * return -ENODEV on err, 0 on success (or no action)
+ * allocates numerous items that must be freed later
+ */
 static int hpsa_enter_performant_mode(struct ctlr_info *h, u32 trans_support)
 {
 	int i;
@@ -7961,12 +8000,16 @@ static int hpsa_enter_performant_mode(struct ctlr_info *h, u32 trans_support)
 /* Free ioaccel1 mode command blocks and block fetch table */
 static void hpsa_free_ioaccel1_cmd_and_bft(struct ctlr_info *h)
 {
-	if (h->ioaccel_cmd_pool)
+	if (h->ioaccel_cmd_pool) {
 		pci_free_consistent(h->pdev,
 			h->nr_cmds * sizeof(*h->ioaccel_cmd_pool),
 			h->ioaccel_cmd_pool,
 			h->ioaccel_cmd_pool_dhandle);
+		h->ioaccel_cmd_pool = NULL;
+		h->ioaccel_cmd_pool_dhandle = 0;
+	}
 	kfree(h->ioaccel1_blockFetchTable);
+	h->ioaccel1_blockFetchTable = NULL;
 }
 
 /* Allocate ioaccel1 mode command blocks and block fetch table */
@@ -8010,12 +8053,16 @@ static void hpsa_free_ioaccel2_cmd_and_bft(struct ctlr_info *h)
 {
 	hpsa_free_ioaccel2_sg_chain_blocks(h);
 
-	if (h->ioaccel2_cmd_pool)
+	if (h->ioaccel2_cmd_pool) {
 		pci_free_consistent(h->pdev,
 			h->nr_cmds * sizeof(*h->ioaccel2_cmd_pool),
 			h->ioaccel2_cmd_pool,
 			h->ioaccel2_cmd_pool_dhandle);
+		h->ioaccel2_cmd_pool = NULL;
+		h->ioaccel2_cmd_pool_dhandle = 0;
+	}
 	kfree(h->ioaccel2_blockFetchTable);
+	h->ioaccel2_blockFetchTable = NULL;
 }
 
 /* Allocate ioaccel2 mode command blocks and block fetch table */
@@ -8060,33 +8107,46 @@ clean_up:
 	return rc;
 }
 
-static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
+/* Free items allocated by hpsa_put_ctlr_into_performant_mode */
+static void hpsa_free_performant_mode(struct ctlr_info *h)
+{
+	kfree(h->blockFetchTable);
+	h->blockFetchTable = NULL;
+	hpsa_free_reply_queues(h);
+	hpsa_free_ioaccel1_cmd_and_bft(h);
+	hpsa_free_ioaccel2_cmd_and_bft(h);
+}
+
+/* return -ENODEV on error, 0 on success (or no action)
+ * allocates numerous items that must be freed later
+ */
+static int hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
 {
 	u32 trans_support;
 	unsigned long transMethod = CFGTBL_Trans_Performant |
 					CFGTBL_Trans_use_short_tags;
-	int i;
+	int i, rc;
 
 	if (hpsa_simple_mode)
-		return;
+		return 0;
 
 	trans_support = readl(&(h->cfgtable->TransportSupport));
 	if (!(trans_support & PERFORMANT_MODE))
-		return;
+		return 0;
 
 	/* Check for I/O accelerator mode support */
 	if (trans_support & CFGTBL_Trans_io_accel1) {
 		transMethod |= CFGTBL_Trans_io_accel1 |
 				CFGTBL_Trans_enable_directed_msix;
-		if (hpsa_alloc_ioaccel1_cmd_and_bft(h))
-			goto clean_up;
-	} else {
-		if (trans_support & CFGTBL_Trans_io_accel2) {
-				transMethod |= CFGTBL_Trans_io_accel2 |
+		rc = hpsa_alloc_ioaccel1_cmd_and_bft(h);
+		if (rc)
+			return rc;
+	} else if (trans_support & CFGTBL_Trans_io_accel2) {
+		transMethod |= CFGTBL_Trans_io_accel2 |
 				CFGTBL_Trans_enable_directed_msix;
-		if (hpsa_alloc_ioaccel2_cmd_and_bft(h))
-			goto clean_up;
-		}
+		rc = hpsa_alloc_ioaccel2_cmd_and_bft(h);
+		if (rc)
+			return rc;
 	}
 
 	h->nreply_queues = h->msix_vector > 0 ? h->msix_vector : 1;
@@ -8098,8 +8158,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
 		h->reply_queue[i].head = pci_alloc_consistent(h->pdev,
 						h->reply_queue_size,
 						&(h->reply_queue[i].busaddr));
-		if (!h->reply_queue[i].head)
-			goto clean_up;
+		if (!h->reply_queue[i].head) {
+			rc = -ENOMEM;
+			goto clean1;	/* rq, ioaccel */
+		}
 		h->reply_queue[i].size = h->max_commands;
 		h->reply_queue[i].wraparound = 1;  /* spec: init to 1 */
 		h->reply_queue[i].current_entry = 0;
@@ -8108,15 +8170,24 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
 	/* Need a block fetch table for performant mode */
 	h->blockFetchTable = kmalloc(((SG_ENTRIES_IN_CMD + 1) *
 				sizeof(u32)), GFP_KERNEL);
-	if (!h->blockFetchTable)
-		goto clean_up;
+	if (!h->blockFetchTable) {
+		rc = -ENOMEM;
+		goto clean1;	/* rq, ioaccel */
+	}
 
-	hpsa_enter_performant_mode(h, trans_support);
-	return;
+	rc = hpsa_enter_performant_mode(h, trans_support);
+	if (rc)
+		goto clean2;	/* bft, rq, ioaccel */
+	return 0;
 
-clean_up:
-	hpsa_free_reply_queues(h);
+clean2:	/* bft, rq, ioaccel */
 	kfree(h->blockFetchTable);
+	h->blockFetchTable = NULL;
+clean1:	/* rq, ioaccel */
+	hpsa_free_reply_queues(h);
+	hpsa_free_ioaccel1_cmd_and_bft(h);
+	hpsa_free_ioaccel2_cmd_and_bft(h);
+	return rc;
 }
 
 static int is_accelerated_cmd(struct CommandList *c)


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 24/42] hpsa: clean up some error reporting output in abort handler
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (22 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 23/42] hpsa: clean up driver init Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 25/42] hpsa: do not print ioaccel2 warning messages about unusual completions Don Brace
                   ` (17 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

report more useful information on aborts

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index f980b89..7ab34f8 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -5140,10 +5140,10 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		return FAILED;
 
 	memset(msg, 0, sizeof(msg));
-	ml += sprintf(msg+ml, "scsi %d:%d:%d:%llu %s",
+	ml += sprintf(msg+ml, "scsi %d:%d:%d:%llu %s %p",
 		h->scsi_host->host_no, sc->device->channel,
 		sc->device->id, sc->device->lun,
-		"Aborting command");
+		"Aborting command", sc);
 
 	/* Find the device of the command to be aborted */
 	dev = sc->device->hostdata;
@@ -5177,12 +5177,12 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower);
 	as  = abort->scsi_cmd;
 	if (as != NULL)
-		ml += sprintf(msg+ml, "Command:0x%x SN:0x%lx ",
-			as->cmnd[0], as->serial_number);
-	dev_dbg(&h->pdev->dev, "%s\n", msg);
-	dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
-		h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
-		"Aborting command");
+		ml += sprintf(msg+ml,
+			"CDBLen: %d CDB: 0x%02x%02x... SN: 0x%lx ",
+			as->cmd_len, as->cmnd[0], as->cmnd[1],
+			as->serial_number);
+	dev_warn(&h->pdev->dev, "%s BEING SENT\n", msg);
+
 	/*
 	 * Command is in flight, or possibly already completed
 	 * by the firmware (but not to the scsi mid layer) but we can't
@@ -5190,7 +5190,8 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	 */
 	if (wait_for_available_abort_cmd(h)) {
 		dev_warn(&h->pdev->dev,
-			"Timed out waiting for an abort command to become available.\n");
+			"%s FAILED, timeout waiting for an abort command to become available.\n",
+			msg);
 		cmd_free(h, abort);
 		return FAILED;
 	}
@@ -5198,16 +5199,14 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 	atomic_inc(&h->abort_cmds_available);
 	wake_up_all(&h->abort_cmd_wait_queue);
 	if (rc != 0) {
-		dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
-			h->scsi_host->host_no,
-			dev->bus, dev->target, dev->lun,
-			"FAILED to abort command");
+		dev_warn(&h->pdev->dev, "%s SENT, FAILED\n", msg);
 		cmd_free(h, abort);
 		return FAILED;
 	}
-	dev_info(&h->pdev->dev, "%s REQUEST SUCCEEDED.\n", msg);
+	dev_info(&h->pdev->dev, "%s SENT, SUCCESS\n", msg);
 
-	/* If the abort(s) above completed and actually aborted the
+	/*
+	 * If the abort(s) above completed and actually aborted the
 	 * command, then the command to be aborted should already be
 	 * completed.  If not, wait around a bit more to see if they
 	 * manage to complete normally.


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 25/42] hpsa: do not print ioaccel2 warning messages about unusual completions.
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (23 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 24/42] hpsa: clean up some error reporting output in abort handler Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 26/42] hpsa: add support sending aborts to physical devices via the ioaccel2 path Don Brace
                   ` (16 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

The SCSI midlayer already prints more detail about completions,
and has logging level options to filter them if not wanted.
These just slow down the system if a lot of errors occur,
stressing error handling even more.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   24 ------------------------
 1 file changed, 24 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 7ab34f8..7d0e226 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1906,9 +1906,6 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 		case IOACCEL2_STATUS_SR_TASK_COMP_GOOD:
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_CHK_COND:
-			dev_warn(&h->pdev->dev,
-				"%s: task complete with check condition.\n",
-				"HP SSD Smart Path");
 			cmd->result |= SAM_STAT_CHECK_CONDITION;
 			if (c2->error_data.data_present !=
 					IOACCEL2_SENSE_DATA_PRESENT) {
@@ -1928,30 +1925,18 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 			retry = 1;
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_BUSY:
-			dev_warn(&h->pdev->dev,
-				"%s: task complete with BUSY status.\n",
-				"HP SSD Smart Path");
 			retry = 1;
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_RES_CON:
-			dev_warn(&h->pdev->dev,
-				"%s: task complete with reservation conflict.\n",
-				"HP SSD Smart Path");
 			retry = 1;
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_SET_FULL:
 			retry = 1;
 			break;
 		case IOACCEL2_STATUS_SR_TASK_COMP_ABORTED:
-			dev_warn(&h->pdev->dev,
-				"%s: task complete with aborted status.\n",
-				"HP SSD Smart Path");
 			retry = 1;
 			break;
 		default:
-			dev_warn(&h->pdev->dev,
-				"%s: task complete with unrecognized status: 0x%02x\n",
-				"HP SSD Smart Path", c2->error_data.status);
 			retry = 1;
 			break;
 		}
@@ -1980,9 +1965,6 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 			break;
 		default:
 			retry = 1;
-			dev_warn(&h->pdev->dev,
-				"unexpected delivery or target failure, status = 0x%02x\n",
-				c2->error_data.status);
 		}
 		break;
 	case IOACCEL2_SERV_RESPONSE_TMF_COMPLETE:
@@ -1990,17 +1972,11 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 	case IOACCEL2_SERV_RESPONSE_TMF_SUCCESS:
 		break;
 	case IOACCEL2_SERV_RESPONSE_TMF_REJECTED:
-		dev_warn(&h->pdev->dev, "task management function rejected.\n");
 		retry = 1;
 		break;
 	case IOACCEL2_SERV_RESPONSE_TMF_WRONG_LUN:
-		dev_warn(&h->pdev->dev, "task management function invalid LUN\n");
 		break;
 	default:
-		dev_warn(&h->pdev->dev,
-			"%s: Unrecognized server response: 0x%02x\n",
-			"HP SSD Smart Path",
-			c2->error_data.serv_response);
 		retry = 1;
 		break;
 	}


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 26/42] hpsa: add support sending aborts to physical devices via the ioaccel2 path
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (24 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 25/42] hpsa: do not print ioaccel2 warning messages about unusual completions Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 27/42] hpsa: use helper routines for finishing commands Don Brace
                   ` (15 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Stephen Cameron <stephenmcameron@gmail.com>

add support for tmf when in ioaccel2 mode

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Joe Handzik <joseph.t.handzik@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |  136 +++++++++++++++++++++++++++++++++++++++++++++--
 drivers/scsi/hpsa.h     |    1 
 drivers/scsi/hpsa_cmd.h |    6 +-
 3 files changed, 135 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 7d0e226..e9642dd 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -861,6 +861,28 @@ static void set_ioaccel1_performant_mode(struct ctlr_info *h,
 					IOACCEL1_BUSADDR_CMDTYPE;
 }
 
+static void set_ioaccel2_tmf_performant_mode(struct ctlr_info *h,
+						struct CommandList *c,
+						int reply_queue)
+{
+	struct hpsa_tmf_struct *cp = (struct hpsa_tmf_struct *)
+		&h->ioaccel2_cmd_pool[c->cmdindex];
+
+	/* Tell the controller to post the reply to the queue for this
+	 * processor.  This seems to give the best I/O throughput.
+	 */
+	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
+		cp->reply_queue = smp_processor_id() % h->nreply_queues;
+	else
+		cp->reply_queue = reply_queue % h->nreply_queues;
+	/* Set the bits in the address sent down to include:
+	 *  - performant mode bit not used in ioaccel mode 2
+	 *  - pull count (bits 0-3)
+	 *  - command type isn't needed for ioaccel2
+	 */
+	c->busaddr |= h->ioaccel2_blockFetchTable[0];
+}
+
 static void set_ioaccel2_performant_mode(struct ctlr_info *h,
 						struct CommandList *c,
 						int reply_queue)
@@ -927,6 +949,10 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
 		set_ioaccel2_performant_mode(h, c, reply_queue);
 		writel(c->busaddr, h->vaddr + IOACCEL2_INBOUND_POSTQ_32);
 		break;
+	case IOACCEL2_TMF:
+		set_ioaccel2_tmf_performant_mode(h, c, reply_queue);
+		writel(c->busaddr, h->vaddr + IOACCEL2_INBOUND_POSTQ_32);
+		break;
 	default:
 		set_performant_mode(h, c, reply_queue);
 		h->access.submit_command(h, c);
@@ -4954,6 +4980,47 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
 	return rc;
 }
 
+static void setup_ioaccel2_abort_cmd(struct CommandList *c, struct ctlr_info *h,
+	struct CommandList *command_to_abort, int reply_queue)
+{
+	struct io_accel2_cmd *c2 = &h->ioaccel2_cmd_pool[c->cmdindex];
+	struct hpsa_tmf_struct *ac = (struct hpsa_tmf_struct *) c2;
+	struct io_accel2_cmd *c2a =
+		&h->ioaccel2_cmd_pool[command_to_abort->cmdindex];
+	struct scsi_cmnd *scmd =
+		(struct scsi_cmnd *) command_to_abort->scsi_cmd;
+	struct hpsa_scsi_dev_t *dev = scmd->device->hostdata;
+
+	/*
+	 * We're overlaying struct hpsa_tmf_struct on top of something which
+	 * was allocated as a struct io_accel2_cmd, so we better be sure it
+	 * actually fits, and doesn't overrun the error info space.
+	 */
+	BUILD_BUG_ON(sizeof(struct hpsa_tmf_struct) >
+			sizeof(struct io_accel2_cmd));
+	BUG_ON(offsetof(struct io_accel2_cmd, error_data) <
+			offsetof(struct hpsa_tmf_struct, error_len) +
+				sizeof(ac->error_len));
+
+	c->cmd_type = IOACCEL2_TMF;
+	/* Adjust the DMA address to point to the accelerated command buffer */
+	c->busaddr = (u32) h->ioaccel2_cmd_pool_dhandle +
+				(c->cmdindex * sizeof(struct io_accel2_cmd));
+	BUG_ON(c->busaddr & 0x0000007F);
+
+	memset(ac, 0, sizeof(*c2)); /* yes this is correct */
+	ac->iu_type = IOACCEL2_IU_TMF_TYPE;
+	ac->reply_queue = reply_queue;
+	ac->tmf = IOACCEL2_TMF_ABORT;
+	ac->it_nexus = cpu_to_le32(dev->ioaccel_handle);
+	memset(ac->lun_id, 0, sizeof(ac->lun_id));
+	ac->tag = cpu_to_le64(c->cmdindex << DIRECT_LOOKUP_SHIFT);
+	ac->abort_tag = cpu_to_le64(le32_to_cpu(c2a->Tag));
+	ac->error_ptr = cpu_to_le64(c->busaddr +
+			offsetof(struct io_accel2_cmd, error_data));
+	ac->error_len = cpu_to_le32(sizeof(c2->error_data));
+}
+
 /* ioaccel2 path firmware cannot handle abort task requests.
  * Change abort requests to physical target reset, and send to the
  * address of the physical disk used for the ioaccel 2 command.
@@ -5032,17 +5099,72 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
 	return rc; /* success */
 }
 
+static int hpsa_send_abort_ioaccel2(struct ctlr_info *h,
+	struct CommandList *abort, int reply_queue)
+{
+	int rc = IO_OK;
+	struct CommandList *c;
+	__le32 taglower, tagupper;
+	struct hpsa_scsi_dev_t *dev;
+	struct io_accel2_cmd *c2;
+
+	dev = abort->scsi_cmd->device->hostdata;
+	if (!dev->offload_enabled && !dev->hba_ioaccel_enabled)
+		return -1;
+
+	c = cmd_alloc(h);
+	setup_ioaccel2_abort_cmd(c, h, abort, reply_queue);
+	c2 = &h->ioaccel2_cmd_pool[c->cmdindex];
+	(void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
+	hpsa_get_tag(h, abort, &taglower, &tagupper);
+	dev_dbg(&h->pdev->dev,
+		"%s: Tag:0x%08x:%08x: do_simple_cmd(ioaccel2 abort) completed.\n",
+		__func__, tagupper, taglower);
+	/* no unmap needed here because no data xfer. */
+
+	dev_dbg(&h->pdev->dev,
+		"%s: Tag:0x%08x:%08x: abort service response = 0x%02x.\n",
+		__func__, tagupper, taglower, c2->error_data.serv_response);
+	switch (c2->error_data.serv_response) {
+	case IOACCEL2_SERV_RESPONSE_TMF_COMPLETE:
+	case IOACCEL2_SERV_RESPONSE_TMF_SUCCESS:
+		rc = 0;
+		break;
+	case IOACCEL2_SERV_RESPONSE_TMF_REJECTED:
+	case IOACCEL2_SERV_RESPONSE_FAILURE:
+	case IOACCEL2_SERV_RESPONSE_TMF_WRONG_LUN:
+		rc = -1;
+		break;
+	default:
+		dev_warn(&h->pdev->dev,
+			"%s: Tag:0x%08x:%08x: unknown abort service response 0x%02x\n",
+			__func__, tagupper, taglower,
+			c2->error_data.serv_response);
+		rc = -1;
+	}
+	cmd_free(h, c);
+	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: Finished.\n", __func__,
+		tagupper, taglower);
+	return rc;
+}
+
 static int hpsa_send_abort_both_ways(struct ctlr_info *h,
 	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
 {
-	/* ioccelerator mode 2 commands should be aborted via the
+	/*
+	 * ioccelerator mode 2 commands should be aborted via the
 	 * accelerated path, since RAID path is unaware of these commands,
-	 * but underlying firmware can't handle abort TMF.
-	 * Change abort to physical device reset.
+	 * but not all underlying firmware can handle abort TMF.
+	 * Change abort to physical device reset when abort TMF is unsupported.
 	 */
-	if (abort->cmd_type == CMD_IOACCEL2)
-		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr,
+	if (abort->cmd_type == CMD_IOACCEL2) {
+		if (HPSATMF_IOACCEL_ENABLED & h->TMFSupportFlags)
+			return hpsa_send_abort_ioaccel2(h, abort,
+						reply_queue);
+		else
+			return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr,
 							abort, reply_queue);
+	}
 	return hpsa_send_abort(h, scsi3addr, abort, reply_queue);
 }
 
@@ -5927,7 +6049,7 @@ static inline void finish_cmd(struct CommandList *c)
 	if (likely(c->cmd_type == CMD_IOACCEL1 || c->cmd_type == CMD_SCSI
 			|| c->cmd_type == CMD_IOACCEL2))
 		complete_scsi_command(c);
-	else if (c->cmd_type == CMD_IOCTL_PEND)
+	else if (c->cmd_type == CMD_IOCTL_PEND || c->cmd_type == IOACCEL2_TMF)
 		complete(c->waiting);
 }
 
@@ -6714,6 +6836,8 @@ static void hpsa_find_board_params(struct ctlr_info *h)
 		dev_warn(&h->pdev->dev, "Physical aborts not supported\n");
 	if (!(HPSATMF_LOG_TASK_ABORT & h->TMFSupportFlags))
 		dev_warn(&h->pdev->dev, "Logical aborts not supported\n");
+	if (!(HPSATMF_IOACCEL_ENABLED & h->TMFSupportFlags))
+		dev_warn(&h->pdev->dev, "HP SSD Smart Path aborts not supported\n");
 }
 
 static inline bool hpsa_CISS_signature_present(struct ctlr_info *h)
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 3acacf6..28b5d79 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -231,6 +231,7 @@ struct ctlr_info {
 #define HPSATMF_PHYS_QRY_TASK   (1 << 7)
 #define HPSATMF_PHYS_QRY_TSET   (1 << 8)
 #define HPSATMF_PHYS_QRY_ASYNC  (1 << 9)
+#define HPSATMF_IOACCEL_ENABLED (1 << 15)
 #define HPSATMF_MASK_SUPPORTED  (1 << 16)
 #define HPSATMF_LOG_LUN_RESET   (1 << 17)
 #define HPSATMF_LOG_NEX_RESET   (1 << 18)
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index cecb62b..3719592 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -396,6 +396,7 @@ struct ErrorInfo {
 #define CMD_SCSI	0x03
 #define CMD_IOACCEL1	0x04
 #define CMD_IOACCEL2	0x05
+#define IOACCEL2_TMF	0x06
 
 #define DIRECT_LOOKUP_SHIFT 4
 #define DIRECT_LOOKUP_MASK (~((1 << DIRECT_LOOKUP_SHIFT) - 1))
@@ -590,6 +591,7 @@ struct io_accel2_cmd {
 #define IOACCEL2_DIR_NO_DATA	0x00
 #define IOACCEL2_DIR_DATA_IN	0x01
 #define IOACCEL2_DIR_DATA_OUT	0x02
+#define IOACCEL2_TMF_ABORT	0x01
 /*
  * SCSI Task Management Request format for Accelerator Mode 2
  */
@@ -598,13 +600,13 @@ struct hpsa_tmf_struct {
 	u8 reply_queue;		/* Reply Queue ID */
 	u8 tmf;			/* Task Management Function */
 	u8 reserved1;		/* byte 3 Reserved */
-	u32 it_nexus;		/* SCSI I-T Nexus */
+	__le32 it_nexus;	/* SCSI I-T Nexus */
 	u8 lun_id[8];		/* LUN ID for TMF request */
 	__le64 tag;		/* cciss tag associated w/ request */
 	__le64 abort_tag;	/* cciss tag of SCSI cmd or TMF to abort */
 	__le64 error_ptr;		/* Error Pointer */
 	__le32 error_len;		/* Error Length */
-};
+} __aligned(IOACCEL2_COMMANDLIST_ALIGNMENT);
 
 /* Configuration Table Structure */
 struct HostWrite {


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 27/42] hpsa: use helper routines for finishing commands
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (25 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 26/42] hpsa: add support sending aborts to physical devices via the ioaccel2 path Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:04 ` [PATCH v3 28/42] hpsa: don't return abort request until target is complete Don Brace
                   ` (14 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

cleanup command completions

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   78 ++++++++++++++++++++-------------------------------
 1 file changed, 31 insertions(+), 47 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index e9642dd..b0949f7 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -2010,6 +2010,19 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 	return retry;	/* retry on raid path? */
 }
 
+static void hpsa_cmd_free_and_done(struct ctlr_info *h,
+		struct CommandList *c, struct scsi_cmnd *cmd)
+{
+	cmd_free(h, c);
+	cmd->scsi_done(cmd);
+}
+
+static void hpsa_retry_cmd(struct ctlr_info *h, struct CommandList *c)
+{
+	INIT_WORK(&c->work, hpsa_command_resubmit_worker);
+	queue_work_on(raw_smp_processor_id(), h->resubmit_wq, &c->work);
+}
+
 static void process_ioaccel2_completion(struct ctlr_info *h,
 		struct CommandList *c, struct scsi_cmnd *cmd,
 		struct hpsa_scsi_dev_t *dev)
@@ -2018,13 +2031,11 @@ static void process_ioaccel2_completion(struct ctlr_info *h,
 
 	/* check for good status */
 	if (likely(c2->error_data.serv_response == 0 &&
-			c2->error_data.status == 0)) {
-		cmd_free(h, c);
-		cmd->scsi_done(cmd);
-		return;
-	}
+			c2->error_data.status == 0))
+		return hpsa_cmd_free_and_done(h, c, cmd);
 
-	/* Any RAID offload error results in retry which will use
+	/*
+	 * Any RAID offload error results in retry which will use
 	 * the normal I/O path so the controller can handle whatever's
 	 * wrong.
 	 */
@@ -2034,19 +2045,14 @@ static void process_ioaccel2_completion(struct ctlr_info *h,
 		if (c2->error_data.status ==
 			IOACCEL2_STATUS_SR_IOACCEL_DISABLED)
 			dev->offload_enabled = 0;
-		goto retry_cmd;
+
+		return hpsa_retry_cmd(h, c);
 	}
 
 	if (handle_ioaccel_mode2_error(h, c, cmd, c2))
-		goto retry_cmd;
-
-	cmd_free(h, c);
-	cmd->scsi_done(cmd);
-	return;
+		return hpsa_retry_cmd(h, c);
 
-retry_cmd:
-	INIT_WORK(&c->work, hpsa_command_resubmit_worker);
-	queue_work_on(raw_smp_processor_id(), h->resubmit_wq, &c->work);
+	return hpsa_cmd_free_and_done(h, c, cmd);
 }
 
 /* Returns 0 on success, < 0 otherwise. */
@@ -2119,22 +2125,15 @@ static void complete_scsi_command(struct CommandList *cp)
 	if (unlikely(ei->CommandStatus == CMD_CTLR_LOCKUP)) {
 		/* DID_NO_CONNECT will prevent a retry */
 		cmd->result = DID_NO_CONNECT << 16;
-		cmd_free(h, cp);
-		cmd->scsi_done(cmd);
-		return;
+		return hpsa_cmd_free_and_done(h, cp, cmd);
 	}
 
 	if (cp->cmd_type == CMD_IOACCEL2)
 		return process_ioaccel2_completion(h, cp, cmd, dev);
 
 	scsi_set_resid(cmd, ei->ResidualCnt);
-	if (ei->CommandStatus == 0) {
-		if (cp->cmd_type == CMD_IOACCEL1)
-			atomic_dec(&cp->phys_disk->ioaccel_cmds_out);
-		cmd_free(h, cp);
-		cmd->scsi_done(cmd);
-		return;
-	}
+	if (ei->CommandStatus == 0)
+		return hpsa_cmd_free_and_done(h, cp, cmd);
 
 	/* For I/O accelerator commands, copy over some fields to the normal
 	 * CISS header used below for error handling.
@@ -2156,10 +2155,7 @@ static void complete_scsi_command(struct CommandList *cp)
 		if (is_logical_dev_addr_mode(dev->scsi3addr)) {
 			if (ei->CommandStatus == CMD_IOACCEL_DISABLED)
 				dev->offload_enabled = 0;
-			INIT_WORK(&cp->work, hpsa_command_resubmit_worker);
-			queue_work_on(raw_smp_processor_id(),
-					h->resubmit_wq, &cp->work);
-			return;
+			return hpsa_retry_cmd(h, cp);
 		}
 	}
 
@@ -2290,8 +2286,8 @@ static void complete_scsi_command(struct CommandList *cp)
 		dev_warn(&h->pdev->dev, "cp %p returned unknown status %x\n",
 				cp, ei->CommandStatus);
 	}
-	cmd_free(h, cp);
-	cmd->scsi_done(cmd);
+
+	return hpsa_cmd_free_and_done(h, cp, cmd);
 }
 
 static void hpsa_pci_unmap(struct pci_dev *pdev,
@@ -4544,16 +4540,13 @@ static void hpsa_command_resubmit_worker(struct work_struct *work)
 {
 	struct scsi_cmnd *cmd;
 	struct hpsa_scsi_dev_t *dev;
-	struct CommandList *c =
-			container_of(work, struct CommandList, work);
+	struct CommandList *c = container_of(work, struct CommandList, work);
 
 	cmd = c->scsi_cmd;
 	dev = cmd->device->hostdata;
 	if (!dev) {
 		cmd->result = DID_NO_CONNECT << 16;
-		cmd_free(c->h, c);
-		cmd->scsi_done(cmd);
-		return;
+		return hpsa_cmd_free_and_done(c->h, c, cmd);
 	}
 	if (c->cmd_type == CMD_IOACCEL2) {
 		struct ctlr_info *h = c->h;
@@ -4572,12 +4565,7 @@ static void hpsa_command_resubmit_worker(struct work_struct *work)
 				 * then get SCSI_MLQUEUE_HOST_BUSY.
 				 */
 				cmd->result = DID_IMM_RETRY << 16;
-				cmd->scsi_done(cmd);
-				cmd_free(h, c);	/* FIX-ME:  on merge, change
-						 * to cmd_tagged_free() and
-						 * ultimately to
-						 * hpsa_cmd_free_and_done(). */
-				return;
+				return hpsa_cmd_free_and_done(h, c, cmd);
 			}
 			/* else, fall thru and resubmit down CISS path */
 		}
@@ -4641,9 +4629,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 		if (rc == 0)
 			return 0;
 		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
-			cmd_free(h, c);	/* FIX-ME:  on merge, change to
-					 * cmd_tagged_free(), and ultimately
-					 * to hpsa_cmd_resolve_and_free(). */
+			cmd_free(h, c);
 			return SCSI_MLQUEUE_HOST_BUSY;
 		}
 	}
@@ -7761,8 +7747,6 @@ static void hpsa_flush_cache(struct ctlr_info *h)
 	struct CommandList *c;
 	int rc;
 
-	/* Don't bother trying to flush the cache if locked up */
-	/* FIXME not necessary if do_simple_cmd does the check */
 	if (unlikely(lockup_detected(h)))
 		return;
 	flush_buf = kzalloc(4, GFP_KERNEL);


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 28/42] hpsa: don't return abort request until target is complete
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (26 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 27/42] hpsa: use helper routines for finishing commands Don Brace
@ 2015-03-17 20:04 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 29/42] hpsa: refactor and rework support for sending TEST_UNIT_READY Don Brace
                   ` (13 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:04 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

Don't return from the abort request until the target command is complete.
Mark outstanding commands which have a pending abort, and do not send them
to the host if we can avoid it.

If the current command has been aborted, do not call the SCSI command
completion routine from the I/O path: when the abort returns successfully,
the SCSI mid-layer will handle the completion implicitly.

The following race was possible in theory.

1. LLD is requested to abort a scsi command
2. scsi command completes
3. The struct CommandList associated with 2 is made available.
4. new io request to LLD to another LUN re-uses struct CommandList
5. abort handler follows scsi_cmnd->host_scribble and
   finds struct CommandList and tries to aborts it.

Now we have aborted the wrong command.

Fix by resetting the scsi_cmd field of struct CommandList
upon completion and making the abort handler check that
the scsi_cmd pointer in the CommadList struct matches the
scsi_cmnd that it has been asked to abort.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |  120 +++++++++++++++++++++++++++++++++++------------
 drivers/scsi/hpsa.h     |    1 
 drivers/scsi/hpsa_cmd.h |    2 +
 3 files changed, 93 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index b0949f7..1cae336 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -195,6 +195,10 @@ static struct board_type products[] = {
 	{0xFFFF103C, "Unknown Smart Array", &SA5_access},
 };
 
+#define SCSI_CMD_BUSY ((struct scsi_cmnd *)&hpsa_cmd_busy)
+static const struct scsi_cmnd hpsa_cmd_busy;
+#define SCSI_CMD_IDLE ((struct scsi_cmnd *)&hpsa_cmd_idle)
+static const struct scsi_cmnd hpsa_cmd_idle;
 static int number_of_controllers;
 
 static irqreturn_t do_hpsa_intr_intx(int irq, void *dev_id);
@@ -270,6 +274,11 @@ static inline struct ctlr_info *shost_to_hba(struct Scsi_Host *sh)
 	return (struct ctlr_info *) *priv;
 }
 
+static inline bool hpsa_is_cmd_idle(struct CommandList *c)
+{
+	return c->scsi_cmd == SCSI_CMD_IDLE;
+}
+
 /* extract sense key, asc, and ascq from sense data.  -1 means invalid. */
 static void decode_sense_data(const u8 *sense_data, int sense_data_len,
 			int *sense_key, int *asc, int *ascq)
@@ -959,9 +968,11 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
 	}
 }
 
-static void enqueue_cmd_and_start_io(struct ctlr_info *h,
-					struct CommandList *c)
+static void enqueue_cmd_and_start_io(struct ctlr_info *h, struct CommandList *c)
 {
+	if (unlikely(c->abort_pending))
+		return finish_cmd(c);
+
 	__enqueue_cmd_and_start_io(h, c, DEFAULT_REPLY_QUEUE);
 }
 
@@ -2010,9 +2021,36 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 	return retry;	/* retry on raid path? */
 }
 
+static void hpsa_cmd_resolve_events(struct ctlr_info *h,
+		struct CommandList *c)
+{
+	/*
+	 * Prevent the following race in the abort handler:
+	 *
+	 * 1. LLD is requested to abort a SCSI command
+	 * 2. The SCSI command completes
+	 * 3. The struct CommandList associated with step 2 is made available
+	 * 4. New I/O request to LLD to another LUN re-uses struct CommandList
+	 * 5. Abort handler follows scsi_cmnd->host_scribble and
+	 *    finds struct CommandList and tries to aborts it
+	 * Now we have aborted the wrong command.
+	 *
+	 * Clear c->scsi_cmd here so that the abort handler will know this
+	 * command has completed.  Then, check to see if the abort handler is
+	 * waiting for this command, and, if so, wake it.
+	 */
+	c->scsi_cmd = SCSI_CMD_IDLE;
+	mb(); /* Ensure c->scsi_cmd is set to SCSI_CMD_IDLE */
+	if (c->abort_pending) {
+		c->abort_pending = false;
+		wake_up_all(&h->abort_sync_wait_queue);
+	}
+}
+
 static void hpsa_cmd_free_and_done(struct ctlr_info *h,
 		struct CommandList *c, struct scsi_cmnd *cmd)
 {
+	hpsa_cmd_resolve_events(h, c);
 	cmd_free(h, c);
 	cmd->scsi_done(cmd);
 }
@@ -2023,6 +2061,21 @@ static void hpsa_retry_cmd(struct ctlr_info *h, struct CommandList *c)
 	queue_work_on(raw_smp_processor_id(), h->resubmit_wq, &c->work);
 }
 
+static void hpsa_set_scsi_cmd_aborted(struct scsi_cmnd *cmd)
+{
+	cmd->result = DID_ABORT << 16;
+}
+
+static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
+				    struct scsi_cmnd *cmd)
+{
+	hpsa_set_scsi_cmd_aborted(cmd);
+	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
+			 c->Request.CDB, c->err_info->ScsiStatus);
+	hpsa_cmd_resolve_events(h, c);
+	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
+}
+
 static void process_ioaccel2_completion(struct ctlr_info *h,
 		struct CommandList *c, struct scsi_cmnd *cmd,
 		struct hpsa_scsi_dev_t *dev)
@@ -2034,6 +2087,10 @@ static void process_ioaccel2_completion(struct ctlr_info *h,
 			c2->error_data.status == 0))
 		return hpsa_cmd_free_and_done(h, c, cmd);
 
+	/* don't requeue a command which is being aborted */
+	if (unlikely(c->abort_pending))
+		return hpsa_cmd_abort_and_free(h, c, cmd);
+
 	/*
 	 * Any RAID offload error results in retry which will use
 	 * the normal I/O path so the controller can handle whatever's
@@ -2155,10 +2212,14 @@ static void complete_scsi_command(struct CommandList *cp)
 		if (is_logical_dev_addr_mode(dev->scsi3addr)) {
 			if (ei->CommandStatus == CMD_IOACCEL_DISABLED)
 				dev->offload_enabled = 0;
-			return hpsa_retry_cmd(h, cp);
+			if (!cp->abort_pending)
+				return hpsa_retry_cmd(h, cp);
 		}
 	}
 
+	if (cp->abort_pending)
+		ei->CommandStatus = CMD_ABORTED;
+
 	/* an error has occurred */
 	switch (ei->CommandStatus) {
 
@@ -2246,10 +2307,8 @@ static void complete_scsi_command(struct CommandList *cp)
 			cp->Request.CDB);
 		break;
 	case CMD_ABORTED:
-		cmd->result = DID_ABORT << 16;
-		dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
-				cp->Request.CDB, ei->ScsiStatus);
-		break;
+		/* Return now to avoid calling scsi_done(). */
+		return hpsa_cmd_abort_and_free(h, cp, cmd);
 	case CMD_ABORT_FAILED:
 		cmd->result = DID_ERROR << 16;
 		dev_warn(&h->pdev->dev, "CDB %16phN : abort failed\n",
@@ -4485,6 +4544,7 @@ static void hpsa_cmd_init(struct ctlr_info *h, int index,
 	c->ErrDesc.Addr = cpu_to_le64((u64) err_dma_handle);
 	c->ErrDesc.Len = cpu_to_le32((u32) sizeof(*c->err_info));
 	c->h = h;
+	c->scsi_cmd = SCSI_CMD_IDLE;
 }
 
 static void hpsa_preinitialize_commands(struct ctlr_info *h)
@@ -4548,6 +4608,8 @@ static void hpsa_command_resubmit_worker(struct work_struct *work)
 		cmd->result = DID_NO_CONNECT << 16;
 		return hpsa_cmd_free_and_done(c->h, c, cmd);
 	}
+	if (c->abort_pending)
+		return hpsa_cmd_abort_and_free(c->h, c, cmd);
 	if (c->cmd_type == CMD_IOACCEL2) {
 		struct ctlr_info *h = c->h;
 		struct io_accel2_cmd *c2 = &h->ioaccel2_cmd_pool[c->cmdindex];
@@ -4973,8 +5035,7 @@ static void setup_ioaccel2_abort_cmd(struct CommandList *c, struct ctlr_info *h,
 	struct hpsa_tmf_struct *ac = (struct hpsa_tmf_struct *) c2;
 	struct io_accel2_cmd *c2a =
 		&h->ioaccel2_cmd_pool[command_to_abort->cmdindex];
-	struct scsi_cmnd *scmd =
-		(struct scsi_cmnd *) command_to_abort->scsi_cmd;
+	struct scsi_cmnd *scmd = command_to_abort->scsi_cmd;
 	struct hpsa_scsi_dev_t *dev = scmd->device->hostdata;
 
 	/*
@@ -4989,6 +5050,8 @@ static void setup_ioaccel2_abort_cmd(struct CommandList *c, struct ctlr_info *h,
 				sizeof(ac->error_len));
 
 	c->cmd_type = IOACCEL2_TMF;
+	c->scsi_cmd = SCSI_CMD_BUSY;
+
 	/* Adjust the DMA address to point to the accelerated command buffer */
 	c->busaddr = (u32) h->ioaccel2_cmd_pool_dhandle +
 				(c->cmdindex * sizeof(struct io_accel2_cmd));
@@ -5182,7 +5245,7 @@ static inline int wait_for_available_abort_cmd(struct ctlr_info *h)
 static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 {
 
-	int i, rc;
+	int rc;
 	struct ctlr_info *h;
 	struct hpsa_scsi_dev_t *dev;
 	struct CommandList *abort; /* pointer to command to be aborted */
@@ -5256,6 +5319,16 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		return FAILED;
 	}
 
+	/*
+	 * Check that we're aborting the right command.
+	 * It's possible the CommandList already completed and got re-used.
+	 */
+	if (abort->scsi_cmd != sc) {
+		cmd_free(h, abort);
+		return SUCCESS;
+	}
+
+	abort->abort_pending = true;
 	hpsa_get_tag(h, abort, &taglower, &tagupper);
 	reply_queue = hpsa_extract_reply_queue(h, abort);
 	ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower);
@@ -5288,27 +5361,10 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		return FAILED;
 	}
 	dev_info(&h->pdev->dev, "%s SENT, SUCCESS\n", msg);
-
-	/*
-	 * If the abort(s) above completed and actually aborted the
-	 * command, then the command to be aborted should already be
-	 * completed.  If not, wait around a bit more to see if they
-	 * manage to complete normally.
-	 */
-#define ABORT_COMPLETE_WAIT_SECS 30
-	for (i = 0; i < ABORT_COMPLETE_WAIT_SECS * 10; i++) {
-		refcount = atomic_read(&abort->refcount);
-		if (refcount < 2) {
-			cmd_free(h, abort);
-			return SUCCESS;
-		} else {
-			msleep(100);
-		}
-	}
-	dev_warn(&h->pdev->dev, "%s FAILED. Aborted command has not completed after %d seconds.\n",
-		msg, ABORT_COMPLETE_WAIT_SECS);
+	wait_event(h->abort_sync_wait_queue,
+		   abort->scsi_cmd != sc || lockup_detected(h));
 	cmd_free(h, abort);
-	return FAILED;
+	return !lockup_detected(h) ? SUCCESS : FAILED;
 }
 
 /*
@@ -5554,6 +5610,7 @@ static int hpsa_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 
 	/* Fill in the command type */
 	c->cmd_type = CMD_IOCTL_PEND;
+	c->scsi_cmd = SCSI_CMD_BUSY;
 	/* Fill in Command Header */
 	c->Header.ReplyQueue = 0; /* unused in simple mode */
 	if (iocommand.buf_size > 0) {	/* buffer to fill */
@@ -5687,6 +5744,7 @@ static int hpsa_big_passthru_ioctl(struct ctlr_info *h, void __user *argp)
 	c = cmd_alloc(h);
 
 	c->cmd_type = CMD_IOCTL_PEND;
+	c->scsi_cmd = SCSI_CMD_BUSY;
 	c->Header.ReplyQueue = 0;
 	c->Header.SGList = (u8) sg_used;
 	c->Header.SGTotal = cpu_to_le16(sg_used);
@@ -5829,6 +5887,7 @@ static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
 	u64 tag; /* for commands to be aborted */
 
 	c->cmd_type = CMD_IOCTL_PEND;
+	c->scsi_cmd = SCSI_CMD_BUSY;
 	c->Header.ReplyQueue = 0;
 	if (buff != NULL && size > 0) {
 		c->Header.SGList = 1;
@@ -7622,6 +7681,7 @@ reinit_after_soft_reset:
 		goto clean5;	/* cmd, irq, pci, lockup, wq/aer/h */
 	init_waitqueue_head(&h->scan_wait_queue);
 	init_waitqueue_head(&h->abort_cmd_wait_queue);
+	init_waitqueue_head(&h->abort_sync_wait_queue);
 	h->scan_finished = 1; /* no scan currently in progress */
 
 	pci_set_drvdata(pdev, h);
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 28b5d79..7cb8586 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -266,6 +266,7 @@ struct ctlr_info {
 	struct workqueue_struct *rescan_ctlr_wq;
 	atomic_t abort_cmds_available;
 	wait_queue_head_t abort_cmd_wait_queue;
+	wait_queue_head_t abort_sync_wait_queue;
 };
 
 struct offline_device_entry {
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 3719592..f986402 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -439,6 +439,8 @@ struct CommandList {
 	 * not used.
 	 */
 	struct hpsa_scsi_dev_t *phys_disk;
+
+	int abort_pending;
 	atomic_t refcount; /* Must be last to avoid memset in hpsa_cmd_init() */
 } __aligned(COMMANDLIST_ALIGNMENT);
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 29/42] hpsa: refactor and rework support for sending TEST_UNIT_READY
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (27 preceding siblings ...)
  2015-03-17 20:04 ` [PATCH v3 28/42] hpsa: don't return abort request until target is complete Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 30/42] hpsa: performance tweak for hpsa_scatter_gather() Don Brace
                   ` (12 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

Factor out the code which sends the TEST_UNIT_READY from
wait_for_device_to_become_ready() into its own function.

Move the code which waits for the TEST_UNIT_READY from
wait_for_device_to_become_ready() into its own function.

If a logical drive has failed, resetting it will ensure
outstanding commands are completed, but polling it with
TURs after the reset will not work because the TURs will
never report good status.  So successful TUR should not
be a condition of success for the device reset error
handler.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  117 ++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 87 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 1cae336..3751df3 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -4822,51 +4822,108 @@ static int hpsa_register_scsi(struct ctlr_info *h)
 	return -ENOMEM;
 }
 
-static int wait_for_device_to_become_ready(struct ctlr_info *h,
-	unsigned char lunaddr[])
+/*
+ * Send a TEST_UNIT_READY command to the specified LUN using the specified
+ * reply queue; returns zero if the unit is ready, and non-zero otherwise.
+ */
+static int hpsa_send_test_unit_ready(struct ctlr_info *h,
+				struct CommandList *c, unsigned char lunaddr[],
+				int reply_queue)
+{
+	int rc;
+
+	/* Send the Test Unit Ready, fill_cmd can't fail, no mapping */
+	(void) fill_cmd(c, TEST_UNIT_READY, h,
+			NULL, 0, 0, lunaddr, TYPE_CMD);
+	rc = hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
+	if (rc)
+		return rc;
+	/* no unmap needed here because no data xfer. */
+
+	/* Check if the unit is already ready. */
+	if (c->err_info->CommandStatus == CMD_SUCCESS)
+		return 0;
+
+	/*
+	 * The first command sent after reset will receive "unit attention" to
+	 * indicate that the LUN has been reset...this is actually what we're
+	 * looking for (but, success is good too).
+	 */
+	if (c->err_info->CommandStatus == CMD_TARGET_STATUS &&
+		c->err_info->ScsiStatus == SAM_STAT_CHECK_CONDITION &&
+			(c->err_info->SenseInfo[2] == NO_SENSE ||
+			 c->err_info->SenseInfo[2] == UNIT_ATTENTION))
+		return 0;
+
+	return 1;
+}
+
+/*
+ * Wait for a TEST_UNIT_READY command to complete, retrying as necessary;
+ * returns zero when the unit is ready, and non-zero when giving up.
+ */
+static int hpsa_wait_for_test_unit_ready(struct ctlr_info *h,
+				struct CommandList *c,
+				unsigned char lunaddr[], int reply_queue)
 {
 	int rc;
 	int count = 0;
 	int waittime = 1; /* seconds */
-	struct CommandList *c;
-
-	c = cmd_alloc(h);
 
 	/* Send test unit ready until device ready, or give up. */
-	while (count < HPSA_TUR_RETRY_LIMIT) {
+	for (count = 0; count < HPSA_TUR_RETRY_LIMIT; count++) {
 
-		/* Wait for a bit.  do this first, because if we send
+		/*
+		 * Wait for a bit.  do this first, because if we send
 		 * the TUR right away, the reset will just abort it.
 		 */
 		msleep(1000 * waittime);
-		count++;
-		rc = 0; /* Device ready. */
+
+		rc = hpsa_send_test_unit_ready(h, c, lunaddr, reply_queue);
+		if (!rc)
+			break;
 
 		/* Increase wait time with each try, up to a point. */
 		if (waittime < HPSA_MAX_WAIT_INTERVAL_SECS)
-			waittime = waittime * 2;
+			waittime *= 2;
 
-		/* Send the Test Unit Ready, fill_cmd can't fail, no mapping */
-		(void) fill_cmd(c, TEST_UNIT_READY, h,
-				NULL, 0, 0, lunaddr, TYPE_CMD);
-		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
-						NO_TIMEOUT);
-		if (rc)
-			goto do_it_again;
-		/* no unmap needed here because no data xfer. */
+		dev_warn(&h->pdev->dev,
+			 "waiting %d secs for device to become ready.\n",
+			 waittime);
+	}
 
-		if (c->err_info->CommandStatus == CMD_SUCCESS)
-			break;
+	return rc;
+}
 
-		if (c->err_info->CommandStatus == CMD_TARGET_STATUS &&
-			c->err_info->ScsiStatus == SAM_STAT_CHECK_CONDITION &&
-			(c->err_info->SenseInfo[2] == NO_SENSE ||
-			c->err_info->SenseInfo[2] == UNIT_ATTENTION))
+static int wait_for_device_to_become_ready(struct ctlr_info *h,
+					   unsigned char lunaddr[],
+					   int reply_queue)
+{
+	int first_queue;
+	int last_queue;
+	int rq;
+	int rc = 0;
+	struct CommandList *c;
+
+	c = cmd_alloc(h);
+
+	/*
+	 * If no specific reply queue was requested, then send the TUR
+	 * repeatedly, requesting a reply on each reply queue; otherwise execute
+	 * the loop exactly once using only the specified queue.
+	 */
+	if (reply_queue == DEFAULT_REPLY_QUEUE) {
+		first_queue = 0;
+		last_queue = h->nreply_queues - 1;
+	} else {
+		first_queue = reply_queue;
+		last_queue = reply_queue;
+	}
+
+	for (rq = first_queue; rq <= last_queue; rq++) {
+		rc = hpsa_wait_for_test_unit_ready(h, c, lunaddr, rq);
+		if (rc)
 			break;
-do_it_again:
-		dev_warn(&h->pdev->dev, "waiting %d secs "
-			"for device to become ready.\n", waittime);
-		rc = 1; /* device not ready. */
 	}
 
 	if (rc)
@@ -4935,7 +4992,7 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
 	/* send a reset to the SCSI LUN which the command was sent to */
 	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN,
 			     DEFAULT_REPLY_QUEUE);
-	if (rc == 0 && wait_for_device_to_become_ready(h, dev->scsi3addr) == 0)
+	if (rc == 0)
 		return SUCCESS;
 
 	dev_warn(&h->pdev->dev,
@@ -5131,7 +5188,7 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
 	}
 
 	/* wait for device to recover */
-	if (wait_for_device_to_become_ready(h, psa) != 0) {
+	if (wait_for_device_to_become_ready(h, psa, reply_queue) != 0) {
 		dev_warn(&h->pdev->dev,
 			"Reset as abort: Failed: Device never recovered from reset: 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
 			psa[0], psa[1], psa[2], psa[3],


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 30/42] hpsa: performance tweak for hpsa_scatter_gather()
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (28 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 29/42] hpsa: refactor and rework support for sending TEST_UNIT_READY Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 31/42] hpsa: call pci_release_regions after pci_disable_device Don Brace
                   ` (11 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

Divide the loop in hpsa_scatter_gather() into two, one for the initial SG list
and a second one for the chained list, if any.  This allows the conditional
check which resets the indicies for the chained list to be performed outside
the loop instead of being done on every iteration inside the loop.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3751df3..b533b8e 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3709,7 +3709,7 @@ static int hpsa_scatter_gather(struct ctlr_info *h,
 		struct scsi_cmnd *cmd)
 {
 	struct scatterlist *sg;
-	int use_sg, i, sg_index, chained;
+	int use_sg, i, sg_limit, chained, last_sg;
 	struct SGDescriptor *curr_sg;
 
 	BUG_ON(scsi_sg_count(cmd) > h->maxsgentries);
@@ -3721,22 +3721,39 @@ static int hpsa_scatter_gather(struct ctlr_info *h,
 	if (!use_sg)
 		goto sglist_finished;
 
+	/*
+	 * If the number of entries is greater than the max for a single list,
+	 * then we have a chained list; we will set up all but one entry in the
+	 * first list (the last entry is saved for link information);
+	 * otherwise, we don't have a chained list and we'll set up at each of
+	 * the entries in the one list.
+	 */
 	curr_sg = cp->SG;
-	chained = 0;
-	sg_index = 0;
-	scsi_for_each_sg(cmd, sg, use_sg, i) {
-		if (i == h->max_cmd_sg_entries - 1 &&
-			use_sg > h->max_cmd_sg_entries) {
-			chained = 1;
-			curr_sg = h->cmd_sg_list[cp->cmdindex];
-			sg_index = 0;
-		}
+	chained = use_sg > h->max_cmd_sg_entries;
+	sg_limit = chained ? h->max_cmd_sg_entries - 1 : use_sg;
+	last_sg = scsi_sg_count(cmd) - 1;
+	scsi_for_each_sg(cmd, sg, sg_limit, i) {
 		hpsa_set_sg_descriptor(curr_sg, sg);
 		curr_sg++;
 	}
 
+	if (chained) {
+		/*
+		 * Continue with the chained list.  Set curr_sg to the chained
+		 * list.  Modify the limit to the total count less the entries
+		 * we've already set up.  Resume the scan at the list entry
+		 * where the previous loop left off.
+		 */
+		curr_sg = h->cmd_sg_list[cp->cmdindex];
+		sg_limit = use_sg - sg_limit;
+		for_each_sg(sg, sg, sg_limit, i) {
+			hpsa_set_sg_descriptor(curr_sg, sg);
+			curr_sg++;
+		}
+	}
+
 	/* Back the pointer up to the last entry and mark it as "last". */
-	(--curr_sg)->Ext = cpu_to_le32(HPSA_SG_LAST);
+	(curr_sg - 1)->Ext = cpu_to_le32(HPSA_SG_LAST);
 
 	if (use_sg + chained > h->maxSG)
 		h->maxSG = use_sg + chained;


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 31/42] hpsa: call pci_release_regions after pci_disable_device
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (29 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 30/42] hpsa: performance tweak for hpsa_scatter_gather() Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 32/42] hpsa: skip free_irq calls if irqs are not allocated Don Brace
                   ` (10 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

Despite the fact that PCI devices are enabled in this order:
    1. pci_enable_device
    2. pci_request_regions

    Documentation/PCI/pci.txt specifies that they be undone
    in this order
    1. pci_disable_device
    2. pci_release_regions

    Tested by injecting error in the call to pci_enable_device
    in hpsa_init_one -> hpsa_pci_init:
    [    9.095001] hpsa 0000:04:00.0: failed to enable PCI device
    [    9.095005] hpsa: probe of 0000:04:00.0 failed with error -22
    (-22 is -EINVAL)
    and then in the call pci_request_regions:
    [    9.178623] hpsa 0000:04:00.0: failed to obtain PCI resources
    [    9.178671] hpsa: probe of 0000:04:00.0 failed with error -16
    (-16 is -EBUSY)

    and then by adding
        reset_devices
    to the kernel command line and inject errors into the two
    calls to pci_enable_device and the call to pci_request_regions
    in hpsa_init_one -> hpsa_init_reset_devices.

    (inject on 6th call, 1st to hpsa2)
    [   62.413750] hpsa 0000:04:00.0: Failed to enable PCI device

    (inject on 7th call, 2nd to hpsa2)
    [   62.807571] hpsa 0000:04:00.0: failed to enable device.

    (inject on 8th call, 3rd to hpsa2)
    [   62.697198] hpsa 0000:04:00.0: failed to obtain PCI resources
    [   62.697234] hpsa: probe of 0000:04:00.0 failed with error -16

    The reset_devices path calls return -ENODEV on failure
    rather than passing the result, which apparently doesn't
    cause the pci driver to print anything.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace < don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index b533b8e..33aca38 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7075,8 +7075,12 @@ static void hpsa_free_pci_init(struct ctlr_info *h)
 	iounmap(h->vaddr);			/* pci_init 3 */
 	h->vaddr = NULL;
 	hpsa_disable_interrupt_mode(h);		/* pci_init 2 */
-	pci_release_regions(h->pdev);		/* pci_init 2 */
+	/*
+	 * call pci_disable_device before pci_release_regions per
+	 * Documentation/PCI/pci.txt
+	 */
 	pci_disable_device(h->pdev);		/* pci_init 1 */
+	pci_release_regions(h->pdev);		/* pci_init 2 */
 }
 
 /* several items must be freed later */
@@ -7099,6 +7103,7 @@ static int hpsa_pci_init(struct ctlr_info *h)
 	err = pci_enable_device(h->pdev);
 	if (err) {
 		dev_err(&h->pdev->dev, "failed to enable PCI device\n");
+		pci_disable_device(h->pdev);
 		return err;
 	}
 
@@ -7106,7 +7111,8 @@ static int hpsa_pci_init(struct ctlr_info *h)
 	if (err) {
 		dev_err(&h->pdev->dev,
 			"failed to obtain PCI resources\n");
-		goto clean1;	/* pci */
+		pci_disable_device(h->pdev);
+		return err;
 	}
 
 	pci_set_master(h->pdev);
@@ -7147,9 +7153,12 @@ clean3:	/* vaddr, intmode+region, pci */
 	h->vaddr = NULL;
 clean2:	/* intmode+region, pci */
 	hpsa_disable_interrupt_mode(h);
-	pci_release_regions(h->pdev);
-clean1:	/* pci */
+	/*
+	 * call pci_disable_device before pci_release_regions per
+	 * Documentation/PCI/pci.txt
+	 */
 	pci_disable_device(h->pdev);
+	pci_release_regions(h->pdev);
 	return err;
 }
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 32/42] hpsa: skip free_irq calls if irqs are not allocated
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (30 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 31/42] hpsa: call pci_release_regions after pci_disable_device Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 33/42] hpsa: cleanup for init_one step 2 in kdump Don Brace
                   ` (9 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

If try_soft_reset fails to re-allocate irqs, the error exit
starts with free_irq calls, which generate kernel WARN
messages since they were already freed a few lines earlier.

Jump to the next exit label to skip the free_irq calls.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 33aca38..f26e6bc 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7799,7 +7799,12 @@ reinit_after_soft_reset:
 		if (rc) {
 			dev_warn(&h->pdev->dev,
 				"Failed to request_irq after soft reset.\n");
-			goto clean4;
+			/*
+			 * clean4 starts with free_irqs, but that was just
+			 * done. Then, request_irqs_failed, so there is
+			 * nothing to free. So, goto the next label.
+			 */
+			goto clean3;
 		}
 
 		rc = hpsa_kdump_soft_reset(h);


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 33/42] hpsa: cleanup for init_one step 2 in kdump
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (31 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 32/42] hpsa: skip free_irq calls if irqs are not allocated Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 34/42] hpsa: fix try_soft_reset error handling Don Brace
                   ` (8 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

In hpsa_undo_allocations_after_kdump_soft_reset,
the things allocated in hpsa_init_one step 2 -
h->resubmit_wq and h->lockup_detected  need to
be freed, in the right order.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index f26e6bc..0afc48b 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7422,6 +7422,16 @@ static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 	hpsa_free_cmd_pool(h);			/* init_one 5 */
 	hpsa_free_irqs(h);			/* init_one 4 */
 	hpsa_free_pci_init(h);			/* init_one 3 */
+	free_percpu(h->lockup_detected);	/* init_one 2 */
+	h->lockup_detected = NULL;		/* init_one 2 */
+	if (h->resubmit_wq) {
+		destroy_workqueue(h->resubmit_wq);	/* init_one 1 */
+		h->resubmit_wq = NULL;
+	}
+	if (h->rescan_ctlr_wq) {
+		destroy_workqueue(h->rescan_ctlr_wq);
+		h->rescan_ctlr_wq = NULL;
+	}
 	kfree(h);				/* init_one 1 */
 }
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 34/42] hpsa: fix try_soft_reset error handling
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (32 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 33/42] hpsa: cleanup for init_one step 2 in kdump Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:05 ` [PATCH v3 35/42] hpsa: create workqueue after the driver is ready for use Don Brace
                   ` (7 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

If registering the special interrupt handlers in hpsa_init_one
before a soft reset fails, the error exit needs to deallocate
everything that was allocated before.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 0afc48b..fdf36a7 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7810,9 +7810,15 @@ reinit_after_soft_reset:
 			dev_warn(&h->pdev->dev,
 				"Failed to request_irq after soft reset.\n");
 			/*
-			 * clean4 starts with free_irqs, but that was just
-			 * done. Then, request_irqs_failed, so there is
-			 * nothing to free. So, goto the next label.
+			 * cannot goto clean7 or free_irqs will be called
+			 * again. Instead, do its work
+			 */
+			hpsa_free_performant_mode(h);	/* clean7 */
+			hpsa_free_sg_chain_blocks(h);	/* clean6 */
+			hpsa_free_cmd_pool(h);		/* clean5 */
+			/*
+			 * skip hpsa_free_irqs(h) clean4 since that
+			 * was just called before request_irqs failed
 			 */
 			goto clean3;
 		}
@@ -7820,7 +7826,7 @@ reinit_after_soft_reset:
 		rc = hpsa_kdump_soft_reset(h);
 		if (rc)
 			/* Neither hard nor soft reset worked, we're hosed. */
-			goto clean4;
+			goto clean7;
 
 		dev_info(&h->pdev->dev, "Board READY.\n");
 		dev_info(&h->pdev->dev,
@@ -7841,7 +7847,7 @@ reinit_after_soft_reset:
 		hpsa_undo_allocations_after_kdump_soft_reset(h);
 		try_soft_reset = 0;
 		if (rc)
-			/* don't go to clean4, we already unallocated */
+			/* don't goto clean, we already unallocated */
 			return -ENODEV;
 
 		goto reinit_after_soft_reset;


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 35/42] hpsa: create workqueue after the driver is ready for use
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (33 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 34/42] hpsa: fix try_soft_reset error handling Don Brace
@ 2015-03-17 20:05 ` Don Brace
  2015-03-17 20:06 ` [PATCH v3 36/42] hpsa: add interrupt number to /proc/interrupts interrupt name Don Brace
                   ` (6 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:05 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

Don't create the resubmit workqueue in hpsa_init_one until everything else
is ready to use, so everything can be freed in reverse order of when they
were allocated without risking freeing things while workqueue items are
still active.

Destroy the workqueue in the right order in
hpsa_undo_allocations_after_kdump_soft_reset too.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   60 ++++++++++++++++++++++++++-------------------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index fdf36a7..0057236 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7714,30 +7714,18 @@ reinit_after_soft_reset:
 	atomic_set(&h->passthru_cmds_avail, HPSA_MAX_CONCURRENT_PASSTHRUS);
 	atomic_set(&h->abort_cmds_available, HPSA_CMDS_RESERVED_FOR_ABORTS);
 
-	h->rescan_ctlr_wq = hpsa_create_controller_wq(h, "rescan");
-	if (!h->rescan_ctlr_wq) {
-		rc = -ENOMEM;
-		goto clean1;
-	}
-
-	h->resubmit_wq = hpsa_create_controller_wq(h, "resubmit");
-	if (!h->resubmit_wq) {
-		rc = -ENOMEM;
-		goto clean1;	/* aer/h */
-	}
-
 	/* Allocate and clear per-cpu variable lockup_detected */
 	h->lockup_detected = alloc_percpu(u32);
 	if (!h->lockup_detected) {
 		dev_err(&h->pdev->dev, "Failed to allocate lockup detector\n");
 		rc = -ENOMEM;
-		goto clean1;	/* wq/aer/h */
+		goto clean1;	/* aer/h */
 	}
 	set_lockup_detected_for_all_cpus(h, 0);
 
 	rc = hpsa_pci_init(h);
 	if (rc)
-		goto clean2;	/* lockup, wq/aer/h */
+		goto clean2;	/* lockup, aer/h */
 
 	sprintf(h->devname, HPSA "%d", number_of_controllers);
 	h->ctlr = number_of_controllers;
@@ -7753,7 +7741,7 @@ reinit_after_soft_reset:
 			dac = 0;
 		} else {
 			dev_err(&pdev->dev, "no suitable DMA available\n");
-			goto clean3;	/* pci, lockup, wq/aer/h */
+			goto clean3;	/* pci, lockup, aer/h */
 		}
 	}
 
@@ -7762,16 +7750,16 @@ reinit_after_soft_reset:
 
 	rc = hpsa_request_irqs(h, do_hpsa_intr_msi, do_hpsa_intr_intx);
 	if (rc)
-		goto clean3;	/* pci, lockup, wq/aer/h */
+		goto clean3;	/* pci, lockup, aer/h */
 	dev_info(&pdev->dev, "%s: <0x%x> at IRQ %d%s using DAC\n",
 	       h->devname, pdev->device,
 	       h->intr[h->intr_mode], dac ? "" : " not");
 	rc = hpsa_alloc_cmd_pool(h);
 	if (rc)
-		goto clean4;	/* irq, pci, lockup, wq/aer/h */
+		goto clean4;	/* irq, pci, lockup, aer/h */
 	rc = hpsa_alloc_sg_chain_blocks(h);
 	if (rc)
-		goto clean5;	/* cmd, irq, pci, lockup, wq/aer/h */
+		goto clean5;	/* cmd, irq, pci, lockup, aer/h */
 	init_waitqueue_head(&h->scan_wait_queue);
 	init_waitqueue_head(&h->abort_cmd_wait_queue);
 	init_waitqueue_head(&h->abort_sync_wait_queue);
@@ -7784,7 +7772,20 @@ reinit_after_soft_reset:
 	spin_lock_init(&h->devlock);
 	rc = hpsa_put_ctlr_into_performant_mode(h);
 	if (rc)
-		goto clean6;	/* sg, cmd, irq, pci, lockup, wq/aer/h */
+		goto clean6;	/* sg, cmd, irq, pci, lockup, aer/h */
+
+	/* create the resubmit workqueue */
+	h->rescan_ctlr_wq = hpsa_create_controller_wq(h, "rescan");
+	if (!h->rescan_ctlr_wq) {
+		rc = -ENOMEM;
+		goto clean7;
+	}
+
+	h->resubmit_wq = hpsa_create_controller_wq(h, "resubmit");
+	if (!h->resubmit_wq) {
+		rc = -ENOMEM;
+		goto clean7;	/* aer/h */
+	}
 
 	/*
 	 * At this point, the controller is ready to take commands.
@@ -7826,7 +7827,7 @@ reinit_after_soft_reset:
 		rc = hpsa_kdump_soft_reset(h);
 		if (rc)
 			/* Neither hard nor soft reset worked, we're hosed. */
-			goto clean7;
+			goto clean8;
 
 		dev_info(&h->pdev->dev, "Board READY.\n");
 		dev_info(&h->pdev->dev,
@@ -7863,7 +7864,7 @@ reinit_after_soft_reset:
 	hpsa_hba_inquiry(h);
 	rc = hpsa_register_scsi(h);	/* hook ourselves into SCSI subsystem */
 	if (rc)
-		goto clean7;
+		goto clean8; /* wq, perf, sg, cmd, irq, pci, lockup, aer/h */
 
 	/* Monitor the controller for firmware lockups */
 	h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
@@ -7875,19 +7876,20 @@ reinit_after_soft_reset:
 				h->heartbeat_sample_interval);
 	return 0;
 
-clean7: /* perf, sg, cmd, irq, pci, lockup, wq/aer/h */
+clean8: /* perf, sg, cmd, irq, pci, lockup, aer/h */
 	kfree(h->hba_inquiry_data);
+clean7: /* perf, sg, cmd, irq, pci, lockup, aer/h */
 	hpsa_free_performant_mode(h);
 	h->access.set_intr_mask(h, HPSA_INTR_OFF);
 clean6: /* sg, cmd, irq, pci, lockup, wq/aer/h */
 	hpsa_free_sg_chain_blocks(h);
-clean5: /* cmd, irq, pci, lockup, wq/aer/h */
+clean5: /* cmd, irq, pci, lockup, aer/h */
 	hpsa_free_cmd_pool(h);
-clean4: /* irq, pci, lockup, wq/aer/h */
+clean4: /* irq, pci, lockup, aer/h */
 	hpsa_free_irqs(h);
-clean3: /* pci, lockup, wq/aer/h */
+clean3: /* pci, lockup, aer/h */
 	hpsa_free_pci_init(h);
-clean2: /* lockup, wq/aer/h */
+clean2: /* lockup, aer/h */
 	if (h->lockup_detected) {
 		free_percpu(h->lockup_detected);
 		h->lockup_detected = NULL;
@@ -7986,9 +7988,9 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 
 	hpsa_free_device_info(h);		/* scan */
 
-	hpsa_unregister_scsi(h);			/* init_one "8" */
-	kfree(h->hba_inquiry_data);			/* init_one "8" */
-	h->hba_inquiry_data = NULL;			/* init_one "8" */
+	hpsa_unregister_scsi(h);			/* init_one 9 */
+	kfree(h->hba_inquiry_data);			/* init_one 9 */
+	h->hba_inquiry_data = NULL;			/* init_one 9 */
 	hpsa_free_performant_mode(h);			/* init_one 7 */
 	hpsa_free_sg_chain_blocks(h);			/* init_one 6 */
 	hpsa_free_cmd_pool(h);				/* init_one 5 */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 36/42] hpsa: add interrupt number to /proc/interrupts interrupt name
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (34 preceding siblings ...)
  2015-03-17 20:05 ` [PATCH v3 35/42] hpsa: create workqueue after the driver is ready for use Don Brace
@ 2015-03-17 20:06 ` Don Brace
  2015-03-17 20:06 ` [PATCH v3 37/42] hpsa: use block layer tag for command allocation Don Brace
                   ` (5 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:06 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

Add the interrupt number to the interrupt names that
appear in /proc/interrupts, so they are unique

Also, delete the IRQ and DAC prints.  Other parts of the kernel
already print the IRQ assignments, and dual-address-cycle support
has not been interesting since the parallel PCI bus went from
32 to 64 bits wide.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   20 ++++++++++++++------
 drivers/scsi/hpsa.h |    1 +
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 0057236..34c178c 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7337,8 +7337,9 @@ static int hpsa_request_irqs(struct ctlr_info *h,
 	if (h->intr_mode == PERF_MODE_INT && h->msix_vector > 0) {
 		/* If performant mode and MSI-X, use multiple reply queues */
 		for (i = 0; i < h->msix_vector; i++) {
+			sprintf(h->intrname[i], "%s-msix%d", h->devname, i);
 			rc = request_irq(h->intr[i], msixhandler,
-					0, h->devname,
+					0, h->intrname[i],
 					&h->q[i]);
 			if (rc) {
 				int j;
@@ -7359,12 +7360,22 @@ static int hpsa_request_irqs(struct ctlr_info *h,
 	} else {
 		/* Use single reply pool */
 		if (h->msix_vector > 0 || h->msi_vector) {
+			if (h->msix_vector)
+				sprintf(h->intrname[h->intr_mode],
+					"%s-msix", h->devname);
+			else
+				sprintf(h->intrname[h->intr_mode],
+					"%s-msi", h->devname);
 			rc = request_irq(h->intr[h->intr_mode],
-				msixhandler, 0, h->devname,
+				msixhandler, 0,
+				h->intrname[h->intr_mode],
 				&h->q[h->intr_mode]);
 		} else {
+			sprintf(h->intrname[h->intr_mode],
+				"%s-intx", h->devname);
 			rc = request_irq(h->intr[h->intr_mode],
-				intxhandler, IRQF_SHARED, h->devname,
+				intxhandler, IRQF_SHARED,
+				h->intrname[h->intr_mode],
 				&h->q[h->intr_mode]);
 		}
 		irq_set_affinity_hint(h->intr[h->intr_mode], NULL);
@@ -7751,9 +7762,6 @@ reinit_after_soft_reset:
 	rc = hpsa_request_irqs(h, do_hpsa_intr_msi, do_hpsa_intr_intx);
 	if (rc)
 		goto clean3;	/* pci, lockup, aer/h */
-	dev_info(&pdev->dev, "%s: <0x%x> at IRQ %d%s using DAC\n",
-	       h->devname, pdev->device,
-	       h->intr[h->intr_mode], dac ? "" : " not");
 	rc = hpsa_alloc_cmd_pool(h);
 	if (rc)
 		goto clean4;	/* irq, pci, lockup, aer/h */
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 7cb8586..3ec8934 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -220,6 +220,7 @@ struct ctlr_info {
 	int remove_in_progress;
 	/* Address of h->q[x] is passed to intr handler to know which queue */
 	u8 q[MAX_REPLY_QUEUES];
+	char intrname[MAX_REPLY_QUEUES][16];	/* "hpsa0-msix00" names */
 	u32 TMFSupportFlags; /* cache what task mgmt funcs are supported. */
 #define HPSATMF_BITS_SUPPORTED  (1 << 0)
 #define HPSATMF_PHYS_LUN_RESET  (1 << 1)


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (35 preceding siblings ...)
  2015-03-17 20:06 ` [PATCH v3 36/42] hpsa: add interrupt number to /proc/interrupts interrupt name Don Brace
@ 2015-03-17 20:06 ` Don Brace
  2015-03-23 16:57   ` Tomas Henzl
  2015-03-17 20:06 ` [PATCH v3 38/42] hpsa: use scsi host_no as hpsa controller number Don Brace
                   ` (4 subsequent siblings)
  41 siblings, 1 reply; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:06 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

Rework slave allocation:
  - separate the tagging support setup from the hostdata setup
  - make the hostdata setup act consistently when the lookup fails
  - make the hostdata setup act consistently when the device is not added
  - set up the queue depth consistently across these scenarios
  - if the block layer mq support is not available, explicitly enable and
    activate the SCSI layer tcq support (and do this at allocation-time so
    that the tags will be available for INQUIRY commands)

Tweak slave configuration so that devices which are masked are also
not attached.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
 drivers/scsi/hpsa.h |    1 
 2 files changed, 123 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 34c178c..4e34a62 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -44,6 +44,7 @@
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_tcq.h>
 #include <scsi/scsi_eh.h>
+#include <scsi/scsi_dbg.h>
 #include <linux/cciss_ioctl.h>
 #include <linux/string.h>
 #include <linux/bitmap.h>
@@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
 
 static void cmd_free(struct ctlr_info *h, struct CommandList *c);
 static struct CommandList *cmd_alloc(struct ctlr_info *h);
+static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
+static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
+					    struct scsi_cmnd *scmd);
 static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
 	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
 	int cmd_type);
@@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
 	}
 }
 
+static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
+				      struct CommandList *c)
+{
+	hpsa_cmd_resolve_events(h, c);
+	cmd_tagged_free(h, c);
+}
+
 static void hpsa_cmd_free_and_done(struct ctlr_info *h,
 		struct CommandList *c, struct scsi_cmnd *cmd)
 {
-	hpsa_cmd_resolve_events(h, c);
-	cmd_free(h, c);
+	hpsa_cmd_resolve_and_free(h, c);
 	cmd->scsi_done(cmd);
 }
 
@@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
 	hpsa_set_scsi_cmd_aborted(cmd);
 	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
 			 c->Request.CDB, c->err_info->ScsiStatus);
-	hpsa_cmd_resolve_events(h, c);
-	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
+	hpsa_cmd_resolve_and_free(h, c);
 }
 
 static void process_ioaccel2_completion(struct ctlr_info *h,
@@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
 	}
 
 	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
-		cmd_free(h, c);
+		hpsa_cmd_resolve_and_free(h, c);
 		return SCSI_MLQUEUE_HOST_BUSY;
 	}
 	enqueue_cmd_and_start_io(h, c);
@@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
 {
 	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
 
+	BUG_ON(c->cmdindex != index);
+
 	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
 	memset(c->err_info, 0, sizeof(*c->err_info));
 	c->busaddr = (u32) cmd_dma_handle;
@@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 
 	/* Get the ptr to our adapter structure out of cmd->host. */
 	h = sdev_to_hba(cmd->device);
+
+	BUG_ON(cmd->request->tag < 0);
+
 	dev = cmd->device->hostdata;
 	if (!dev) {
 		cmd->result = DID_NO_CONNECT << 16;
 		cmd->scsi_done(cmd);
 		return 0;
 	}
-	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
 
-	if (unlikely(lockup_detected(h))) {
-		cmd->result = DID_NO_CONNECT << 16;
-		cmd->scsi_done(cmd);
-		return 0;
-	}
-	c = cmd_alloc(h);
+	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
 
 	if (unlikely(lockup_detected(h))) {
 		cmd->result = DID_NO_CONNECT << 16;
-		cmd_free(h, c);
 		cmd->scsi_done(cmd);
 		return 0;
 	}
+	c = cmd_tagged_alloc(h, cmd);
 
 	/*
 	 * Call alternate submit routine for I/O accelerated commands.
@@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
 		if (rc == 0)
 			return 0;
 		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
-			cmd_free(h, c);
+			hpsa_cmd_resolve_and_free(h, c);
 			return SCSI_MLQUEUE_HOST_BUSY;
 		}
 	}
@@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
 	sh->hostdata[0] = (unsigned long) h;
 	sh->irq = h->intr[h->intr_mode];
 	sh->unique_id = sh->irq;
+	error = scsi_init_shared_tag_map(sh, sh->can_queue);
+	if (error) {
+		dev_err(&h->pdev->dev,
+			"%s: scsi_init_shared_tag_map failed for controller %d\n",
+			__func__, h->ctlr);
+		goto fail_host_put;
+	}
 	error = scsi_add_host(sh, &h->pdev->dev);
-	if (error)
+	if (error) {
+		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
+			__func__, h->ctlr);
 		goto fail_host_put;
+	}
 	scsi_scan_host(sh);
 	return 0;
 
  fail_host_put:
-	dev_err(&h->pdev->dev, "%s: scsi_add_host"
-		" failed for controller %d\n", __func__, h->ctlr);
 	scsi_host_put(sh);
 	return error;
  fail:
@@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
 }
 
 /*
+ * The block layer has already gone to the trouble of picking out a unique,
+ * small-integer tag for this request.  We use an offset from that value as
+ * an index to select our command block.  (The offset allows us to reserve the
+ * low-numbered entries for our own uses.)
+ */
+static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
+{
+	int idx = scmd->request->tag;
+
+	if (idx < 0)
+		return idx;
+
+	/* Offset to leave space for internal cmds. */
+	return idx += HPSA_NRESERVED_CMDS;
+}
+
+/*
  * Send a TEST_UNIT_READY command to the specified LUN using the specified
  * reply queue; returns zero if the unit is ready, and non-zero otherwise.
  */
@@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
 	/* if controller locked up, we can guarantee command won't complete */
 	if (lockup_detected(h)) {
 		dev_warn(&h->pdev->dev,
-			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
-			h->scsi_host->host_no, dev->bus, dev->target,
-			dev->lun);
+			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
+			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+			 hpsa_get_cmd_index(scsicmd));
 		return FAILED;
 	}
 
 	/* this reset request might be the result of a lockup; check */
 	if (detect_controller_lockup(h)) {
 		dev_warn(&h->pdev->dev,
-			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
+			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
 			 h->scsi_host->host_no, dev->bus, dev->target,
-			 dev->lun);
+			 dev->lun, hpsa_get_cmd_index(scsicmd));
 		return FAILED;
 	}
 
@@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 }
 
 /*
+ * For operations with an associated SCSI command, a command block is allocated
+ * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
+ * block request tag as an index into a table of entries.  cmd_tagged_free() is
+ * the complement, although cmd_free() may be called instead.
+ */
+static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
+					    struct scsi_cmnd *scmd)
+{
+	int idx = hpsa_get_cmd_index(scmd);
+	struct CommandList *c = h->cmd_pool + idx;
+	int refcount = 0;
+
+	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
+		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
+			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
+		/* The index value comes from the block layer, so if it's out of
+		 * bounds, it's probably not our bug.
+		 */
+		BUG();
+	}
+
+	refcount = atomic_inc_return(&c->refcount);
+	if (unlikely(!hpsa_is_cmd_idle(c))) {
+		/*
+		 * We expect that the SCSI layer will hand us a unique tag
+		 * value.  Thus, there should never be a collision here between
+		 * two requests...because if the selected command isn't idle
+		 * then someone is going to be very disappointed.
+		 */
+		dev_err(&h->pdev->dev,
+			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
+			idx);
+		if (c->scsi_cmd != NULL)
+			scsi_print_command(c->scsi_cmd);
+		scsi_print_command(scmd);
+	}
+
+	hpsa_cmd_partial_init(h, idx, c);
+	return c;
+}
+
+static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
+{
+	/*
+	 * Release our reference to the block.  We don't need to do anything
+	 * else to free it, because it is accessed by index.  (There's no point
+	 * in checking the result of the decrement, since we cannot guarantee
+	 * that there isn't a concurrent abort which is also accessing it.)
+	 */
+	(void)atomic_dec(&c->refcount);
+}
+
+/*
  * For operations that cannot sleep, a command block is allocated at init,
  * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
  * which ones are free or in use.  Lock must be held when calling this.
@@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
 {
 	struct CommandList *c;
 	int refcount, i;
-	unsigned long offset;
 
 	/*
 	 * There is some *extremely* small but non-zero chance that that
@@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
 	 * very unlucky thread might be starved anyway, never able to
 	 * beat the other threads.  In reality, this happens so
 	 * infrequently as to be indistinguishable from never.
+	 *
+	 * Note that we start allocating commands before the SCSI host structure
+	 * is initialized.  Since the search starts at bit zero, this
+	 * all works, since we have at least one command structure available;
+	 * however, it means that the structures with the low indexes have to be
+	 * reserved for driver-initiated requests, while requests from the block
+	 * layer will use the higher indexes.
 	 */
 
-	offset = h->last_allocation; /* benignly racy */
 	for (;;) {
-		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
-		if (unlikely(i == h->nr_cmds)) {
-			offset = 0;
+		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
+		if (unlikely(i >= HPSA_NRESERVED_CMDS))
 			continue;
-		}
 		c = h->cmd_pool + i;
 		refcount = atomic_inc_return(&c->refcount);
 		if (unlikely(refcount > 1)) {
 			cmd_free(h, c); /* already in use */
-			offset = (i + 1) % h->nr_cmds;
 			continue;
 		}
 		set_bit(i & (BITS_PER_LONG - 1),
 			h->cmd_pool_bits + (i / BITS_PER_LONG));
 		break; /* it's ours now. */
 	}
-	h->last_allocation = i; /* benignly racy */
 	hpsa_cmd_partial_init(h, i, c);
 	return c;
 }
 
+/*
+ * This is the complementary operation to cmd_alloc().  Note, however, in some
+ * corner cases it may also be used to free blocks allocated by
+ * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
+ * the clear-bit is harmless.
+ */
 static void cmd_free(struct ctlr_info *h, struct CommandList *c)
 {
 	if (atomic_dec_and_test(&c->refcount)) {
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 3ec8934..2536b67 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -141,7 +141,6 @@ struct ctlr_info {
 	struct CfgTable __iomem *cfgtable;
 	int	interrupts_enabled;
 	int 	max_commands;
-	int last_allocation;
 	atomic_t commands_outstanding;
 #	define PERF_MODE_INT	0
 #	define DOORBELL_INT	1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 38/42] hpsa: use scsi host_no as hpsa controller number
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (36 preceding siblings ...)
  2015-03-17 20:06 ` [PATCH v3 37/42] hpsa: use block layer tag for command allocation Don Brace
@ 2015-03-17 20:06 ` Don Brace
  2015-03-17 20:07 ` [PATCH v3 39/42] hpsa: propagate the error code in hpsa_kdump_soft_reset Don Brace
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:06 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

Rather than numbering the hpsa controllers with an
incrementing 0..n value (e.g., that shows up in
/proc/interrupts), use the scsi midlayer
host_no (e.g. matching /sys/class/scsi_host/hostNN).

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |  134 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 74 insertions(+), 60 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 4e34a62..511b7ab 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -324,32 +324,35 @@ static int check_for_unit_attention(struct ctlr_info *h,
 	switch (asc) {
 	case STATE_CHANGED:
 		dev_warn(&h->pdev->dev,
-			HPSA "%d: a state change detected, command retried\n",
-			h->ctlr);
+			"%s: a state change detected, command retried\n",
+			h->devname);
 		break;
 	case LUN_FAILED:
 		dev_warn(&h->pdev->dev,
-			HPSA "%d: LUN failure detected\n", h->ctlr);
+			"%s: LUN failure detected\n", h->devname);
 		break;
 	case REPORT_LUNS_CHANGED:
 		dev_warn(&h->pdev->dev,
-			HPSA "%d: report LUN data changed\n", h->ctlr);
+			"%s: report LUN data changed\n", h->devname);
 	/*
 	 * Note: this REPORT_LUNS_CHANGED condition only occurs on the external
 	 * target (array) devices.
 	 */
 		break;
 	case POWER_OR_RESET:
-		dev_warn(&h->pdev->dev, HPSA "%d: a power on "
-			"or device reset detected\n", h->ctlr);
+		dev_warn(&h->pdev->dev,
+			"%s: a power on or device reset detected\n",
+			h->devname);
 		break;
 	case UNIT_ATTENTION_CLEARED:
-		dev_warn(&h->pdev->dev, HPSA "%d: unit attention "
-		    "cleared by another initiator\n", h->ctlr);
+		dev_warn(&h->pdev->dev,
+			"%s: unit attention cleared by another initiator\n",
+			h->devname);
 		break;
 	default:
-		dev_warn(&h->pdev->dev, HPSA "%d: unknown "
-			"unit attention detected\n", h->ctlr);
+		dev_warn(&h->pdev->dev,
+			"%s: unknown unit attention detected\n",
+			h->devname);
 		break;
 	}
 	return 1;
@@ -4799,22 +4802,16 @@ static int hpsa_scan_finished(struct Scsi_Host *sh,
 	return finished;
 }
 
-static void hpsa_unregister_scsi(struct ctlr_info *h)
-{
-	/* we are being forcibly unloaded, and may not refuse. */
-	scsi_remove_host(h->scsi_host);
-	scsi_host_put(h->scsi_host);
-	h->scsi_host = NULL;
-}
-
-static int hpsa_register_scsi(struct ctlr_info *h)
+static int hpsa_scsi_host_alloc(struct ctlr_info *h)
 {
 	struct Scsi_Host *sh;
 	int error;
 
 	sh = scsi_host_alloc(&hpsa_driver_template, sizeof(h));
-	if (sh == NULL)
-		goto fail;
+	if (sh == NULL) {
+		dev_err(&h->pdev->dev, "scsi_host_alloc failed\n");
+		return -ENOMEM;
+	}
 
 	sh->io_port = 0;
 	sh->n_io_port = 0;
@@ -4826,7 +4823,6 @@ static int hpsa_register_scsi(struct ctlr_info *h)
 	sh->can_queue = h->nr_cmds - HPSA_NRESERVED_CMDS;
 	sh->cmd_per_lun = sh->can_queue;
 	sh->sg_tablesize = h->maxsgentries;
-	h->scsi_host = sh;
 	sh->hostdata[0] = (unsigned long) h;
 	sh->irq = h->intr[h->intr_mode];
 	sh->unique_id = sh->irq;
@@ -4835,24 +4831,24 @@ static int hpsa_register_scsi(struct ctlr_info *h)
 		dev_err(&h->pdev->dev,
 			"%s: scsi_init_shared_tag_map failed for controller %d\n",
 			__func__, h->ctlr);
-		goto fail_host_put;
-	}
-	error = scsi_add_host(sh, &h->pdev->dev);
-	if (error) {
-		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
-			__func__, h->ctlr);
-		goto fail_host_put;
+			scsi_host_put(sh);
+			return error;
 	}
-	scsi_scan_host(sh);
+	h->scsi_host = sh;
 	return 0;
+}
 
- fail_host_put:
-	scsi_host_put(sh);
-	return error;
- fail:
-	dev_err(&h->pdev->dev, "%s: scsi_host_alloc"
-		" failed for controller %d\n", __func__, h->ctlr);
-	return -ENOMEM;
+static int hpsa_scsi_add_host(struct ctlr_info *h)
+{
+	int rv;
+
+	rv = scsi_add_host(h->scsi_host, &h->pdev->dev);
+	if (rv) {
+		dev_err(&h->pdev->dev, "scsi_add_host failed\n");
+		return rv;
+	}
+	scsi_scan_host(h->scsi_host);
+	return 0;
 }
 
 /*
@@ -7525,7 +7521,9 @@ static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
 	hpsa_free_sg_chain_blocks(h);		/* init_one 6 */
 	hpsa_free_cmd_pool(h);			/* init_one 5 */
 	hpsa_free_irqs(h);			/* init_one 4 */
-	hpsa_free_pci_init(h);			/* init_one 3 */
+	scsi_host_put(h->scsi_host);		/* init_one 3 */
+	h->scsi_host = NULL;			/* init_one 3 */
+	hpsa_free_pci_init(h);			/* init_one 2_5 */
 	free_percpu(h->lockup_detected);	/* init_one 2 */
 	h->lockup_detected = NULL;		/* init_one 2 */
 	if (h->resubmit_wq) {
@@ -7829,9 +7827,15 @@ reinit_after_soft_reset:
 
 	rc = hpsa_pci_init(h);
 	if (rc)
-		goto clean2;	/* lockup, aer/h */
+		goto clean2;	/* lu, aer/h */
+
+	/* relies on h-> settings made by hpsa_pci_init, including
+	 * interrupt_mode h->intr */
+	rc = hpsa_scsi_host_alloc(h);
+	if (rc)
+		goto clean2_5;	/* pci, lu, aer/h */
 
-	sprintf(h->devname, HPSA "%d", number_of_controllers);
+	sprintf(h->devname, HPSA "%d", h->scsi_host->host_no);
 	h->ctlr = number_of_controllers;
 	number_of_controllers++;
 
@@ -7845,7 +7849,7 @@ reinit_after_soft_reset:
 			dac = 0;
 		} else {
 			dev_err(&pdev->dev, "no suitable DMA available\n");
-			goto clean3;	/* pci, lockup, aer/h */
+			goto clean3;	/* shost, pci, lu, aer/h */
 		}
 	}
 
@@ -7854,13 +7858,13 @@ reinit_after_soft_reset:
 
 	rc = hpsa_request_irqs(h, do_hpsa_intr_msi, do_hpsa_intr_intx);
 	if (rc)
-		goto clean3;	/* pci, lockup, aer/h */
+		goto clean3;	/* shost, pci, lu, aer/h */
 	rc = hpsa_alloc_cmd_pool(h);
 	if (rc)
-		goto clean4;	/* irq, pci, lockup, aer/h */
+		goto clean4;	/* irq, shost, pci, lu, aer/h */
 	rc = hpsa_alloc_sg_chain_blocks(h);
 	if (rc)
-		goto clean5;	/* cmd, irq, pci, lockup, aer/h */
+		goto clean5;	/* cmd, irq, shost, pci, lu, aer/h */
 	init_waitqueue_head(&h->scan_wait_queue);
 	init_waitqueue_head(&h->abort_cmd_wait_queue);
 	init_waitqueue_head(&h->abort_sync_wait_queue);
@@ -7869,11 +7873,16 @@ reinit_after_soft_reset:
 	pci_set_drvdata(pdev, h);
 	h->ndevices = 0;
 	h->hba_mode_enabled = 0;
-	h->scsi_host = NULL;
+
 	spin_lock_init(&h->devlock);
 	rc = hpsa_put_ctlr_into_performant_mode(h);
 	if (rc)
-		goto clean6;	/* sg, cmd, irq, pci, lockup, aer/h */
+		goto clean6; /* sg, cmd, irq, shost, pci, lu, aer/h */
+
+	/* hook into SCSI subsystem */
+	rc = hpsa_scsi_add_host(h);
+	if (rc)
+		goto clean7; /* perf, sg, cmd, irq, shost, pci, lu, aer/h */
 
 	/* create the resubmit workqueue */
 	h->rescan_ctlr_wq = hpsa_create_controller_wq(h, "rescan");
@@ -7928,7 +7937,7 @@ reinit_after_soft_reset:
 		rc = hpsa_kdump_soft_reset(h);
 		if (rc)
 			/* Neither hard nor soft reset worked, we're hosed. */
-			goto clean8;
+			goto clean9;
 
 		dev_info(&h->pdev->dev, "Board READY.\n");
 		dev_info(&h->pdev->dev,
@@ -7963,9 +7972,6 @@ reinit_after_soft_reset:
 	h->access.set_intr_mask(h, HPSA_INTR_ON);
 
 	hpsa_hba_inquiry(h);
-	rc = hpsa_register_scsi(h);	/* hook ourselves into SCSI subsystem */
-	if (rc)
-		goto clean8; /* wq, perf, sg, cmd, irq, pci, lockup, aer/h */
 
 	/* Monitor the controller for firmware lockups */
 	h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
@@ -7977,20 +7983,23 @@ reinit_after_soft_reset:
 				h->heartbeat_sample_interval);
 	return 0;
 
-clean8: /* perf, sg, cmd, irq, pci, lockup, aer/h */
+clean9: /* wq, sh, perf, sg, cmd, irq, shost, pci, lu, aer/h */
 	kfree(h->hba_inquiry_data);
-clean7: /* perf, sg, cmd, irq, pci, lockup, aer/h */
+clean7: /* perf, sg, cmd, irq, shost, pci, lu, aer/h */
 	hpsa_free_performant_mode(h);
 	h->access.set_intr_mask(h, HPSA_INTR_OFF);
 clean6: /* sg, cmd, irq, pci, lockup, wq/aer/h */
 	hpsa_free_sg_chain_blocks(h);
-clean5: /* cmd, irq, pci, lockup, aer/h */
+clean5: /* cmd, irq, shost, pci, lu, aer/h */
 	hpsa_free_cmd_pool(h);
-clean4: /* irq, pci, lockup, aer/h */
+clean4: /* irq, shost, pci, lu, aer/h */
 	hpsa_free_irqs(h);
-clean3: /* pci, lockup, aer/h */
+clean3: /* shost, pci, lu, aer/h */
+	scsi_host_put(h->scsi_host);
+	h->scsi_host = NULL;
+clean2_5: /* pci, lu, aer/h */
 	hpsa_free_pci_init(h);
-clean2: /* lockup, aer/h */
+clean2: /* lu, aer/h */
 	if (h->lockup_detected) {
 		free_percpu(h->lockup_detected);
 		h->lockup_detected = NULL;
@@ -8089,17 +8098,22 @@ static void hpsa_remove_one(struct pci_dev *pdev)
 
 	hpsa_free_device_info(h);		/* scan */
 
-	hpsa_unregister_scsi(h);			/* init_one 9 */
-	kfree(h->hba_inquiry_data);			/* init_one 9 */
-	h->hba_inquiry_data = NULL;			/* init_one 9 */
+	kfree(h->hba_inquiry_data);			/* init_one 10 */
+	h->hba_inquiry_data = NULL;			/* init_one 10 */
+	if (h->scsi_host)
+		scsi_remove_host(h->scsi_host);		/* init_one 8 */
+	hpsa_free_ioaccel2_sg_chain_blocks(h);
 	hpsa_free_performant_mode(h);			/* init_one 7 */
 	hpsa_free_sg_chain_blocks(h);			/* init_one 6 */
 	hpsa_free_cmd_pool(h);				/* init_one 5 */
 
 	/* hpsa_free_irqs already called via hpsa_shutdown init_one 4 */
 
+	scsi_host_put(h->scsi_host);			/* init_one 3 */
+	h->scsi_host = NULL;				/* init_one 3 */
+
 	/* includes hpsa_disable_interrupt_mode - pci_init 2 */
-	hpsa_free_pci_init(h);				/* init_one 3 */
+	hpsa_free_pci_init(h);				/* init_one 2.5 */
 
 	free_percpu(h->lockup_detected);		/* init_one 2 */
 	h->lockup_detected = NULL;			/* init_one 2 */


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 39/42] hpsa: propagate the error code in hpsa_kdump_soft_reset
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (37 preceding siblings ...)
  2015-03-17 20:06 ` [PATCH v3 38/42] hpsa: use scsi host_no as hpsa controller number Don Brace
@ 2015-03-17 20:07 ` Don Brace
  2015-03-17 20:07 ` [PATCH v3 40/42] hpsa: cleanup reset Don Brace
                   ` (2 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:07 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Robert Elliott <elliott@hp.com>

If hpsa_wait_for_board_state fails, hpsa_kdump_soft_reset
should propagate its return value (e.g., -ENODEV) rather
than just returning -1.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 511b7ab..3987400 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7480,19 +7480,22 @@ static int hpsa_request_irqs(struct ctlr_info *h,
 
 static int hpsa_kdump_soft_reset(struct ctlr_info *h)
 {
+	int rc;
 	hpsa_send_host_reset(h, RAID_CTLR_LUNID, HPSA_RESET_TYPE_CONTROLLER);
 
 	dev_info(&h->pdev->dev, "Waiting for board to soft reset.\n");
-	if (hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_NOT_READY)) {
+	rc = hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_NOT_READY);
+	if (rc) {
 		dev_warn(&h->pdev->dev, "Soft reset had no effect.\n");
-		return -1;
+		return rc;
 	}
 
 	dev_info(&h->pdev->dev, "Board reset, awaiting READY status.\n");
-	if (hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_READY)) {
+	rc = hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_READY);
+	if (rc) {
 		dev_warn(&h->pdev->dev, "Board failed to become ready "
 			"after soft reset.\n");
-		return -1;
+		return rc;
 	}
 
 	return 0;


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 40/42] hpsa: cleanup reset
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (38 preceding siblings ...)
  2015-03-17 20:07 ` [PATCH v3 39/42] hpsa: propagate the error code in hpsa_kdump_soft_reset Don Brace
@ 2015-03-17 20:07 ` Don Brace
  2015-03-17 20:07 ` [PATCH v3 41/42] hpsa: change driver version Don Brace
  2015-03-17 20:07 ` [PATCH v3 42/42] hpsa: add PMC to copyright Don Brace
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:07 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

From: Webb Scales <webbnh@hp.com>

Synchronize completion the reset with completion of outstanding commands

Extending the newly-added synchronous abort functionality,
now also synchronize resets with the completion of outstanding commands.
Rename the wait queue to reflect the fact that it's being used for both
types of waits.  Also, don't complete commands which are terminated
due to a reset operation.

fix for controller lockup during reset

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Webb Scales <webbnh@hp.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |  200 +++++++++++++++++++++++++++++++++++++++++------
 drivers/scsi/hpsa.h     |    5 +
 drivers/scsi/hpsa_cmd.h |    1 
 3 files changed, 178 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3987400..96e1d02 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -283,6 +283,11 @@ static inline bool hpsa_is_cmd_idle(struct CommandList *c)
 	return c->scsi_cmd == SCSI_CMD_IDLE;
 }
 
+static inline bool hpsa_is_pending_event(struct CommandList *c)
+{
+	return c->abort_pending || c->reset_pending;
+}
+
 /* extract sense key, asc, and ascq from sense data.  -1 means invalid. */
 static void decode_sense_data(const u8 *sense_data, int sense_data_len,
 			int *sense_key, int *asc, int *ascq)
@@ -977,7 +982,7 @@ static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
 
 static void enqueue_cmd_and_start_io(struct ctlr_info *h, struct CommandList *c)
 {
-	if (unlikely(c->abort_pending))
+	if (unlikely(hpsa_is_pending_event(c)))
 		return finish_cmd(c);
 
 	__enqueue_cmd_and_start_io(h, c, DEFAULT_REPLY_QUEUE);
@@ -1479,6 +1484,8 @@ static void hpsa_figure_phys_disk_ptrs(struct ctlr_info *h,
 	if (nraid_map_entries > RAID_MAP_MAX_ENTRIES)
 		nraid_map_entries = RAID_MAP_MAX_ENTRIES;
 
+	logical_drive->nphysical_disks = nraid_map_entries;
+
 	qdepth = 0;
 	for (i = 0; i < nraid_map_entries; i++) {
 		logical_drive->phys_disk[i] = NULL;
@@ -2031,6 +2038,8 @@ static int handle_ioaccel_mode2_error(struct ctlr_info *h,
 static void hpsa_cmd_resolve_events(struct ctlr_info *h,
 		struct CommandList *c)
 {
+	bool do_wake = false;
+
 	/*
 	 * Prevent the following race in the abort handler:
 	 *
@@ -2042,16 +2051,35 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
 	 *    finds struct CommandList and tries to aborts it
 	 * Now we have aborted the wrong command.
 	 *
-	 * Clear c->scsi_cmd here so that the abort handler will know this
-	 * command has completed.  Then, check to see if the abort handler is
+	 * Reset c->scsi_cmd here so that the abort or reset handler will know
+	 * this command has completed.  Then, check to see if the handler is
 	 * waiting for this command, and, if so, wake it.
 	 */
 	c->scsi_cmd = SCSI_CMD_IDLE;
-	mb(); /* Ensure c->scsi_cmd is set to SCSI_CMD_IDLE */
+	mb();	/* Declare command idle before checking for pending events. */
 	if (c->abort_pending) {
+		do_wake = true;
 		c->abort_pending = false;
-		wake_up_all(&h->abort_sync_wait_queue);
 	}
+	if (c->reset_pending) {
+		unsigned long flags;
+		struct hpsa_scsi_dev_t *dev;
+
+		/*
+		 * There appears to be a reset pending; lock the lock and
+		 * reconfirm.  If so, then decrement the count of outstanding
+		 * commands and wake the reset command if this is the last one.
+		 */
+		spin_lock_irqsave(&h->lock, flags);
+		dev = c->reset_pending;		/* Re-fetch under the lock. */
+		if (dev && atomic_dec_and_test(&dev->reset_cmds_out))
+			do_wake = true;
+		c->reset_pending = NULL;
+		spin_unlock_irqrestore(&h->lock, flags);
+	}
+
+	if (do_wake)
+		wake_up_all(&h->event_sync_wait_queue);
 }
 
 static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
@@ -2099,10 +2127,6 @@ static void process_ioaccel2_completion(struct ctlr_info *h,
 			c2->error_data.status == 0))
 		return hpsa_cmd_free_and_done(h, c, cmd);
 
-	/* don't requeue a command which is being aborted */
-	if (unlikely(c->abort_pending))
-		return hpsa_cmd_abort_and_free(h, c, cmd);
-
 	/*
 	 * Any RAID offload error results in retry which will use
 	 * the normal I/O path so the controller can handle whatever's
@@ -2197,6 +2221,13 @@ static void complete_scsi_command(struct CommandList *cp)
 		return hpsa_cmd_free_and_done(h, cp, cmd);
 	}
 
+	if ((unlikely(hpsa_is_pending_event(cp)))) {
+		if (cp->reset_pending)
+			return hpsa_cmd_resolve_and_free(h, cp);
+		if (cp->abort_pending)
+			return hpsa_cmd_abort_and_free(h, cp, cmd);
+	}
+
 	if (cp->cmd_type == CMD_IOACCEL2)
 		return process_ioaccel2_completion(h, cp, cmd, dev);
 
@@ -2224,14 +2255,10 @@ static void complete_scsi_command(struct CommandList *cp)
 		if (is_logical_dev_addr_mode(dev->scsi3addr)) {
 			if (ei->CommandStatus == CMD_IOACCEL_DISABLED)
 				dev->offload_enabled = 0;
-			if (!cp->abort_pending)
-				return hpsa_retry_cmd(h, cp);
+			return hpsa_retry_cmd(h, cp);
 		}
 	}
 
-	if (cp->abort_pending)
-		ei->CommandStatus = CMD_ABORTED;
-
 	/* an error has occurred */
 	switch (ei->CommandStatus) {
 
@@ -2651,6 +2678,124 @@ out:
 	return rc;
 }
 
+static bool hpsa_cmd_dev_match(struct ctlr_info *h, struct CommandList *c,
+			       struct hpsa_scsi_dev_t *dev,
+			       unsigned char *scsi3addr)
+{
+	int i;
+	bool match = false;
+	struct io_accel2_cmd *c2 = &h->ioaccel2_cmd_pool[c->cmdindex];
+	struct hpsa_tmf_struct *ac = (struct hpsa_tmf_struct *) c2;
+
+	if (hpsa_is_cmd_idle(c))
+		return false;
+
+	switch (c->cmd_type) {
+	case CMD_SCSI:
+	case CMD_IOCTL_PEND:
+		match = !memcmp(scsi3addr, &c->Header.LUN.LunAddrBytes,
+				sizeof(c->Header.LUN.LunAddrBytes));
+		break;
+
+	case CMD_IOACCEL1:
+	case CMD_IOACCEL2:
+		if (c->phys_disk == dev) {
+			/* HBA mode match */
+			match = true;
+		} else {
+			/* Possible RAID mode -- check each phys dev. */
+			/* FIXME:  Do we need to take out a lock here?  If
+			 * so, we could just call hpsa_get_pdisk_of_ioaccel2()
+			 * instead. */
+			for (i = 0; i < dev->nphysical_disks && !match; i++) {
+				/* FIXME: an alternate test might be
+				 *
+				 * match = dev->phys_disk[i]->ioaccel_handle
+				 *              == c2->scsi_nexus;      */
+				match = dev->phys_disk[i] == c->phys_disk;
+			}
+		}
+		break;
+
+	case IOACCEL2_TMF:
+		for (i = 0; i < dev->nphysical_disks && !match; i++) {
+			match = dev->phys_disk[i]->ioaccel_handle ==
+					le32_to_cpu(ac->it_nexus);
+		}
+		break;
+
+	case 0:		/* The command is in the middle of being initialized. */
+		match = false;
+		break;
+
+	default:
+		dev_err(&h->pdev->dev, "unexpected cmd_type: %d\n",
+			c->cmd_type);
+		BUG();
+	}
+
+	return match;
+}
+
+static int hpsa_do_reset(struct ctlr_info *h, struct hpsa_scsi_dev_t *dev,
+	unsigned char *scsi3addr, u8 reset_type, int reply_queue)
+{
+	int i;
+	int rc = 0;
+
+	/* We can really only handle one reset at a time */
+	if (mutex_lock_interruptible(&h->reset_mutex) == -EINTR) {
+		dev_warn(&h->pdev->dev, "concurrent reset wait interrupted.\n");
+		return -EINTR;
+	}
+
+	BUG_ON(atomic_read(&dev->reset_cmds_out) != 0);
+
+	for (i = 0; i < h->nr_cmds; i++) {
+		struct CommandList *c = h->cmd_pool + i;
+		int refcount = atomic_inc_return(&c->refcount);
+
+		if (refcount > 1 && hpsa_cmd_dev_match(h, c, dev, scsi3addr)) {
+			unsigned long flags;
+
+			/*
+			 * Mark the target command as having a reset pending,
+			 * then lock a lock so that the command cannot complete
+			 * while we're considering it.  If the command is not
+			 * idle then count it; otherwise revoke the event.
+			 */
+			c->reset_pending = dev;
+			spin_lock_irqsave(&h->lock, flags);	/* Implied MB */
+			if (!hpsa_is_cmd_idle(c))
+				atomic_inc(&dev->reset_cmds_out);
+			else
+				c->reset_pending = NULL;
+			spin_unlock_irqrestore(&h->lock, flags);
+		}
+
+		cmd_free(h, c);
+	}
+
+	rc = hpsa_send_reset(h, scsi3addr, reset_type, reply_queue);
+	if (!rc)
+		wait_event(h->event_sync_wait_queue,
+			atomic_read(&dev->reset_cmds_out) == 0 ||
+			lockup_detected(h));
+
+	if (unlikely(lockup_detected(h))) {
+			dev_warn(&h->pdev->dev,
+				 "Controller lockup detected during reset wait\n");
+			mutex_unlock(&h->reset_mutex);
+			rc = -ENODEV;
+		}
+
+	if (unlikely(rc))
+		atomic_set(&dev->reset_cmds_out, 0);
+
+	mutex_unlock(&h->reset_mutex);
+	return rc;
+}
+
 static void hpsa_get_raid_level(struct ctlr_info *h,
 	unsigned char *scsi3addr, unsigned char *raid_level)
 {
@@ -3500,6 +3645,7 @@ static void hpsa_get_ioaccel_drive_info(struct ctlr_info *h,
 	else
 		dev->queue_depth = DRIVE_QUEUE_DEPTH; /* conservative */
 	atomic_set(&dev->ioaccel_cmds_out, 0);
+	atomic_set(&dev->reset_cmds_out, 0);
 }
 
 static void hpsa_update_scsi_devices(struct ctlr_info *h, int hostno)
@@ -4639,6 +4785,8 @@ static void hpsa_command_resubmit_worker(struct work_struct *work)
 		cmd->result = DID_NO_CONNECT << 16;
 		return hpsa_cmd_free_and_done(c->h, c, cmd);
 	}
+	if (c->reset_pending)
+		return hpsa_cmd_resolve_and_free(c->h, c);
 	if (c->abort_pending)
 		return hpsa_cmd_abort_and_free(c->h, c, cmd);
 	if (c->cmd_type == CMD_IOACCEL2) {
@@ -5000,8 +5148,7 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
 
 	dev = scsicmd->device->hostdata;
 	if (!dev) {
-		dev_err(&h->pdev->dev, "hpsa_eh_device_reset_handler: "
-			"device lookup failed.\n");
+		dev_err(&h->pdev->dev, "%s: device lookup failed\n", __func__);
 		return FAILED;
 	}
 
@@ -5036,15 +5183,13 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
 		dev->expose_state);
 
 	/* send a reset to the SCSI LUN which the command was sent to */
-	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN,
-			     DEFAULT_REPLY_QUEUE);
-	if (rc == 0)
-		return SUCCESS;
-
+	rc = hpsa_do_reset(h, dev, dev->scsi3addr, HPSA_RESET_TYPE_LUN,
+			   DEFAULT_REPLY_QUEUE);
 	dev_warn(&h->pdev->dev,
-		"scsi %d:%d:%d:%d reset failed\n",
-		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
-	return FAILED;
+		 "scsi %d:%d:%d:%d reset %s\n",
+		 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
+		 rc == 0 ? "completed successfully" : "failed");
+	return rc == 0 ? SUCCESS : FAILED;
 }
 
 static void swizzle_abort_tag(u8 *tag)
@@ -5224,7 +5369,7 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
 			"Reset as abort: Resetting physical device at scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
 			psa[0], psa[1], psa[2], psa[3],
 			psa[4], psa[5], psa[6], psa[7]);
-	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET, reply_queue);
+	rc = hpsa_do_reset(h, dev, psa, HPSA_RESET_TYPE_TARGET, reply_queue);
 	if (rc != 0) {
 		dev_warn(&h->pdev->dev,
 			"Reset as abort: Failed on physical device at scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
@@ -5464,7 +5609,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
 		return FAILED;
 	}
 	dev_info(&h->pdev->dev, "%s SENT, SUCCESS\n", msg);
-	wait_event(h->abort_sync_wait_queue,
+	wait_event(h->event_sync_wait_queue,
 		   abort->scsi_cmd != sc || lockup_detected(h));
 	cmd_free(h, abort);
 	return !lockup_detected(h) ? SUCCESS : FAILED;
@@ -7870,7 +8015,8 @@ reinit_after_soft_reset:
 		goto clean5;	/* cmd, irq, shost, pci, lu, aer/h */
 	init_waitqueue_head(&h->scan_wait_queue);
 	init_waitqueue_head(&h->abort_cmd_wait_queue);
-	init_waitqueue_head(&h->abort_sync_wait_queue);
+	init_waitqueue_head(&h->event_sync_wait_queue);
+	mutex_init(&h->reset_mutex);
 	h->scan_finished = 1; /* no scan currently in progress */
 
 	pci_set_drvdata(pdev, h);
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 2536b67..6ee4da6 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -47,6 +47,7 @@ struct hpsa_scsi_dev_t {
 	unsigned char raid_level;	/* from inquiry page 0xC1 */
 	unsigned char volume_offline;	/* discovered via TUR or VPD */
 	u16 queue_depth;		/* max queue_depth for this device */
+	atomic_t reset_cmds_out;	/* Count of commands to-be affected */
 	atomic_t ioaccel_cmds_out;	/* Only used for physical devices
 					 * counts commands sent to physical
 					 * device via "ioaccel" path.
@@ -70,6 +71,7 @@ struct hpsa_scsi_dev_t {
 	 * devices in order to honor physical device queue depth limits.
 	 */
 	struct hpsa_scsi_dev_t *phys_disk[RAID_MAP_MAX_ENTRIES];
+	int nphysical_disks;
 	int supports_aborts;
 #define HPSA_DO_NOT_EXPOSE	0x0
 #define HPSA_SG_ATTACH		0x1
@@ -266,7 +268,8 @@ struct ctlr_info {
 	struct workqueue_struct *rescan_ctlr_wq;
 	atomic_t abort_cmds_available;
 	wait_queue_head_t abort_cmd_wait_queue;
-	wait_queue_head_t abort_sync_wait_queue;
+	wait_queue_head_t event_sync_wait_queue;
+	struct mutex reset_mutex;
 };
 
 struct offline_device_entry {
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index f986402..c601622 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -441,6 +441,7 @@ struct CommandList {
 	struct hpsa_scsi_dev_t *phys_disk;
 
 	int abort_pending;
+	struct hpsa_scsi_dev_t *reset_pending;
 	atomic_t refcount; /* Must be last to avoid memset in hpsa_cmd_init() */
 } __aligned(COMMANDLIST_ALIGNMENT);
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 41/42] hpsa: change driver version
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (39 preceding siblings ...)
  2015-03-17 20:07 ` [PATCH v3 40/42] hpsa: cleanup reset Don Brace
@ 2015-03-17 20:07 ` Don Brace
  2015-03-17 20:07 ` [PATCH v3 42/42] hpsa: add PMC to copyright Don Brace
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:07 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

update driver version

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 96e1d02..893bb50 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -58,7 +58,7 @@
 #include "hpsa.h"
 
 /* HPSA_DRIVER_VERSION must be 3 byte values (0-255) separated by '.' */
-#define HPSA_DRIVER_VERSION "3.4.4-1"
+#define HPSA_DRIVER_VERSION "3.4.10-0"
 #define DRIVER_NAME "HP HPSA Driver (v " HPSA_DRIVER_VERSION ")"
 #define HPSA "hpsa"
 


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 42/42] hpsa: add PMC to copyright
  2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
                   ` (40 preceding siblings ...)
  2015-03-17 20:07 ` [PATCH v3 41/42] hpsa: change driver version Don Brace
@ 2015-03-17 20:07 ` Don Brace
  41 siblings, 0 replies; 54+ messages in thread
From: Don Brace @ 2015-03-17 20:07 UTC (permalink / raw)
  To: scott.teel, Kevin.Barnett, james.bottomley, hch, Justin.Lindley, brace
  Cc: linux-scsi

need to add PMC to copyright notice and update the Hewlett-Packard
copyright notification.

Reviewed-by: Scott Teel <scott.teel@pmcs.com>
Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
Reviewed-by: Justin Lindley <justin.lindley@pmcs.com>
Signed-off-by: Don Brace <don.brace@pmcs.com>
---
 drivers/scsi/hpsa.c     |    3 ++-
 drivers/scsi/hpsa.h     |    3 ++-
 drivers/scsi/hpsa_cmd.h |    3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 893bb50..33ad6eb 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1,6 +1,7 @@
 /*
  *    Disk Array driver for HP Smart Array SAS controllers
- *    Copyright 2000, 2014 Hewlett-Packard Development Company, L.P.
+ *    Copyright 2014-2015 PMC-Sierra, Inc.
+ *    Portions Copyright 2008-2014 Hewlett-Packard Development Company, L.P.
  *
  *    This program is free software; you can redistribute it and/or modify
  *    it under the terms of the GNU General Public License as published by
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 6ee4da6..80cfc79 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -1,6 +1,7 @@
 /*
  *    Disk Array driver for HP Smart Array SAS controllers
- *    Copyright 2000, 2014 Hewlett-Packard Development Company, L.P.
+ *    Copyright 2014-2015 PMC-Sierra, Inc.
+ *    Portions Copyright 2008-2014 Hewlett-Packard Development Company, L.P.
  *
  *    This program is free software; you can redistribute it and/or modify
  *    it under the terms of the GNU General Public License as published by
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index c601622..23a8f0d 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -1,6 +1,7 @@
 /*
  *    Disk Array driver for HP Smart Array SAS controllers
- *    Copyright 2000, 2014 Hewlett-Packard Development Company, L.P.
+ *    Copyright 2014-2015 PMC-Sierra, Inc.
+ *    Portions Copyright 2008-2014 Hewlett-Packard Development Company, L.P.
  *
  *    This program is free software; you can redistribute it and/or modify
  *    it under the terms of the GNU General Public License as published by


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-17 20:06 ` [PATCH v3 37/42] hpsa: use block layer tag for command allocation Don Brace
@ 2015-03-23 16:57   ` Tomas Henzl
       [not found]     ` <07F70BBF6832E34FA1C923241E8833AB486892F9@BBYEXM01.pmc-sierra.internal>
  2015-03-27 18:49     ` brace
  0 siblings, 2 replies; 54+ messages in thread
From: Tomas Henzl @ 2015-03-23 16:57 UTC (permalink / raw)
  To: Don Brace, scott.teel, Kevin.Barnett, james.bottomley, hch,
	Justin.Lindley
  Cc: linux-scsi

On 03/17/2015 09:06 PM, Don Brace wrote:
> From: Webb Scales <webbnh@hp.com>
>
> Rework slave allocation:
>   - separate the tagging support setup from the hostdata setup
>   - make the hostdata setup act consistently when the lookup fails
>   - make the hostdata setup act consistently when the device is not added
>   - set up the queue depth consistently across these scenarios
>   - if the block layer mq support is not available, explicitly enable and
>     activate the SCSI layer tcq support (and do this at allocation-time so
>     that the tags will be available for INQUIRY commands)
>
> Tweak slave configuration so that devices which are masked are also
> not attached.
>
> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
> Signed-off-by: Webb Scales <webbnh@hp.com>
> Signed-off-by: Don Brace <don.brace@pmcs.com>
> ---
>  drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
>  drivers/scsi/hpsa.h |    1 
>  2 files changed, 123 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> index 34c178c..4e34a62 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -44,6 +44,7 @@
>  #include <scsi/scsi_host.h>
>  #include <scsi/scsi_tcq.h>
>  #include <scsi/scsi_eh.h>
> +#include <scsi/scsi_dbg.h>
>  #include <linux/cciss_ioctl.h>
>  #include <linux/string.h>
>  #include <linux/bitmap.h>
> @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
>  
>  static void cmd_free(struct ctlr_info *h, struct CommandList *c);
>  static struct CommandList *cmd_alloc(struct ctlr_info *h);
> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
> +					    struct scsi_cmnd *scmd);
>  static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
>  	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
>  	int cmd_type);
> @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
>  	}
>  }
>  
> +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
> +				      struct CommandList *c)
> +{
> +	hpsa_cmd_resolve_events(h, c);
> +	cmd_tagged_free(h, c);
> +}
> +
>  static void hpsa_cmd_free_and_done(struct ctlr_info *h,
>  		struct CommandList *c, struct scsi_cmnd *cmd)
>  {
> -	hpsa_cmd_resolve_events(h, c);
> -	cmd_free(h, c);
> +	hpsa_cmd_resolve_and_free(h, c);
>  	cmd->scsi_done(cmd);
>  }
>  
> @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
>  	hpsa_set_scsi_cmd_aborted(cmd);
>  	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
>  			 c->Request.CDB, c->err_info->ScsiStatus);
> -	hpsa_cmd_resolve_events(h, c);
> -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
> +	hpsa_cmd_resolve_and_free(h, c);
>  }
>  
>  static void process_ioaccel2_completion(struct ctlr_info *h,
> @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
>  	}
>  
>  	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
> -		cmd_free(h, c);
> +		hpsa_cmd_resolve_and_free(h, c);
>  		return SCSI_MLQUEUE_HOST_BUSY;
>  	}
>  	enqueue_cmd_and_start_io(h, c);
> @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
>  {
>  	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
>  
> +	BUG_ON(c->cmdindex != index);
> +
>  	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
>  	memset(c->err_info, 0, sizeof(*c->err_info));
>  	c->busaddr = (u32) cmd_dma_handle;
> @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>  
>  	/* Get the ptr to our adapter structure out of cmd->host. */
>  	h = sdev_to_hba(cmd->device);
> +
> +	BUG_ON(cmd->request->tag < 0);
> +
>  	dev = cmd->device->hostdata;
>  	if (!dev) {
>  		cmd->result = DID_NO_CONNECT << 16;
>  		cmd->scsi_done(cmd);
>  		return 0;
>  	}
> -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>  
> -	if (unlikely(lockup_detected(h))) {
> -		cmd->result = DID_NO_CONNECT << 16;
> -		cmd->scsi_done(cmd);
> -		return 0;
> -	}
> -	c = cmd_alloc(h);
> +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>  
>  	if (unlikely(lockup_detected(h))) {
>  		cmd->result = DID_NO_CONNECT << 16;
> -		cmd_free(h, c);
>  		cmd->scsi_done(cmd);
>  		return 0;
>  	}
> +	c = cmd_tagged_alloc(h, cmd);
>  
>  	/*
>  	 * Call alternate submit routine for I/O accelerated commands.
> @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>  		if (rc == 0)
>  			return 0;
>  		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
> -			cmd_free(h, c);
> +			hpsa_cmd_resolve_and_free(h, c);
>  			return SCSI_MLQUEUE_HOST_BUSY;
>  		}
>  	}
> @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>  	sh->hostdata[0] = (unsigned long) h;
>  	sh->irq = h->intr[h->intr_mode];
>  	sh->unique_id = sh->irq;
> +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
> +	if (error) {
> +		dev_err(&h->pdev->dev,
> +			"%s: scsi_init_shared_tag_map failed for controller %d\n",
> +			__func__, h->ctlr);
> +		goto fail_host_put;
> +	}
>  	error = scsi_add_host(sh, &h->pdev->dev);
> -	if (error)
> +	if (error) {
> +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
> +			__func__, h->ctlr);
>  		goto fail_host_put;
> +	}
>  	scsi_scan_host(sh);
>  	return 0;
>  
>   fail_host_put:
> -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
> -		" failed for controller %d\n", __func__, h->ctlr);
>  	scsi_host_put(sh);
>  	return error;
>   fail:
> @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>  }
>  
>  /*
> + * The block layer has already gone to the trouble of picking out a unique,
> + * small-integer tag for this request.  We use an offset from that value as
> + * an index to select our command block.  (The offset allows us to reserve the
> + * low-numbered entries for our own uses.)
> + */
> +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
> +{
> +	int idx = scmd->request->tag;
> +
> +	if (idx < 0)
> +		return idx;
> +
> +	/* Offset to leave space for internal cmds. */
> +	return idx += HPSA_NRESERVED_CMDS;
> +}
> +
> +/*
>   * Send a TEST_UNIT_READY command to the specified LUN using the specified
>   * reply queue; returns zero if the unit is ready, and non-zero otherwise.
>   */
> @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>  	/* if controller locked up, we can guarantee command won't complete */
>  	if (lockup_detected(h)) {
>  		dev_warn(&h->pdev->dev,
> -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
> -			h->scsi_host->host_no, dev->bus, dev->target,
> -			dev->lun);
> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
> +			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
> +			 hpsa_get_cmd_index(scsicmd));
>  		return FAILED;
>  	}
>  
>  	/* this reset request might be the result of a lockup; check */
>  	if (detect_controller_lockup(h)) {
>  		dev_warn(&h->pdev->dev,
> -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
>  			 h->scsi_host->host_no, dev->bus, dev->target,
> -			 dev->lun);
> +			 dev->lun, hpsa_get_cmd_index(scsicmd));
>  		return FAILED;
>  	}
>  
> @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>  }
>  
>  /*
> + * For operations with an associated SCSI command, a command block is allocated
> + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
> + * block request tag as an index into a table of entries.  cmd_tagged_free() is
> + * the complement, although cmd_free() may be called instead.
> + */
> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
> +					    struct scsi_cmnd *scmd)
> +{
> +	int idx = hpsa_get_cmd_index(scmd);
> +	struct CommandList *c = h->cmd_pool + idx;
> +	int refcount = 0;
> +
> +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
> +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
> +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
> +		/* The index value comes from the block layer, so if it's out of
> +		 * bounds, it's probably not our bug.
> +		 */
> +		BUG();
> +	}
> +
> +	refcount = atomic_inc_return(&c->refcount);

refcount is never used, use atomic_inc(&c->refcount); instead?

> +	if (unlikely(!hpsa_is_cmd_idle(c))) {
> +		/*
> +		 * We expect that the SCSI layer will hand us a unique tag
> +		 * value.  Thus, there should never be a collision here between
> +		 * two requests...because if the selected command isn't idle
> +		 * then someone is going to be very disappointed.
> +		 */
> +		dev_err(&h->pdev->dev,
> +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
> +			idx);
> +		if (c->scsi_cmd != NULL)
> +			scsi_print_command(c->scsi_cmd);
> +		scsi_print_command(scmd);
> +	}
> +
> +	hpsa_cmd_partial_init(h, idx, c);
> +	return c;
> +}
> +
> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
> +{
> +	/*
> +	 * Release our reference to the block.  We don't need to do anything
> +	 * else to free it, because it is accessed by index.  (There's no point
> +	 * in checking the result of the decrement, since we cannot guarantee
> +	 * that there isn't a concurrent abort which is also accessing it.)
> +	 */
> +	(void)atomic_dec(&c->refcount);
> +}
> +
> +/*
>   * For operations that cannot sleep, a command block is allocated at init,
>   * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
>   * which ones are free or in use.  Lock must be held when calling this.
> @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>  {
>  	struct CommandList *c;
>  	int refcount, i;
> -	unsigned long offset;
>  
>  	/*
>  	 * There is some *extremely* small but non-zero chance that that
> @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>  	 * very unlucky thread might be starved anyway, never able to
>  	 * beat the other threads.  In reality, this happens so
>  	 * infrequently as to be indistinguishable from never.
> +	 *
> +	 * Note that we start allocating commands before the SCSI host structure
> +	 * is initialized.  Since the search starts at bit zero, this
> +	 * all works, since we have at least one command structure available;
> +	 * however, it means that the structures with the low indexes have to be
> +	 * reserved for driver-initiated requests, while requests from the block
> +	 * layer will use the higher indexes.
>  	 */
>  
> -	offset = h->last_allocation; /* benignly racy */
>  	for (;;) {
> -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
> -		if (unlikely(i == h->nr_cmds)) {
> -			offset = 0;
> +		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
> +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
>  			continue;
> -		}
>  		c = h->cmd_pool + i;
>  		refcount = atomic_inc_return(&c->refcount);
>  		if (unlikely(refcount > 1)) {
>  			cmd_free(h, c); /* already in use */
> -			offset = (i + 1) % h->nr_cmds;

Hi Don,
when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
it will be so likely for next several thousands of tests in this function until the it is freed.
When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
Cheers, Tomas

>  			continue;
>  		}
>  		set_bit(i & (BITS_PER_LONG - 1),
>  			h->cmd_pool_bits + (i / BITS_PER_LONG));
>  		break; /* it's ours now. */
>  	}
> -	h->last_allocation = i; /* benignly racy */
>  	hpsa_cmd_partial_init(h, i, c);
>  	return c;
>  }
>  
> +/*
> + * This is the complementary operation to cmd_alloc().  Note, however, in some
> + * corner cases it may also be used to free blocks allocated by
> + * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
> + * the clear-bit is harmless.
> + */
>  static void cmd_free(struct ctlr_info *h, struct CommandList *c)
>  {
>  	if (atomic_dec_and_test(&c->refcount)) {
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 3ec8934..2536b67 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -141,7 +141,6 @@ struct ctlr_info {
>  	struct CfgTable __iomem *cfgtable;
>  	int	interrupts_enabled;
>  	int 	max_commands;
> -	int last_allocation;
>  	atomic_t commands_outstanding;
>  #	define PERF_MODE_INT	0
>  #	define DOORBELL_INT	1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
       [not found]     ` <07F70BBF6832E34FA1C923241E8833AB486892F9@BBYEXM01.pmc-sierra.internal>
@ 2015-03-25 18:33       ` Webb Scales
  2015-03-26 12:47         ` Tomas Henzl
  0 siblings, 1 reply; 54+ messages in thread
From: Webb Scales @ 2015-03-25 18:33 UTC (permalink / raw)
  To: Brace, Don, Tomas Henzl
  Cc: Teel, Scott Stacy, kevin.barnett, james.bottomley,
	Christoph Hellwig, Lindley, Justin, SCSI development list

Tomas,

You are correct that the previous approach of using find_next_zero_bit() 
with a persistent offset is more run-time efficient in the worst case; 
however, given that "normal" operation doesn't call this allocation 
routine, and given that, when this routine is called, the bit mask being 
searched only has about 16 bits in it, I opted for simplicity over 
efficiency -- that is, I doubt that the difference in efficiency is 
discernible, while getting rid of the last_allocation field is a 
worthwhile savings in both memory use and code.

Regarding your earlier comment on the refcount variable, I believe that 
it was handy for debugging purposes.  The code has undergone numerous 
revisions, and the variable certainly could now be removed from the 
source per your suggestion.  (Of course, the compiler is already 
removing it, I'm sure. ;-) )


                 Webb



-----Original Message-----
From: Tomas Henzl [mailto:thenzl@redhat.com]
Sent: Monday, March 23, 2015 11:58 AM
To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com; hch@infradead.org; Justin Lindley; brace
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation

On 03/17/2015 09:06 PM, Don Brace wrote:

> From: Webb Scales <webbnh@hp.com>
>
> Rework slave allocation:
>    - separate the tagging support setup from the hostdata setup
>    - make the hostdata setup act consistently when the lookup fails
>    - make the hostdata setup act consistently when the device is not added
>    - set up the queue depth consistently across these scenarios
>    - if the block layer mq support is not available, explicitly enable and
>      activate the SCSI layer tcq support (and do this at allocation-time so
>      that the tags will be available for INQUIRY commands)
>
> Tweak slave configuration so that devices which are masked are also
> not attached.
>
> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
> Signed-off-by: Webb Scales <webbnh@hp.com>
> Signed-off-by: Don Brace <don.brace@pmcs.com>
> ---
>   drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
>   drivers/scsi/hpsa.h |    1
>   2 files changed, 123 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> index 34c178c..4e34a62 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -44,6 +44,7 @@
>   #include <scsi/scsi_host.h>
>   #include <scsi/scsi_tcq.h>
>   #include <scsi/scsi_eh.h>
> +#include <scsi/scsi_dbg.h>
>   #include <linux/cciss_ioctl.h>
>   #include <linux/string.h>
>   #include <linux/bitmap.h>
> @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
>   
>   static void cmd_free(struct ctlr_info *h, struct CommandList *c);
>   static struct CommandList *cmd_alloc(struct ctlr_info *h);
> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
> +					    struct scsi_cmnd *scmd);
>   static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
>   	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
>   	int cmd_type);
> @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
>   	}
>   }
>   
> +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
> +				      struct CommandList *c)
> +{
> +	hpsa_cmd_resolve_events(h, c);
> +	cmd_tagged_free(h, c);
> +}
> +
>   static void hpsa_cmd_free_and_done(struct ctlr_info *h,
>   		struct CommandList *c, struct scsi_cmnd *cmd)
>   {
> -	hpsa_cmd_resolve_events(h, c);
> -	cmd_free(h, c);
> +	hpsa_cmd_resolve_and_free(h, c);
>   	cmd->scsi_done(cmd);
>   }
>   
> @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
>   	hpsa_set_scsi_cmd_aborted(cmd);
>   	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
>   			 c->Request.CDB, c->err_info->ScsiStatus);
> -	hpsa_cmd_resolve_events(h, c);
> -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
> +	hpsa_cmd_resolve_and_free(h, c);
>   }
>   
>   static void process_ioaccel2_completion(struct ctlr_info *h,
> @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
>   	}
>   
>   	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
> -		cmd_free(h, c);
> +		hpsa_cmd_resolve_and_free(h, c);
>   		return SCSI_MLQUEUE_HOST_BUSY;
>   	}
>   	enqueue_cmd_and_start_io(h, c);
> @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
>   {
>   	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
>   
> +	BUG_ON(c->cmdindex != index);
> +
>   	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
>   	memset(c->err_info, 0, sizeof(*c->err_info));
>   	c->busaddr = (u32) cmd_dma_handle;
> @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>   
>   	/* Get the ptr to our adapter structure out of cmd->host. */
>   	h = sdev_to_hba(cmd->device);
> +
> +	BUG_ON(cmd->request->tag < 0);
> +
>   	dev = cmd->device->hostdata;
>   	if (!dev) {
>   		cmd->result = DID_NO_CONNECT << 16;
>   		cmd->scsi_done(cmd);
>   		return 0;
>   	}
> -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>   
> -	if (unlikely(lockup_detected(h))) {
> -		cmd->result = DID_NO_CONNECT << 16;
> -		cmd->scsi_done(cmd);
> -		return 0;
> -	}
> -	c = cmd_alloc(h);
> +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>   
>   	if (unlikely(lockup_detected(h))) {
>   		cmd->result = DID_NO_CONNECT << 16;
> -		cmd_free(h, c);
>   		cmd->scsi_done(cmd);
>   		return 0;
>   	}
> +	c = cmd_tagged_alloc(h, cmd);
>   
>   	/*
>   	 * Call alternate submit routine for I/O accelerated commands.
> @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>   		if (rc == 0)
>   			return 0;
>   		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
> -			cmd_free(h, c);
> +			hpsa_cmd_resolve_and_free(h, c);
>   			return SCSI_MLQUEUE_HOST_BUSY;
>   		}
>   	}
> @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>   	sh->hostdata[0] = (unsigned long) h;
>   	sh->irq = h->intr[h->intr_mode];
>   	sh->unique_id = sh->irq;
> +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
> +	if (error) {
> +		dev_err(&h->pdev->dev,
> +			"%s: scsi_init_shared_tag_map failed for controller %d\n",
> +			__func__, h->ctlr);
> +		goto fail_host_put;
> +	}
>   	error = scsi_add_host(sh, &h->pdev->dev);
> -	if (error)
> +	if (error) {
> +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
> +			__func__, h->ctlr);
>   		goto fail_host_put;
> +	}
>   	scsi_scan_host(sh);
>   	return 0;
>   
>    fail_host_put:
> -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
> -		" failed for controller %d\n", __func__, h->ctlr);
>   	scsi_host_put(sh);
>   	return error;
>    fail:
> @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>   }
>   
>   /*
> + * The block layer has already gone to the trouble of picking out a unique,
> + * small-integer tag for this request.  We use an offset from that value as
> + * an index to select our command block.  (The offset allows us to reserve the
> + * low-numbered entries for our own uses.)
> + */
> +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
> +{
> +	int idx = scmd->request->tag;
> +
> +	if (idx < 0)
> +		return idx;
> +
> +	/* Offset to leave space for internal cmds. */
> +	return idx += HPSA_NRESERVED_CMDS;
> +}
> +
> +/*
>    * Send a TEST_UNIT_READY command to the specified LUN using the specified
>    * reply queue; returns zero if the unit is ready, and non-zero otherwise.
>    */
> @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>   	/* if controller locked up, we can guarantee command won't complete */
>   	if (lockup_detected(h)) {
>   		dev_warn(&h->pdev->dev,
> -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
> -			h->scsi_host->host_no, dev->bus, dev->target,
> -			dev->lun);
> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
> +			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
> +			 hpsa_get_cmd_index(scsicmd));
>   		return FAILED;
>   	}
>   
>   	/* this reset request might be the result of a lockup; check */
>   	if (detect_controller_lockup(h)) {
>   		dev_warn(&h->pdev->dev,
> -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
>   			 h->scsi_host->host_no, dev->bus, dev->target,
> -			 dev->lun);
> +			 dev->lun, hpsa_get_cmd_index(scsicmd));
>   		return FAILED;
>   	}
>   
> @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>   }
>   
>   /*
> + * For operations with an associated SCSI command, a command block is allocated
> + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
> + * block request tag as an index into a table of entries.  cmd_tagged_free() is
> + * the complement, although cmd_free() may be called instead.
> + */
> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
> +					    struct scsi_cmnd *scmd)
> +{
> +	int idx = hpsa_get_cmd_index(scmd);
> +	struct CommandList *c = h->cmd_pool + idx;
> +	int refcount = 0;
> +
> +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
> +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
> +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
> +		/* The index value comes from the block layer, so if it's out of
> +		 * bounds, it's probably not our bug.
> +		 */
> +		BUG();
> +	}
> +
> +	refcount = atomic_inc_return(&c->refcount);

refcount is never used, use atomic_inc(&c->refcount); instead?

> +	if (unlikely(!hpsa_is_cmd_idle(c))) {
> +		/*
> +		 * We expect that the SCSI layer will hand us a unique tag
> +		 * value.  Thus, there should never be a collision here between
> +		 * two requests...because if the selected command isn't idle
> +		 * then someone is going to be very disappointed.
> +		 */
> +		dev_err(&h->pdev->dev,
> +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
> +			idx);
> +		if (c->scsi_cmd != NULL)
> +			scsi_print_command(c->scsi_cmd);
> +		scsi_print_command(scmd);
> +	}
> +
> +	hpsa_cmd_partial_init(h, idx, c);
> +	return c;
> +}
> +
> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
> +{
> +	/*
> +	 * Release our reference to the block.  We don't need to do anything
> +	 * else to free it, because it is accessed by index.  (There's no point
> +	 * in checking the result of the decrement, since we cannot guarantee
> +	 * that there isn't a concurrent abort which is also accessing it.)
> +	 */
> +	(void)atomic_dec(&c->refcount);
> +}
> +
> +/*
>    * For operations that cannot sleep, a command block is allocated at init,
>    * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
>    * which ones are free or in use.  Lock must be held when calling this.
> @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>   {
>   	struct CommandList *c;
>   	int refcount, i;
> -	unsigned long offset;
>   
>   	/*
>   	 * There is some *extremely* small but non-zero chance that that
> @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>   	 * very unlucky thread might be starved anyway, never able to
>   	 * beat the other threads.  In reality, this happens so
>   	 * infrequently as to be indistinguishable from never.
> +	 *
> +	 * Note that we start allocating commands before the SCSI host structure
> +	 * is initialized.  Since the search starts at bit zero, this
> +	 * all works, since we have at least one command structure available;
> +	 * however, it means that the structures with the low indexes have to be
> +	 * reserved for driver-initiated requests, while requests from the block
> +	 * layer will use the higher indexes.
>   	 */
>   
> -	offset = h->last_allocation; /* benignly racy */
>   	for (;;) {
> -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
> -		if (unlikely(i == h->nr_cmds)) {
> -			offset = 0;
> +		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
> +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
>   			continue;
> -		}
>   		c = h->cmd_pool + i;
>   		refcount = atomic_inc_return(&c->refcount);
>   		if (unlikely(refcount > 1)) {
>   			cmd_free(h, c); /* already in use */
> -			offset = (i + 1) % h->nr_cmds;

Hi Don,
when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
it will be so likely for next several thousands of tests in this function until the it is freed.
When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
Cheers, Tomas

>   			continue;
>   		}
>   		set_bit(i & (BITS_PER_LONG - 1),
>   			h->cmd_pool_bits + (i / BITS_PER_LONG));
>   		break; /* it's ours now. */
>   	}
> -	h->last_allocation = i; /* benignly racy */
>   	hpsa_cmd_partial_init(h, i, c);
>   	return c;
>   }
>   
> +/*
> + * This is the complementary operation to cmd_alloc().  Note, however, in some
> + * corner cases it may also be used to free blocks allocated by
> + * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
> + * the clear-bit is harmless.
> + */
>   static void cmd_free(struct ctlr_info *h, struct CommandList *c)
>   {
>   	if (atomic_dec_and_test(&c->refcount)) {
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 3ec8934..2536b67 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -141,7 +141,6 @@ struct ctlr_info {
>   	struct CfgTable __iomem *cfgtable;
>   	int	interrupts_enabled;
>   	int 	max_commands;
> -	int last_allocation;
>   	atomic_t commands_outstanding;
>   #	define PERF_MODE_INT	0
>   #	define DOORBELL_INT	1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-25 18:33       ` Webb Scales
@ 2015-03-26 12:47         ` Tomas Henzl
  2015-03-26 14:38           ` Webb Scales
  0 siblings, 1 reply; 54+ messages in thread
From: Tomas Henzl @ 2015-03-26 12:47 UTC (permalink / raw)
  To: Webb Scales, Brace, Don
  Cc: Teel, Scott Stacy, kevin.barnett, james.bottomley,
	Christoph Hellwig, Lindley, Justin, SCSI development list

On 03/25/2015 07:33 PM, Webb Scales wrote:
> Tomas,
>
> You are correct that the previous approach of using find_next_zero_bit() 
> with a persistent offset is more run-time efficient in the worst case; 
> however, given that "normal" operation doesn't call this allocation 
> routine, and given that, when this routine is called, the bit mask being 
> searched only has about 16 bits in it, I opted for simplicity over 
> efficiency -- that is, I doubt that the difference in efficiency is 
> discernible, while getting rid of the last_allocation field is a 
> worthwhile savings in both memory use and code.

My comment is not about efficiency, I believe that when you measure it you wont be
able to measure any significant difference. 
But if this '(unlikely(refcount > 1))' is true for let's say the first entry in the bitfield
this and any other command submitted later will not pass the cmd-alloc until the command that 
caused '(unlikely(refcount > 1))' to be true is resolved. That might cause unexpected
behavior. 

> Regarding your earlier comment on the refcount variable, I believe that 
> it was handy for debugging purposes.  The code has undergone numerous 
> revisions, and the variable certainly could now be removed from the 
> source per your suggestion.  (Of course, the compiler is already 
> removing it, I'm sure. ;-) )

Not sure if the compiler is able to switch from 'atomic_inc_return' to 'atomic_inc' :) though.
It is not important. (I'd not comment on this without the other cmd_alloc inaccuracy)

tomash 

>
>
>                  Webb
>
>
>
> -----Original Message-----
> From: Tomas Henzl [mailto:thenzl@redhat.com]
> Sent: Monday, March 23, 2015 11:58 AM
> To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com; hch@infradead.org; Justin Lindley; brace
> Cc: linux-scsi@vger.kernel.org
> Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
>
> On 03/17/2015 09:06 PM, Don Brace wrote:
>
>> From: Webb Scales <webbnh@hp.com>
>>
>> Rework slave allocation:
>>    - separate the tagging support setup from the hostdata setup
>>    - make the hostdata setup act consistently when the lookup fails
>>    - make the hostdata setup act consistently when the device is not added
>>    - set up the queue depth consistently across these scenarios
>>    - if the block layer mq support is not available, explicitly enable and
>>      activate the SCSI layer tcq support (and do this at allocation-time so
>>      that the tags will be available for INQUIRY commands)
>>
>> Tweak slave configuration so that devices which are masked are also
>> not attached.
>>
>> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
>> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
>> Signed-off-by: Webb Scales <webbnh@hp.com>
>> Signed-off-by: Don Brace <don.brace@pmcs.com>
>> ---
>>   drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
>>   drivers/scsi/hpsa.h |    1
>>   2 files changed, 123 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>> index 34c178c..4e34a62 100644
>> --- a/drivers/scsi/hpsa.c
>> +++ b/drivers/scsi/hpsa.c
>> @@ -44,6 +44,7 @@
>>   #include <scsi/scsi_host.h>
>>   #include <scsi/scsi_tcq.h>
>>   #include <scsi/scsi_eh.h>
>> +#include <scsi/scsi_dbg.h>
>>   #include <linux/cciss_ioctl.h>
>>   #include <linux/string.h>
>>   #include <linux/bitmap.h>
>> @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
>>   
>>   static void cmd_free(struct ctlr_info *h, struct CommandList *c);
>>   static struct CommandList *cmd_alloc(struct ctlr_info *h);
>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>> +					    struct scsi_cmnd *scmd);
>>   static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
>>   	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
>>   	int cmd_type);
>> @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
>>   	}
>>   }
>>   
>> +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
>> +				      struct CommandList *c)
>> +{
>> +	hpsa_cmd_resolve_events(h, c);
>> +	cmd_tagged_free(h, c);
>> +}
>> +
>>   static void hpsa_cmd_free_and_done(struct ctlr_info *h,
>>   		struct CommandList *c, struct scsi_cmnd *cmd)
>>   {
>> -	hpsa_cmd_resolve_events(h, c);
>> -	cmd_free(h, c);
>> +	hpsa_cmd_resolve_and_free(h, c);
>>   	cmd->scsi_done(cmd);
>>   }
>>   
>> @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
>>   	hpsa_set_scsi_cmd_aborted(cmd);
>>   	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
>>   			 c->Request.CDB, c->err_info->ScsiStatus);
>> -	hpsa_cmd_resolve_events(h, c);
>> -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
>> +	hpsa_cmd_resolve_and_free(h, c);
>>   }
>>   
>>   static void process_ioaccel2_completion(struct ctlr_info *h,
>> @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
>>   	}
>>   
>>   	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
>> -		cmd_free(h, c);
>> +		hpsa_cmd_resolve_and_free(h, c);
>>   		return SCSI_MLQUEUE_HOST_BUSY;
>>   	}
>>   	enqueue_cmd_and_start_io(h, c);
>> @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
>>   {
>>   	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
>>   
>> +	BUG_ON(c->cmdindex != index);
>> +
>>   	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
>>   	memset(c->err_info, 0, sizeof(*c->err_info));
>>   	c->busaddr = (u32) cmd_dma_handle;
>> @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>   
>>   	/* Get the ptr to our adapter structure out of cmd->host. */
>>   	h = sdev_to_hba(cmd->device);
>> +
>> +	BUG_ON(cmd->request->tag < 0);
>> +
>>   	dev = cmd->device->hostdata;
>>   	if (!dev) {
>>   		cmd->result = DID_NO_CONNECT << 16;
>>   		cmd->scsi_done(cmd);
>>   		return 0;
>>   	}
>> -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>   
>> -	if (unlikely(lockup_detected(h))) {
>> -		cmd->result = DID_NO_CONNECT << 16;
>> -		cmd->scsi_done(cmd);
>> -		return 0;
>> -	}
>> -	c = cmd_alloc(h);
>> +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>   
>>   	if (unlikely(lockup_detected(h))) {
>>   		cmd->result = DID_NO_CONNECT << 16;
>> -		cmd_free(h, c);
>>   		cmd->scsi_done(cmd);
>>   		return 0;
>>   	}
>> +	c = cmd_tagged_alloc(h, cmd);
>>   
>>   	/*
>>   	 * Call alternate submit routine for I/O accelerated commands.
>> @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>   		if (rc == 0)
>>   			return 0;
>>   		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
>> -			cmd_free(h, c);
>> +			hpsa_cmd_resolve_and_free(h, c);
>>   			return SCSI_MLQUEUE_HOST_BUSY;
>>   		}
>>   	}
>> @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>   	sh->hostdata[0] = (unsigned long) h;
>>   	sh->irq = h->intr[h->intr_mode];
>>   	sh->unique_id = sh->irq;
>> +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
>> +	if (error) {
>> +		dev_err(&h->pdev->dev,
>> +			"%s: scsi_init_shared_tag_map failed for controller %d\n",
>> +			__func__, h->ctlr);
>> +		goto fail_host_put;
>> +	}
>>   	error = scsi_add_host(sh, &h->pdev->dev);
>> -	if (error)
>> +	if (error) {
>> +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
>> +			__func__, h->ctlr);
>>   		goto fail_host_put;
>> +	}
>>   	scsi_scan_host(sh);
>>   	return 0;
>>   
>>    fail_host_put:
>> -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
>> -		" failed for controller %d\n", __func__, h->ctlr);
>>   	scsi_host_put(sh);
>>   	return error;
>>    fail:
>> @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>   }
>>   
>>   /*
>> + * The block layer has already gone to the trouble of picking out a unique,
>> + * small-integer tag for this request.  We use an offset from that value as
>> + * an index to select our command block.  (The offset allows us to reserve the
>> + * low-numbered entries for our own uses.)
>> + */
>> +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
>> +{
>> +	int idx = scmd->request->tag;
>> +
>> +	if (idx < 0)
>> +		return idx;
>> +
>> +	/* Offset to leave space for internal cmds. */
>> +	return idx += HPSA_NRESERVED_CMDS;
>> +}
>> +
>> +/*
>>    * Send a TEST_UNIT_READY command to the specified LUN using the specified
>>    * reply queue; returns zero if the unit is ready, and non-zero otherwise.
>>    */
>> @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>>   	/* if controller locked up, we can guarantee command won't complete */
>>   	if (lockup_detected(h)) {
>>   		dev_warn(&h->pdev->dev,
>> -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
>> -			h->scsi_host->host_no, dev->bus, dev->target,
>> -			dev->lun);
>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
>> +			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
>> +			 hpsa_get_cmd_index(scsicmd));
>>   		return FAILED;
>>   	}
>>   
>>   	/* this reset request might be the result of a lockup; check */
>>   	if (detect_controller_lockup(h)) {
>>   		dev_warn(&h->pdev->dev,
>> -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
>>   			 h->scsi_host->host_no, dev->bus, dev->target,
>> -			 dev->lun);
>> +			 dev->lun, hpsa_get_cmd_index(scsicmd));
>>   		return FAILED;
>>   	}
>>   
>> @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>>   }
>>   
>>   /*
>> + * For operations with an associated SCSI command, a command block is allocated
>> + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
>> + * block request tag as an index into a table of entries.  cmd_tagged_free() is
>> + * the complement, although cmd_free() may be called instead.
>> + */
>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>> +					    struct scsi_cmnd *scmd)
>> +{
>> +	int idx = hpsa_get_cmd_index(scmd);
>> +	struct CommandList *c = h->cmd_pool + idx;
>> +	int refcount = 0;
>> +
>> +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
>> +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
>> +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
>> +		/* The index value comes from the block layer, so if it's out of
>> +		 * bounds, it's probably not our bug.
>> +		 */
>> +		BUG();
>> +	}
>> +
>> +	refcount = atomic_inc_return(&c->refcount);
> refcount is never used, use atomic_inc(&c->refcount); instead?
>
>> +	if (unlikely(!hpsa_is_cmd_idle(c))) {
>> +		/*
>> +		 * We expect that the SCSI layer will hand us a unique tag
>> +		 * value.  Thus, there should never be a collision here between
>> +		 * two requests...because if the selected command isn't idle
>> +		 * then someone is going to be very disappointed.
>> +		 */
>> +		dev_err(&h->pdev->dev,
>> +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
>> +			idx);
>> +		if (c->scsi_cmd != NULL)
>> +			scsi_print_command(c->scsi_cmd);
>> +		scsi_print_command(scmd);
>> +	}
>> +
>> +	hpsa_cmd_partial_init(h, idx, c);
>> +	return c;
>> +}
>> +
>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
>> +{
>> +	/*
>> +	 * Release our reference to the block.  We don't need to do anything
>> +	 * else to free it, because it is accessed by index.  (There's no point
>> +	 * in checking the result of the decrement, since we cannot guarantee
>> +	 * that there isn't a concurrent abort which is also accessing it.)
>> +	 */
>> +	(void)atomic_dec(&c->refcount);
>> +}
>> +
>> +/*
>>    * For operations that cannot sleep, a command block is allocated at init,
>>    * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
>>    * which ones are free or in use.  Lock must be held when calling this.
>> @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>   {
>>   	struct CommandList *c;
>>   	int refcount, i;
>> -	unsigned long offset;
>>   
>>   	/*
>>   	 * There is some *extremely* small but non-zero chance that that
>> @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>   	 * very unlucky thread might be starved anyway, never able to
>>   	 * beat the other threads.  In reality, this happens so
>>   	 * infrequently as to be indistinguishable from never.
>> +	 *
>> +	 * Note that we start allocating commands before the SCSI host structure
>> +	 * is initialized.  Since the search starts at bit zero, this
>> +	 * all works, since we have at least one command structure available;
>> +	 * however, it means that the structures with the low indexes have to be
>> +	 * reserved for driver-initiated requests, while requests from the block
>> +	 * layer will use the higher indexes.
>>   	 */
>>   
>> -	offset = h->last_allocation; /* benignly racy */
>>   	for (;;) {
>> -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
>> -		if (unlikely(i == h->nr_cmds)) {
>> -			offset = 0;
>> +		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
>> +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
>>   			continue;
>> -		}
>>   		c = h->cmd_pool + i;
>>   		refcount = atomic_inc_return(&c->refcount);
>>   		if (unlikely(refcount > 1)) {
>>   			cmd_free(h, c); /* already in use */
>> -			offset = (i + 1) % h->nr_cmds;
> Hi Don,
> when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
> it will be so likely for next several thousands of tests in this function until the it is freed.
> When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
> The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
> Cheers, Tomas
>
>>   			continue;
>>   		}
>>   		set_bit(i & (BITS_PER_LONG - 1),
>>   			h->cmd_pool_bits + (i / BITS_PER_LONG));
>>   		break; /* it's ours now. */
>>   	}
>> -	h->last_allocation = i; /* benignly racy */
>>   	hpsa_cmd_partial_init(h, i, c);
>>   	return c;
>>   }
>>   
>> +/*
>> + * This is the complementary operation to cmd_alloc().  Note, however, in some
>> + * corner cases it may also be used to free blocks allocated by
>> + * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
>> + * the clear-bit is harmless.
>> + */
>>   static void cmd_free(struct ctlr_info *h, struct CommandList *c)
>>   {
>>   	if (atomic_dec_and_test(&c->refcount)) {
>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
>> index 3ec8934..2536b67 100644
>> --- a/drivers/scsi/hpsa.h
>> +++ b/drivers/scsi/hpsa.h
>> @@ -141,7 +141,6 @@ struct ctlr_info {
>>   	struct CfgTable __iomem *cfgtable;
>>   	int	interrupts_enabled;
>>   	int 	max_commands;
>> -	int last_allocation;
>>   	atomic_t commands_outstanding;
>>   #	define PERF_MODE_INT	0
>>   #	define DOORBELL_INT	1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-26 12:47         ` Tomas Henzl
@ 2015-03-26 14:38           ` Webb Scales
  2015-03-26 15:10             ` Tomas Henzl
  0 siblings, 1 reply; 54+ messages in thread
From: Webb Scales @ 2015-03-26 14:38 UTC (permalink / raw)
  To: Tomas Henzl, Brace, Don
  Cc: Teel, Scott Stacy, kevin.barnett, james.bottomley,
	Christoph Hellwig, Lindley, Justin, SCSI development list

Ah!  Tomas, you are absolutely correct!  The loop should not be 
restarting the search from the beginning of the bitfield, but rather 
should be proceeding to check the next bit.  (Otherwise, there's no 
point in having more than one bit!!)

(This code has been tweaked so many times that when I read it now I no 
longer see what it actually does....)

And, Tomas, you have a good point regarding the difference between 
atomic_inc() and atomic_inc_return(), but, again, the difference is only 
a couple of instructions in the context of a long code path, so I think 
it's a difference without a distinction.  (And, I'm looking forward to 
the day when all of the reference counting stuff can be removed...I 
think it's not far off.)


                 Webb



On 3/26/15 8:47 AM, Tomas Henzl wrote:
> On 03/25/2015 07:33 PM, Webb Scales wrote:
>> Tomas,
>>
>> You are correct that the previous approach of using find_next_zero_bit()
>> with a persistent offset is more run-time efficient in the worst case;
>> however, given that "normal" operation doesn't call this allocation
>> routine, and given that, when this routine is called, the bit mask being
>> searched only has about 16 bits in it, I opted for simplicity over
>> efficiency -- that is, I doubt that the difference in efficiency is
>> discernible, while getting rid of the last_allocation field is a
>> worthwhile savings in both memory use and code.
> My comment is not about efficiency, I believe that when you measure it you wont be
> able to measure any significant difference.
> But if this '(unlikely(refcount > 1))' is true for let's say the first entry in the bitfield
> this and any other command submitted later will not pass the cmd-alloc until the command that
> caused '(unlikely(refcount > 1))' to be true is resolved. That might cause unexpected
> behavior.
>
>> Regarding your earlier comment on the refcount variable, I believe that
>> it was handy for debugging purposes.  The code has undergone numerous
>> revisions, and the variable certainly could now be removed from the
>> source per your suggestion.  (Of course, the compiler is already
>> removing it, I'm sure. ;-) )
> Not sure if the compiler is able to switch from 'atomic_inc_return' to 'atomic_inc' :) though.
> It is not important. (I'd not comment on this without the other cmd_alloc inaccuracy)
>
> tomash
>
>>
>>                   Webb
>>
>>
>>
>> -----Original Message-----
>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>> Sent: Monday, March 23, 2015 11:58 AM
>> To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com; hch@infradead.org; Justin Lindley; brace
>> Cc: linux-scsi@vger.kernel.org
>> Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
>>
>> On 03/17/2015 09:06 PM, Don Brace wrote:
>>
>>> From: Webb Scales <webbnh@hp.com>
>>>
>>> Rework slave allocation:
>>>     - separate the tagging support setup from the hostdata setup
>>>     - make the hostdata setup act consistently when the lookup fails
>>>     - make the hostdata setup act consistently when the device is not added
>>>     - set up the queue depth consistently across these scenarios
>>>     - if the block layer mq support is not available, explicitly enable and
>>>       activate the SCSI layer tcq support (and do this at allocation-time so
>>>       that the tags will be available for INQUIRY commands)
>>>
>>> Tweak slave configuration so that devices which are masked are also
>>> not attached.
>>>
>>> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
>>> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
>>> Signed-off-by: Webb Scales <webbnh@hp.com>
>>> Signed-off-by: Don Brace <don.brace@pmcs.com>
>>> ---
>>>    drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
>>>    drivers/scsi/hpsa.h |    1
>>>    2 files changed, 123 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>>> index 34c178c..4e34a62 100644
>>> --- a/drivers/scsi/hpsa.c
>>> +++ b/drivers/scsi/hpsa.c
>>> @@ -44,6 +44,7 @@
>>>    #include <scsi/scsi_host.h>
>>>    #include <scsi/scsi_tcq.h>
>>>    #include <scsi/scsi_eh.h>
>>> +#include <scsi/scsi_dbg.h>
>>>    #include <linux/cciss_ioctl.h>
>>>    #include <linux/string.h>
>>>    #include <linux/bitmap.h>
>>> @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
>>>    
>>>    static void cmd_free(struct ctlr_info *h, struct CommandList *c);
>>>    static struct CommandList *cmd_alloc(struct ctlr_info *h);
>>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
>>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>>> +					    struct scsi_cmnd *scmd);
>>>    static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
>>>    	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
>>>    	int cmd_type);
>>> @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
>>>    	}
>>>    }
>>>    
>>> +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
>>> +				      struct CommandList *c)
>>> +{
>>> +	hpsa_cmd_resolve_events(h, c);
>>> +	cmd_tagged_free(h, c);
>>> +}
>>> +
>>>    static void hpsa_cmd_free_and_done(struct ctlr_info *h,
>>>    		struct CommandList *c, struct scsi_cmnd *cmd)
>>>    {
>>> -	hpsa_cmd_resolve_events(h, c);
>>> -	cmd_free(h, c);
>>> +	hpsa_cmd_resolve_and_free(h, c);
>>>    	cmd->scsi_done(cmd);
>>>    }
>>>    
>>> @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
>>>    	hpsa_set_scsi_cmd_aborted(cmd);
>>>    	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
>>>    			 c->Request.CDB, c->err_info->ScsiStatus);
>>> -	hpsa_cmd_resolve_events(h, c);
>>> -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
>>> +	hpsa_cmd_resolve_and_free(h, c);
>>>    }
>>>    
>>>    static void process_ioaccel2_completion(struct ctlr_info *h,
>>> @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
>>>    	}
>>>    
>>>    	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
>>> -		cmd_free(h, c);
>>> +		hpsa_cmd_resolve_and_free(h, c);
>>>    		return SCSI_MLQUEUE_HOST_BUSY;
>>>    	}
>>>    	enqueue_cmd_and_start_io(h, c);
>>> @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
>>>    {
>>>    	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
>>>    
>>> +	BUG_ON(c->cmdindex != index);
>>> +
>>>    	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
>>>    	memset(c->err_info, 0, sizeof(*c->err_info));
>>>    	c->busaddr = (u32) cmd_dma_handle;
>>> @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>>    
>>>    	/* Get the ptr to our adapter structure out of cmd->host. */
>>>    	h = sdev_to_hba(cmd->device);
>>> +
>>> +	BUG_ON(cmd->request->tag < 0);
>>> +
>>>    	dev = cmd->device->hostdata;
>>>    	if (!dev) {
>>>    		cmd->result = DID_NO_CONNECT << 16;
>>>    		cmd->scsi_done(cmd);
>>>    		return 0;
>>>    	}
>>> -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>>    
>>> -	if (unlikely(lockup_detected(h))) {
>>> -		cmd->result = DID_NO_CONNECT << 16;
>>> -		cmd->scsi_done(cmd);
>>> -		return 0;
>>> -	}
>>> -	c = cmd_alloc(h);
>>> +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>>    
>>>    	if (unlikely(lockup_detected(h))) {
>>>    		cmd->result = DID_NO_CONNECT << 16;
>>> -		cmd_free(h, c);
>>>    		cmd->scsi_done(cmd);
>>>    		return 0;
>>>    	}
>>> +	c = cmd_tagged_alloc(h, cmd);
>>>    
>>>    	/*
>>>    	 * Call alternate submit routine for I/O accelerated commands.
>>> @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>>    		if (rc == 0)
>>>    			return 0;
>>>    		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
>>> -			cmd_free(h, c);
>>> +			hpsa_cmd_resolve_and_free(h, c);
>>>    			return SCSI_MLQUEUE_HOST_BUSY;
>>>    		}
>>>    	}
>>> @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>>    	sh->hostdata[0] = (unsigned long) h;
>>>    	sh->irq = h->intr[h->intr_mode];
>>>    	sh->unique_id = sh->irq;
>>> +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
>>> +	if (error) {
>>> +		dev_err(&h->pdev->dev,
>>> +			"%s: scsi_init_shared_tag_map failed for controller %d\n",
>>> +			__func__, h->ctlr);
>>> +		goto fail_host_put;
>>> +	}
>>>    	error = scsi_add_host(sh, &h->pdev->dev);
>>> -	if (error)
>>> +	if (error) {
>>> +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
>>> +			__func__, h->ctlr);
>>>    		goto fail_host_put;
>>> +	}
>>>    	scsi_scan_host(sh);
>>>    	return 0;
>>>    
>>>     fail_host_put:
>>> -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
>>> -		" failed for controller %d\n", __func__, h->ctlr);
>>>    	scsi_host_put(sh);
>>>    	return error;
>>>     fail:
>>> @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>>    }
>>>    
>>>    /*
>>> + * The block layer has already gone to the trouble of picking out a unique,
>>> + * small-integer tag for this request.  We use an offset from that value as
>>> + * an index to select our command block.  (The offset allows us to reserve the
>>> + * low-numbered entries for our own uses.)
>>> + */
>>> +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
>>> +{
>>> +	int idx = scmd->request->tag;
>>> +
>>> +	if (idx < 0)
>>> +		return idx;
>>> +
>>> +	/* Offset to leave space for internal cmds. */
>>> +	return idx += HPSA_NRESERVED_CMDS;
>>> +}
>>> +
>>> +/*
>>>     * Send a TEST_UNIT_READY command to the specified LUN using the specified
>>>     * reply queue; returns zero if the unit is ready, and non-zero otherwise.
>>>     */
>>> @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>>>    	/* if controller locked up, we can guarantee command won't complete */
>>>    	if (lockup_detected(h)) {
>>>    		dev_warn(&h->pdev->dev,
>>> -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
>>> -			h->scsi_host->host_no, dev->bus, dev->target,
>>> -			dev->lun);
>>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
>>> +			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
>>> +			 hpsa_get_cmd_index(scsicmd));
>>>    		return FAILED;
>>>    	}
>>>    
>>>    	/* this reset request might be the result of a lockup; check */
>>>    	if (detect_controller_lockup(h)) {
>>>    		dev_warn(&h->pdev->dev,
>>> -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
>>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
>>>    			 h->scsi_host->host_no, dev->bus, dev->target,
>>> -			 dev->lun);
>>> +			 dev->lun, hpsa_get_cmd_index(scsicmd));
>>>    		return FAILED;
>>>    	}
>>>    
>>> @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>>>    }
>>>    
>>>    /*
>>> + * For operations with an associated SCSI command, a command block is allocated
>>> + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
>>> + * block request tag as an index into a table of entries.  cmd_tagged_free() is
>>> + * the complement, although cmd_free() may be called instead.
>>> + */
>>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>>> +					    struct scsi_cmnd *scmd)
>>> +{
>>> +	int idx = hpsa_get_cmd_index(scmd);
>>> +	struct CommandList *c = h->cmd_pool + idx;
>>> +	int refcount = 0;
>>> +
>>> +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
>>> +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
>>> +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
>>> +		/* The index value comes from the block layer, so if it's out of
>>> +		 * bounds, it's probably not our bug.
>>> +		 */
>>> +		BUG();
>>> +	}
>>> +
>>> +	refcount = atomic_inc_return(&c->refcount);
>> refcount is never used, use atomic_inc(&c->refcount); instead?
>>
>>> +	if (unlikely(!hpsa_is_cmd_idle(c))) {
>>> +		/*
>>> +		 * We expect that the SCSI layer will hand us a unique tag
>>> +		 * value.  Thus, there should never be a collision here between
>>> +		 * two requests...because if the selected command isn't idle
>>> +		 * then someone is going to be very disappointed.
>>> +		 */
>>> +		dev_err(&h->pdev->dev,
>>> +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
>>> +			idx);
>>> +		if (c->scsi_cmd != NULL)
>>> +			scsi_print_command(c->scsi_cmd);
>>> +		scsi_print_command(scmd);
>>> +	}
>>> +
>>> +	hpsa_cmd_partial_init(h, idx, c);
>>> +	return c;
>>> +}
>>> +
>>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
>>> +{
>>> +	/*
>>> +	 * Release our reference to the block.  We don't need to do anything
>>> +	 * else to free it, because it is accessed by index.  (There's no point
>>> +	 * in checking the result of the decrement, since we cannot guarantee
>>> +	 * that there isn't a concurrent abort which is also accessing it.)
>>> +	 */
>>> +	(void)atomic_dec(&c->refcount);
>>> +}
>>> +
>>> +/*
>>>     * For operations that cannot sleep, a command block is allocated at init,
>>>     * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
>>>     * which ones are free or in use.  Lock must be held when calling this.
>>> @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>>    {
>>>    	struct CommandList *c;
>>>    	int refcount, i;
>>> -	unsigned long offset;
>>>    
>>>    	/*
>>>    	 * There is some *extremely* small but non-zero chance that that
>>> @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>>    	 * very unlucky thread might be starved anyway, never able to
>>>    	 * beat the other threads.  In reality, this happens so
>>>    	 * infrequently as to be indistinguishable from never.
>>> +	 *
>>> +	 * Note that we start allocating commands before the SCSI host structure
>>> +	 * is initialized.  Since the search starts at bit zero, this
>>> +	 * all works, since we have at least one command structure available;
>>> +	 * however, it means that the structures with the low indexes have to be
>>> +	 * reserved for driver-initiated requests, while requests from the block
>>> +	 * layer will use the higher indexes.
>>>    	 */
>>>    
>>> -	offset = h->last_allocation; /* benignly racy */
>>>    	for (;;) {
>>> -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
>>> -		if (unlikely(i == h->nr_cmds)) {
>>> -			offset = 0;
>>> +		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
>>> +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
>>>    			continue;
>>> -		}
>>>    		c = h->cmd_pool + i;
>>>    		refcount = atomic_inc_return(&c->refcount);
>>>    		if (unlikely(refcount > 1)) {
>>>    			cmd_free(h, c); /* already in use */
>>> -			offset = (i + 1) % h->nr_cmds;
>> Hi Don,
>> when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
>> it will be so likely for next several thousands of tests in this function until the it is freed.
>> When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
>> The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
>> Cheers, Tomas
>>
>>>    			continue;
>>>    		}
>>>    		set_bit(i & (BITS_PER_LONG - 1),
>>>    			h->cmd_pool_bits + (i / BITS_PER_LONG));
>>>    		break; /* it's ours now. */
>>>    	}
>>> -	h->last_allocation = i; /* benignly racy */
>>>    	hpsa_cmd_partial_init(h, i, c);
>>>    	return c;
>>>    }
>>>    
>>> +/*
>>> + * This is the complementary operation to cmd_alloc().  Note, however, in some
>>> + * corner cases it may also be used to free blocks allocated by
>>> + * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
>>> + * the clear-bit is harmless.
>>> + */
>>>    static void cmd_free(struct ctlr_info *h, struct CommandList *c)
>>>    {
>>>    	if (atomic_dec_and_test(&c->refcount)) {
>>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
>>> index 3ec8934..2536b67 100644
>>> --- a/drivers/scsi/hpsa.h
>>> +++ b/drivers/scsi/hpsa.h
>>> @@ -141,7 +141,6 @@ struct ctlr_info {
>>>    	struct CfgTable __iomem *cfgtable;
>>>    	int	interrupts_enabled;
>>>    	int 	max_commands;
>>> -	int last_allocation;
>>>    	atomic_t commands_outstanding;
>>>    #	define PERF_MODE_INT	0
>>>    #	define DOORBELL_INT	1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-26 14:38           ` Webb Scales
@ 2015-03-26 15:10             ` Tomas Henzl
  2015-03-26 15:18               ` Webb Scales
  0 siblings, 1 reply; 54+ messages in thread
From: Tomas Henzl @ 2015-03-26 15:10 UTC (permalink / raw)
  To: Webb Scales, Brace, Don
  Cc: Teel, Scott Stacy, kevin.barnett, james.bottomley,
	Christoph Hellwig, Lindley, Justin, SCSI development list

On 03/26/2015 03:38 PM, Webb Scales wrote:
> Ah!  Tomas, you are absolutely correct!  The loop should not be 
> restarting the search from the beginning of the bitfield, but rather 
> should be proceeding to check the next bit.  (Otherwise, there's no 
> point in having more than one bit!!)

Most of the time it will work as expected, my comment was about a corner case.
Are you going to repost this patch? Btw. I think that a local variable
to restart the loop with an incremented value is enough.

>
> (This code has been tweaked so many times that when I read it now I no 
> longer see what it actually does....)
>
> And, Tomas, you have a good point regarding the difference between 
> atomic_inc() and atomic_inc_return(), but, again, the difference is only 
> a couple of instructions in the context of a long code path, so I think 
> it's a difference without a distinction.  (And, I'm looking forward to 
> the day when all of the reference counting stuff can be removed...I 
> think it's not far off.)

If i got it right the refcount exist only because of the error handler,
i also think it might be rewritten so, that it could be removed.
This the atomic_inc/and atomic_inc_return is a small detail, but if you repost
the patch why not fix it too?

tomash 

>
>
>                  Webb
>
>
>
> On 3/26/15 8:47 AM, Tomas Henzl wrote:
>> On 03/25/2015 07:33 PM, Webb Scales wrote:
>>> Tomas,
>>>
>>> You are correct that the previous approach of using find_next_zero_bit()
>>> with a persistent offset is more run-time efficient in the worst case;
>>> however, given that "normal" operation doesn't call this allocation
>>> routine, and given that, when this routine is called, the bit mask being
>>> searched only has about 16 bits in it, I opted for simplicity over
>>> efficiency -- that is, I doubt that the difference in efficiency is
>>> discernible, while getting rid of the last_allocation field is a
>>> worthwhile savings in both memory use and code.
>> My comment is not about efficiency, I believe that when you measure it you wont be
>> able to measure any significant difference.
>> But if this '(unlikely(refcount > 1))' is true for let's say the first entry in the bitfield
>> this and any other command submitted later will not pass the cmd-alloc until the command that
>> caused '(unlikely(refcount > 1))' to be true is resolved. That might cause unexpected
>> behavior.
>>
>>> Regarding your earlier comment on the refcount variable, I believe that
>>> it was handy for debugging purposes.  The code has undergone numerous
>>> revisions, and the variable certainly could now be removed from the
>>> source per your suggestion.  (Of course, the compiler is already
>>> removing it, I'm sure. ;-) )
>> Not sure if the compiler is able to switch from 'atomic_inc_return' to 'atomic_inc' :) though.
>> It is not important. (I'd not comment on this without the other cmd_alloc inaccuracy)
>>
>> tomash
>>
>>>                   Webb
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>>> Sent: Monday, March 23, 2015 11:58 AM
>>> To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com; hch@infradead.org; Justin Lindley; brace
>>> Cc: linux-scsi@vger.kernel.org
>>> Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
>>>
>>> On 03/17/2015 09:06 PM, Don Brace wrote:
>>>
>>>> From: Webb Scales <webbnh@hp.com>
>>>>
>>>> Rework slave allocation:
>>>>     - separate the tagging support setup from the hostdata setup
>>>>     - make the hostdata setup act consistently when the lookup fails
>>>>     - make the hostdata setup act consistently when the device is not added
>>>>     - set up the queue depth consistently across these scenarios
>>>>     - if the block layer mq support is not available, explicitly enable and
>>>>       activate the SCSI layer tcq support (and do this at allocation-time so
>>>>       that the tags will be available for INQUIRY commands)
>>>>
>>>> Tweak slave configuration so that devices which are masked are also
>>>> not attached.
>>>>
>>>> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
>>>> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
>>>> Signed-off-by: Webb Scales <webbnh@hp.com>
>>>> Signed-off-by: Don Brace <don.brace@pmcs.com>
>>>> ---
>>>>    drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
>>>>    drivers/scsi/hpsa.h |    1
>>>>    2 files changed, 123 insertions(+), 31 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>>>> index 34c178c..4e34a62 100644
>>>> --- a/drivers/scsi/hpsa.c
>>>> +++ b/drivers/scsi/hpsa.c
>>>> @@ -44,6 +44,7 @@
>>>>    #include <scsi/scsi_host.h>
>>>>    #include <scsi/scsi_tcq.h>
>>>>    #include <scsi/scsi_eh.h>
>>>> +#include <scsi/scsi_dbg.h>
>>>>    #include <linux/cciss_ioctl.h>
>>>>    #include <linux/string.h>
>>>>    #include <linux/bitmap.h>
>>>> @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
>>>>    
>>>>    static void cmd_free(struct ctlr_info *h, struct CommandList *c);
>>>>    static struct CommandList *cmd_alloc(struct ctlr_info *h);
>>>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
>>>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>>>> +					    struct scsi_cmnd *scmd);
>>>>    static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
>>>>    	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
>>>>    	int cmd_type);
>>>> @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
>>>>    	}
>>>>    }
>>>>    
>>>> +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
>>>> +				      struct CommandList *c)
>>>> +{
>>>> +	hpsa_cmd_resolve_events(h, c);
>>>> +	cmd_tagged_free(h, c);
>>>> +}
>>>> +
>>>>    static void hpsa_cmd_free_and_done(struct ctlr_info *h,
>>>>    		struct CommandList *c, struct scsi_cmnd *cmd)
>>>>    {
>>>> -	hpsa_cmd_resolve_events(h, c);
>>>> -	cmd_free(h, c);
>>>> +	hpsa_cmd_resolve_and_free(h, c);
>>>>    	cmd->scsi_done(cmd);
>>>>    }
>>>>    
>>>> @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
>>>>    	hpsa_set_scsi_cmd_aborted(cmd);
>>>>    	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
>>>>    			 c->Request.CDB, c->err_info->ScsiStatus);
>>>> -	hpsa_cmd_resolve_events(h, c);
>>>> -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
>>>> +	hpsa_cmd_resolve_and_free(h, c);
>>>>    }
>>>>    
>>>>    static void process_ioaccel2_completion(struct ctlr_info *h,
>>>> @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
>>>>    	}
>>>>    
>>>>    	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
>>>> -		cmd_free(h, c);
>>>> +		hpsa_cmd_resolve_and_free(h, c);
>>>>    		return SCSI_MLQUEUE_HOST_BUSY;
>>>>    	}
>>>>    	enqueue_cmd_and_start_io(h, c);
>>>> @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
>>>>    {
>>>>    	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
>>>>    
>>>> +	BUG_ON(c->cmdindex != index);
>>>> +
>>>>    	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
>>>>    	memset(c->err_info, 0, sizeof(*c->err_info));
>>>>    	c->busaddr = (u32) cmd_dma_handle;
>>>> @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>>>    
>>>>    	/* Get the ptr to our adapter structure out of cmd->host. */
>>>>    	h = sdev_to_hba(cmd->device);
>>>> +
>>>> +	BUG_ON(cmd->request->tag < 0);
>>>> +
>>>>    	dev = cmd->device->hostdata;
>>>>    	if (!dev) {
>>>>    		cmd->result = DID_NO_CONNECT << 16;
>>>>    		cmd->scsi_done(cmd);
>>>>    		return 0;
>>>>    	}
>>>> -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>>>    
>>>> -	if (unlikely(lockup_detected(h))) {
>>>> -		cmd->result = DID_NO_CONNECT << 16;
>>>> -		cmd->scsi_done(cmd);
>>>> -		return 0;
>>>> -	}
>>>> -	c = cmd_alloc(h);
>>>> +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>>>    
>>>>    	if (unlikely(lockup_detected(h))) {
>>>>    		cmd->result = DID_NO_CONNECT << 16;
>>>> -		cmd_free(h, c);
>>>>    		cmd->scsi_done(cmd);
>>>>    		return 0;
>>>>    	}
>>>> +	c = cmd_tagged_alloc(h, cmd);
>>>>    
>>>>    	/*
>>>>    	 * Call alternate submit routine for I/O accelerated commands.
>>>> @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>>>    		if (rc == 0)
>>>>    			return 0;
>>>>    		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
>>>> -			cmd_free(h, c);
>>>> +			hpsa_cmd_resolve_and_free(h, c);
>>>>    			return SCSI_MLQUEUE_HOST_BUSY;
>>>>    		}
>>>>    	}
>>>> @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>>>    	sh->hostdata[0] = (unsigned long) h;
>>>>    	sh->irq = h->intr[h->intr_mode];
>>>>    	sh->unique_id = sh->irq;
>>>> +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
>>>> +	if (error) {
>>>> +		dev_err(&h->pdev->dev,
>>>> +			"%s: scsi_init_shared_tag_map failed for controller %d\n",
>>>> +			__func__, h->ctlr);
>>>> +		goto fail_host_put;
>>>> +	}
>>>>    	error = scsi_add_host(sh, &h->pdev->dev);
>>>> -	if (error)
>>>> +	if (error) {
>>>> +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
>>>> +			__func__, h->ctlr);
>>>>    		goto fail_host_put;
>>>> +	}
>>>>    	scsi_scan_host(sh);
>>>>    	return 0;
>>>>    
>>>>     fail_host_put:
>>>> -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
>>>> -		" failed for controller %d\n", __func__, h->ctlr);
>>>>    	scsi_host_put(sh);
>>>>    	return error;
>>>>     fail:
>>>> @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>>>    }
>>>>    
>>>>    /*
>>>> + * The block layer has already gone to the trouble of picking out a unique,
>>>> + * small-integer tag for this request.  We use an offset from that value as
>>>> + * an index to select our command block.  (The offset allows us to reserve the
>>>> + * low-numbered entries for our own uses.)
>>>> + */
>>>> +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
>>>> +{
>>>> +	int idx = scmd->request->tag;
>>>> +
>>>> +	if (idx < 0)
>>>> +		return idx;
>>>> +
>>>> +	/* Offset to leave space for internal cmds. */
>>>> +	return idx += HPSA_NRESERVED_CMDS;
>>>> +}
>>>> +
>>>> +/*
>>>>     * Send a TEST_UNIT_READY command to the specified LUN using the specified
>>>>     * reply queue; returns zero if the unit is ready, and non-zero otherwise.
>>>>     */
>>>> @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>>>>    	/* if controller locked up, we can guarantee command won't complete */
>>>>    	if (lockup_detected(h)) {
>>>>    		dev_warn(&h->pdev->dev,
>>>> -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
>>>> -			h->scsi_host->host_no, dev->bus, dev->target,
>>>> -			dev->lun);
>>>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
>>>> +			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
>>>> +			 hpsa_get_cmd_index(scsicmd));
>>>>    		return FAILED;
>>>>    	}
>>>>    
>>>>    	/* this reset request might be the result of a lockup; check */
>>>>    	if (detect_controller_lockup(h)) {
>>>>    		dev_warn(&h->pdev->dev,
>>>> -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
>>>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
>>>>    			 h->scsi_host->host_no, dev->bus, dev->target,
>>>> -			 dev->lun);
>>>> +			 dev->lun, hpsa_get_cmd_index(scsicmd));
>>>>    		return FAILED;
>>>>    	}
>>>>    
>>>> @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>>>>    }
>>>>    
>>>>    /*
>>>> + * For operations with an associated SCSI command, a command block is allocated
>>>> + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
>>>> + * block request tag as an index into a table of entries.  cmd_tagged_free() is
>>>> + * the complement, although cmd_free() may be called instead.
>>>> + */
>>>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>>>> +					    struct scsi_cmnd *scmd)
>>>> +{
>>>> +	int idx = hpsa_get_cmd_index(scmd);
>>>> +	struct CommandList *c = h->cmd_pool + idx;
>>>> +	int refcount = 0;
>>>> +
>>>> +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
>>>> +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
>>>> +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
>>>> +		/* The index value comes from the block layer, so if it's out of
>>>> +		 * bounds, it's probably not our bug.
>>>> +		 */
>>>> +		BUG();
>>>> +	}
>>>> +
>>>> +	refcount = atomic_inc_return(&c->refcount);
>>> refcount is never used, use atomic_inc(&c->refcount); instead?
>>>
>>>> +	if (unlikely(!hpsa_is_cmd_idle(c))) {
>>>> +		/*
>>>> +		 * We expect that the SCSI layer will hand us a unique tag
>>>> +		 * value.  Thus, there should never be a collision here between
>>>> +		 * two requests...because if the selected command isn't idle
>>>> +		 * then someone is going to be very disappointed.
>>>> +		 */
>>>> +		dev_err(&h->pdev->dev,
>>>> +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
>>>> +			idx);
>>>> +		if (c->scsi_cmd != NULL)
>>>> +			scsi_print_command(c->scsi_cmd);
>>>> +		scsi_print_command(scmd);
>>>> +	}
>>>> +
>>>> +	hpsa_cmd_partial_init(h, idx, c);
>>>> +	return c;
>>>> +}
>>>> +
>>>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
>>>> +{
>>>> +	/*
>>>> +	 * Release our reference to the block.  We don't need to do anything
>>>> +	 * else to free it, because it is accessed by index.  (There's no point
>>>> +	 * in checking the result of the decrement, since we cannot guarantee
>>>> +	 * that there isn't a concurrent abort which is also accessing it.)
>>>> +	 */
>>>> +	(void)atomic_dec(&c->refcount);
>>>> +}
>>>> +
>>>> +/*
>>>>     * For operations that cannot sleep, a command block is allocated at init,
>>>>     * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
>>>>     * which ones are free or in use.  Lock must be held when calling this.
>>>> @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>>>    {
>>>>    	struct CommandList *c;
>>>>    	int refcount, i;
>>>> -	unsigned long offset;
>>>>    
>>>>    	/*
>>>>    	 * There is some *extremely* small but non-zero chance that that
>>>> @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>>>    	 * very unlucky thread might be starved anyway, never able to
>>>>    	 * beat the other threads.  In reality, this happens so
>>>>    	 * infrequently as to be indistinguishable from never.
>>>> +	 *
>>>> +	 * Note that we start allocating commands before the SCSI host structure
>>>> +	 * is initialized.  Since the search starts at bit zero, this
>>>> +	 * all works, since we have at least one command structure available;
>>>> +	 * however, it means that the structures with the low indexes have to be
>>>> +	 * reserved for driver-initiated requests, while requests from the block
>>>> +	 * layer will use the higher indexes.
>>>>    	 */
>>>>    
>>>> -	offset = h->last_allocation; /* benignly racy */
>>>>    	for (;;) {
>>>> -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
>>>> -		if (unlikely(i == h->nr_cmds)) {
>>>> -			offset = 0;
>>>> +		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
>>>> +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
>>>>    			continue;
>>>> -		}
>>>>    		c = h->cmd_pool + i;
>>>>    		refcount = atomic_inc_return(&c->refcount);
>>>>    		if (unlikely(refcount > 1)) {
>>>>    			cmd_free(h, c); /* already in use */
>>>> -			offset = (i + 1) % h->nr_cmds;
>>> Hi Don,
>>> when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
>>> it will be so likely for next several thousands of tests in this function until the it is freed.
>>> When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
>>> The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
>>> Cheers, Tomas
>>>
>>>>    			continue;
>>>>    		}
>>>>    		set_bit(i & (BITS_PER_LONG - 1),
>>>>    			h->cmd_pool_bits + (i / BITS_PER_LONG));
>>>>    		break; /* it's ours now. */
>>>>    	}
>>>> -	h->last_allocation = i; /* benignly racy */
>>>>    	hpsa_cmd_partial_init(h, i, c);
>>>>    	return c;
>>>>    }
>>>>    
>>>> +/*
>>>> + * This is the complementary operation to cmd_alloc().  Note, however, in some
>>>> + * corner cases it may also be used to free blocks allocated by
>>>> + * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
>>>> + * the clear-bit is harmless.
>>>> + */
>>>>    static void cmd_free(struct ctlr_info *h, struct CommandList *c)
>>>>    {
>>>>    	if (atomic_dec_and_test(&c->refcount)) {
>>>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
>>>> index 3ec8934..2536b67 100644
>>>> --- a/drivers/scsi/hpsa.h
>>>> +++ b/drivers/scsi/hpsa.h
>>>> @@ -141,7 +141,6 @@ struct ctlr_info {
>>>>    	struct CfgTable __iomem *cfgtable;
>>>>    	int	interrupts_enabled;
>>>>    	int 	max_commands;
>>>> -	int last_allocation;
>>>>    	atomic_t commands_outstanding;
>>>>    #	define PERF_MODE_INT	0
>>>>    #	define DOORBELL_INT	1
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-26 15:10             ` Tomas Henzl
@ 2015-03-26 15:18               ` Webb Scales
  2015-04-10 15:13                 ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Webb Scales @ 2015-03-26 15:18 UTC (permalink / raw)
  To: Tomas Henzl, Brace, Don
  Cc: Teel, Scott Stacy, kevin.barnett, james.bottomley,
	Christoph Hellwig, Lindley, Justin, SCSI development list

At this point, I've turned this work over to Don; I presume he'll want 
to address the bitmask search. I'll leave it up to him to decide what to 
do about the refcount.


                 Webb


On 3/26/15 11:10 AM, Tomas Henzl wrote:
> On 03/26/2015 03:38 PM, Webb Scales wrote:
>> Ah!  Tomas, you are absolutely correct!  The loop should not be
>> restarting the search from the beginning of the bitfield, but rather
>> should be proceeding to check the next bit.  (Otherwise, there's no
>> point in having more than one bit!!)
> Most of the time it will work as expected, my comment was about a corner case.
> Are you going to repost this patch? Btw. I think that a local variable
> to restart the loop with an incremented value is enough.
>
>> (This code has been tweaked so many times that when I read it now I no
>> longer see what it actually does....)
>>
>> And, Tomas, you have a good point regarding the difference between
>> atomic_inc() and atomic_inc_return(), but, again, the difference is only
>> a couple of instructions in the context of a long code path, so I think
>> it's a difference without a distinction.  (And, I'm looking forward to
>> the day when all of the reference counting stuff can be removed...I
>> think it's not far off.)
> If i got it right the refcount exist only because of the error handler,
> i also think it might be rewritten so, that it could be removed.
> This the atomic_inc/and atomic_inc_return is a small detail, but if you repost
> the patch why not fix it too?
>
> tomash
>
>>
>>                   Webb
>>
>>
>>
>> On 3/26/15 8:47 AM, Tomas Henzl wrote:
>>> On 03/25/2015 07:33 PM, Webb Scales wrote:
>>>> Tomas,
>>>>
>>>> You are correct that the previous approach of using find_next_zero_bit()
>>>> with a persistent offset is more run-time efficient in the worst case;
>>>> however, given that "normal" operation doesn't call this allocation
>>>> routine, and given that, when this routine is called, the bit mask being
>>>> searched only has about 16 bits in it, I opted for simplicity over
>>>> efficiency -- that is, I doubt that the difference in efficiency is
>>>> discernible, while getting rid of the last_allocation field is a
>>>> worthwhile savings in both memory use and code.
>>> My comment is not about efficiency, I believe that when you measure it you wont be
>>> able to measure any significant difference.
>>> But if this '(unlikely(refcount > 1))' is true for let's say the first entry in the bitfield
>>> this and any other command submitted later will not pass the cmd-alloc until the command that
>>> caused '(unlikely(refcount > 1))' to be true is resolved. That might cause unexpected
>>> behavior.
>>>
>>>> Regarding your earlier comment on the refcount variable, I believe that
>>>> it was handy for debugging purposes.  The code has undergone numerous
>>>> revisions, and the variable certainly could now be removed from the
>>>> source per your suggestion.  (Of course, the compiler is already
>>>> removing it, I'm sure. ;-) )
>>> Not sure if the compiler is able to switch from 'atomic_inc_return' to 'atomic_inc' :) though.
>>> It is not important. (I'd not comment on this without the other cmd_alloc inaccuracy)
>>>
>>> tomash
>>>
>>>>                    Webb
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>>>> Sent: Monday, March 23, 2015 11:58 AM
>>>> To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com; hch@infradead.org; Justin Lindley; brace
>>>> Cc: linux-scsi@vger.kernel.org
>>>> Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
>>>>
>>>> On 03/17/2015 09:06 PM, Don Brace wrote:
>>>>
>>>>> From: Webb Scales <webbnh@hp.com>
>>>>>
>>>>> Rework slave allocation:
>>>>>      - separate the tagging support setup from the hostdata setup
>>>>>      - make the hostdata setup act consistently when the lookup fails
>>>>>      - make the hostdata setup act consistently when the device is not added
>>>>>      - set up the queue depth consistently across these scenarios
>>>>>      - if the block layer mq support is not available, explicitly enable and
>>>>>        activate the SCSI layer tcq support (and do this at allocation-time so
>>>>>        that the tags will be available for INQUIRY commands)
>>>>>
>>>>> Tweak slave configuration so that devices which are masked are also
>>>>> not attached.
>>>>>
>>>>> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
>>>>> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
>>>>> Signed-off-by: Webb Scales <webbnh@hp.com>
>>>>> Signed-off-by: Don Brace <don.brace@pmcs.com>
>>>>> ---
>>>>>     drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++----------
>>>>>     drivers/scsi/hpsa.h |    1
>>>>>     2 files changed, 123 insertions(+), 31 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>>>>> index 34c178c..4e34a62 100644
>>>>> --- a/drivers/scsi/hpsa.c
>>>>> +++ b/drivers/scsi/hpsa.c
>>>>> @@ -44,6 +44,7 @@
>>>>>     #include <scsi/scsi_host.h>
>>>>>     #include <scsi/scsi_tcq.h>
>>>>>     #include <scsi/scsi_eh.h>
>>>>> +#include <scsi/scsi_dbg.h>
>>>>>     #include <linux/cciss_ioctl.h>
>>>>>     #include <linux/string.h>
>>>>>     #include <linux/bitmap.h>
>>>>> @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev, int cmd,
>>>>>     
>>>>>     static void cmd_free(struct ctlr_info *h, struct CommandList *c);
>>>>>     static struct CommandList *cmd_alloc(struct ctlr_info *h);
>>>>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
>>>>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>>>>> +					    struct scsi_cmnd *scmd);
>>>>>     static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
>>>>>     	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
>>>>>     	int cmd_type);
>>>>> @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct ctlr_info *h,
>>>>>     	}
>>>>>     }
>>>>>     
>>>>> +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
>>>>> +				      struct CommandList *c)
>>>>> +{
>>>>> +	hpsa_cmd_resolve_events(h, c);
>>>>> +	cmd_tagged_free(h, c);
>>>>> +}
>>>>> +
>>>>>     static void hpsa_cmd_free_and_done(struct ctlr_info *h,
>>>>>     		struct CommandList *c, struct scsi_cmnd *cmd)
>>>>>     {
>>>>> -	hpsa_cmd_resolve_events(h, c);
>>>>> -	cmd_free(h, c);
>>>>> +	hpsa_cmd_resolve_and_free(h, c);
>>>>>     	cmd->scsi_done(cmd);
>>>>>     }
>>>>>     
>>>>> @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct ctlr_info *h, struct CommandList *c,
>>>>>     	hpsa_set_scsi_cmd_aborted(cmd);
>>>>>     	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status 0x%x\n",
>>>>>     			 c->Request.CDB, c->err_info->ScsiStatus);
>>>>> -	hpsa_cmd_resolve_events(h, c);
>>>>> -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
>>>>> +	hpsa_cmd_resolve_and_free(h, c);
>>>>>     }
>>>>>     
>>>>>     static void process_ioaccel2_completion(struct ctlr_info *h,
>>>>> @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
>>>>>     	}
>>>>>     
>>>>>     	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
>>>>> -		cmd_free(h, c);
>>>>> +		hpsa_cmd_resolve_and_free(h, c);
>>>>>     		return SCSI_MLQUEUE_HOST_BUSY;
>>>>>     	}
>>>>>     	enqueue_cmd_and_start_io(h, c);
>>>>> @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct ctlr_info *h, int index,
>>>>>     {
>>>>>     	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index * sizeof(*c);
>>>>>     
>>>>> +	BUG_ON(c->cmdindex != index);
>>>>> +
>>>>>     	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
>>>>>     	memset(c->err_info, 0, sizeof(*c->err_info));
>>>>>     	c->busaddr = (u32) cmd_dma_handle;
>>>>> @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>>>>     
>>>>>     	/* Get the ptr to our adapter structure out of cmd->host. */
>>>>>     	h = sdev_to_hba(cmd->device);
>>>>> +
>>>>> +	BUG_ON(cmd->request->tag < 0);
>>>>> +
>>>>>     	dev = cmd->device->hostdata;
>>>>>     	if (!dev) {
>>>>>     		cmd->result = DID_NO_CONNECT << 16;
>>>>>     		cmd->scsi_done(cmd);
>>>>>     		return 0;
>>>>>     	}
>>>>> -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>>>>     
>>>>> -	if (unlikely(lockup_detected(h))) {
>>>>> -		cmd->result = DID_NO_CONNECT << 16;
>>>>> -		cmd->scsi_done(cmd);
>>>>> -		return 0;
>>>>> -	}
>>>>> -	c = cmd_alloc(h);
>>>>> +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>>>>>     
>>>>>     	if (unlikely(lockup_detected(h))) {
>>>>>     		cmd->result = DID_NO_CONNECT << 16;
>>>>> -		cmd_free(h, c);
>>>>>     		cmd->scsi_done(cmd);
>>>>>     		return 0;
>>>>>     	}
>>>>> +	c = cmd_tagged_alloc(h, cmd);
>>>>>     
>>>>>     	/*
>>>>>     	 * Call alternate submit routine for I/O accelerated commands.
>>>>> @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>>>>>     		if (rc == 0)
>>>>>     			return 0;
>>>>>     		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
>>>>> -			cmd_free(h, c);
>>>>> +			hpsa_cmd_resolve_and_free(h, c);
>>>>>     			return SCSI_MLQUEUE_HOST_BUSY;
>>>>>     		}
>>>>>     	}
>>>>> @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>>>>     	sh->hostdata[0] = (unsigned long) h;
>>>>>     	sh->irq = h->intr[h->intr_mode];
>>>>>     	sh->unique_id = sh->irq;
>>>>> +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
>>>>> +	if (error) {
>>>>> +		dev_err(&h->pdev->dev,
>>>>> +			"%s: scsi_init_shared_tag_map failed for controller %d\n",
>>>>> +			__func__, h->ctlr);
>>>>> +		goto fail_host_put;
>>>>> +	}
>>>>>     	error = scsi_add_host(sh, &h->pdev->dev);
>>>>> -	if (error)
>>>>> +	if (error) {
>>>>> +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller %d\n",
>>>>> +			__func__, h->ctlr);
>>>>>     		goto fail_host_put;
>>>>> +	}
>>>>>     	scsi_scan_host(sh);
>>>>>     	return 0;
>>>>>     
>>>>>      fail_host_put:
>>>>> -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
>>>>> -		" failed for controller %d\n", __func__, h->ctlr);
>>>>>     	scsi_host_put(sh);
>>>>>     	return error;
>>>>>      fail:
>>>>> @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
>>>>>     }
>>>>>     
>>>>>     /*
>>>>> + * The block layer has already gone to the trouble of picking out a unique,
>>>>> + * small-integer tag for this request.  We use an offset from that value as
>>>>> + * an index to select our command block.  (The offset allows us to reserve the
>>>>> + * low-numbered entries for our own uses.)
>>>>> + */
>>>>> +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
>>>>> +{
>>>>> +	int idx = scmd->request->tag;
>>>>> +
>>>>> +	if (idx < 0)
>>>>> +		return idx;
>>>>> +
>>>>> +	/* Offset to leave space for internal cmds. */
>>>>> +	return idx += HPSA_NRESERVED_CMDS;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>>      * Send a TEST_UNIT_READY command to the specified LUN using the specified
>>>>>      * reply queue; returns zero if the unit is ready, and non-zero otherwise.
>>>>>      */
>>>>> @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>>>>>     	/* if controller locked up, we can guarantee command won't complete */
>>>>>     	if (lockup_detected(h)) {
>>>>>     		dev_warn(&h->pdev->dev,
>>>>> -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
>>>>> -			h->scsi_host->host_no, dev->bus, dev->target,
>>>>> -			dev->lun);
>>>>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup detected\n",
>>>>> +			 h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
>>>>> +			 hpsa_get_cmd_index(scsicmd));
>>>>>     		return FAILED;
>>>>>     	}
>>>>>     
>>>>>     	/* this reset request might be the result of a lockup; check */
>>>>>     	if (detect_controller_lockup(h)) {
>>>>>     		dev_warn(&h->pdev->dev,
>>>>> -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
>>>>> +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup detected\n",
>>>>>     			 h->scsi_host->host_no, dev->bus, dev->target,
>>>>> -			 dev->lun);
>>>>> +			 dev->lun, hpsa_get_cmd_index(scsicmd));
>>>>>     		return FAILED;
>>>>>     	}
>>>>>     
>>>>> @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>>>>>     }
>>>>>     
>>>>>     /*
>>>>> + * For operations with an associated SCSI command, a command block is allocated
>>>>> + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using the
>>>>> + * block request tag as an index into a table of entries.  cmd_tagged_free() is
>>>>> + * the complement, although cmd_free() may be called instead.
>>>>> + */
>>>>> +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
>>>>> +					    struct scsi_cmnd *scmd)
>>>>> +{
>>>>> +	int idx = hpsa_get_cmd_index(scmd);
>>>>> +	struct CommandList *c = h->cmd_pool + idx;
>>>>> +	int refcount = 0;
>>>>> +
>>>>> +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
>>>>> +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
>>>>> +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
>>>>> +		/* The index value comes from the block layer, so if it's out of
>>>>> +		 * bounds, it's probably not our bug.
>>>>> +		 */
>>>>> +		BUG();
>>>>> +	}
>>>>> +
>>>>> +	refcount = atomic_inc_return(&c->refcount);
>>>> refcount is never used, use atomic_inc(&c->refcount); instead?
>>>>
>>>>> +	if (unlikely(!hpsa_is_cmd_idle(c))) {
>>>>> +		/*
>>>>> +		 * We expect that the SCSI layer will hand us a unique tag
>>>>> +		 * value.  Thus, there should never be a collision here between
>>>>> +		 * two requests...because if the selected command isn't idle
>>>>> +		 * then someone is going to be very disappointed.
>>>>> +		 */
>>>>> +		dev_err(&h->pdev->dev,
>>>>> +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
>>>>> +			idx);
>>>>> +		if (c->scsi_cmd != NULL)
>>>>> +			scsi_print_command(c->scsi_cmd);
>>>>> +		scsi_print_command(scmd);
>>>>> +	}
>>>>> +
>>>>> +	hpsa_cmd_partial_init(h, idx, c);
>>>>> +	return c;
>>>>> +}
>>>>> +
>>>>> +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
>>>>> +{
>>>>> +	/*
>>>>> +	 * Release our reference to the block.  We don't need to do anything
>>>>> +	 * else to free it, because it is accessed by index.  (There's no point
>>>>> +	 * in checking the result of the decrement, since we cannot guarantee
>>>>> +	 * that there isn't a concurrent abort which is also accessing it.)
>>>>> +	 */
>>>>> +	(void)atomic_dec(&c->refcount);
>>>>> +}
>>>>> +
>>>>> +/*
>>>>>      * For operations that cannot sleep, a command block is allocated at init,
>>>>>      * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
>>>>>      * which ones are free or in use.  Lock must be held when calling this.
>>>>> @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>>>>     {
>>>>>     	struct CommandList *c;
>>>>>     	int refcount, i;
>>>>> -	unsigned long offset;
>>>>>     
>>>>>     	/*
>>>>>     	 * There is some *extremely* small but non-zero chance that that
>>>>> @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
>>>>>     	 * very unlucky thread might be starved anyway, never able to
>>>>>     	 * beat the other threads.  In reality, this happens so
>>>>>     	 * infrequently as to be indistinguishable from never.
>>>>> +	 *
>>>>> +	 * Note that we start allocating commands before the SCSI host structure
>>>>> +	 * is initialized.  Since the search starts at bit zero, this
>>>>> +	 * all works, since we have at least one command structure available;
>>>>> +	 * however, it means that the structures with the low indexes have to be
>>>>> +	 * reserved for driver-initiated requests, while requests from the block
>>>>> +	 * layer will use the higher indexes.
>>>>>     	 */
>>>>>     
>>>>> -	offset = h->last_allocation; /* benignly racy */
>>>>>     	for (;;) {
>>>>> -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
>>>>> -		if (unlikely(i == h->nr_cmds)) {
>>>>> -			offset = 0;
>>>>> +		i = find_first_zero_bit(h->cmd_pool_bits, HPSA_NRESERVED_CMDS);
>>>>> +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
>>>>>     			continue;
>>>>> -		}
>>>>>     		c = h->cmd_pool + i;
>>>>>     		refcount = atomic_inc_return(&c->refcount);
>>>>>     		if (unlikely(refcount > 1)) {
>>>>>     			cmd_free(h, c); /* already in use */
>>>>> -			offset = (i + 1) % h->nr_cmds;
>>>> Hi Don,
>>>> when this happens - a command has its bitfield flag cleared and, but it's taken - refcount is > 1
>>>> it will be so likely for next several thousands of tests in this function until the it is freed.
>>>> When it is the first bit in the bitfield it will block all following commands sent to the card for that time.
>>>> The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems to handle this better.
>>>> Cheers, Tomas
>>>>
>>>>>     			continue;
>>>>>     		}
>>>>>     		set_bit(i & (BITS_PER_LONG - 1),
>>>>>     			h->cmd_pool_bits + (i / BITS_PER_LONG));
>>>>>     		break; /* it's ours now. */
>>>>>     	}
>>>>> -	h->last_allocation = i; /* benignly racy */
>>>>>     	hpsa_cmd_partial_init(h, i, c);
>>>>>     	return c;
>>>>>     }
>>>>>     
>>>>> +/*
>>>>> + * This is the complementary operation to cmd_alloc().  Note, however, in some
>>>>> + * corner cases it may also be used to free blocks allocated by
>>>>> + * cmd_tagged_alloc() in which case the ref-count decrement does the trick and
>>>>> + * the clear-bit is harmless.
>>>>> + */
>>>>>     static void cmd_free(struct ctlr_info *h, struct CommandList *c)
>>>>>     {
>>>>>     	if (atomic_dec_and_test(&c->refcount)) {
>>>>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
>>>>> index 3ec8934..2536b67 100644
>>>>> --- a/drivers/scsi/hpsa.h
>>>>> +++ b/drivers/scsi/hpsa.h
>>>>> @@ -141,7 +141,6 @@ struct ctlr_info {
>>>>>     	struct CfgTable __iomem *cfgtable;
>>>>>     	int	interrupts_enabled;
>>>>>     	int 	max_commands;
>>>>> -	int last_allocation;
>>>>>     	atomic_t commands_outstanding;
>>>>>     #	define PERF_MODE_INT	0
>>>>>     #	define DOORBELL_INT	1
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/42] hpsa: rework controller command submission
  2015-03-17 20:02 ` [PATCH v3 03/42] hpsa: rework controller command submission Don Brace
@ 2015-03-27 15:11   ` Tomas Henzl
  2015-03-27 18:04     ` brace
  0 siblings, 1 reply; 54+ messages in thread
From: Tomas Henzl @ 2015-03-27 15:11 UTC (permalink / raw)
  To: Don Brace, scott.teel, Kevin.Barnett, james.bottomley, hch,
	Justin.Lindley
  Cc: linux-scsi

On 03/17/2015 09:02 PM, Don Brace wrote:
> From: Webb Scales <webb.scales@hp.com>
>
> Allow driver initiated commands to have a timeout.  It does not
> yet try to do anything with timeouts on such commands.
>
> We are sending a reset in order to get rid of a command we want to abort.
> If we make it return on the same reply queue as the command we want to abort,
> the completion of the aborted command will not race with the completion of
> the reset command.
>
> Rename hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd(), since
> this function is the interface for issuing commands to the controller and
> not the "core" of that implementation.  Add a parameter to it which allows
> the caller to specify the reply queue to be used.  Modify existing callers
> to specify the default reply queue.
>
> Rename __hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd_core(),
> since this routine is the "core" implementation of the "do simple command"
> function and there is no longer any other function with a similar name.
> Modify the existing callers of this routine (other than
> hpsa_scsi_do_simple_cmd()) to instead call hpsa_scsi_do_simple_cmd(), since
> it will now accept the reply_queue paramenter, and it provides a controller
> lock-up check.  (Also, tweak two related message strings to make them
> distinct from each other.)
>
> Submitting a command to a locked up controller always results in a timeout,
> so check for controller lock-up before submitting.
>
> This is to enable fixing a race between command completions and
> abort completions on different reply queues in a subsequent patch.
> We want to be able to specify which reply queue an abort completion
> should occur on so that it cannot race the completion of the command
> it is trying to abort.
>
> The following race was possible in theory:
>
>   1. Abort command is sent to hardware.
>   2. Command to be aborted simultaneously completes on another
>      reply queue.
>   3. Hardware receives abort command, decides command has already
>      completed and indicates this to the driver via another different
>      reply queue.
>   4. driver processes abort completion finds that the hardware does not know
>      about the command, concludes that therefore the command cannot complete,
>      returns SUCCESS indicating to the mid-layer that the scsi_cmnd may be
>      re-used.
>   5. Command from step 2 is processed and completed back to scsi mid
>      layer (after we already promised that would never happen.)
>
> Fix by forcing aborts to complete on the same reply queue as the command
> they are aborting.
>
> Piggybacking device rescanning functionality onto the lockup
> detection thread is not a good idea because if the controller
> locks up during device rescanning, then the thread could get
> stuck, then the lockup isn't detected.  Use separate work
> queues for device rescanning and lockup detection.
>
> Detect controller lockup in abort handler.
>
> After a lockup is detected, return DO_NO_CONNECT which results in immediate
> termination of commands rather than DID_ERR which results in retries.
>
> Modify detect_controller_lockup() to return the result, to remove the need for a separate check.
>
> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
> Signed-off-by: Webb Scales <webbnh@hp.com>
> Signed-off-by: Don Brace <don.brace@pmcs.com>
> ---
>  drivers/scsi/hpsa.c     |  326 ++++++++++++++++++++++++++++++++++++-----------
>  drivers/scsi/hpsa_cmd.h |    5 +
>  2 files changed, 254 insertions(+), 77 deletions(-)
>
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> index 9b88726..488f81b 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -253,6 +253,8 @@ static int hpsa_scsi_ioaccel_queue_command(struct ctlr_info *h,
>  	struct CommandList *c, u32 ioaccel_handle, u8 *cdb, int cdb_len,
>  	u8 *scsi3addr, struct hpsa_scsi_dev_t *phys_disk);
>  static void hpsa_command_resubmit_worker(struct work_struct *work);
> +static u32 lockup_detected(struct ctlr_info *h);
> +static int detect_controller_lockup(struct ctlr_info *h);
>  
>  static inline struct ctlr_info *sdev_to_hba(struct scsi_device *sdev)
>  {
> @@ -748,30 +750,43 @@ static inline u32 next_command(struct ctlr_info *h, u8 q)
>   * a separate special register for submitting commands.
>   */
>  
> -/* set_performant_mode: Modify the tag for cciss performant
> +/*
> + * set_performant_mode: Modify the tag for cciss performant
>   * set bit 0 for pull model, bits 3-1 for block fetch
>   * register number
>   */
> -static void set_performant_mode(struct ctlr_info *h, struct CommandList *c)
> +#define DEFAULT_REPLY_QUEUE (-1)
> +static void set_performant_mode(struct ctlr_info *h, struct CommandList *c,
> +					int reply_queue)
>  {
>  	if (likely(h->transMethod & CFGTBL_Trans_Performant)) {
>  		c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1);
> -		if (likely(h->msix_vector > 0))
> +		if (unlikely(!h->msix_vector))
> +			return;
> +		if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
>  			c->Header.ReplyQueue =
>  				raw_smp_processor_id() % h->nreply_queues;
> +		else
> +			c->Header.ReplyQueue = reply_queue % h->nreply_queues;
>  	}
>  }
>  
>  static void set_ioaccel1_performant_mode(struct ctlr_info *h,
> -						struct CommandList *c)
> +						struct CommandList *c,
> +						int reply_queue)
>  {
>  	struct io_accel1_cmd *cp = &h->ioaccel_cmd_pool[c->cmdindex];
>  
> -	/* Tell the controller to post the reply to the queue for this
> +	/*
> +	 * Tell the controller to post the reply to the queue for this
>  	 * processor.  This seems to give the best I/O throughput.
>  	 */
> -	cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> -	/* Set the bits in the address sent down to include:
> +	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> +		cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> +	else
> +		cp->ReplyQueue = reply_queue % h->nreply_queues;
> +	/*
> +	 * Set the bits in the address sent down to include:
>  	 *  - performant mode bit (bit 0)
>  	 *  - pull count (bits 1-3)
>  	 *  - command type (bits 4-6)
> @@ -781,15 +796,21 @@ static void set_ioaccel1_performant_mode(struct ctlr_info *h,
>  }
>  
>  static void set_ioaccel2_performant_mode(struct ctlr_info *h,
> -						struct CommandList *c)
> +						struct CommandList *c,
> +						int reply_queue)
>  {
>  	struct io_accel2_cmd *cp = &h->ioaccel2_cmd_pool[c->cmdindex];
>  
> -	/* Tell the controller to post the reply to the queue for this
> +	/*
> +	 * Tell the controller to post the reply to the queue for this
>  	 * processor.  This seems to give the best I/O throughput.
>  	 */
> -	cp->reply_queue = smp_processor_id() % h->nreply_queues;
> -	/* Set the bits in the address sent down to include:
> +	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> +		cp->reply_queue = smp_processor_id() % h->nreply_queues;
> +	else
> +		cp->reply_queue = reply_queue % h->nreply_queues;
> +	/*
> +	 * Set the bits in the address sent down to include:
>  	 *  - performant mode bit not used in ioaccel mode 2
>  	 *  - pull count (bits 0-3)
>  	 *  - command type isn't needed for ioaccel2
> @@ -826,26 +847,32 @@ static void dial_up_lockup_detection_on_fw_flash_complete(struct ctlr_info *h,
>  		h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
>  }
>  
> -static void enqueue_cmd_and_start_io(struct ctlr_info *h,
> -	struct CommandList *c)
> +static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
> +	struct CommandList *c, int reply_queue)
>  {
>  	dial_down_lockup_detection_during_fw_flash(h, c);
>  	atomic_inc(&h->commands_outstanding);
>  	switch (c->cmd_type) {
>  	case CMD_IOACCEL1:
> -		set_ioaccel1_performant_mode(h, c);
> +		set_ioaccel1_performant_mode(h, c, reply_queue);
>  		writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
>  		break;
>  	case CMD_IOACCEL2:
> -		set_ioaccel2_performant_mode(h, c);
> +		set_ioaccel2_performant_mode(h, c, reply_queue);
>  		writel(c->busaddr, h->vaddr + IOACCEL2_INBOUND_POSTQ_32);
>  		break;
>  	default:
> -		set_performant_mode(h, c);
> +		set_performant_mode(h, c, reply_queue);
>  		h->access.submit_command(h, c);
>  	}
>  }
>  
> +static void enqueue_cmd_and_start_io(struct ctlr_info *h,
> +					struct CommandList *c)
> +{
> +	__enqueue_cmd_and_start_io(h, c, DEFAULT_REPLY_QUEUE);
> +}
> +
>  static inline int is_hba_lunid(unsigned char scsi3addr[])
>  {
>  	return memcmp(scsi3addr, RAID_CTLR_LUNID, 8) == 0;
> @@ -1877,6 +1904,19 @@ static void complete_scsi_command(struct CommandList *cp)
>  	if (cp->cmd_type == CMD_IOACCEL2 || cp->cmd_type == CMD_IOACCEL1)
>  		atomic_dec(&cp->phys_disk->ioaccel_cmds_out);
>  
> +	/*
> +	 * We check for lockup status here as it may be set for
> +	 * CMD_SCSI, CMD_IOACCEL1 and CMD_IOACCEL2 commands by
> +	 * fail_all_oustanding_cmds()
> +	 */
> +	if (unlikely(ei->CommandStatus == CMD_CTLR_LOCKUP)) {
> +		/* DID_NO_CONNECT will prevent a retry */
> +		cmd->result = DID_NO_CONNECT << 16;
> +		cmd_free(h, cp);
> +		cmd->scsi_done(cmd);
> +		return;
> +	}
> +
>  	if (cp->cmd_type == CMD_IOACCEL2)
>  		return process_ioaccel2_completion(h, cp, cmd, dev);
>  
> @@ -2091,14 +2131,36 @@ static int hpsa_map_one(struct pci_dev *pdev,
>  	return 0;
>  }
>  
> -static inline void hpsa_scsi_do_simple_cmd_core(struct ctlr_info *h,
> -	struct CommandList *c)
> +#define NO_TIMEOUT ((unsigned long) -1)
> +#define DEFAULT_TIMEOUT 30000 /* milliseconds */
> +static int hpsa_scsi_do_simple_cmd_core(struct ctlr_info *h,
> +	struct CommandList *c, int reply_queue, unsigned long timeout_msecs)
>  {
>  	DECLARE_COMPLETION_ONSTACK(wait);
>  
>  	c->waiting = &wait;
> -	enqueue_cmd_and_start_io(h, c);
> -	wait_for_completion(&wait);
> +	__enqueue_cmd_and_start_io(h, c, reply_queue);
> +	if (timeout_msecs == NO_TIMEOUT) {
> +		/* TODO: get rid of this no-timeout thing */
> +		wait_for_completion_io(&wait);
> +		return IO_OK;
> +	}
> +	if (!wait_for_completion_io_timeout(&wait,
> +					msecs_to_jiffies(timeout_msecs))) {
> +		dev_warn(&h->pdev->dev, "Command timed out.\n");
> +		return -ETIMEDOUT;
> +	}
> +	return IO_OK;
> +}
> +
> +static int hpsa_scsi_do_simple_cmd(struct ctlr_info *h, struct CommandList *c,
> +				   int reply_queue, unsigned long timeout_msecs)
> +{
> +	if (unlikely(lockup_detected(h))) {
> +		c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
> +		return IO_OK;
> +	}
> +	return hpsa_scsi_do_simple_cmd_core(h, c, reply_queue, timeout_msecs);
>  }
>  
>  static u32 lockup_detected(struct ctlr_info *h)
> @@ -2113,25 +2175,19 @@ static u32 lockup_detected(struct ctlr_info *h)
>  	return rc;
>  }
>  
> -static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
> -	struct CommandList *c)
> -{
> -	/* If controller lockup detected, fake a hardware error. */
> -	if (unlikely(lockup_detected(h)))
> -		c->err_info->CommandStatus = CMD_HARDWARE_ERR;
> -	else
> -		hpsa_scsi_do_simple_cmd_core(h, c);
> -}
> -
>  #define MAX_DRIVER_CMD_RETRIES 25
> -static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> -	struct CommandList *c, int data_direction)
> +static int hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> +	struct CommandList *c, int data_direction, unsigned long timeout_msecs)
>  {
>  	int backoff_time = 10, retry_count = 0;
> +	int rc;
>  
>  	do {
>  		memset(c->err_info, 0, sizeof(*c->err_info));
> -		hpsa_scsi_do_simple_cmd_core(h, c);
> +		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> +						  timeout_msecs);
> +		if (rc)
> +			break;
>  		retry_count++;
>  		if (retry_count > 3) {
>  			msleep(backoff_time);
> @@ -2142,6 +2198,9 @@ static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
>  			check_for_busy(h, c)) &&
>  			retry_count <= MAX_DRIVER_CMD_RETRIES);
>  	hpsa_pci_unmap(h->pdev, c, 1, data_direction);
> +	if (retry_count > MAX_DRIVER_CMD_RETRIES)
> +		rc = -EIO;
> +	return rc;
>  }
>  
>  static void hpsa_print_cmd(struct ctlr_info *h, char *txt,
> @@ -2218,6 +2277,9 @@ static void hpsa_scsi_interpret_error(struct ctlr_info *h,
>  	case CMD_UNABORTABLE:
>  		hpsa_print_cmd(h, "unabortable", cp);
>  		break;
> +	case CMD_CTLR_LOCKUP:
> +		hpsa_print_cmd(h, "controller lockup detected", cp);
> +		break;
>  	default:
>  		hpsa_print_cmd(h, "unknown status", cp);
>  		dev_warn(d, "Unknown command status %x\n",
> @@ -2245,7 +2307,10 @@ static int hpsa_scsi_do_inquiry(struct ctlr_info *h, unsigned char *scsi3addr,
>  		rc = -1;
>  		goto out;
>  	}
> -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> +			PCI_DMA_FROMDEVICE, NO_TIMEOUT);
> +	if (rc)
> +		goto out;
>  	ei = c->err_info;
>  	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
>  		hpsa_scsi_interpret_error(h, c);
> @@ -2275,7 +2340,10 @@ static int hpsa_bmic_ctrl_mode_sense(struct ctlr_info *h,
>  		rc = -1;
>  		goto out;
>  	}
> -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> +					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
> +	if (rc)
> +		goto out;
>  	ei = c->err_info;
>  	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
>  		hpsa_scsi_interpret_error(h, c);
> @@ -2287,7 +2355,7 @@ out:
>  	}
>  
>  static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
> -	u8 reset_type)
> +	u8 reset_type, int reply_queue)
>  {
>  	int rc = IO_OK;
>  	struct CommandList *c;
> @@ -2304,7 +2372,11 @@ static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
>  	(void) fill_cmd(c, HPSA_DEVICE_RESET_MSG, h, NULL, 0, 0,
>  			scsi3addr, TYPE_MSG);
>  	c->Request.CDB[1] = reset_type; /* fill_cmd defaults to LUN reset */
> -	hpsa_scsi_do_simple_cmd_core(h, c);
> +	rc = hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
> +	if (rc) {
> +		dev_warn(&h->pdev->dev, "Failed to send reset command\n");
> +		goto out;
> +	}
>  	/* no unmap needed here because no data xfer. */
>  
>  	ei = c->err_info;
> @@ -2312,6 +2384,7 @@ static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
>  		hpsa_scsi_interpret_error(h, c);
>  		rc = -1;
>  	}
> +out:
>  	cmd_free(h, c);
>  	return rc;
>  }
> @@ -2429,15 +2502,18 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
>  			sizeof(this_device->raid_map), 0,
>  			scsi3addr, TYPE_CMD)) {
>  		dev_warn(&h->pdev->dev, "Out of memory in hpsa_get_raid_map()\n");
> -		cmd_free(h, c);
> -		return -ENOMEM;
> +		rc = -ENOMEM;
> +		goto out;
>  	}
> -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> +					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
> +	if (rc)
> +		goto out;
>  	ei = c->err_info;
>  	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
>  		hpsa_scsi_interpret_error(h, c);
> -		cmd_free(h, c);
> -		return -1;
> +		rc = -1;
> +		goto out;
>  	}
>  	cmd_free(h, c);
>  
> @@ -2449,6 +2525,9 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
>  	}
>  	hpsa_debug_map_buff(h, rc, &this_device->raid_map);
>  	return rc;
> +out:
> +	cmd_free(h, c);
> +	return rc;
>  }
>  
>  static int hpsa_bmic_id_physical_device(struct ctlr_info *h,
> @@ -2468,7 +2547,8 @@ static int hpsa_bmic_id_physical_device(struct ctlr_info *h,
>  	c->Request.CDB[2] = bmic_device_index & 0xff;
>  	c->Request.CDB[9] = (bmic_device_index >> 8) & 0xff;
>  
> -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> +	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE,
> +						NO_TIMEOUT);
>  	ei = c->err_info;
>  	if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
>  		hpsa_scsi_interpret_error(h, c);
> @@ -2603,7 +2683,10 @@ static int hpsa_scsi_do_report_luns(struct ctlr_info *h, int logical,
>  	}
>  	if (extended_response)
>  		c->Request.CDB[1] = extended_response;
> -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> +					PCI_DMA_FROMDEVICE, NO_TIMEOUT);
> +	if (rc)
> +		goto out;
>  	ei = c->err_info;
>  	if (ei->CommandStatus != 0 &&
>  	    ei->CommandStatus != CMD_DATA_UNDERRUN) {
> @@ -2696,7 +2779,7 @@ static int hpsa_volume_offline(struct ctlr_info *h,
>  {
>  	struct CommandList *c;
>  	unsigned char *sense, sense_key, asc, ascq;
> -	int ldstat = 0;
> +	int rc, ldstat = 0;
>  	u16 cmd_status;
>  	u8 scsi_status;
>  #define ASC_LUN_NOT_READY 0x04
> @@ -2707,7 +2790,11 @@ static int hpsa_volume_offline(struct ctlr_info *h,
>  	if (!c)
>  		return 0;
>  	(void) fill_cmd(c, TEST_UNIT_READY, h, NULL, 0, 0, scsi3addr, TYPE_CMD);
> -	hpsa_scsi_do_simple_cmd_core(h, c);
> +	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
> +	if (rc) {
> +		cmd_free(h, c);
> +		return 0;
> +	}
>  	sense = c->err_info->SenseInfo;
>  	sense_key = sense[2];
>  	asc = sense[12];
> @@ -4040,7 +4127,11 @@ static int hpsa_scsi_ioaccel_raid_map(struct ctlr_info *h,
>  						dev->phys_disk[map_index]);
>  }
>  
> -/* Submit commands down the "normal" RAID stack path */
> +/*
> + * Submit commands down the "normal" RAID stack path
> + * All callers to hpsa_ciss_submit must check lockup_detected
> + * beforehand, before (opt.) and after calling cmd_alloc
> + */
>  static int hpsa_ciss_submit(struct ctlr_info *h,
>  	struct CommandList *c, struct scsi_cmnd *cmd,
>  	unsigned char scsi3addr[])
> @@ -4151,7 +4242,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>  	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
>  
>  	if (unlikely(lockup_detected(h))) {
> -		cmd->result = DID_ERROR << 16;
> +		cmd->result = DID_NO_CONNECT << 16;
>  		cmd->scsi_done(cmd);
>  		return 0;
>  	}
> @@ -4161,7 +4252,7 @@ static int hpsa_scsi_queue_command(struct Scsi_Host *sh, struct scsi_cmnd *cmd)
>  		return SCSI_MLQUEUE_HOST_BUSY;
>  	}
>  	if (unlikely(lockup_detected(h))) {
> -		cmd->result = DID_ERROR << 16;
> +		cmd->result = DID_NO_CONNECT << 16;
>  		cmd_free(h, c);
>  		cmd->scsi_done(cmd);
>  		return 0;
> @@ -4356,7 +4447,10 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h,
>  		/* Send the Test Unit Ready, fill_cmd can't fail, no mapping */
>  		(void) fill_cmd(c, TEST_UNIT_READY, h,
>  				NULL, 0, 0, lunaddr, TYPE_CMD);
> -		hpsa_scsi_do_simple_cmd_core(h, c);
> +		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> +						NO_TIMEOUT);
> +		if (rc)
> +			goto do_it_again;
>  		/* no unmap needed here because no data xfer. */
>  
>  		if (c->err_info->CommandStatus == CMD_SUCCESS)
> @@ -4367,7 +4461,7 @@ static int wait_for_device_to_become_ready(struct ctlr_info *h,
>  			(c->err_info->SenseInfo[2] == NO_SENSE ||
>  			c->err_info->SenseInfo[2] == UNIT_ATTENTION))
>  			break;
> -
> +do_it_again:
>  		dev_warn(&h->pdev->dev, "waiting %d secs "
>  			"for device to become ready.\n", waittime);
>  		rc = 1; /* device not ready. */
> @@ -4405,13 +4499,46 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
>  			"device lookup failed.\n");
>  		return FAILED;
>  	}
> -	dev_warn(&h->pdev->dev, "resetting device %d:%d:%d:%d\n",
> -		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
> +
> +	/* if controller locked up, we can guarantee command won't complete */
> +	if (lockup_detected(h)) {
> +		dev_warn(&h->pdev->dev,
> +			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
> +			h->scsi_host->host_no, dev->bus, dev->target,
> +			dev->lun);
> +		return FAILED;
> +	}
> +
> +	/* this reset request might be the result of a lockup; check */
> +	if (detect_controller_lockup(h)) {
> +		dev_warn(&h->pdev->dev,
> +			 "scsi %d:%d:%d:%d RESET FAILED, new lockup detected\n",
> +			 h->scsi_host->host_no, dev->bus, dev->target,
> +			 dev->lun);
> +		return FAILED;
> +	}
> +
> +	dev_warn(&h->pdev->dev,
> +		"scsi %d:%d:%d:%d: %s %.8s %.16s resetting RAID-%s SSDSmartPathCap%c En%c Exp=%d\n",
> +		h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
> +		scsi_device_type(dev->devtype),
> +		dev->vendor,
> +		dev->model,
> +		dev->raid_level > RAID_UNKNOWN ?
> +			"RAID-?" : raid_label[dev->raid_level],
> +		dev->offload_config ? '+' : '-',
> +		dev->offload_enabled ? '+' : '-',
> +		dev->expose_state);
> +
>  	/* send a reset to the SCSI LUN which the command was sent to */
> -	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN);
> +	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN,
> +			     DEFAULT_REPLY_QUEUE);
>  	if (rc == 0 && wait_for_device_to_become_ready(h, dev->scsi3addr) == 0)
>  		return SUCCESS;
>  
> +	dev_warn(&h->pdev->dev,
> +		"scsi %d:%d:%d:%d reset failed\n",
> +		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
>  	return FAILED;
>  }
>  
> @@ -4456,7 +4583,7 @@ static void hpsa_get_tag(struct ctlr_info *h,
>  }
>  
>  static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
> -	struct CommandList *abort, int swizzle)
> +	struct CommandList *abort, int swizzle, int reply_queue)
>  {
>  	int rc = IO_OK;
>  	struct CommandList *c;
> @@ -4474,9 +4601,9 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
>  		0, 0, scsi3addr, TYPE_MSG);
>  	if (swizzle)
>  		swizzle_abort_tag(&c->Request.CDB[4]);
> -	hpsa_scsi_do_simple_cmd_core(h, c);
> +	(void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
>  	hpsa_get_tag(h, abort, &taglower, &tagupper);
> -	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd_core completed.\n",
> +	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd(abort) completed.\n",
>  		__func__, tagupper, taglower);
>  	/* no unmap needed here because no data xfer. */
>  
> @@ -4508,7 +4635,7 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
>   */
>  
>  static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
> -	unsigned char *scsi3addr, struct CommandList *abort)
> +	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
>  {
>  	int rc = IO_OK;
>  	struct scsi_cmnd *scmd; /* scsi command within request being aborted */
> @@ -4551,7 +4678,7 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
>  			"Reset as abort: Resetting physical device at scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
>  			psa[0], psa[1], psa[2], psa[3],
>  			psa[4], psa[5], psa[6], psa[7]);
> -	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET);
> +	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET, reply_queue);
>  	if (rc != 0) {
>  		dev_warn(&h->pdev->dev,
>  			"Reset as abort: Failed on physical device at scsi3addr 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
> @@ -4585,7 +4712,7 @@ static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
>   * make this true someday become false.
>   */
>  static int hpsa_send_abort_both_ways(struct ctlr_info *h,
> -	unsigned char *scsi3addr, struct CommandList *abort)
> +	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
>  {
>  	/* ioccelerator mode 2 commands should be aborted via the
>  	 * accelerated path, since RAID path is unaware of these commands,
> @@ -4593,10 +4720,20 @@ static int hpsa_send_abort_both_ways(struct ctlr_info *h,
>  	 * Change abort to physical device reset.
>  	 */
>  	if (abort->cmd_type == CMD_IOACCEL2)
> -		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr, abort);
> +		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr,
> +							abort, reply_queue);
> +
> +	return hpsa_send_abort(h, scsi3addr, abort, 0, reply_queue) &&
> +			hpsa_send_abort(h, scsi3addr, abort, 1, reply_queue);
> +}
>  
> -	return hpsa_send_abort(h, scsi3addr, abort, 0) &&
> -			hpsa_send_abort(h, scsi3addr, abort, 1);
> +/* Find out which reply queue a command was meant to return on */
> +static int hpsa_extract_reply_queue(struct ctlr_info *h,
> +					struct CommandList *c)
> +{
> +	if (c->cmd_type == CMD_IOACCEL2)
> +		return h->ioaccel2_cmd_pool[c->cmdindex].reply_queue;
> +	return c->Header.ReplyQueue;
>  }
>  
>  /* Send an abort for the specified command.
> @@ -4614,7 +4751,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>  	char msg[256];		/* For debug messaging. */
>  	int ml = 0;
>  	__le32 tagupper, taglower;
> -	int refcount;
> +	int refcount, reply_queue;
>  
>  	/* Find the controller of the command to be aborted */
>  	h = sdev_to_hba(sc->device);
> @@ -4622,8 +4759,23 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>  			"ABORT REQUEST FAILED, Controller lookup failed.\n"))
>  		return FAILED;
>  
> -	if (lockup_detected(h))
> +	/* If controller locked up, we can guarantee command won't complete */
> +	if (lockup_detected(h)) {
> +		dev_warn(&h->pdev->dev,
> +			"scsi %d:%d:%d:%llu scmd %p ABORT FAILED, lockup detected\n",
> +			h->scsi_host->host_no, sc->device->channel,
> +			sc->device->id, sc->device->lun, sc);
>  		return FAILED;
> +	}
> +
> +	/* This is a good time to check if controller lockup has occurred */
> +	if (detect_controller_lockup(h)) {
> +		dev_warn(&h->pdev->dev,
> +			 "scsi %d:%d:%d:%llu scmd %p ABORT FAILED, new lockup detected\n",
> +			 h->scsi_host->host_no, sc->device->channel,
> +			 sc->device->id, sc->device->lun, sc);
> +		return FAILED;
> +	}
>  
>  	/* Check that controller supports some kind of task abort */
>  	if (!(HPSATMF_PHYS_TASK_ABORT & h->TMFSupportFlags) &&
> @@ -4656,6 +4808,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>  		return SUCCESS;
>  	}
>  	hpsa_get_tag(h, abort, &taglower, &tagupper);
> +	reply_queue = hpsa_extract_reply_queue(h, abort);
>  	ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower);
>  	as  = abort->scsi_cmd;
>  	if (as != NULL)
> @@ -4670,7 +4823,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
>  	 * by the firmware (but not to the scsi mid layer) but we can't
>  	 * distinguish which.  Send the abort down.
>  	 */
> -	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort);
> +	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort, reply_queue);
>  	if (rc != 0) {
>  		dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
>  			h->scsi_host->host_no,
> @@ -4995,7 +5148,9 @@ static int hpsa_passthru_ioctl(struct ctlr_info *h, void __user *argp)
>  		c->SG[0].Len = cpu_to_le32(iocommand.buf_size);
>  		c->SG[0].Ext = cpu_to_le32(HPSA_SG_LAST); /* not chaining */
>  	}
> -	hpsa_scsi_do_simple_cmd_core_if_no_lockup(h, c);
> +	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
> +	if (rc)
> +		rc = -EIO;

We just pretend here that an error path exist, with NO_TIMEOUT the function can't fail,
but if it could we might end up copying some random data from kernel to user space.

>  	if (iocommand.buf_size > 0)
>  		hpsa_pci_unmap(h->pdev, c, 1, PCI_DMA_BIDIRECTIONAL);
>  	check_ioctl_unit_attention(h, c);
> @@ -5125,7 +5280,11 @@ static int hpsa_big_passthru_ioctl(struct ctlr_info *h, void __user *argp)
>  		}
>  		c->SG[--i].Ext = cpu_to_le32(HPSA_SG_LAST);
>  	}
> -	hpsa_scsi_do_simple_cmd_core_if_no_lockup(h, c);
> +	status = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT);
> +	if (status) {
> +		status = -EIO;
> +		goto cleanup0;

Similar here, by goto cleanup0; we miss a call to hpsa_pci_unmap.
Nothing from that is an issue because  hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE, NO_TIMEOUT)
can't fail, but it is  a prepared trap for a future change with a real timeout.

As it is now it is not a real issue, when it's fixed in next driver update it's fine
for me.

Tomas

> +	}
>  	if (sg_used)
>  		hpsa_pci_unmap(h->pdev, c, sg_used, PCI_DMA_BIDIRECTIONAL);
>  	check_ioctl_unit_attention(h, c);
> @@ -6272,6 +6431,8 @@ static int hpsa_wait_for_mode_change_ack(struct ctlr_info *h)
>  	 * as we enter this code.)
>  	 */
>  	for (i = 0; i < MAX_MODE_CHANGE_WAIT; i++) {
> +		if (h->remove_in_progress)
> +			goto done;
>  		spin_lock_irqsave(&h->lock, flags);
>  		doorbell_value = readl(h->vaddr + SA5_DOORBELL);
>  		spin_unlock_irqrestore(&h->lock, flags);
> @@ -6667,17 +6828,21 @@ static void fail_all_outstanding_cmds(struct ctlr_info *h)
>  {
>  	int i, refcount;
>  	struct CommandList *c;
> +	int failcount = 0;
>  
>  	flush_workqueue(h->resubmit_wq); /* ensure all cmds are fully built */
>  	for (i = 0; i < h->nr_cmds; i++) {
>  		c = h->cmd_pool + i;
>  		refcount = atomic_inc_return(&c->refcount);
>  		if (refcount > 1) {
> -			c->err_info->CommandStatus = CMD_HARDWARE_ERR;
> +			c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
>  			finish_cmd(c);
> +			failcount++;
>  		}
>  		cmd_free(h, c);
>  	}
> +	dev_warn(&h->pdev->dev,
> +		"failed %d commands in fail_all\n", failcount);
>  }
>  
>  static void set_lockup_detected_for_all_cpus(struct ctlr_info *h, u32 value)
> @@ -6705,18 +6870,19 @@ static void controller_lockup_detected(struct ctlr_info *h)
>  	if (!lockup_detected) {
>  		/* no heartbeat, but controller gave us a zero. */
>  		dev_warn(&h->pdev->dev,
> -			"lockup detected but scratchpad register is zero\n");
> +			"lockup detected after %d but scratchpad register is zero\n",
> +			h->heartbeat_sample_interval / HZ);
>  		lockup_detected = 0xffffffff;
>  	}
>  	set_lockup_detected_for_all_cpus(h, lockup_detected);
>  	spin_unlock_irqrestore(&h->lock, flags);
> -	dev_warn(&h->pdev->dev, "Controller lockup detected: 0x%08x\n",
> -			lockup_detected);
> +	dev_warn(&h->pdev->dev, "Controller lockup detected: 0x%08x after %d\n",
> +			lockup_detected, h->heartbeat_sample_interval / HZ);
>  	pci_disable_device(h->pdev);
>  	fail_all_outstanding_cmds(h);
>  }
>  
> -static void detect_controller_lockup(struct ctlr_info *h)
> +static int detect_controller_lockup(struct ctlr_info *h)
>  {
>  	u64 now;
>  	u32 heartbeat;
> @@ -6726,7 +6892,7 @@ static void detect_controller_lockup(struct ctlr_info *h)
>  	/* If we've received an interrupt recently, we're ok. */
>  	if (time_after64(h->last_intr_timestamp +
>  				(h->heartbeat_sample_interval), now))
> -		return;
> +		return false;
>  
>  	/*
>  	 * If we've already checked the heartbeat recently, we're ok.
> @@ -6735,7 +6901,7 @@ static void detect_controller_lockup(struct ctlr_info *h)
>  	 */
>  	if (time_after64(h->last_heartbeat_timestamp +
>  				(h->heartbeat_sample_interval), now))
> -		return;
> +		return false;
>  
>  	/* If heartbeat has not changed since we last looked, we're not ok. */
>  	spin_lock_irqsave(&h->lock, flags);
> @@ -6743,12 +6909,13 @@ static void detect_controller_lockup(struct ctlr_info *h)
>  	spin_unlock_irqrestore(&h->lock, flags);
>  	if (h->last_heartbeat == heartbeat) {
>  		controller_lockup_detected(h);
> -		return;
> +		return true;
>  	}
>  
>  	/* We're ok. */
>  	h->last_heartbeat = heartbeat;
>  	h->last_heartbeat_timestamp = now;
> +	return false;
>  }
>  
>  static void hpsa_ack_ctlr_events(struct ctlr_info *h)
> @@ -7092,8 +7259,10 @@ static void hpsa_flush_cache(struct ctlr_info *h)
>  {
>  	char *flush_buf;
>  	struct CommandList *c;
> +	int rc;
>  
>  	/* Don't bother trying to flush the cache if locked up */
> +	/* FIXME not necessary if do_simple_cmd does the check */
>  	if (unlikely(lockup_detected(h)))
>  		return;
>  	flush_buf = kzalloc(4, GFP_KERNEL);
> @@ -7109,7 +7278,10 @@ static void hpsa_flush_cache(struct ctlr_info *h)
>  		RAID_CTLR_LUNID, TYPE_CMD)) {
>  		goto out;
>  	}
> -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_TODEVICE);
> +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> +					PCI_DMA_TODEVICE, NO_TIMEOUT);
> +	if (rc)
> +		goto out;
>  	if (c->err_info->CommandStatus != 0)
>  out:
>  		dev_warn(&h->pdev->dev,
> diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
> index 76d5499..f52c847 100644
> --- a/drivers/scsi/hpsa_cmd.h
> +++ b/drivers/scsi/hpsa_cmd.h
> @@ -43,6 +43,11 @@
>  #define CMD_TIMEOUT             0x000B
>  #define CMD_UNABORTABLE		0x000C
>  #define CMD_IOACCEL_DISABLED	0x000E
> +#define CMD_CTLR_LOCKUP		0xffff
> +/* Note: CMD_CTLR_LOCKUP is not a value defined by the CISS spec
> + * it is a value defined by the driver that commands can be marked
> + * with when a controller lockup has been detected by the driver
> + */
>  
>  
>  /* Unit Attentions ASC's as defined for the MSA2012sa */
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v3 03/42] hpsa: rework controller command submission
  2015-03-27 15:11   ` Tomas Henzl
@ 2015-03-27 18:04     ` brace
  0 siblings, 0 replies; 54+ messages in thread
From: brace @ 2015-03-27 18:04 UTC (permalink / raw)
  To: Tomas Henzl, Scott Teel, Kevin Barnett, james.bottomley, hch,
	Justin Lindley, brace
  Cc: linux-scsi

Noted.

Thanks for your review.

Don

> -----Original Message-----
> From: Tomas Henzl [mailto:thenzl@redhat.com]
> Sent: Friday, March 27, 2015 10:11 AM
> To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com;
> hch@infradead.org; Justin Lindley; brace
> Cc: linux-scsi@vger.kernel.org
> Subject: Re: [PATCH v3 03/42] hpsa: rework controller command submission
> 
> On 03/17/2015 09:02 PM, Don Brace wrote:
> > From: Webb Scales <webb.scales@hp.com>
> >
> > Allow driver initiated commands to have a timeout.  It does not
> > yet try to do anything with timeouts on such commands.
> >
> > We are sending a reset in order to get rid of a command we want to abort.
> > If we make it return on the same reply queue as the command we want to
> abort,
> > the completion of the aborted command will not race with the completion of
> > the reset command.
> >
> > Rename hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd(),
> since
> > this function is the interface for issuing commands to the controller and
> > not the "core" of that implementation.  Add a parameter to it which allows
> > the caller to specify the reply queue to be used.  Modify existing callers
> > to specify the default reply queue.
> >
> > Rename __hpsa_scsi_do_simple_cmd_core() to
> hpsa_scsi_do_simple_cmd_core(),
> > since this routine is the "core" implementation of the "do simple command"
> > function and there is no longer any other function with a similar name.
> > Modify the existing callers of this routine (other than
> > hpsa_scsi_do_simple_cmd()) to instead call hpsa_scsi_do_simple_cmd(), since
> > it will now accept the reply_queue paramenter, and it provides a controller
> > lock-up check.  (Also, tweak two related message strings to make them
> > distinct from each other.)
> >
> > Submitting a command to a locked up controller always results in a timeout,
> > so check for controller lock-up before submitting.
> >
> > This is to enable fixing a race between command completions and
> > abort completions on different reply queues in a subsequent patch.
> > We want to be able to specify which reply queue an abort completion
> > should occur on so that it cannot race the completion of the command
> > it is trying to abort.
> >
> > The following race was possible in theory:
> >
> >   1. Abort command is sent to hardware.
> >   2. Command to be aborted simultaneously completes on another
> >      reply queue.
> >   3. Hardware receives abort command, decides command has already
> >      completed and indicates this to the driver via another different
> >      reply queue.
> >   4. driver processes abort completion finds that the hardware does not know
> >      about the command, concludes that therefore the command cannot
> complete,
> >      returns SUCCESS indicating to the mid-layer that the scsi_cmnd may be
> >      re-used.
> >   5. Command from step 2 is processed and completed back to scsi mid
> >      layer (after we already promised that would never happen.)
> >
> > Fix by forcing aborts to complete on the same reply queue as the command
> > they are aborting.
> >
> > Piggybacking device rescanning functionality onto the lockup
> > detection thread is not a good idea because if the controller
> > locks up during device rescanning, then the thread could get
> > stuck, then the lockup isn't detected.  Use separate work
> > queues for device rescanning and lockup detection.
> >
> > Detect controller lockup in abort handler.
> >
> > After a lockup is detected, return DO_NO_CONNECT which results in
> immediate
> > termination of commands rather than DID_ERR which results in retries.
> >
> > Modify detect_controller_lockup() to return the result, to remove the need for
> a separate check.
> >
> > Reviewed-by: Scott Teel <scott.teel@pmcs.com>
> > Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
> > Signed-off-by: Webb Scales <webbnh@hp.com>
> > Signed-off-by: Don Brace <don.brace@pmcs.com>
> > ---
> >  drivers/scsi/hpsa.c     |  326 ++++++++++++++++++++++++++++++++++++-------
> ----
> >  drivers/scsi/hpsa_cmd.h |    5 +
> >  2 files changed, 254 insertions(+), 77 deletions(-)
> >
> > diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> > index 9b88726..488f81b 100644
> > --- a/drivers/scsi/hpsa.c
> > +++ b/drivers/scsi/hpsa.c
> > @@ -253,6 +253,8 @@ static int hpsa_scsi_ioaccel_queue_command(struct
> ctlr_info *h,
> >  	struct CommandList *c, u32 ioaccel_handle, u8 *cdb, int cdb_len,
> >  	u8 *scsi3addr, struct hpsa_scsi_dev_t *phys_disk);
> >  static void hpsa_command_resubmit_worker(struct work_struct *work);
> > +static u32 lockup_detected(struct ctlr_info *h);
> > +static int detect_controller_lockup(struct ctlr_info *h);
> >
> >  static inline struct ctlr_info *sdev_to_hba(struct scsi_device *sdev)
> >  {
> > @@ -748,30 +750,43 @@ static inline u32 next_command(struct ctlr_info *h,
> u8 q)
> >   * a separate special register for submitting commands.
> >   */
> >
> > -/* set_performant_mode: Modify the tag for cciss performant
> > +/*
> > + * set_performant_mode: Modify the tag for cciss performant
> >   * set bit 0 for pull model, bits 3-1 for block fetch
> >   * register number
> >   */
> > -static void set_performant_mode(struct ctlr_info *h, struct CommandList *c)
> > +#define DEFAULT_REPLY_QUEUE (-1)
> > +static void set_performant_mode(struct ctlr_info *h, struct CommandList *c,
> > +					int reply_queue)
> >  {
> >  	if (likely(h->transMethod & CFGTBL_Trans_Performant)) {
> >  		c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1);
> > -		if (likely(h->msix_vector > 0))
> > +		if (unlikely(!h->msix_vector))
> > +			return;
> > +		if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> >  			c->Header.ReplyQueue =
> >  				raw_smp_processor_id() % h->nreply_queues;
> > +		else
> > +			c->Header.ReplyQueue = reply_queue % h-
> >nreply_queues;
> >  	}
> >  }
> >
> >  static void set_ioaccel1_performant_mode(struct ctlr_info *h,
> > -						struct CommandList *c)
> > +						struct CommandList *c,
> > +						int reply_queue)
> >  {
> >  	struct io_accel1_cmd *cp = &h->ioaccel_cmd_pool[c->cmdindex];
> >
> > -	/* Tell the controller to post the reply to the queue for this
> > +	/*
> > +	 * Tell the controller to post the reply to the queue for this
> >  	 * processor.  This seems to give the best I/O throughput.
> >  	 */
> > -	cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> > -	/* Set the bits in the address sent down to include:
> > +	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> > +		cp->ReplyQueue = smp_processor_id() % h->nreply_queues;
> > +	else
> > +		cp->ReplyQueue = reply_queue % h->nreply_queues;
> > +	/*
> > +	 * Set the bits in the address sent down to include:
> >  	 *  - performant mode bit (bit 0)
> >  	 *  - pull count (bits 1-3)
> >  	 *  - command type (bits 4-6)
> > @@ -781,15 +796,21 @@ static void set_ioaccel1_performant_mode(struct
> ctlr_info *h,
> >  }
> >
> >  static void set_ioaccel2_performant_mode(struct ctlr_info *h,
> > -						struct CommandList *c)
> > +						struct CommandList *c,
> > +						int reply_queue)
> >  {
> >  	struct io_accel2_cmd *cp = &h->ioaccel2_cmd_pool[c->cmdindex];
> >
> > -	/* Tell the controller to post the reply to the queue for this
> > +	/*
> > +	 * Tell the controller to post the reply to the queue for this
> >  	 * processor.  This seems to give the best I/O throughput.
> >  	 */
> > -	cp->reply_queue = smp_processor_id() % h->nreply_queues;
> > -	/* Set the bits in the address sent down to include:
> > +	if (likely(reply_queue == DEFAULT_REPLY_QUEUE))
> > +		cp->reply_queue = smp_processor_id() % h->nreply_queues;
> > +	else
> > +		cp->reply_queue = reply_queue % h->nreply_queues;
> > +	/*
> > +	 * Set the bits in the address sent down to include:
> >  	 *  - performant mode bit not used in ioaccel mode 2
> >  	 *  - pull count (bits 0-3)
> >  	 *  - command type isn't needed for ioaccel2
> > @@ -826,26 +847,32 @@ static void
> dial_up_lockup_detection_on_fw_flash_complete(struct ctlr_info *h,
> >  		h->heartbeat_sample_interval =
> HEARTBEAT_SAMPLE_INTERVAL;
> >  }
> >
> > -static void enqueue_cmd_and_start_io(struct ctlr_info *h,
> > -	struct CommandList *c)
> > +static void __enqueue_cmd_and_start_io(struct ctlr_info *h,
> > +	struct CommandList *c, int reply_queue)
> >  {
> >  	dial_down_lockup_detection_during_fw_flash(h, c);
> >  	atomic_inc(&h->commands_outstanding);
> >  	switch (c->cmd_type) {
> >  	case CMD_IOACCEL1:
> > -		set_ioaccel1_performant_mode(h, c);
> > +		set_ioaccel1_performant_mode(h, c, reply_queue);
> >  		writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
> >  		break;
> >  	case CMD_IOACCEL2:
> > -		set_ioaccel2_performant_mode(h, c);
> > +		set_ioaccel2_performant_mode(h, c, reply_queue);
> >  		writel(c->busaddr, h->vaddr +
> IOACCEL2_INBOUND_POSTQ_32);
> >  		break;
> >  	default:
> > -		set_performant_mode(h, c);
> > +		set_performant_mode(h, c, reply_queue);
> >  		h->access.submit_command(h, c);
> >  	}
> >  }
> >
> > +static void enqueue_cmd_and_start_io(struct ctlr_info *h,
> > +					struct CommandList *c)
> > +{
> > +	__enqueue_cmd_and_start_io(h, c, DEFAULT_REPLY_QUEUE);
> > +}
> > +
> >  static inline int is_hba_lunid(unsigned char scsi3addr[])
> >  {
> >  	return memcmp(scsi3addr, RAID_CTLR_LUNID, 8) == 0;
> > @@ -1877,6 +1904,19 @@ static void complete_scsi_command(struct
> CommandList *cp)
> >  	if (cp->cmd_type == CMD_IOACCEL2 || cp->cmd_type ==
> CMD_IOACCEL1)
> >  		atomic_dec(&cp->phys_disk->ioaccel_cmds_out);
> >
> > +	/*
> > +	 * We check for lockup status here as it may be set for
> > +	 * CMD_SCSI, CMD_IOACCEL1 and CMD_IOACCEL2 commands by
> > +	 * fail_all_oustanding_cmds()
> > +	 */
> > +	if (unlikely(ei->CommandStatus == CMD_CTLR_LOCKUP)) {
> > +		/* DID_NO_CONNECT will prevent a retry */
> > +		cmd->result = DID_NO_CONNECT << 16;
> > +		cmd_free(h, cp);
> > +		cmd->scsi_done(cmd);
> > +		return;
> > +	}
> > +
> >  	if (cp->cmd_type == CMD_IOACCEL2)
> >  		return process_ioaccel2_completion(h, cp, cmd, dev);
> >
> > @@ -2091,14 +2131,36 @@ static int hpsa_map_one(struct pci_dev *pdev,
> >  	return 0;
> >  }
> >
> > -static inline void hpsa_scsi_do_simple_cmd_core(struct ctlr_info *h,
> > -	struct CommandList *c)
> > +#define NO_TIMEOUT ((unsigned long) -1)
> > +#define DEFAULT_TIMEOUT 30000 /* milliseconds */
> > +static int hpsa_scsi_do_simple_cmd_core(struct ctlr_info *h,
> > +	struct CommandList *c, int reply_queue, unsigned long timeout_msecs)
> >  {
> >  	DECLARE_COMPLETION_ONSTACK(wait);
> >
> >  	c->waiting = &wait;
> > -	enqueue_cmd_and_start_io(h, c);
> > -	wait_for_completion(&wait);
> > +	__enqueue_cmd_and_start_io(h, c, reply_queue);
> > +	if (timeout_msecs == NO_TIMEOUT) {
> > +		/* TODO: get rid of this no-timeout thing */
> > +		wait_for_completion_io(&wait);
> > +		return IO_OK;
> > +	}
> > +	if (!wait_for_completion_io_timeout(&wait,
> > +					msecs_to_jiffies(timeout_msecs))) {
> > +		dev_warn(&h->pdev->dev, "Command timed out.\n");
> > +		return -ETIMEDOUT;
> > +	}
> > +	return IO_OK;
> > +}
> > +
> > +static int hpsa_scsi_do_simple_cmd(struct ctlr_info *h, struct CommandList
> *c,
> > +				   int reply_queue, unsigned long
> timeout_msecs)
> > +{
> > +	if (unlikely(lockup_detected(h))) {
> > +		c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
> > +		return IO_OK;
> > +	}
> > +	return hpsa_scsi_do_simple_cmd_core(h, c, reply_queue,
> timeout_msecs);
> >  }
> >
> >  static u32 lockup_detected(struct ctlr_info *h)
> > @@ -2113,25 +2175,19 @@ static u32 lockup_detected(struct ctlr_info *h)
> >  	return rc;
> >  }
> >
> > -static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
> > -	struct CommandList *c)
> > -{
> > -	/* If controller lockup detected, fake a hardware error. */
> > -	if (unlikely(lockup_detected(h)))
> > -		c->err_info->CommandStatus = CMD_HARDWARE_ERR;
> > -	else
> > -		hpsa_scsi_do_simple_cmd_core(h, c);
> > -}
> > -
> >  #define MAX_DRIVER_CMD_RETRIES 25
> > -static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> > -	struct CommandList *c, int data_direction)
> > +static int hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> > +	struct CommandList *c, int data_direction, unsigned long
> timeout_msecs)
> >  {
> >  	int backoff_time = 10, retry_count = 0;
> > +	int rc;
> >
> >  	do {
> >  		memset(c->err_info, 0, sizeof(*c->err_info));
> > -		hpsa_scsi_do_simple_cmd_core(h, c);
> > +		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> > +						  timeout_msecs);
> > +		if (rc)
> > +			break;
> >  		retry_count++;
> >  		if (retry_count > 3) {
> >  			msleep(backoff_time);
> > @@ -2142,6 +2198,9 @@ static void
> hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> >  			check_for_busy(h, c)) &&
> >  			retry_count <= MAX_DRIVER_CMD_RETRIES);
> >  	hpsa_pci_unmap(h->pdev, c, 1, data_direction);
> > +	if (retry_count > MAX_DRIVER_CMD_RETRIES)
> > +		rc = -EIO;
> > +	return rc;
> >  }
> >
> >  static void hpsa_print_cmd(struct ctlr_info *h, char *txt,
> > @@ -2218,6 +2277,9 @@ static void hpsa_scsi_interpret_error(struct ctlr_info
> *h,
> >  	case CMD_UNABORTABLE:
> >  		hpsa_print_cmd(h, "unabortable", cp);
> >  		break;
> > +	case CMD_CTLR_LOCKUP:
> > +		hpsa_print_cmd(h, "controller lockup detected", cp);
> > +		break;
> >  	default:
> >  		hpsa_print_cmd(h, "unknown status", cp);
> >  		dev_warn(d, "Unknown command status %x\n",
> > @@ -2245,7 +2307,10 @@ static int hpsa_scsi_do_inquiry(struct ctlr_info *h,
> unsigned char *scsi3addr,
> >  		rc = -1;
> >  		goto out;
> >  	}
> > -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> > +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> > +			PCI_DMA_FROMDEVICE, NO_TIMEOUT);
> > +	if (rc)
> > +		goto out;
> >  	ei = c->err_info;
> >  	if (ei->CommandStatus != 0 && ei->CommandStatus !=
> CMD_DATA_UNDERRUN) {
> >  		hpsa_scsi_interpret_error(h, c);
> > @@ -2275,7 +2340,10 @@ static int hpsa_bmic_ctrl_mode_sense(struct
> ctlr_info *h,
> >  		rc = -1;
> >  		goto out;
> >  	}
> > -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> > +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> > +					PCI_DMA_FROMDEVICE,
> NO_TIMEOUT);
> > +	if (rc)
> > +		goto out;
> >  	ei = c->err_info;
> >  	if (ei->CommandStatus != 0 && ei->CommandStatus !=
> CMD_DATA_UNDERRUN) {
> >  		hpsa_scsi_interpret_error(h, c);
> > @@ -2287,7 +2355,7 @@ out:
> >  	}
> >
> >  static int hpsa_send_reset(struct ctlr_info *h, unsigned char *scsi3addr,
> > -	u8 reset_type)
> > +	u8 reset_type, int reply_queue)
> >  {
> >  	int rc = IO_OK;
> >  	struct CommandList *c;
> > @@ -2304,7 +2372,11 @@ static int hpsa_send_reset(struct ctlr_info *h,
> unsigned char *scsi3addr,
> >  	(void) fill_cmd(c, HPSA_DEVICE_RESET_MSG, h, NULL, 0, 0,
> >  			scsi3addr, TYPE_MSG);
> >  	c->Request.CDB[1] = reset_type; /* fill_cmd defaults to LUN reset */
> > -	hpsa_scsi_do_simple_cmd_core(h, c);
> > +	rc = hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
> > +	if (rc) {
> > +		dev_warn(&h->pdev->dev, "Failed to send reset command\n");
> > +		goto out;
> > +	}
> >  	/* no unmap needed here because no data xfer. */
> >
> >  	ei = c->err_info;
> > @@ -2312,6 +2384,7 @@ static int hpsa_send_reset(struct ctlr_info *h,
> unsigned char *scsi3addr,
> >  		hpsa_scsi_interpret_error(h, c);
> >  		rc = -1;
> >  	}
> > +out:
> >  	cmd_free(h, c);
> >  	return rc;
> >  }
> > @@ -2429,15 +2502,18 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
> >  			sizeof(this_device->raid_map), 0,
> >  			scsi3addr, TYPE_CMD)) {
> >  		dev_warn(&h->pdev->dev, "Out of memory in
> hpsa_get_raid_map()\n");
> > -		cmd_free(h, c);
> > -		return -ENOMEM;
> > +		rc = -ENOMEM;
> > +		goto out;
> >  	}
> > -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> > +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> > +					PCI_DMA_FROMDEVICE,
> NO_TIMEOUT);
> > +	if (rc)
> > +		goto out;
> >  	ei = c->err_info;
> >  	if (ei->CommandStatus != 0 && ei->CommandStatus !=
> CMD_DATA_UNDERRUN) {
> >  		hpsa_scsi_interpret_error(h, c);
> > -		cmd_free(h, c);
> > -		return -1;
> > +		rc = -1;
> > +		goto out;
> >  	}
> >  	cmd_free(h, c);
> >
> > @@ -2449,6 +2525,9 @@ static int hpsa_get_raid_map(struct ctlr_info *h,
> >  	}
> >  	hpsa_debug_map_buff(h, rc, &this_device->raid_map);
> >  	return rc;
> > +out:
> > +	cmd_free(h, c);
> > +	return rc;
> >  }
> >
> >  static int hpsa_bmic_id_physical_device(struct ctlr_info *h,
> > @@ -2468,7 +2547,8 @@ static int hpsa_bmic_id_physical_device(struct
> ctlr_info *h,
> >  	c->Request.CDB[2] = bmic_device_index & 0xff;
> >  	c->Request.CDB[9] = (bmic_device_index >> 8) & 0xff;
> >
> > -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> > +	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE,
> > +						NO_TIMEOUT);
> >  	ei = c->err_info;
> >  	if (ei->CommandStatus != 0 && ei->CommandStatus !=
> CMD_DATA_UNDERRUN) {
> >  		hpsa_scsi_interpret_error(h, c);
> > @@ -2603,7 +2683,10 @@ static int hpsa_scsi_do_report_luns(struct ctlr_info
> *h, int logical,
> >  	}
> >  	if (extended_response)
> >  		c->Request.CDB[1] = extended_response;
> > -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE);
> > +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> > +					PCI_DMA_FROMDEVICE,
> NO_TIMEOUT);
> > +	if (rc)
> > +		goto out;
> >  	ei = c->err_info;
> >  	if (ei->CommandStatus != 0 &&
> >  	    ei->CommandStatus != CMD_DATA_UNDERRUN) {
> > @@ -2696,7 +2779,7 @@ static int hpsa_volume_offline(struct ctlr_info *h,
> >  {
> >  	struct CommandList *c;
> >  	unsigned char *sense, sense_key, asc, ascq;
> > -	int ldstat = 0;
> > +	int rc, ldstat = 0;
> >  	u16 cmd_status;
> >  	u8 scsi_status;
> >  #define ASC_LUN_NOT_READY 0x04
> > @@ -2707,7 +2790,11 @@ static int hpsa_volume_offline(struct ctlr_info *h,
> >  	if (!c)
> >  		return 0;
> >  	(void) fill_cmd(c, TEST_UNIT_READY, h, NULL, 0, 0, scsi3addr,
> TYPE_CMD);
> > -	hpsa_scsi_do_simple_cmd_core(h, c);
> > +	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> NO_TIMEOUT);
> > +	if (rc) {
> > +		cmd_free(h, c);
> > +		return 0;
> > +	}
> >  	sense = c->err_info->SenseInfo;
> >  	sense_key = sense[2];
> >  	asc = sense[12];
> > @@ -4040,7 +4127,11 @@ static int hpsa_scsi_ioaccel_raid_map(struct
> ctlr_info *h,
> >  						dev->phys_disk[map_index]);
> >  }
> >
> > -/* Submit commands down the "normal" RAID stack path */
> > +/*
> > + * Submit commands down the "normal" RAID stack path
> > + * All callers to hpsa_ciss_submit must check lockup_detected
> > + * beforehand, before (opt.) and after calling cmd_alloc
> > + */
> >  static int hpsa_ciss_submit(struct ctlr_info *h,
> >  	struct CommandList *c, struct scsi_cmnd *cmd,
> >  	unsigned char scsi3addr[])
> > @@ -4151,7 +4242,7 @@ static int hpsa_scsi_queue_command(struct
> Scsi_Host *sh, struct scsi_cmnd *cmd)
> >  	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
> >
> >  	if (unlikely(lockup_detected(h))) {
> > -		cmd->result = DID_ERROR << 16;
> > +		cmd->result = DID_NO_CONNECT << 16;
> >  		cmd->scsi_done(cmd);
> >  		return 0;
> >  	}
> > @@ -4161,7 +4252,7 @@ static int hpsa_scsi_queue_command(struct
> Scsi_Host *sh, struct scsi_cmnd *cmd)
> >  		return SCSI_MLQUEUE_HOST_BUSY;
> >  	}
> >  	if (unlikely(lockup_detected(h))) {
> > -		cmd->result = DID_ERROR << 16;
> > +		cmd->result = DID_NO_CONNECT << 16;
> >  		cmd_free(h, c);
> >  		cmd->scsi_done(cmd);
> >  		return 0;
> > @@ -4356,7 +4447,10 @@ static int
> wait_for_device_to_become_ready(struct ctlr_info *h,
> >  		/* Send the Test Unit Ready, fill_cmd can't fail, no mapping */
> >  		(void) fill_cmd(c, TEST_UNIT_READY, h,
> >  				NULL, 0, 0, lunaddr, TYPE_CMD);
> > -		hpsa_scsi_do_simple_cmd_core(h, c);
> > +		rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> > +						NO_TIMEOUT);
> > +		if (rc)
> > +			goto do_it_again;
> >  		/* no unmap needed here because no data xfer. */
> >
> >  		if (c->err_info->CommandStatus == CMD_SUCCESS)
> > @@ -4367,7 +4461,7 @@ static int wait_for_device_to_become_ready(struct
> ctlr_info *h,
> >  			(c->err_info->SenseInfo[2] == NO_SENSE ||
> >  			c->err_info->SenseInfo[2] == UNIT_ATTENTION))
> >  			break;
> > -
> > +do_it_again:
> >  		dev_warn(&h->pdev->dev, "waiting %d secs "
> >  			"for device to become ready.\n", waittime);
> >  		rc = 1; /* device not ready. */
> > @@ -4405,13 +4499,46 @@ static int hpsa_eh_device_reset_handler(struct
> scsi_cmnd *scsicmd)
> >  			"device lookup failed.\n");
> >  		return FAILED;
> >  	}
> > -	dev_warn(&h->pdev->dev, "resetting device %d:%d:%d:%d\n",
> > -		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
> > +
> > +	/* if controller locked up, we can guarantee command won't complete
> */
> > +	if (lockup_detected(h)) {
> > +		dev_warn(&h->pdev->dev,
> > +			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
> > +			h->scsi_host->host_no, dev->bus, dev->target,
> > +			dev->lun);
> > +		return FAILED;
> > +	}
> > +
> > +	/* this reset request might be the result of a lockup; check */
> > +	if (detect_controller_lockup(h)) {
> > +		dev_warn(&h->pdev->dev,
> > +			 "scsi %d:%d:%d:%d RESET FAILED, new lockup
> detected\n",
> > +			 h->scsi_host->host_no, dev->bus, dev->target,
> > +			 dev->lun);
> > +		return FAILED;
> > +	}
> > +
> > +	dev_warn(&h->pdev->dev,
> > +		"scsi %d:%d:%d:%d: %s %.8s %.16s resetting RAID-%s
> SSDSmartPathCap%c En%c Exp=%d\n",
> > +		h->scsi_host->host_no, dev->bus, dev->target, dev->lun,
> > +		scsi_device_type(dev->devtype),
> > +		dev->vendor,
> > +		dev->model,
> > +		dev->raid_level > RAID_UNKNOWN ?
> > +			"RAID-?" : raid_label[dev->raid_level],
> > +		dev->offload_config ? '+' : '-',
> > +		dev->offload_enabled ? '+' : '-',
> > +		dev->expose_state);
> > +
> >  	/* send a reset to the SCSI LUN which the command was sent to */
> > -	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN);
> > +	rc = hpsa_send_reset(h, dev->scsi3addr, HPSA_RESET_TYPE_LUN,
> > +			     DEFAULT_REPLY_QUEUE);
> >  	if (rc == 0 && wait_for_device_to_become_ready(h, dev->scsi3addr) ==
> 0)
> >  		return SUCCESS;
> >
> > +	dev_warn(&h->pdev->dev,
> > +		"scsi %d:%d:%d:%d reset failed\n",
> > +		h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
> >  	return FAILED;
> >  }
> >
> > @@ -4456,7 +4583,7 @@ static void hpsa_get_tag(struct ctlr_info *h,
> >  }
> >
> >  static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
> > -	struct CommandList *abort, int swizzle)
> > +	struct CommandList *abort, int swizzle, int reply_queue)
> >  {
> >  	int rc = IO_OK;
> >  	struct CommandList *c;
> > @@ -4474,9 +4601,9 @@ static int hpsa_send_abort(struct ctlr_info *h,
> unsigned char *scsi3addr,
> >  		0, 0, scsi3addr, TYPE_MSG);
> >  	if (swizzle)
> >  		swizzle_abort_tag(&c->Request.CDB[4]);
> > -	hpsa_scsi_do_simple_cmd_core(h, c);
> > +	(void) hpsa_scsi_do_simple_cmd(h, c, reply_queue, NO_TIMEOUT);
> >  	hpsa_get_tag(h, abort, &taglower, &tagupper);
> > -	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd_core
> completed.\n",
> > +	dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x:
> do_simple_cmd(abort) completed.\n",
> >  		__func__, tagupper, taglower);
> >  	/* no unmap needed here because no data xfer. */
> >
> > @@ -4508,7 +4635,7 @@ static int hpsa_send_abort(struct ctlr_info *h,
> unsigned char *scsi3addr,
> >   */
> >
> >  static int hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
> > -	unsigned char *scsi3addr, struct CommandList *abort)
> > +	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
> >  {
> >  	int rc = IO_OK;
> >  	struct scsi_cmnd *scmd; /* scsi command within request being aborted
> */
> > @@ -4551,7 +4678,7 @@ static int
> hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
> >  			"Reset as abort: Resetting physical device at scsi3addr
> 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
> >  			psa[0], psa[1], psa[2], psa[3],
> >  			psa[4], psa[5], psa[6], psa[7]);
> > -	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET);
> > +	rc = hpsa_send_reset(h, psa, HPSA_RESET_TYPE_TARGET, reply_queue);
> >  	if (rc != 0) {
> >  		dev_warn(&h->pdev->dev,
> >  			"Reset as abort: Failed on physical device at scsi3addr
> 0x%02x%02x%02x%02x%02x%02x%02x%02x\n",
> > @@ -4585,7 +4712,7 @@ static int
> hpsa_send_reset_as_abort_ioaccel2(struct ctlr_info *h,
> >   * make this true someday become false.
> >   */
> >  static int hpsa_send_abort_both_ways(struct ctlr_info *h,
> > -	unsigned char *scsi3addr, struct CommandList *abort)
> > +	unsigned char *scsi3addr, struct CommandList *abort, int reply_queue)
> >  {
> >  	/* ioccelerator mode 2 commands should be aborted via the
> >  	 * accelerated path, since RAID path is unaware of these commands,
> > @@ -4593,10 +4720,20 @@ static int hpsa_send_abort_both_ways(struct
> ctlr_info *h,
> >  	 * Change abort to physical device reset.
> >  	 */
> >  	if (abort->cmd_type == CMD_IOACCEL2)
> > -		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr, abort);
> > +		return hpsa_send_reset_as_abort_ioaccel2(h, scsi3addr,
> > +							abort, reply_queue);
> > +
> > +	return hpsa_send_abort(h, scsi3addr, abort, 0, reply_queue) &&
> > +			hpsa_send_abort(h, scsi3addr, abort, 1, reply_queue);
> > +}
> >
> > -	return hpsa_send_abort(h, scsi3addr, abort, 0) &&
> > -			hpsa_send_abort(h, scsi3addr, abort, 1);
> > +/* Find out which reply queue a command was meant to return on */
> > +static int hpsa_extract_reply_queue(struct ctlr_info *h,
> > +					struct CommandList *c)
> > +{
> > +	if (c->cmd_type == CMD_IOACCEL2)
> > +		return h->ioaccel2_cmd_pool[c->cmdindex].reply_queue;
> > +	return c->Header.ReplyQueue;
> >  }
> >
> >  /* Send an abort for the specified command.
> > @@ -4614,7 +4751,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd
> *sc)
> >  	char msg[256];		/* For debug messaging. */
> >  	int ml = 0;
> >  	__le32 tagupper, taglower;
> > -	int refcount;
> > +	int refcount, reply_queue;
> >
> >  	/* Find the controller of the command to be aborted */
> >  	h = sdev_to_hba(sc->device);
> > @@ -4622,8 +4759,23 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd
> *sc)
> >  			"ABORT REQUEST FAILED, Controller lookup failed.\n"))
> >  		return FAILED;
> >
> > -	if (lockup_detected(h))
> > +	/* If controller locked up, we can guarantee command won't complete
> */
> > +	if (lockup_detected(h)) {
> > +		dev_warn(&h->pdev->dev,
> > +			"scsi %d:%d:%d:%llu scmd %p ABORT FAILED, lockup
> detected\n",
> > +			h->scsi_host->host_no, sc->device->channel,
> > +			sc->device->id, sc->device->lun, sc);
> >  		return FAILED;
> > +	}
> > +
> > +	/* This is a good time to check if controller lockup has occurred */
> > +	if (detect_controller_lockup(h)) {
> > +		dev_warn(&h->pdev->dev,
> > +			 "scsi %d:%d:%d:%llu scmd %p ABORT FAILED, new
> lockup detected\n",
> > +			 h->scsi_host->host_no, sc->device->channel,
> > +			 sc->device->id, sc->device->lun, sc);
> > +		return FAILED;
> > +	}
> >
> >  	/* Check that controller supports some kind of task abort */
> >  	if (!(HPSATMF_PHYS_TASK_ABORT & h->TMFSupportFlags) &&
> > @@ -4656,6 +4808,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd
> *sc)
> >  		return SUCCESS;
> >  	}
> >  	hpsa_get_tag(h, abort, &taglower, &tagupper);
> > +	reply_queue = hpsa_extract_reply_queue(h, abort);
> >  	ml += sprintf(msg+ml, "Tag:0x%08x:%08x ", tagupper, taglower);
> >  	as  = abort->scsi_cmd;
> >  	if (as != NULL)
> > @@ -4670,7 +4823,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd
> *sc)
> >  	 * by the firmware (but not to the scsi mid layer) but we can't
> >  	 * distinguish which.  Send the abort down.
> >  	 */
> > -	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort);
> > +	rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort,
> reply_queue);
> >  	if (rc != 0) {
> >  		dev_warn(&h->pdev->dev, "scsi %d:%d:%d:%d %s\n",
> >  			h->scsi_host->host_no,
> > @@ -4995,7 +5148,9 @@ static int hpsa_passthru_ioctl(struct ctlr_info *h,
> void __user *argp)
> >  		c->SG[0].Len = cpu_to_le32(iocommand.buf_size);
> >  		c->SG[0].Ext = cpu_to_le32(HPSA_SG_LAST); /* not chaining */
> >  	}
> > -	hpsa_scsi_do_simple_cmd_core_if_no_lockup(h, c);
> > +	rc = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> NO_TIMEOUT);
> > +	if (rc)
> > +		rc = -EIO;
> 
> We just pretend here that an error path exist, with NO_TIMEOUT the function
> can't fail,
> but if it could we might end up copying some random data from kernel to user
> space.
> 
> >  	if (iocommand.buf_size > 0)
> >  		hpsa_pci_unmap(h->pdev, c, 1, PCI_DMA_BIDIRECTIONAL);
> >  	check_ioctl_unit_attention(h, c);
> > @@ -5125,7 +5280,11 @@ static int hpsa_big_passthru_ioctl(struct ctlr_info
> *h, void __user *argp)
> >  		}
> >  		c->SG[--i].Ext = cpu_to_le32(HPSA_SG_LAST);
> >  	}
> > -	hpsa_scsi_do_simple_cmd_core_if_no_lockup(h, c);
> > +	status = hpsa_scsi_do_simple_cmd(h, c, DEFAULT_REPLY_QUEUE,
> NO_TIMEOUT);
> > +	if (status) {
> > +		status = -EIO;
> > +		goto cleanup0;
> 
> Similar here, by goto cleanup0; we miss a call to hpsa_pci_unmap.
> Nothing from that is an issue because  hpsa_scsi_do_simple_cmd(h, c,
> DEFAULT_REPLY_QUEUE, NO_TIMEOUT)
> can't fail, but it is  a prepared trap for a future change with a real timeout.
> 
> As it is now it is not a real issue, when it's fixed in next driver update it's fine
> for me.
> 
> Tomas
> 
> > +	}
> >  	if (sg_used)
> >  		hpsa_pci_unmap(h->pdev, c, sg_used,
> PCI_DMA_BIDIRECTIONAL);
> >  	check_ioctl_unit_attention(h, c);
> > @@ -6272,6 +6431,8 @@ static int hpsa_wait_for_mode_change_ack(struct
> ctlr_info *h)
> >  	 * as we enter this code.)
> >  	 */
> >  	for (i = 0; i < MAX_MODE_CHANGE_WAIT; i++) {
> > +		if (h->remove_in_progress)
> > +			goto done;
> >  		spin_lock_irqsave(&h->lock, flags);
> >  		doorbell_value = readl(h->vaddr + SA5_DOORBELL);
> >  		spin_unlock_irqrestore(&h->lock, flags);
> > @@ -6667,17 +6828,21 @@ static void fail_all_outstanding_cmds(struct
> ctlr_info *h)
> >  {
> >  	int i, refcount;
> >  	struct CommandList *c;
> > +	int failcount = 0;
> >
> >  	flush_workqueue(h->resubmit_wq); /* ensure all cmds are fully built */
> >  	for (i = 0; i < h->nr_cmds; i++) {
> >  		c = h->cmd_pool + i;
> >  		refcount = atomic_inc_return(&c->refcount);
> >  		if (refcount > 1) {
> > -			c->err_info->CommandStatus =
> CMD_HARDWARE_ERR;
> > +			c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
> >  			finish_cmd(c);
> > +			failcount++;
> >  		}
> >  		cmd_free(h, c);
> >  	}
> > +	dev_warn(&h->pdev->dev,
> > +		"failed %d commands in fail_all\n", failcount);
> >  }
> >
> >  static void set_lockup_detected_for_all_cpus(struct ctlr_info *h, u32 value)
> > @@ -6705,18 +6870,19 @@ static void controller_lockup_detected(struct
> ctlr_info *h)
> >  	if (!lockup_detected) {
> >  		/* no heartbeat, but controller gave us a zero. */
> >  		dev_warn(&h->pdev->dev,
> > -			"lockup detected but scratchpad register is zero\n");
> > +			"lockup detected after %d but scratchpad register is
> zero\n",
> > +			h->heartbeat_sample_interval / HZ);
> >  		lockup_detected = 0xffffffff;
> >  	}
> >  	set_lockup_detected_for_all_cpus(h, lockup_detected);
> >  	spin_unlock_irqrestore(&h->lock, flags);
> > -	dev_warn(&h->pdev->dev, "Controller lockup detected: 0x%08x\n",
> > -			lockup_detected);
> > +	dev_warn(&h->pdev->dev, "Controller lockup detected: 0x%08x after
> %d\n",
> > +			lockup_detected, h->heartbeat_sample_interval / HZ);
> >  	pci_disable_device(h->pdev);
> >  	fail_all_outstanding_cmds(h);
> >  }
> >
> > -static void detect_controller_lockup(struct ctlr_info *h)
> > +static int detect_controller_lockup(struct ctlr_info *h)
> >  {
> >  	u64 now;
> >  	u32 heartbeat;
> > @@ -6726,7 +6892,7 @@ static void detect_controller_lockup(struct ctlr_info
> *h)
> >  	/* If we've received an interrupt recently, we're ok. */
> >  	if (time_after64(h->last_intr_timestamp +
> >  				(h->heartbeat_sample_interval), now))
> > -		return;
> > +		return false;
> >
> >  	/*
> >  	 * If we've already checked the heartbeat recently, we're ok.
> > @@ -6735,7 +6901,7 @@ static void detect_controller_lockup(struct ctlr_info
> *h)
> >  	 */
> >  	if (time_after64(h->last_heartbeat_timestamp +
> >  				(h->heartbeat_sample_interval), now))
> > -		return;
> > +		return false;
> >
> >  	/* If heartbeat has not changed since we last looked, we're not ok. */
> >  	spin_lock_irqsave(&h->lock, flags);
> > @@ -6743,12 +6909,13 @@ static void detect_controller_lockup(struct
> ctlr_info *h)
> >  	spin_unlock_irqrestore(&h->lock, flags);
> >  	if (h->last_heartbeat == heartbeat) {
> >  		controller_lockup_detected(h);
> > -		return;
> > +		return true;
> >  	}
> >
> >  	/* We're ok. */
> >  	h->last_heartbeat = heartbeat;
> >  	h->last_heartbeat_timestamp = now;
> > +	return false;
> >  }
> >
> >  static void hpsa_ack_ctlr_events(struct ctlr_info *h)
> > @@ -7092,8 +7259,10 @@ static void hpsa_flush_cache(struct ctlr_info *h)
> >  {
> >  	char *flush_buf;
> >  	struct CommandList *c;
> > +	int rc;
> >
> >  	/* Don't bother trying to flush the cache if locked up */
> > +	/* FIXME not necessary if do_simple_cmd does the check */
> >  	if (unlikely(lockup_detected(h)))
> >  		return;
> >  	flush_buf = kzalloc(4, GFP_KERNEL);
> > @@ -7109,7 +7278,10 @@ static void hpsa_flush_cache(struct ctlr_info *h)
> >  		RAID_CTLR_LUNID, TYPE_CMD)) {
> >  		goto out;
> >  	}
> > -	hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_TODEVICE);
> > +	rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> > +					PCI_DMA_TODEVICE, NO_TIMEOUT);
> > +	if (rc)
> > +		goto out;
> >  	if (c->err_info->CommandStatus != 0)
> >  out:
> >  		dev_warn(&h->pdev->dev,
> > diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
> > index 76d5499..f52c847 100644
> > --- a/drivers/scsi/hpsa_cmd.h
> > +++ b/drivers/scsi/hpsa_cmd.h
> > @@ -43,6 +43,11 @@
> >  #define CMD_TIMEOUT             0x000B
> >  #define CMD_UNABORTABLE		0x000C
> >  #define CMD_IOACCEL_DISABLED	0x000E
> > +#define CMD_CTLR_LOCKUP		0xffff
> > +/* Note: CMD_CTLR_LOCKUP is not a value defined by the CISS spec
> > + * it is a value defined by the driver that commands can be marked
> > + * with when a controller lockup has been detected by the driver
> > + */
> >
> >
> >  /* Unit Attentions ASC's as defined for the MSA2012sa */
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-23 16:57   ` Tomas Henzl
       [not found]     ` <07F70BBF6832E34FA1C923241E8833AB486892F9@BBYEXM01.pmc-sierra.internal>
@ 2015-03-27 18:49     ` brace
  1 sibling, 0 replies; 54+ messages in thread
From: brace @ 2015-03-27 18:49 UTC (permalink / raw)
  To: Tomas Henzl, Scott Teel, Kevin Barnett, james.bottomley, hch,
	Justin Lindley, brace
  Cc: linux-scsi

I'll send up another patch to fix this issue.

> -----Original Message-----
> From: Tomas Henzl [mailto:thenzl@redhat.com]
> Sent: Monday, March 23, 2015 11:58 AM
> To: Don Brace; Scott Teel; Kevin Barnett; james.bottomley@parallels.com;
> hch@infradead.org; Justin Lindley; brace
> Cc: linux-scsi@vger.kernel.org
> Subject: Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
> 
> On 03/17/2015 09:06 PM, Don Brace wrote:
> > From: Webb Scales <webbnh@hp.com>
> >
> > Rework slave allocation:
> >   - separate the tagging support setup from the hostdata setup
> >   - make the hostdata setup act consistently when the lookup fails
> >   - make the hostdata setup act consistently when the device is not added
> >   - set up the queue depth consistently across these scenarios
> >   - if the block layer mq support is not available, explicitly enable and
> >     activate the SCSI layer tcq support (and do this at allocation-time so
> >     that the tags will be available for INQUIRY commands)
> >
> > Tweak slave configuration so that devices which are masked are also
> > not attached.
> >
> > Reviewed-by: Scott Teel <scott.teel@pmcs.com>
> > Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
> > Signed-off-by: Webb Scales <webbnh@hp.com>
> > Signed-off-by: Don Brace <don.brace@pmcs.com>
> > ---
> >  drivers/scsi/hpsa.c |  153 +++++++++++++++++++++++++++++++++++++++++--
> --------
> >  drivers/scsi/hpsa.h |    1
> >  2 files changed, 123 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> > index 34c178c..4e34a62 100644
> > --- a/drivers/scsi/hpsa.c
> > +++ b/drivers/scsi/hpsa.c
> > @@ -44,6 +44,7 @@
> >  #include <scsi/scsi_host.h>
> >  #include <scsi/scsi_tcq.h>
> >  #include <scsi/scsi_eh.h>
> > +#include <scsi/scsi_dbg.h>
> >  #include <linux/cciss_ioctl.h>
> >  #include <linux/string.h>
> >  #include <linux/bitmap.h>
> > @@ -212,6 +213,9 @@ static int hpsa_compat_ioctl(struct scsi_device *dev,
> int cmd,
> >
> >  static void cmd_free(struct ctlr_info *h, struct CommandList *c);
> >  static struct CommandList *cmd_alloc(struct ctlr_info *h);
> > +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c);
> > +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
> > +					    struct scsi_cmnd *scmd);
> >  static int fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
> >  	void *buff, size_t size, u16 page_code, unsigned char *scsi3addr,
> >  	int cmd_type);
> > @@ -2047,11 +2051,17 @@ static void hpsa_cmd_resolve_events(struct
> ctlr_info *h,
> >  	}
> >  }
> >
> > +static void hpsa_cmd_resolve_and_free(struct ctlr_info *h,
> > +				      struct CommandList *c)
> > +{
> > +	hpsa_cmd_resolve_events(h, c);
> > +	cmd_tagged_free(h, c);
> > +}
> > +
> >  static void hpsa_cmd_free_and_done(struct ctlr_info *h,
> >  		struct CommandList *c, struct scsi_cmnd *cmd)
> >  {
> > -	hpsa_cmd_resolve_events(h, c);
> > -	cmd_free(h, c);
> > +	hpsa_cmd_resolve_and_free(h, c);
> >  	cmd->scsi_done(cmd);
> >  }
> >
> > @@ -2072,8 +2082,7 @@ static void hpsa_cmd_abort_and_free(struct
> ctlr_info *h, struct CommandList *c,
> >  	hpsa_set_scsi_cmd_aborted(cmd);
> >  	dev_warn(&h->pdev->dev, "CDB %16phN was aborted with status
> 0x%x\n",
> >  			 c->Request.CDB, c->err_info->ScsiStatus);
> > -	hpsa_cmd_resolve_events(h, c);
> > -	cmd_free(h, c);		/* FIX-ME:  change to cmd_tagged_free(h, c) */
> > +	hpsa_cmd_resolve_and_free(h, c);
> >  }
> >
> >  static void process_ioaccel2_completion(struct ctlr_info *h,
> > @@ -4535,7 +4544,7 @@ static int hpsa_ciss_submit(struct ctlr_info *h,
> >  	}
> >
> >  	if (hpsa_scatter_gather(h, c, cmd) < 0) { /* Fill SG list */
> > -		cmd_free(h, c);
> > +		hpsa_cmd_resolve_and_free(h, c);
> >  		return SCSI_MLQUEUE_HOST_BUSY;
> >  	}
> >  	enqueue_cmd_and_start_io(h, c);
> > @@ -4581,6 +4590,8 @@ static inline void hpsa_cmd_partial_init(struct
> ctlr_info *h, int index,
> >  {
> >  	dma_addr_t cmd_dma_handle = h->cmd_pool_dhandle + index *
> sizeof(*c);
> >
> > +	BUG_ON(c->cmdindex != index);
> > +
> >  	memset(c->Request.CDB, 0, sizeof(c->Request.CDB));
> >  	memset(c->err_info, 0, sizeof(*c->err_info));
> >  	c->busaddr = (u32) cmd_dma_handle;
> > @@ -4675,27 +4686,24 @@ static int hpsa_scsi_queue_command(struct
> Scsi_Host *sh, struct scsi_cmnd *cmd)
> >
> >  	/* Get the ptr to our adapter structure out of cmd->host. */
> >  	h = sdev_to_hba(cmd->device);
> > +
> > +	BUG_ON(cmd->request->tag < 0);
> > +
> >  	dev = cmd->device->hostdata;
> >  	if (!dev) {
> >  		cmd->result = DID_NO_CONNECT << 16;
> >  		cmd->scsi_done(cmd);
> >  		return 0;
> >  	}
> > -	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
> >
> > -	if (unlikely(lockup_detected(h))) {
> > -		cmd->result = DID_NO_CONNECT << 16;
> > -		cmd->scsi_done(cmd);
> > -		return 0;
> > -	}
> > -	c = cmd_alloc(h);
> > +	memcpy(scsi3addr, dev->scsi3addr, sizeof(scsi3addr));
> >
> >  	if (unlikely(lockup_detected(h))) {
> >  		cmd->result = DID_NO_CONNECT << 16;
> > -		cmd_free(h, c);
> >  		cmd->scsi_done(cmd);
> >  		return 0;
> >  	}
> > +	c = cmd_tagged_alloc(h, cmd);
> >
> >  	/*
> >  	 * Call alternate submit routine for I/O accelerated commands.
> > @@ -4708,7 +4716,7 @@ static int hpsa_scsi_queue_command(struct
> Scsi_Host *sh, struct scsi_cmnd *cmd)
> >  		if (rc == 0)
> >  			return 0;
> >  		if (rc == SCSI_MLQUEUE_HOST_BUSY) {
> > -			cmd_free(h, c);
> > +			hpsa_cmd_resolve_and_free(h, c);
> >  			return SCSI_MLQUEUE_HOST_BUSY;
> >  		}
> >  	}
> > @@ -4822,15 +4830,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
> >  	sh->hostdata[0] = (unsigned long) h;
> >  	sh->irq = h->intr[h->intr_mode];
> >  	sh->unique_id = sh->irq;
> > +	error = scsi_init_shared_tag_map(sh, sh->can_queue);
> > +	if (error) {
> > +		dev_err(&h->pdev->dev,
> > +			"%s: scsi_init_shared_tag_map failed for controller
> %d\n",
> > +			__func__, h->ctlr);
> > +		goto fail_host_put;
> > +	}
> >  	error = scsi_add_host(sh, &h->pdev->dev);
> > -	if (error)
> > +	if (error) {
> > +		dev_err(&h->pdev->dev, "%s: scsi_add_host failed for controller
> %d\n",
> > +			__func__, h->ctlr);
> >  		goto fail_host_put;
> > +	}
> >  	scsi_scan_host(sh);
> >  	return 0;
> >
> >   fail_host_put:
> > -	dev_err(&h->pdev->dev, "%s: scsi_add_host"
> > -		" failed for controller %d\n", __func__, h->ctlr);
> >  	scsi_host_put(sh);
> >  	return error;
> >   fail:
> > @@ -4840,6 +4856,23 @@ static int hpsa_register_scsi(struct ctlr_info *h)
> >  }
> >
> >  /*
> > + * The block layer has already gone to the trouble of picking out a unique,
> > + * small-integer tag for this request.  We use an offset from that value as
> > + * an index to select our command block.  (The offset allows us to reserve the
> > + * low-numbered entries for our own uses.)
> > + */
> > +static int hpsa_get_cmd_index(struct scsi_cmnd *scmd)
> > +{
> > +	int idx = scmd->request->tag;
> > +
> > +	if (idx < 0)
> > +		return idx;
> > +
> > +	/* Offset to leave space for internal cmds. */
> > +	return idx += HPSA_NRESERVED_CMDS;
> > +}
> > +
> > +/*
> >   * Send a TEST_UNIT_READY command to the specified LUN using the
> specified
> >   * reply queue; returns zero if the unit is ready, and non-zero otherwise.
> >   */
> > @@ -4979,18 +5012,18 @@ static int hpsa_eh_device_reset_handler(struct
> scsi_cmnd *scsicmd)
> >  	/* if controller locked up, we can guarantee command won't complete
> */
> >  	if (lockup_detected(h)) {
> >  		dev_warn(&h->pdev->dev,
> > -			"scsi %d:%d:%d:%d RESET FAILED, lockup detected\n",
> > -			h->scsi_host->host_no, dev->bus, dev->target,
> > -			dev->lun);
> > +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, lockup
> detected\n",
> > +			 h->scsi_host->host_no, dev->bus, dev->target, dev-
> >lun,
> > +			 hpsa_get_cmd_index(scsicmd));
> >  		return FAILED;
> >  	}
> >
> >  	/* this reset request might be the result of a lockup; check */
> >  	if (detect_controller_lockup(h)) {
> >  		dev_warn(&h->pdev->dev,
> > -			 "scsi %d:%d:%d:%d RESET FAILED, new lockup
> detected\n",
> > +			 "scsi %d:%d:%d:%u cmd %d RESET FAILED, new lockup
> detected\n",
> >  			 h->scsi_host->host_no, dev->bus, dev->target,
> > -			 dev->lun);
> > +			 dev->lun, hpsa_get_cmd_index(scsicmd));
> >  		return FAILED;
> >  	}
> >
> > @@ -5442,6 +5475,59 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd
> *sc)
> >  }
> >
> >  /*
> > + * For operations with an associated SCSI command, a command block is
> allocated
> > + * at init, and managed by cmd_tagged_alloc() and cmd_tagged_free() using
> the
> > + * block request tag as an index into a table of entries.  cmd_tagged_free() is
> > + * the complement, although cmd_free() may be called instead.
> > + */
> > +static struct CommandList *cmd_tagged_alloc(struct ctlr_info *h,
> > +					    struct scsi_cmnd *scmd)
> > +{
> > +	int idx = hpsa_get_cmd_index(scmd);
> > +	struct CommandList *c = h->cmd_pool + idx;
> > +	int refcount = 0;
> > +
> > +	if (idx < HPSA_NRESERVED_CMDS || idx >= h->nr_cmds) {
> > +		dev_err(&h->pdev->dev, "Bad block tag: %d not in [%d..%d]\n",
> > +			idx, HPSA_NRESERVED_CMDS, h->nr_cmds - 1);
> > +		/* The index value comes from the block layer, so if it's out of
> > +		 * bounds, it's probably not our bug.
> > +		 */
> > +		BUG();
> > +	}
> > +
> > +	refcount = atomic_inc_return(&c->refcount);
> 
> refcount is never used, use atomic_inc(&c->refcount); instead?
> 
> > +	if (unlikely(!hpsa_is_cmd_idle(c))) {
> > +		/*
> > +		 * We expect that the SCSI layer will hand us a unique tag
> > +		 * value.  Thus, there should never be a collision here between
> > +		 * two requests...because if the selected command isn't idle
> > +		 * then someone is going to be very disappointed.
> > +		 */
> > +		dev_err(&h->pdev->dev,
> > +			"tag collision (tag=%d) in cmd_tagged_alloc().\n",
> > +			idx);
> > +		if (c->scsi_cmd != NULL)
> > +			scsi_print_command(c->scsi_cmd);
> > +		scsi_print_command(scmd);
> > +	}
> > +
> > +	hpsa_cmd_partial_init(h, idx, c);
> > +	return c;
> > +}
> > +
> > +static void cmd_tagged_free(struct ctlr_info *h, struct CommandList *c)
> > +{
> > +	/*
> > +	 * Release our reference to the block.  We don't need to do anything
> > +	 * else to free it, because it is accessed by index.  (There's no point
> > +	 * in checking the result of the decrement, since we cannot guarantee
> > +	 * that there isn't a concurrent abort which is also accessing it.)
> > +	 */
> > +	(void)atomic_dec(&c->refcount);
> > +}
> > +
> > +/*
> >   * For operations that cannot sleep, a command block is allocated at init,
> >   * and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
> >   * which ones are free or in use.  Lock must be held when calling this.
> > @@ -5454,7 +5540,6 @@ static struct CommandList *cmd_alloc(struct
> ctlr_info *h)
> >  {
> >  	struct CommandList *c;
> >  	int refcount, i;
> > -	unsigned long offset;
> >
> >  	/*
> >  	 * There is some *extremely* small but non-zero chance that that
> > @@ -5466,31 +5551,39 @@ static struct CommandList *cmd_alloc(struct
> ctlr_info *h)
> >  	 * very unlucky thread might be starved anyway, never able to
> >  	 * beat the other threads.  In reality, this happens so
> >  	 * infrequently as to be indistinguishable from never.
> > +	 *
> > +	 * Note that we start allocating commands before the SCSI host
> structure
> > +	 * is initialized.  Since the search starts at bit zero, this
> > +	 * all works, since we have at least one command structure available;
> > +	 * however, it means that the structures with the low indexes have to be
> > +	 * reserved for driver-initiated requests, while requests from the block
> > +	 * layer will use the higher indexes.
> >  	 */
> >
> > -	offset = h->last_allocation; /* benignly racy */
> >  	for (;;) {
> > -		i = find_next_zero_bit(h->cmd_pool_bits, h->nr_cmds, offset);
> > -		if (unlikely(i == h->nr_cmds)) {
> > -			offset = 0;
> > +		i = find_first_zero_bit(h->cmd_pool_bits,
> HPSA_NRESERVED_CMDS);
> > +		if (unlikely(i >= HPSA_NRESERVED_CMDS))
> >  			continue;
> > -		}
> >  		c = h->cmd_pool + i;
> >  		refcount = atomic_inc_return(&c->refcount);
> >  		if (unlikely(refcount > 1)) {
> >  			cmd_free(h, c); /* already in use */
> > -			offset = (i + 1) % h->nr_cmds;
> 
> Hi Don,
> when this happens - a command has its bitfield flag cleared and, but it's taken -
> refcount is > 1
> it will be so likely for next several thousands of tests in this function until the it is
> freed.
> When it is the first bit in the bitfield it will block all following commands sent to
> the card for that time.
> The previous variant  'find_next_zero_bit + offset = (i + 1) % h->nr_cmds' seems
> to handle this better.
> Cheers, Tomas
> 
> >  			continue;
> >  		}
> >  		set_bit(i & (BITS_PER_LONG - 1),
> >  			h->cmd_pool_bits + (i / BITS_PER_LONG));
> >  		break; /* it's ours now. */
> >  	}
> > -	h->last_allocation = i; /* benignly racy */
> >  	hpsa_cmd_partial_init(h, i, c);
> >  	return c;
> >  }
> >
> > +/*
> > + * This is the complementary operation to cmd_alloc().  Note, however, in
> some
> > + * corner cases it may also be used to free blocks allocated by
> > + * cmd_tagged_alloc() in which case the ref-count decrement does the trick
> and
> > + * the clear-bit is harmless.
> > + */
> >  static void cmd_free(struct ctlr_info *h, struct CommandList *c)
> >  {
> >  	if (atomic_dec_and_test(&c->refcount)) {
> > diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> > index 3ec8934..2536b67 100644
> > --- a/drivers/scsi/hpsa.h
> > +++ b/drivers/scsi/hpsa.h
> > @@ -141,7 +141,6 @@ struct ctlr_info {
> >  	struct CfgTable __iomem *cfgtable;
> >  	int	interrupts_enabled;
> >  	int 	max_commands;
> > -	int last_allocation;
> >  	atomic_t commands_outstanding;
> >  #	define PERF_MODE_INT	0
> >  #	define DOORBELL_INT	1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 05/42] hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds
  2015-03-17 20:02 ` [PATCH v3 05/42] hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds Don Brace
@ 2015-04-02 13:33   ` Tomas Henzl
  0 siblings, 0 replies; 54+ messages in thread
From: Tomas Henzl @ 2015-04-02 13:33 UTC (permalink / raw)
  To: Don Brace, scott.teel, Kevin.Barnett, james.bottomley, hch,
	Justin.Lindley
  Cc: linux-scsi

On 03/17/2015 09:02 PM, Don Brace wrote:
> From: Stephen Cameron <stephenmcameron@gmail.com>
>
> make tracking of outstanding commands more robust
>
> Reviewed-by: Scott Teel <scott.teel@pmcs.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@pmcs.com>
> Signed-off-by: Don Brace <don.brace@pmcs.com>
> ---
>  drivers/scsi/hpsa.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> index fe860a6..3a06812 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -6945,7 +6945,7 @@ static void fail_all_outstanding_cmds(struct ctlr_info *h)
>  		if (refcount > 1) {
>  			c->err_info->CommandStatus = CMD_CTLR_LOCKUP;
>  			finish_cmd(c);
> -			failcount++;

The failcount should stay untouched.

What is the status of this series is it already accepted or makes reviewing still sense?
Christopher has commented on 2/41 + 6/41 in the previous V2, has this been answered/resolved?

Thanks,
Tomas


> +			atomic_dec(&h->commands_outstanding);
>  		}
>  		cmd_free(h, c);
>  	}
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 37/42] hpsa: use block layer tag for command allocation
  2015-03-26 15:18               ` Webb Scales
@ 2015-04-10 15:13                 ` James Bottomley
  0 siblings, 0 replies; 54+ messages in thread
From: James Bottomley @ 2015-04-10 15:13 UTC (permalink / raw)
  To: Webb Scales
  Cc: Tomas Henzl, Brace, Don, Teel, Scott Stacy, kevin.barnett,
	james.bottomley, Christoph Hellwig, Lindley, Justin,
	SCSI development list

On Thu, 2015-03-26 at 11:18 -0400, Webb Scales wrote:
> At this point, I've turned this work over to Don; I presume he'll want 
> to address the bitmask search. I'll leave it up to him to decide what to 
> do about the refcount.

Is there an update on this?  Firstly the delay is going to cause you to
miss the merge window (assuming it opens on Sunday) and secondly
reviewers can get demotivated if you apparently don't address their
concerns.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2015-04-10 15:13 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-17 20:02 [PATCH v3 00/42] hpsa updates Don Brace
2015-03-17 20:02 ` [PATCH v3 01/42] hpsa: add masked physical devices into h->dev[] array Don Brace
2015-03-17 20:02 ` [PATCH v3 02/42] hpsa: clean up host, channel, target, lun prints Don Brace
2015-03-17 20:02 ` [PATCH v3 03/42] hpsa: rework controller command submission Don Brace
2015-03-27 15:11   ` Tomas Henzl
2015-03-27 18:04     ` brace
2015-03-17 20:02 ` [PATCH v3 04/42] hpsa: clean up aborts Don Brace
2015-03-17 20:02 ` [PATCH v3 05/42] hpsa: decrement h->commands_outstanding in fail_all_outstanding_cmds Don Brace
2015-04-02 13:33   ` Tomas Henzl
2015-03-17 20:02 ` [PATCH v3 06/42] hpsa: hpsa decode sense data for io and tmf Don Brace
2015-03-17 20:03 ` [PATCH v3 07/42] hpsa: allow lockup detected to be viewed via sysfs Don Brace
2015-03-17 20:03 ` [PATCH v3 08/42] hpsa: make function names consistent Don Brace
2015-03-17 20:03 ` [PATCH v3 09/42] hpsa: factor out hpsa_init_cmd function Don Brace
2015-03-17 20:03 ` [PATCH v3 10/42] hpsa: do not ignore return value of hpsa_register_scsi Don Brace
2015-03-17 20:03 ` [PATCH v3 11/42] hpsa: try resubmitting down raid path on task set full Don Brace
2015-03-17 20:03 ` [PATCH v3 12/42] hpsa: factor out hpsa_ioaccel_submit function Don Brace
2015-03-17 20:03 ` [PATCH v3 13/42] hpsa: print accurate SSD Smart Path Enabled status Don Brace
2015-03-17 20:03 ` [PATCH v3 14/42] hpsa: use ioaccel2 path to submit IOs to physical drives in HBA mode Don Brace
2015-03-17 20:03 ` [PATCH v3 15/42] hpsa: Get queue depth from identify physical bmic for physical disks Don Brace
2015-03-17 20:03 ` [PATCH v3 16/42] hpsa: break hpsa_free_irqs_and_disable_msix into two functions Don Brace
2015-03-17 20:03 ` [PATCH v3 17/42] hpsa: clean up error handling Don Brace
2015-03-17 20:04 ` [PATCH v3 18/42] hpsa: refactor freeing of resources into more logical functions Don Brace
2015-03-17 20:04 ` [PATCH v3 19/42] hpsa: add ioaccel sg chaining for the ioaccel2 path Don Brace
2015-03-17 20:04 ` [PATCH v3 20/42] hpsa: add more ioaccel2 error handling, including underrun statuses Don Brace
2015-03-17 20:04 ` [PATCH v3 21/42] hpsa: do not check cmd_alloc return value - it cannnot return NULL Don Brace
2015-03-17 20:04 ` [PATCH v3 22/42] hpsa: correct return values from driver functions Don Brace
2015-03-17 20:04 ` [PATCH v3 23/42] hpsa: clean up driver init Don Brace
2015-03-17 20:04 ` [PATCH v3 24/42] hpsa: clean up some error reporting output in abort handler Don Brace
2015-03-17 20:04 ` [PATCH v3 25/42] hpsa: do not print ioaccel2 warning messages about unusual completions Don Brace
2015-03-17 20:04 ` [PATCH v3 26/42] hpsa: add support sending aborts to physical devices via the ioaccel2 path Don Brace
2015-03-17 20:04 ` [PATCH v3 27/42] hpsa: use helper routines for finishing commands Don Brace
2015-03-17 20:04 ` [PATCH v3 28/42] hpsa: don't return abort request until target is complete Don Brace
2015-03-17 20:05 ` [PATCH v3 29/42] hpsa: refactor and rework support for sending TEST_UNIT_READY Don Brace
2015-03-17 20:05 ` [PATCH v3 30/42] hpsa: performance tweak for hpsa_scatter_gather() Don Brace
2015-03-17 20:05 ` [PATCH v3 31/42] hpsa: call pci_release_regions after pci_disable_device Don Brace
2015-03-17 20:05 ` [PATCH v3 32/42] hpsa: skip free_irq calls if irqs are not allocated Don Brace
2015-03-17 20:05 ` [PATCH v3 33/42] hpsa: cleanup for init_one step 2 in kdump Don Brace
2015-03-17 20:05 ` [PATCH v3 34/42] hpsa: fix try_soft_reset error handling Don Brace
2015-03-17 20:05 ` [PATCH v3 35/42] hpsa: create workqueue after the driver is ready for use Don Brace
2015-03-17 20:06 ` [PATCH v3 36/42] hpsa: add interrupt number to /proc/interrupts interrupt name Don Brace
2015-03-17 20:06 ` [PATCH v3 37/42] hpsa: use block layer tag for command allocation Don Brace
2015-03-23 16:57   ` Tomas Henzl
     [not found]     ` <07F70BBF6832E34FA1C923241E8833AB486892F9@BBYEXM01.pmc-sierra.internal>
2015-03-25 18:33       ` Webb Scales
2015-03-26 12:47         ` Tomas Henzl
2015-03-26 14:38           ` Webb Scales
2015-03-26 15:10             ` Tomas Henzl
2015-03-26 15:18               ` Webb Scales
2015-04-10 15:13                 ` James Bottomley
2015-03-27 18:49     ` brace
2015-03-17 20:06 ` [PATCH v3 38/42] hpsa: use scsi host_no as hpsa controller number Don Brace
2015-03-17 20:07 ` [PATCH v3 39/42] hpsa: propagate the error code in hpsa_kdump_soft_reset Don Brace
2015-03-17 20:07 ` [PATCH v3 40/42] hpsa: cleanup reset Don Brace
2015-03-17 20:07 ` [PATCH v3 41/42] hpsa: change driver version Don Brace
2015-03-17 20:07 ` [PATCH v3 42/42] hpsa: add PMC to copyright Don Brace

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.