* [PATCH 00/16] aacraid: Fixes and enhancements for arc family
@ 2017-02-14 20:44 Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 01/16] aacraid: Fix camel case Raghava Aditya Renukunta
` (15 more replies)
0 siblings, 16 replies; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
This patch set contains issue fixes, enhancements and other misc changes.
The majority of the fixes are a direct outcome of testing and work done on the
adapter reset mechanism. Initially it just had IOP reset and then was augmented
with IWBR soft hardware resets in the previous patch set. The reset mechanism
is triggered in 2 paths, one is from the eh handler from the kernel and the
other is from the driver's internal periodic health checkup.
Raghava Aditya Renukunta (16):
[SCSI] aacraid: Fix camel case
[SCSI] aacraid: Use correct channel number for raw srb
[SCSI] aacraid: Fix for excessive prints on EEH
[SCSI] aacraid: Prevent E3 lockup when deleting units
[SCSI] aacraid: Fix memory leak in fib init path
[SCSI] aacraid: Added sysfs for driver version
[SCSI] aacraid: Fix sync fibs time out on controller reset
[SCSI] aacraid: Skip wellness sync on controller failure
[SCSI] aacraid: Reload offlined drives after controller reset
[SCSI] aacraid: Terminate kthread on controller fw assert
[SCSI] aacraid: Decrease adapter health check interval
[SCSI] aacraid: Skip IOP reset on controller panic(SMART Family)
[SCSI] aacraid: Reorder Adapter status check
[SCSI] aacraid: Save adapter fib log before an IOP reset
[SCSI] aacraid: Fix a potential spinlock double unlock bug
[SCSI] aacraid: Update driver version
drivers/scsi/aacraid/aachba.c | 59 ++++++++++++----------
drivers/scsi/aacraid/aacraid.h | 107 +++++++++++++++++++++-------------------
drivers/scsi/aacraid/commctrl.c | 2 +-
drivers/scsi/aacraid/comminit.c | 2 +-
drivers/scsi/aacraid/commsup.c | 97 +++++++++++++++++++++++++++++-------
drivers/scsi/aacraid/linit.c | 47 ++++++++++++------
drivers/scsi/aacraid/rx.c | 2 +-
drivers/scsi/aacraid/src.c | 48 +++++++++++++++---
8 files changed, 248 insertions(+), 116 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH 01/16] aacraid: Fix camel case
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:02 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 02/16] aacraid: Use correct channel number for raw srb Raghava Aditya Renukunta
` (14 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
Replaced camel case with snake case for init supported options.
Suggested-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/aachba.c | 53 ++++++++++++-----------
drivers/scsi/aacraid/aacraid.h | 98 +++++++++++++++++++++---------------------
drivers/scsi/aacraid/commsup.c | 6 +--
drivers/scsi/aacraid/linit.c | 32 +++++++-------
drivers/scsi/aacraid/rx.c | 2 +-
drivers/scsi/aacraid/src.c | 2 +-
6 files changed, 100 insertions(+), 93 deletions(-)
diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index 907f1e8..98d4ffd 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -483,7 +483,7 @@ int aac_get_containers(struct aac_dev *dev)
if (status >= 0) {
dresp = (struct aac_get_container_count_resp *)fib_data(fibptr);
maximum_num_containers = le32_to_cpu(dresp->ContainerSwitchEntries);
- if (fibptr->dev->supplement_adapter_info.SupportedOptions2 &
+ if (fibptr->dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_SUPPORTED_240_VOLUMES) {
maximum_num_containers =
le32_to_cpu(dresp->MaxSimpleVolumes);
@@ -639,13 +639,16 @@ static void _aac_probe_container2(void * context, struct fib * fibptr)
fsa_dev_ptr = fibptr->dev->fsa_dev;
if (fsa_dev_ptr) {
struct aac_mount * dresp = (struct aac_mount *) fib_data(fibptr);
+ __le32 sup_options2;
+
fsa_dev_ptr += scmd_id(scsicmd);
+ sup_options2 =
+ fibptr->dev->supplement_adapter_info.supported_options2;
if ((le32_to_cpu(dresp->status) == ST_OK) &&
(le32_to_cpu(dresp->mnt[0].vol) != CT_NONE) &&
(le32_to_cpu(dresp->mnt[0].state) != FSCS_HIDDEN)) {
- if (!(fibptr->dev->supplement_adapter_info.SupportedOptions2 &
- AAC_OPTION_VARIABLE_BLOCK_SIZE)) {
+ if (!(sup_options2 & AAC_OPTION_VARIABLE_BLOCK_SIZE)) {
dresp->mnt[0].fileinfo.bdevinfo.block_size = 0x200;
fsa_dev_ptr->block_size = 0x200;
} else {
@@ -688,7 +691,7 @@ static void _aac_probe_container1(void * context, struct fib * fibptr)
int status;
dresp = (struct aac_mount *) fib_data(fibptr);
- if (!(fibptr->dev->supplement_adapter_info.SupportedOptions2 &
+ if (!(fibptr->dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_VARIABLE_BLOCK_SIZE))
dresp->mnt[0].capacityhigh = 0;
if ((le32_to_cpu(dresp->status) != ST_OK) ||
@@ -705,7 +708,7 @@ static void _aac_probe_container1(void * context, struct fib * fibptr)
dinfo = (struct aac_query_mount *)fib_data(fibptr);
- if (fibptr->dev->supplement_adapter_info.SupportedOptions2 &
+ if (fibptr->dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_VARIABLE_BLOCK_SIZE)
dinfo->command = cpu_to_le32(VM_NameServeAllBlk);
else
@@ -745,7 +748,7 @@ static int _aac_probe_container(struct scsi_cmnd * scsicmd, int (*callback)(stru
dinfo = (struct aac_query_mount *)fib_data(fibptr);
- if (fibptr->dev->supplement_adapter_info.SupportedOptions2 &
+ if (fibptr->dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_VARIABLE_BLOCK_SIZE)
dinfo->command = cpu_to_le32(VM_NameServeAllBlk);
else
@@ -896,12 +899,14 @@ char * get_container_type(unsigned tindex)
static void setinqstr(struct aac_dev *dev, void *data, int tindex)
{
struct scsi_inq *str;
+ struct aac_supplement_adapter_info *sup_adap_info;
+ sup_adap_info = &dev->supplement_adapter_info;
str = (struct scsi_inq *)(data); /* cast data to scsi inq block */
memset(str, ' ', sizeof(*str));
- if (dev->supplement_adapter_info.AdapterTypeText[0]) {
- char * cp = dev->supplement_adapter_info.AdapterTypeText;
+ if (sup_adap_info->adapter_type_text[0]) {
+ char *cp = sup_adap_info->adapter_type_text;
int c;
if ((cp[0] == 'A') && (cp[1] == 'O') && (cp[2] == 'C'))
inqstrcpy("SMC", str->vid);
@@ -911,8 +916,7 @@ static void setinqstr(struct aac_dev *dev, void *data, int tindex)
++cp;
c = *cp;
*cp = '\0';
- inqstrcpy (dev->supplement_adapter_info.AdapterTypeText,
- str->vid);
+ inqstrcpy(sup_adap_info->adapter_type_text, str->vid);
*cp = c;
while (*cp && *cp != ' ')
++cp;
@@ -1675,8 +1679,8 @@ int aac_issue_bmic_identify(struct aac_dev *dev, u32 bus, u32 target)
if (!identify_resp)
goto fib_free_ptr;
- vbus = (u32)le16_to_cpu(dev->supplement_adapter_info.VirtDeviceBus);
- vid = (u32)le16_to_cpu(dev->supplement_adapter_info.VirtDeviceTarget);
+ vbus = (u32)le16_to_cpu(dev->supplement_adapter_info.virt_device_bus);
+ vid = (u32)le16_to_cpu(dev->supplement_adapter_info.virt_device_target);
aac_fib_init(fibptr);
@@ -1815,9 +1819,9 @@ int aac_report_phys_luns(struct aac_dev *dev, struct fib *fibptr, int rescan)
}
vbus = (u32) le16_to_cpu(
- dev->supplement_adapter_info.VirtDeviceBus);
+ dev->supplement_adapter_info.virt_device_bus);
vid = (u32) le16_to_cpu(
- dev->supplement_adapter_info.VirtDeviceTarget);
+ dev->supplement_adapter_info.virt_device_target);
aac_fib_init(fibptr);
@@ -1893,7 +1897,7 @@ int aac_get_adapter_info(struct aac_dev* dev)
}
memcpy(&dev->adapter_info, info, sizeof(*info));
- dev->supplement_adapter_info.VirtDeviceBus = 0xffff;
+ dev->supplement_adapter_info.virt_device_bus = 0xffff;
if (dev->adapter_info.options & AAC_OPT_SUPPLEMENT_ADAPTER_INFO) {
struct aac_supplement_adapter_info * sinfo;
@@ -1961,7 +1965,7 @@ int aac_get_adapter_info(struct aac_dev* dev)
}
if (!dev->sync_mode && dev->sa_firmware &&
- dev->supplement_adapter_info.VirtDeviceBus != 0xffff) {
+ dev->supplement_adapter_info.virt_device_bus != 0xffff) {
/* Thor SA Firmware -> CISS_REPORT_PHYSICAL_LUNS */
rcode = aac_report_phys_luns(dev, fibptr, AAC_INIT);
}
@@ -1976,8 +1980,8 @@ int aac_get_adapter_info(struct aac_dev* dev)
(tmp>>16)&0xff,
tmp&0xff,
le32_to_cpu(dev->adapter_info.kernelbuild),
- (int)sizeof(dev->supplement_adapter_info.BuildDate),
- dev->supplement_adapter_info.BuildDate);
+ (int)sizeof(dev->supplement_adapter_info.build_date),
+ dev->supplement_adapter_info.build_date);
tmp = le32_to_cpu(dev->adapter_info.monitorrev);
printk(KERN_INFO "%s%d: monitor %d.%d-%d[%d]\n",
dev->name, dev->id,
@@ -1993,14 +1997,15 @@ int aac_get_adapter_info(struct aac_dev* dev)
shost_to_class(dev->scsi_host_ptr), buffer))
printk(KERN_INFO "%s%d: serial %s",
dev->name, dev->id, buffer);
- if (dev->supplement_adapter_info.VpdInfo.Tsid[0]) {
+ if (dev->supplement_adapter_info.vpd_info.tsid[0]) {
printk(KERN_INFO "%s%d: TSID %.*s\n",
dev->name, dev->id,
- (int)sizeof(dev->supplement_adapter_info.VpdInfo.Tsid),
- dev->supplement_adapter_info.VpdInfo.Tsid);
+ (int)sizeof(dev->supplement_adapter_info
+ .vpd_info.tsid),
+ dev->supplement_adapter_info.vpd_info.tsid);
}
if (!aac_check_reset || ((aac_check_reset == 1) &&
- (dev->supplement_adapter_info.SupportedOptions2 &
+ (dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_IGNORE_RESET))) {
printk(KERN_INFO "%s%d: Reset Adapter Ignored\n",
dev->name, dev->id);
@@ -2008,7 +2013,7 @@ int aac_get_adapter_info(struct aac_dev* dev)
}
dev->cache_protected = 0;
- dev->jbod = ((dev->supplement_adapter_info.FeatureBits &
+ dev->jbod = ((dev->supplement_adapter_info.feature_bits &
AAC_FEATURE_JBOD) != 0);
dev->nondasd_support = 0;
dev->raid_scsi_mode = 0;
@@ -2631,7 +2636,7 @@ static int aac_start_stop(struct scsi_cmnd *scsicmd)
struct scsi_device *sdev = scsicmd->device;
struct aac_dev *aac = (struct aac_dev *)sdev->host->hostdata;
- if (!(aac->supplement_adapter_info.SupportedOptions2 &
+ if (!(aac->supplement_adapter_info.supported_options2 &
AAC_OPTION_POWER_MANAGEMENT)) {
scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 |
SAM_STAT_GOOD;
diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index f234497..b5a2c87 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -1380,57 +1380,57 @@ struct aac_adapter_info
struct aac_supplement_adapter_info
{
- u8 AdapterTypeText[17+1];
- u8 Pad[2];
- __le32 FlashMemoryByteSize;
- __le32 FlashImageId;
- __le32 MaxNumberPorts;
- __le32 Version;
- __le32 FeatureBits;
- u8 SlotNumber;
- u8 ReservedPad0[3];
- u8 BuildDate[12];
- __le32 CurrentNumberPorts;
+ u8 adapter_type_text[17+1];
+ u8 pad[2];
+ __le32 flash_memory_byte_size;
+ __le32 flash_image_id;
+ __le32 max_number_ports;
+ __le32 version;
+ __le32 feature_bits;
+ u8 slot_number;
+ u8 reserved_pad0[3];
+ u8 build_date[12];
+ __le32 current_number_ports;
struct {
- u8 AssemblyPn[8];
- u8 FruPn[8];
- u8 BatteryFruPn[8];
- u8 EcVersionString[8];
- u8 Tsid[12];
- } VpdInfo;
- __le32 FlashFirmwareRevision;
- __le32 FlashFirmwareBuild;
- __le32 RaidTypeMorphOptions;
- __le32 FlashFirmwareBootRevision;
- __le32 FlashFirmwareBootBuild;
- u8 MfgPcbaSerialNo[12];
- u8 MfgWWNName[8];
- __le32 SupportedOptions2;
- __le32 StructExpansion;
+ u8 assembly_pn[8];
+ u8 fru_pn[8];
+ u8 battery_fru_pn[8];
+ u8 ec_version_string[8];
+ u8 tsid[12];
+ } vpd_info;
+ __le32 flash_firmware_revision;
+ __le32 flash_firmware_build;
+ __le32 raid_type_morph_options;
+ __le32 flash_firmware_boot_revision;
+ __le32 flash_firmware_boot_build;
+ u8 mfg_pcba_serial_no[12];
+ u8 mfg_wwn_name[8];
+ __le32 supported_options2;
+ __le32 struct_expansion;
/* StructExpansion == 1 */
- __le32 FeatureBits3;
- __le32 SupportedPerformanceModes;
- u8 HostBusType; /* uses HOST_BUS_TYPE_xxx defines */
- u8 HostBusWidth; /* actual width in bits or links */
- u16 HostBusSpeed; /* actual bus speed/link rate in MHz */
- u8 MaxRRCDrives; /* max. number of ITP-RRC drives/pool */
- u8 MaxDiskXtasks; /* max. possible num of DiskX Tasks */
-
- u8 CpldVerLoaded;
- u8 CpldVerInFlash;
-
- __le64 MaxRRCCapacity;
- __le32 CompiledMaxHistLogLevel;
- u8 CustomBoardName[12];
- u16 SupportedCntlrMode; /* identify supported controller mode */
- u16 ReservedForFuture16;
- __le32 SupportedOptions3; /* reserved for future options */
-
- __le16 VirtDeviceBus; /* virt. SCSI device for Thor */
- __le16 VirtDeviceTarget;
- __le16 VirtDeviceLUN;
- __le16 Unused;
- __le32 ReservedForFutureGrowth[68];
+ __le32 feature_bits3;
+ __le32 supported_performance_modes;
+ u8 host_bus_type; /* uses HOST_BUS_TYPE_xxx defines */
+ u8 host_bus_width; /* actual width in bits or links */
+ u16 host_bus_speed; /* actual bus speed/link rate in MHz */
+ u8 max_rrc_drives; /* max. number of ITP-RRC drives/pool */
+ u8 max_disk_xtasks; /* max. possible num of DiskX Tasks */
+
+ u8 cpld_ver_loaded;
+ u8 cpld_ver_in_flash;
+
+ __le64 max_rrc_capacity;
+ __le32 compiled_max_hist_log_level;
+ u8 custom_board_name[12];
+ u16 supported_cntlr_mode; /* identify supported controller mode */
+ u16 reserved_for_future16;
+ __le32 supported_options3; /* reserved for future options */
+
+ __le16 virt_device_bus; /* virt. SCSI device for Thor */
+ __le16 virt_device_target;
+ __le16 virt_device_lun;
+ __le16 unused;
+ __le32 reserved_for_future_growth[68];
};
#define AAC_FEATURE_FALCON cpu_to_le32(0x00000010)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 969727b..56090f5 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1815,7 +1815,7 @@ int aac_check_health(struct aac_dev * aac)
printk(KERN_ERR "%s: Host adapter BLINK LED 0x%x\n", aac->name, BlinkLED);
if (!aac_check_reset || ((aac_check_reset == 1) &&
- (aac->supplement_adapter_info.SupportedOptions2 &
+ (aac->supplement_adapter_info.supported_options2 &
AAC_OPTION_IGNORE_RESET)))
goto out;
host = aac->scsi_host_ptr;
@@ -2264,8 +2264,8 @@ static int aac_send_wellness_command(struct aac_dev *dev, char *wellness_str,
aac_fib_init(fibptr);
- vbus = (u32)le16_to_cpu(dev->supplement_adapter_info.VirtDeviceBus);
- vid = (u32)le16_to_cpu(dev->supplement_adapter_info.VirtDeviceTarget);
+ vbus = (u32)le16_to_cpu(dev->supplement_adapter_info.virt_device_bus);
+ vid = (u32)le16_to_cpu(dev->supplement_adapter_info.virt_device_target);
srbcmd = (struct aac_srb *)fib_data(fibptr);
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 137d22d..ab4f1e7 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -891,13 +891,13 @@ static int aac_eh_reset(struct scsi_cmnd* cmd)
* Adapters that support a register, instead of a commanded,
* reset.
*/
- if (((aac->supplement_adapter_info.SupportedOptions2 &
+ if (((aac->supplement_adapter_info.supported_options2 &
AAC_OPTION_MU_RESET) ||
- (aac->supplement_adapter_info.SupportedOptions2 &
+ (aac->supplement_adapter_info.supported_options2 &
AAC_OPTION_DOORBELL_RESET)) &&
aac_check_reset &&
((aac_check_reset != 1) ||
- !(aac->supplement_adapter_info.SupportedOptions2 &
+ !(aac->supplement_adapter_info.supported_options2 &
AAC_OPTION_IGNORE_RESET))) {
/* Bypass wait for command quiesce */
aac_reset_adapter(aac, 2, IOP_HWSOFT_RESET);
@@ -1029,8 +1029,8 @@ static ssize_t aac_show_model(struct device *device,
struct aac_dev *dev = (struct aac_dev*)class_to_shost(device)->hostdata;
int len;
- if (dev->supplement_adapter_info.AdapterTypeText[0]) {
- char * cp = dev->supplement_adapter_info.AdapterTypeText;
+ if (dev->supplement_adapter_info.adapter_type_text[0]) {
+ char *cp = dev->supplement_adapter_info.adapter_type_text;
while (*cp && *cp != ' ')
++cp;
while (*cp == ' ')
@@ -1046,18 +1046,20 @@ static ssize_t aac_show_vendor(struct device *device,
struct device_attribute *attr, char *buf)
{
struct aac_dev *dev = (struct aac_dev*)class_to_shost(device)->hostdata;
+ struct aac_supplement_adapter_info *sup_adap_info;
int len;
- if (dev->supplement_adapter_info.AdapterTypeText[0]) {
- char * cp = dev->supplement_adapter_info.AdapterTypeText;
+ sup_adap_info = &dev->supplement_adapter_info;
+ if (sup_adap_info->adapter_type_text[0]) {
+ char *cp = sup_adap_info->adapter_type_text;
while (*cp && *cp != ' ')
++cp;
len = snprintf(buf, PAGE_SIZE, "%.*s\n",
- (int)(cp - (char *)dev->supplement_adapter_info.AdapterTypeText),
- dev->supplement_adapter_info.AdapterTypeText);
+ (int)(cp - (char *)sup_adap_info->adapter_type_text),
+ sup_adap_info->adapter_type_text);
} else
len = snprintf(buf, PAGE_SIZE, "%s\n",
- aac_drivers[dev->cardtype].vname);
+ aac_drivers[dev->cardtype].vname);
return len;
}
@@ -1078,7 +1080,7 @@ static ssize_t aac_show_flags(struct device *cdev,
"SAI_READ_CAPACITY_16\n");
if (dev->jbod)
len += snprintf(buf + len, PAGE_SIZE - len, "SUPPORTED_JBOD\n");
- if (dev->supplement_adapter_info.SupportedOptions2 &
+ if (dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_POWER_MANAGEMENT)
len += snprintf(buf + len, PAGE_SIZE - len,
"SUPPORTED_POWER_MANAGEMENT\n");
@@ -1139,12 +1141,12 @@ static ssize_t aac_show_serial_number(struct device *device,
len = snprintf(buf, 16, "%06X\n",
le32_to_cpu(dev->adapter_info.serial[0]));
if (len &&
- !memcmp(&dev->supplement_adapter_info.MfgPcbaSerialNo[
- sizeof(dev->supplement_adapter_info.MfgPcbaSerialNo)-len],
+ !memcmp(&dev->supplement_adapter_info.mfg_pcba_serial_no[
+ sizeof(dev->supplement_adapter_info.mfg_pcba_serial_no)-len],
buf, len-1))
len = snprintf(buf, 16, "%.*s\n",
- (int)sizeof(dev->supplement_adapter_info.MfgPcbaSerialNo),
- dev->supplement_adapter_info.MfgPcbaSerialNo);
+ (int)sizeof(dev->supplement_adapter_info.mfg_pcba_serial_no),
+ dev->supplement_adapter_info.mfg_pcba_serial_no);
return min(len, 16);
}
diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
index 0e69a80..5d19c31 100644
--- a/drivers/scsi/aacraid/rx.c
+++ b/drivers/scsi/aacraid/rx.c
@@ -475,7 +475,7 @@ static int aac_rx_restart_adapter(struct aac_dev *dev, int bled, u8 reset_type)
{
u32 var = 0;
- if (!(dev->supplement_adapter_info.SupportedOptions2 &
+ if (!(dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_MU_RESET) || (bled >= 0) || (bled == -2)) {
if (bled)
printk(KERN_ERR "%s%d: adapter kernel panic'd %x.\n",
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index 8e4e2dd..c17b060 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -684,7 +684,7 @@ static void aac_send_iop_reset(struct aac_dev *dev, int bled)
aac_set_intx_mode(dev);
- if (!bled && (dev->supplement_adapter_info.SupportedOptions2 &
+ if (!bled && (dev->supplement_adapter_info.supported_options2 &
AAC_OPTION_DOORBELL_RESET)) {
src_writel(dev, MUnit.IDR, reset_mask);
} else {
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 02/16] aacraid: Use correct channel number for raw srb
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 01/16] aacraid: Fix camel case Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:03 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 03/16] aacraid: Fix for excessive prints on EEH Raghava Aditya Renukunta
` (13 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
The channel being used for raw srb commands is retrieved from the utility
sent fibs and is converted into physical channel id. The driver does not
need to to do this since the management utility sends the correct channel
id in the first place and in addition the driver sets inaccurate
information in the cmd sent to the firmware and gets an invalid response.
Fixed by using channel id from srb command.
Cc: stable@vger.kernel.org
Fixes: 423400e64d377c0 ("scsi: aacraid: Include HBA direct interface")
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commctrl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/aacraid/commctrl.c b/drivers/scsi/aacraid/commctrl.c
index 614842a..f6afd50 100644
--- a/drivers/scsi/aacraid/commctrl.c
+++ b/drivers/scsi/aacraid/commctrl.c
@@ -580,7 +580,7 @@ static int aac_send_raw_srb(struct aac_dev* dev, void __user * arg)
goto cleanup;
}
- chn = aac_logical_to_phys(user_srbcmd->channel);
+ chn = user_srbcmd->channel;
if (chn < AAC_MAX_BUSES && user_srbcmd->id < AAC_MAX_TARGETS &&
dev->hba_map[chn][user_srbcmd->id].devtype ==
AAC_DEVTYPE_NATIVE_RAW) {
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 03/16] aacraid: Fix for excessive prints on EEH
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 01/16] aacraid: Fix camel case Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 02/16] aacraid: Use correct channel number for raw srb Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:07 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units Raghava Aditya Renukunta
` (12 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
This issue showed up on a kdump debug(single CPU on powerkvm), when EEH
errors rendered the adapter unusable. The driver correctly detected the
issue and attempted to restart the controller, in doing so the driver
attempted to read the status registers of the controller. This triggered
additional eeh errors which continued for a good 6 minutes.
Fixed by returning without waiting when EEH error is reported.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 38 +++++++++++++++++++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 56090f5..6220b47 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -461,6 +461,30 @@ int aac_queue_get(struct aac_dev * dev, u32 * index, u32 qid, struct hw_fib * hw
return 0;
}
+#ifdef CONFIG_EEH
+static inline int aac_check_eeh_failure(struct aac_dev *dev)
+{
+ /* Check for an EEH failure for the given
+ * device node. Function eeh_dev_check_failure()
+ * returns 0 if there has not been an EEH error
+ * otherwise returns a non-zero value.
+ *
+ * Need to be called before any PCI operation,
+ * i.e.,before aac_adapter_check_health()
+ */
+ struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev);
+
+ if (eeh_dev_check_failure(edev)) {
+ /* The EEH mechanisms will handle this
+ * error and reset the device if
+ * necessary.
+ */
+ return 1;
+ }
+ return 0;
+}
+#endif
+
/*
* Define the highest level of host to adapter communication routines.
* These routines will support host to adapter FS commuication. These
@@ -496,7 +520,6 @@ int aac_fib_send(u16 command, struct fib *fibptr, unsigned long size,
unsigned long mflags = 0;
unsigned long sflags = 0;
-
if (!(hw_fib->header.XferState & cpu_to_le32(HostOwned)))
return -EBUSY;
/*
@@ -662,6 +685,12 @@ int aac_fib_send(u16 command, struct fib *fibptr, unsigned long size,
}
return -ETIMEDOUT;
}
+
+#if defined(CONFIG_EEH)
+ if (aac_check_eeh_failure(dev))
+ return -EFAULT;
+#endif
+
if ((blink = aac_adapter_check_health(dev)) > 0) {
if (wait == -1) {
printk(KERN_ERR "aacraid: aac_fib_send: adapter blinkLED 0x%x.\n"
@@ -755,7 +784,14 @@ int aac_hba_send(u8 command, struct fib *fibptr, fib_callback callback,
FIB_COUNTER_INCREMENT(aac_config.NativeSent);
if (wait) {
+
spin_unlock_irqrestore(&fibptr->event_lock, flags);
+
+#if defined(CONFIG_EEH)
+ if (aac_check_eeh_failure(dev))
+ return -EFAULT;
+#endif
+
/* Only set for first known interruptable command */
if (down_interruptible(&fibptr->event_wait)) {
fibptr->done = 2;
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (2 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 03/16] aacraid: Fix for excessive prints on EEH Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:20 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 05/16] aacraid: Fix memory leak in fib init path Raghava Aditya Renukunta
` (11 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
Arrconf management utility at times sends fibs with AdapterProcessed set
in its fibs. This causes the controller to panic and lockup.
Fixed by failing the commands that have AdapterProcessed set in its flag.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 6220b47..f7a3bcb 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -522,6 +522,10 @@ int aac_fib_send(u16 command, struct fib *fibptr, unsigned long size,
if (!(hw_fib->header.XferState & cpu_to_le32(HostOwned)))
return -EBUSY;
+
+ if (hw_fib->header.XferState & cpu_to_le32(AdapterProcessed))
+ return -EINVAL;
+
/*
* There are 5 cases with the wait and response requested flags.
* The only invalid cases are if the caller requests to wait and
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 05/16] aacraid: Fix memory leak in fib init path
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (3 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:31 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 06/16] aacraid: Added sysfs for driver version Raghava Aditya Renukunta
` (10 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
aac_fib_map_free frees misaligned fib dma memory, additionally it does not
free up the whole memory.
Fixed by changing the code to free up the correct and full memory
allocation.
Cc: stable@vger.kernel.org
Fixes: e8b12f0fb835223 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC based controller family)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index f7a3bcb..863c98d 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -97,8 +97,8 @@ void aac_fib_map_free(struct aac_dev *dev)
{
if (dev->hw_fib_va && dev->max_cmd_size) {
pci_free_consistent(dev->pdev,
- (dev->max_cmd_size *
- (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB)),
+ (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr))
+ * (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB) + 31,
dev->hw_fib_va, dev->hw_fib_pa);
}
dev->hw_fib_va = NULL;
@@ -153,22 +153,20 @@ int aac_fib_setup(struct aac_dev * dev)
if (i<0)
return -ENOMEM;
- /* 32 byte alignment for PMC */
- hw_fib_pa = (dev->hw_fib_pa + (ALIGN32 - 1)) & ~(ALIGN32 - 1);
- dev->hw_fib_va = (struct hw_fib *)((unsigned char *)dev->hw_fib_va +
- (hw_fib_pa - dev->hw_fib_pa));
- dev->hw_fib_pa = hw_fib_pa;
memset(dev->hw_fib_va, 0,
(dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) *
(dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB));
+ /* 32 byte alignment for PMC */
+ hw_fib_pa = (dev->hw_fib_pa + (ALIGN32 - 1)) & ~(ALIGN32 - 1);
+ hw_fib = (struct hw_fib *)((unsigned char *)dev->hw_fib_va +
+ (hw_fib_pa - dev->hw_fib_pa));
+
/* add Xport header */
- dev->hw_fib_va = (struct hw_fib *)((unsigned char *)dev->hw_fib_va +
+ hw_fib = (struct hw_fib *)((unsigned char *)hw_fib +
sizeof(struct aac_fib_xporthdr));
- dev->hw_fib_pa += sizeof(struct aac_fib_xporthdr);
+ hw_fib_pa += sizeof(struct aac_fib_xporthdr);
- hw_fib = dev->hw_fib_va;
- hw_fib_pa = dev->hw_fib_pa;
/*
* Initialise the fibs
*/
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 06/16] aacraid: Added sysfs for driver version
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (4 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 05/16] aacraid: Fix memory leak in fib init path Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:32 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset Raghava Aditya Renukunta
` (9 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
Added support to retrieve driver version from a new sysfs variable called
driver_version. It makes it easier for the user to figure out the driver
version that is currently running.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/linit.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index ab4f1e7..df02784 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1131,6 +1131,13 @@ static ssize_t aac_show_bios_version(struct device *device,
return len;
}
+static ssize_t aac_show_driver_version(struct device *device,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return snprintf(buf, PAGE_SIZE, "%s\n", aac_driver_version);
+}
+
static ssize_t aac_show_serial_number(struct device *device,
struct device_attribute *attr, char *buf)
{
@@ -1241,6 +1248,13 @@ static struct device_attribute aac_bios_version = {
},
.show = aac_show_bios_version,
};
+static struct device_attribute aac_lld_version = {
+ .attr = {
+ .name = "driver_version",
+ .mode = 0444,
+ },
+ .show = aac_show_driver_version,
+};
static struct device_attribute aac_serial_number = {
.attr = {
.name = "serial_number",
@@ -1278,6 +1292,7 @@ static struct device_attribute *aac_attrs[] = {
&aac_kernel_version,
&aac_monitor_version,
&aac_bios_version,
+ &aac_lld_version,
&aac_serial_number,
&aac_max_channel,
&aac_max_id,
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (5 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 06/16] aacraid: Added sysfs for driver version Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:34 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 08/16] aacraid: Skip wellness sync on controller failure Raghava Aditya Renukunta
` (8 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
After controller shutdown, all sync fibs time out due to not knowing
about the switch to INT-x mode
Fixed by replacing aac_src_access_devreg() to aac_set_intx_mode() call.
Cc: stable@vger.kernel.org
Fixes: 495c021767bd78c998 (aacraid: MSI-x support)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/aacraid.h | 1 +
drivers/scsi/aacraid/comminit.c | 2 +-
drivers/scsi/aacraid/src.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index b5a2c87..9281e72 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -2639,6 +2639,7 @@ void aac_hba_callback(void *context, struct fib *fibptr);
#define fib_data(fibctx) ((void *)(fibctx)->hw_fib_va->data)
struct aac_dev *aac_init_adapter(struct aac_dev *dev);
void aac_src_access_devreg(struct aac_dev *dev, int mode);
+void aac_set_intx_mode(struct aac_dev *dev);
int aac_get_config_status(struct aac_dev *dev, int commit_flag);
int aac_get_containers(struct aac_dev *dev);
int aac_scsi_cmd(struct scsi_cmnd *cmd);
diff --git a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c
index d0c7724..cd3456e 100644
--- a/drivers/scsi/aacraid/comminit.c
+++ b/drivers/scsi/aacraid/comminit.c
@@ -326,7 +326,7 @@ int aac_send_shutdown(struct aac_dev * dev)
dev->pdev->device == PMC_DEVICE_S8 ||
dev->pdev->device == PMC_DEVICE_S9) &&
dev->msi_enabled)
- aac_src_access_devreg(dev, AAC_ENABLE_INTX);
+ aac_set_intx_mode(dev);
return status;
}
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index c17b060..b23c818 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -657,7 +657,7 @@ static int aac_srcv_ioremap(struct aac_dev *dev, u32 size)
return 0;
}
-static void aac_set_intx_mode(struct aac_dev *dev)
+void aac_set_intx_mode(struct aac_dev *dev)
{
if (dev->msi_enabled) {
aac_src_access_devreg(dev, AAC_ENABLE_INTX);
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 08/16] aacraid: Skip wellness sync on controller failure
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (6 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:35 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 09/16] aacraid: Reload offlined drives after controller reset Raghava Aditya Renukunta
` (7 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
aac_command_thread checks on the health of controller periodically,
using aac_check_health. If the status is an error state KERNEL_PANIC or
anything else. The driver will attempt to restart the adapter, but the
response is not checked in aac_command_thread. This allows the periodic
sync to go thru and lead the driver to a hung state.
Fixed by terminating the periodic loop(intended per original design),
if the controller is not restored to a healthy state.
Cc: stable@vger.kernel.org
Fixes: 3d77d8404478353358 (scsi: aacraid: Added support for periodic wellness sync)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 863c98d..de4285d 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -2472,7 +2472,7 @@ int aac_command_thread(void *data)
/* Don't even try to talk to adapter if its sick */
ret = aac_check_health(dev);
- if (!dev->queues)
+ if (ret || !dev->queues)
break;
next_check_jiffies = jiffies
+ ((long)(unsigned)check_interval)
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 09/16] aacraid: Reload offlined drives after controller reset
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (7 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 08/16] aacraid: Skip wellness sync on controller failure Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:38 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 10/16] aacraid: Terminate kthread on controller fw assert Raghava Aditya Renukunta
` (6 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
During the IOP reset stress testing, it was found that the drives can be
marked offline when the adapter controller crashes and IO's are running
in parallel. When the controller does come back from the reset, the drive
that is marked offline is not exposed.
Fixed by removing and adding drives that are marked offline. In addition
invoke a scsi host bus rescan to capture any additional configuration
changes.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index de4285d..78588e4 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1628,11 +1628,29 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type)
command->SCp.phase = AAC_OWNER_ERROR_HANDLER;
command->scsi_done(command);
}
+ /*
+ * Any Device that was already marked offline needs to be cleaned up
+ */
+ __shost_for_each_device(dev, host) {
+ if (!scsi_device_online(dev)) {
+ sdev_printk(KERN_INFO, dev, "Removing offline device\n");
+ scsi_remove_device(dev);
+ scsi_device_put(dev);
+ }
+ }
retval = 0;
out:
aac->in_reset = 0;
scsi_unblock_requests(host);
+ /*
+ * Issue bus rescan to catch any configuration that might have
+ * occurred
+ */
+ if (!retval) {
+ dev_info(&aac->pdev->dev, "Issuing bus rescan\n");
+ scsi_scan_host(host);
+ }
if (jafo) {
spin_lock_irq(host->host_lock);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (8 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 09/16] aacraid: Reload offlined drives after controller reset Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:44 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 11/16] aacraid: Decrease adapter health check interval Raghava Aditya Renukunta
` (5 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
When the command thread performs a periodic time sync and the firmware is
going through an assert during that time, the command thread waits for the
response that would never arrive. The SCSI Mid layer's error handler would
eventually reset the controller, but the eh_handler just issues a
"thread stop" to the command thread which is stuck on a semaphore and the
eh_thread would in turn goes to sleep waiting for the command_thread to
respond to the stop which never happens.
Fixed by allowing SIGTERM for the command thread, and during the eh_reset
call, sends termination signal to the command thread. As a follow-up, the
eh_reset handler would take care of the controller reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 78588e4..0ee91d0 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -1519,8 +1519,15 @@ static int _aac_reset_adapter(struct aac_dev *aac, int forced, u8 reset_type)
scsi_block_requests(host);
aac_adapter_disable_int(aac);
if (aac->thread->pid != current->pid) {
+ struct task_struct *tsk;
+
spin_unlock_irq(host->host_lock);
+ tsk = pid_task(find_vpid(aac->thread->pid), PIDTYPE_PID);
+ if (tsk)
+ send_sig(SIGTERM, tsk, 1);
kthread_stop(aac->thread);
+
+ dev_info(&aac->pdev->dev, "Command Thread Terminated\n");
jafo = 1;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 11/16] aacraid: Decrease adapter health check interval
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (9 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 10/16] aacraid: Terminate kthread on controller fw assert Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:45 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family) Raghava Aditya Renukunta
` (4 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
Currently driver checks the health status of the adapter once every 24
hours. When that happens the driver becomes dependent on the kernel to
figure out if the adapter is misbehaving. This might take some time
(when the adapter is idle). The driver currently has support to
restart/recover the controller when it fails, and decreasing the time
interval will help.
Fixed by decreasing check interval from 24 hours to 1 minute
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/aachba.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index 98d4ffd..3ede50f 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -311,7 +311,7 @@ module_param(update_interval, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(update_interval, "Interval in seconds between time sync"
" updates issued to adapter.");
-int check_interval = 24 * 60 * 60;
+int check_interval = 60;
module_param(check_interval, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(check_interval, "Interval in seconds between adapter health"
" checks.");
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family)
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (10 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 11/16] aacraid: Decrease adapter health check interval Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:49 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 13/16] aacraid: Reorder Adapter status check Raghava Aditya Renukunta
` (3 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
When the SMART family of controller panic (KERNEL_PANIC) , they do not
honor IOP resets. So better to skip it and directly perform a IWBR reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/src.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index b23c818..5bb9865 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -714,6 +714,12 @@ static int aac_src_restart_adapter(struct aac_dev *dev, int bled, u8 reset_type)
pr_err("%s%d: adapter kernel panic'd %x.\n",
dev->name, dev->id, bled);
+ /*
+ * WHen there is a BlinkLED, IOP_RESET has not effect
+ */
+ if (bled >= 2 && dev->sa_firmware && (reset_type & HW_IOP_RESET))
+ reset_type &= ~HW_IOP_RESET;
+
dev->a_ops.adapter_enable_int = aac_src_disable_interrupt;
switch (reset_type) {
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 13/16] aacraid: Reorder Adapter status check
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (11 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family) Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:50 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset Raghava Aditya Renukunta
` (2 subsequent siblings)
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
The driver currently checks the SELF_TEST_FAILED first and then
KERNEL_PANIC next. Under error conditions(boot code failure) both
SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time.
The driver has the capability to reset the controller on an KERNEL_PANIC
, but not on SELF_TEST_FAILED.
Fixed by first checking KERNEL_PANIC and then the others.
Cc: stable@vger.kernel.org
Fixes: e8b12f0fb835223752 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/src.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index 5bb9865..0990117 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -437,16 +437,23 @@ static int aac_src_check_health(struct aac_dev *dev)
u32 status = src_readl(dev, MUnit.OMR);
/*
+ * Check to see if the board panic'd.
+ */
+ if (unlikely(status & KERNEL_PANIC))
+ goto err_blink;
+
+ /*
* Check to see if the board failed any self tests.
*/
if (unlikely(status & SELF_TEST_FAILED))
- return -1;
+ goto err_out;
/*
- * Check to see if the board panic'd.
+ * Check to see if the board failed any self tests.
*/
- if (unlikely(status & KERNEL_PANIC))
- return (status >> 16) & 0xFF;
+ if (unlikely(status & MONITOR_PANIC))
+ goto err_out;
+
/*
* Wait for the adapter to be up and running.
*/
@@ -456,6 +463,12 @@ static int aac_src_check_health(struct aac_dev *dev)
* Everything is OK
*/
return 0;
+
+err_out:
+ return -1;
+
+err_blink:
+ return (status > 16) & 0xFF;
}
static inline u32 aac_get_vector(struct aac_dev *dev)
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (12 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 13/16] aacraid: Reorder Adapter status check Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:53 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 16/16] aacraid: Update driver version Raghava Aditya Renukunta
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
Currently the adapter firmware does not save outstanding I/O's log
information when an IOP reset is triggered. This is problematic when
trying to root cause and debug issues.
Fixed by adding sync command to trigger I/O log file save in the adapter
firmware before issuing an IOP reset.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/aachba.c | 4 ++++
drivers/scsi/aacraid/aacraid.h | 6 ++++++
drivers/scsi/aacraid/src.c | 17 +++++++++++++++++
3 files changed, 27 insertions(+)
diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index 3ede50f..e3e93de 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -294,6 +294,10 @@ MODULE_PARM_DESC(aif_timeout, "The duration of time in seconds to wait for"
"deregistering them. This is typically adjusted for heavily burdened"
" systems.");
+int aac_fib_dump;
+module_param(aac_fib_dump, int, 0644);
+MODULE_PARM_DESC(aac_fib_dump, "Dump controller fibs prior to IOP_RESET 0=off, 1=on");
+
int numacb = -1;
module_param(numacb, int, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(numacb, "Request a limit to the number of adapter control"
diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index 9281e72..622fd69 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -1444,6 +1444,10 @@ struct aac_supplement_adapter_info
#define AAC_OPTION_VARIABLE_BLOCK_SIZE cpu_to_le32(0x00040000)
/* 240 simple volume support */
#define AAC_OPTION_SUPPORTED_240_VOLUMES cpu_to_le32(0x10000000)
+/*
+ * Supports FIB dump sync command send prior to IOP_RESET
+ */
+#define AAC_OPTION_SUPPORTED3_IOP_RESET_FIB_DUMP cpu_to_le32(0x00004000)
#define AAC_SIS_VERSION_V3 3
#define AAC_SIS_SLOT_UNKNOWN 0xFF
@@ -2483,6 +2487,7 @@ struct aac_hba_info {
#define GET_DRIVER_BUFFER_PROPERTIES 0x00000023
#define RCV_TEMP_READINGS 0x00000025
#define GET_COMM_PREFERRED_SETTINGS 0x00000026
+#define IOP_RESET_FW_FIB_DUMP 0x00000034
#define IOP_RESET 0x00001000
#define IOP_RESET_ALWAYS 0x00001001
#define RE_INIT_ADAPTER 0x000000ee
@@ -2686,4 +2691,5 @@ extern int aac_commit;
extern int update_interval;
extern int check_interval;
extern int aac_check_reset;
+extern int aac_fib_dump;
#endif
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index 0990117..fa03cdc 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -679,10 +679,27 @@ void aac_set_intx_mode(struct aac_dev *dev)
}
}
+static void aac_dump_fw_fib_iop_reset(struct aac_dev *dev)
+{
+ __le32 supported_options3;
+
+ if (!aac_fib_dump)
+ return;
+
+ supported_options3 = dev->supplement_adapter_info.supported_options3;
+ if (!(supported_options3 & AAC_OPTION_SUPPORTED3_IOP_RESET_FIB_DUMP))
+ return;
+
+ aac_adapter_sync_cmd(dev, IOP_RESET_FW_FIB_DUMP,
+ 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL, NULL);
+}
+
static void aac_send_iop_reset(struct aac_dev *dev, int bled)
{
u32 var, reset_mask;
+ aac_dump_fw_fib_iop_reset(dev);
+
bled = aac_adapter_sync_cmd(dev, IOP_RESET_ALWAYS,
0, 0, 0, 0, 0, 0, &var,
&reset_mask, NULL, NULL, NULL);
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (13 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:54 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 16/16] aacraid: Update driver version Raghava Aditya Renukunta
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
The driver does not unlock the reply queue spin lock after handling SMART
adapter events. Instead it might attempt to unlock an already unlocked
spin lock.
Fixed by making sure the driver locks the spin lock before freeing it.
Thank you dan for finding this issue out.
Fixes: 6223a39fe6fbbeef (scsi: aacraid: Added support for hotplug)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/commsup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
index 0ee91d0..4141de0 100644
--- a/drivers/scsi/aacraid/commsup.c
+++ b/drivers/scsi/aacraid/commsup.c
@@ -2213,7 +2213,7 @@ static void aac_process_events(struct aac_dev *dev)
/* Thor AIF */
aac_handle_sa_aif(dev, fib);
aac_fib_adapter_complete(fib, (u16)sizeof(u32));
- continue;
+ goto free_fib;
}
/*
* We will process the FIB here or pass it to a
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH 16/16] aacraid: Update driver version
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
` (14 preceding siblings ...)
2017-02-14 20:44 ` [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug Raghava Aditya Renukunta
@ 2017-02-14 20:44 ` Raghava Aditya Renukunta
2017-02-15 8:55 ` Johannes Thumshirn
15 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-14 20:44 UTC (permalink / raw)
To: jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, jthumshirn, dan.carpenter
Updated driver version to 50792
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
---
drivers/scsi/aacraid/aacraid.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index 622fd69..d036a80 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -97,7 +97,7 @@ enum {
#define PMC_GLOBAL_INT_BIT0 0x00000001
#ifndef AAC_DRIVER_BUILD
-# define AAC_DRIVER_BUILD 50740
+# define AAC_DRIVER_BUILD 50792
# define AAC_DRIVER_BRANCH "-custom"
#endif
#define MAXIMUM_NUM_CONTAINERS 32
--
2.7.4
^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 01/16] aacraid: Fix camel case
2017-02-14 20:44 ` [PATCH 01/16] aacraid: Fix camel case Raghava Aditya Renukunta
@ 2017-02-15 8:02 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:02 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> Replaced camel case with snake case for init supported options.
>
> Suggested-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
\o/
Bonus-points-awarded-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 02/16] aacraid: Use correct channel number for raw srb
2017-02-14 20:44 ` [PATCH 02/16] aacraid: Use correct channel number for raw srb Raghava Aditya Renukunta
@ 2017-02-15 8:03 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:03 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> The channel being used for raw srb commands is retrieved from the utility
> sent fibs and is converted into physical channel id. The driver does not
> need to to do this since the management utility sends the correct channel
> id in the first place and in addition the driver sets inaccurate
> information in the cmd sent to the firmware and gets an invalid response.
>
> Fixed by using channel id from srb command.
>
> Cc: stable@vger.kernel.org
> Fixes: 423400e64d377c0 ("scsi: aacraid: Include HBA direct interface")
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 03/16] aacraid: Fix for excessive prints on EEH
2017-02-14 20:44 ` [PATCH 03/16] aacraid: Fix for excessive prints on EEH Raghava Aditya Renukunta
@ 2017-02-15 8:07 ` Johannes Thumshirn
2017-02-15 18:06 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:07 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> This issue showed up on a kdump debug(single CPU on powerkvm), when EEH
> errors rendered the adapter unusable. The driver correctly detected the
> issue and attempted to restart the controller, in doing so the driver
> attempted to read the status registers of the controller. This triggered
> additional eeh errors which continued for a good 6 minutes.
>
> Fixed by returning without waiting when EEH error is reported.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
> drivers/scsi/aacraid/commsup.c | 38 +++++++++++++++++++++++++++++++++++++-
> 1 file changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
> index 56090f5..6220b47 100644
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -461,6 +461,30 @@ int aac_queue_get(struct aac_dev * dev, u32 * index, u32 qid, struct hw_fib * hw
> return 0;
> }
Please do
> +#ifdef CONFIG_EEH
> +static inline int aac_check_eeh_failure(struct aac_dev *dev)
> +{
> + /* Check for an EEH failure for the given
> + * device node. Function eeh_dev_check_failure()
> + * returns 0 if there has not been an EEH error
> + * otherwise returns a non-zero value.
> + *
> + * Need to be called before any PCI operation,
> + * i.e.,before aac_adapter_check_health()
> + */
> + struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev);
> +
> + if (eeh_dev_check_failure(edev)) {
> + /* The EEH mechanisms will handle this
> + * error and reset the device if
> + * necessary.
> + */
> + return 1;
> + }
> + return 0;
> +}
#else
static inline int aac_check_eeh_failure(struct aac_dev *dev)
{
return 0;
}
> +#endif
> +
[...]
> +
> +#if defined(CONFIG_EEH)
> + if (aac_check_eeh_failure(dev))
> + return -EFAULT;
> +#endif
> +
So the #if defined() blocks become unnecessary.
Thanks,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
2017-02-14 20:44 ` [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units Raghava Aditya Renukunta
@ 2017-02-15 8:20 ` Johannes Thumshirn
2017-02-15 18:08 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:20 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> Arrconf management utility at times sends fibs with AdapterProcessed set
> in its fibs. This causes the controller to panic and lockup.
>
> Fixed by failing the commands that have AdapterProcessed set in its flag.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
> drivers/scsi/aacraid/commsup.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
> index 6220b47..f7a3bcb 100644
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -522,6 +522,10 @@ int aac_fib_send(u16 command, struct fib *fibptr, unsigned long size,
>
> if (!(hw_fib->header.XferState & cpu_to_le32(HostOwned)))
> return -EBUSY;
> +
> + if (hw_fib->header.XferState & cpu_to_le32(AdapterProcessed))
> + return -EINVAL;
> +
As far as I can see the fib_xfer_state enum isn't an exported as an
official ABI, so it's a good candidate (whole of aacraid.h actually) for
the next round of camel case removals.
Anyways, this can wait:
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 05/16] aacraid: Fix memory leak in fib init path
2017-02-14 20:44 ` [PATCH 05/16] aacraid: Fix memory leak in fib init path Raghava Aditya Renukunta
@ 2017-02-15 8:31 ` Johannes Thumshirn
2017-02-15 18:08 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:31 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> aac_fib_map_free frees misaligned fib dma memory, additionally it does not
> free up the whole memory.
>
> Fixed by changing the code to free up the correct and full memory
> allocation.
>
> Cc: stable@vger.kernel.org
> Fixes: e8b12f0fb835223 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC based controller family)
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
> drivers/scsi/aacraid/commsup.c | 20 +++++++++-----------
> 1 file changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
> index f7a3bcb..863c98d 100644
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -97,8 +97,8 @@ void aac_fib_map_free(struct aac_dev *dev)
> {
> if (dev->hw_fib_va && dev->max_cmd_size) {
> pci_free_consistent(dev->pdev,
> - (dev->max_cmd_size *
> - (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB)),
> + (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr))
> + * (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB) + 31,
> dev->hw_fib_va, dev->hw_fib_pa);
Can you please do something like:
size_t alloc_size;
int numtags;
numtags = dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB;
alloc_size = (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) *
numtags + 31;
pci_free_consistent(dev->pdev, alloc_size, dev->hw_fib_va,
dev->hw_fib_pa);
And please indent correctly. If it indentation doesn't work correctly
because you hit the 80 chars limit, that's a sign something should be
reconsidered.
Thanks,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 06/16] aacraid: Added sysfs for driver version
2017-02-14 20:44 ` [PATCH 06/16] aacraid: Added sysfs for driver version Raghava Aditya Renukunta
@ 2017-02-15 8:32 ` Johannes Thumshirn
2017-02-15 18:12 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:32 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> Added support to retrieve driver version from a new sysfs variable called
> driver_version. It makes it easier for the user to figure out the driver
> version that is currently running.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Can't this be retrieved via modinfo?
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset
2017-02-14 20:44 ` [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset Raghava Aditya Renukunta
@ 2017-02-15 8:34 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:34 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> After controller shutdown, all sync fibs time out due to not knowing
> about the switch to INT-x mode
>
> Fixed by replacing aac_src_access_devreg() to aac_set_intx_mode() call.
>
> Cc: stable@vger.kernel.org
> Fixes: 495c021767bd78c998 (aacraid: MSI-x support)
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 08/16] aacraid: Skip wellness sync on controller failure
2017-02-14 20:44 ` [PATCH 08/16] aacraid: Skip wellness sync on controller failure Raghava Aditya Renukunta
@ 2017-02-15 8:35 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:35 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> aac_command_thread checks on the health of controller periodically,
> using aac_check_health. If the status is an error state KERNEL_PANIC or
> anything else. The driver will attempt to restart the adapter, but the
> response is not checked in aac_command_thread. This allows the periodic
> sync to go thru and lead the driver to a hung state.
>
> Fixed by terminating the periodic loop(intended per original design),
> if the controller is not restored to a healthy state.
>
> Cc: stable@vger.kernel.org
> Fixes: 3d77d8404478353358 (scsi: aacraid: Added support for periodic wellness sync)
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 09/16] aacraid: Reload offlined drives after controller reset
2017-02-14 20:44 ` [PATCH 09/16] aacraid: Reload offlined drives after controller reset Raghava Aditya Renukunta
@ 2017-02-15 8:38 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:38 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> During the IOP reset stress testing, it was found that the drives can be
> marked offline when the adapter controller crashes and IO's are running
> in parallel. When the controller does come back from the reset, the drive
> that is marked offline is not exposed.
>
> Fixed by removing and adding drives that are marked offline. In addition
> invoke a scsi host bus rescan to capture any additional configuration
> changes.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
2017-02-14 20:44 ` [PATCH 10/16] aacraid: Terminate kthread on controller fw assert Raghava Aditya Renukunta
@ 2017-02-15 8:44 ` Johannes Thumshirn
2017-02-15 22:22 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:44 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> When the command thread performs a periodic time sync and the firmware is
> going through an assert during that time, the command thread waits for the
> response that would never arrive. The SCSI Mid layer's error handler would
> eventually reset the controller, but the eh_handler just issues a
> "thread stop" to the command thread which is stuck on a semaphore and the
> eh_thread would in turn goes to sleep waiting for the command_thread to
> respond to the stop which never happens.
>
> Fixed by allowing SIGTERM for the command thread, and during the eh_reset
> call, sends termination signal to the command thread. As a follow-up, the
> eh_reset handler would take care of the controller reset.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
This look a bit scary. Can't the kthread be converted to a workqueue so
we could call cancel_work_sync()?
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 11/16] aacraid: Decrease adapter health check interval
2017-02-14 20:44 ` [PATCH 11/16] aacraid: Decrease adapter health check interval Raghava Aditya Renukunta
@ 2017-02-15 8:45 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:45 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> Currently driver checks the health status of the adapter once every 24
> hours. When that happens the driver becomes dependent on the kernel to
> figure out if the adapter is misbehaving. This might take some time
> (when the adapter is idle). The driver currently has support to
> restart/recover the controller when it fails, and decreasing the time
> interval will help.
>
> Fixed by decreasing check interval from 24 hours to 1 minute
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family)
2017-02-14 20:44 ` [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family) Raghava Aditya Renukunta
@ 2017-02-15 8:49 ` Johannes Thumshirn
2017-02-15 18:14 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:49 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> When the SMART family of controller panic (KERNEL_PANIC) , they do not
^ controllers? ^ extra space
> honor IOP resets. So better to skip it and directly perform a IWBR reset.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
> drivers/scsi/aacraid/src.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
> index b23c818..5bb9865 100644
> --- a/drivers/scsi/aacraid/src.c
> +++ b/drivers/scsi/aacraid/src.c
> @@ -714,6 +714,12 @@ static int aac_src_restart_adapter(struct aac_dev *dev, int bled, u8 reset_type)
> pr_err("%s%d: adapter kernel panic'd %x.\n",
> dev->name, dev->id, bled);
>
> + /*
> + * WHen there is a BlinkLED, IOP_RESET has not effect
^ When
> + */
> + if (bled >= 2 && dev->sa_firmware && (reset_type & HW_IOP_RESET))
^ No need for the
parenthesis
> + reset_type &= ~HW_IOP_RESET;
> +
> dev->a_ops.adapter_enable_int = aac_src_disable_interrupt;
>
> switch (reset_type) {
>
Apart from that,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 13/16] aacraid: Reorder Adapter status check
2017-02-14 20:44 ` [PATCH 13/16] aacraid: Reorder Adapter status check Raghava Aditya Renukunta
@ 2017-02-15 8:50 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:50 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> The driver currently checks the SELF_TEST_FAILED first and then
> KERNEL_PANIC next. Under error conditions(boot code failure) both
> SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time.
>
> The driver has the capability to reset the controller on an KERNEL_PANIC
> , but not on SELF_TEST_FAILED.
>
> Fixed by first checking KERNEL_PANIC and then the others.
>
> Cc: stable@vger.kernel.org
> Fixes: e8b12f0fb835223752 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family)
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Apart from the odd comma placement in the 2nd paragraph of the commit
message
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset
2017-02-14 20:44 ` [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset Raghava Aditya Renukunta
@ 2017-02-15 8:53 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:53 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> Currently the adapter firmware does not save outstanding I/O's log
> information when an IOP reset is triggered. This is problematic when
> trying to root cause and debug issues.
>
> Fixed by adding sync command to trigger I/O log file save in the adapter
> firmware before issuing an IOP reset.
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug
2017-02-14 20:44 ` [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug Raghava Aditya Renukunta
@ 2017-02-15 8:54 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:54 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> The driver does not unlock the reply queue spin lock after handling SMART
> adapter events. Instead it might attempt to unlock an already unlocked
> spin lock.
>
> Fixed by making sure the driver locks the spin lock before freeing it.
>
> Thank you dan for finding this issue out.
>
> Fixes: 6223a39fe6fbbeef (scsi: aacraid: Added support for hotplug)
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 16/16] aacraid: Update driver version
2017-02-14 20:44 ` [PATCH 16/16] aacraid: Update driver version Raghava Aditya Renukunta
@ 2017-02-15 8:55 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-15 8:55 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: David.Carroll, Gana.Sridaran, Scott.Benesh, dan.carpenter
On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> Updated driver version to 50792
>
> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
> Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> ---
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 03/16] aacraid: Fix for excessive prints on EEH
2017-02-15 8:07 ` Johannes Thumshirn
@ 2017-02-15 18:06 ` Raghava Aditya Renukunta
0 siblings, 0 replies; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-15 18:06 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > This issue showed up on a kdump debug(single CPU on powerkvm), when
> EEH
> > errors rendered the adapter unusable. The driver correctly detected the
> > issue and attempted to restart the controller, in doing so the driver
> > attempted to read the status registers of the controller. This triggered
> > additional eeh errors which continued for a good 6 minutes.
> >
> > Fixed by returning without waiting when EEH error is reported.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>
> > Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> > ---
> > drivers/scsi/aacraid/commsup.c | 38
> +++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 37 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/scsi/aacraid/commsup.c
> b/drivers/scsi/aacraid/commsup.c
> > index 56090f5..6220b47 100644
> > --- a/drivers/scsi/aacraid/commsup.c
> > +++ b/drivers/scsi/aacraid/commsup.c
> > @@ -461,6 +461,30 @@ int aac_queue_get(struct aac_dev * dev, u32 *
> index, u32 qid, struct hw_fib * hw
> > return 0;
> > }
>
> Please do
>
> > +#ifdef CONFIG_EEH
> > +static inline int aac_check_eeh_failure(struct aac_dev *dev)
> > +{
> > + /* Check for an EEH failure for the given
> > + * device node. Function eeh_dev_check_failure()
> > + * returns 0 if there has not been an EEH error
> > + * otherwise returns a non-zero value.
> > + *
> > + * Need to be called before any PCI operation,
> > + * i.e.,before aac_adapter_check_health()
> > + */
> > + struct eeh_dev *edev = pci_dev_to_eeh_dev(dev->pdev);
> > +
> > + if (eeh_dev_check_failure(edev)) {
> > + /* The EEH mechanisms will handle this
> > + * error and reset the device if
> > + * necessary.
> > + */
> > + return 1;
> > + }
> > + return 0;
> > +}
>
> #else
> static inline int aac_check_eeh_failure(struct aac_dev *dev)
> {
> return 0;
> }
>
> > +#endif
> > +
>
> [...]
>
> > +
> > +#if defined(CONFIG_EEH)
> > + if (aac_check_eeh_failure(dev))
> > + return -EFAULT;
> > +#endif
> > +
>
> So the #if defined() blocks become unnecessary.
Yes I can do that.
Regards,
Raghava Aditya
> Thanks,
> Johannes
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
2017-02-15 8:20 ` Johannes Thumshirn
@ 2017-02-15 18:08 ` Raghava Aditya Renukunta
2017-02-16 7:40 ` Johannes Thumshirn
0 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-15 18:08 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Wednesday, February 15, 2017 12:20 AM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
>
> EXTERNAL EMAIL
>
>
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > Arrconf management utility at times sends fibs with AdapterProcessed set
> > in its fibs. This causes the controller to panic and lockup.
> >
> > Fixed by failing the commands that have AdapterProcessed set in its flag.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>
> > Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> > ---
> > drivers/scsi/aacraid/commsup.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/scsi/aacraid/commsup.c
> b/drivers/scsi/aacraid/commsup.c
> > index 6220b47..f7a3bcb 100644
> > --- a/drivers/scsi/aacraid/commsup.c
> > +++ b/drivers/scsi/aacraid/commsup.c
> > @@ -522,6 +522,10 @@ int aac_fib_send(u16 command, struct fib *fibptr,
> unsigned long size,
> >
> > if (!(hw_fib->header.XferState & cpu_to_le32(HostOwned)))
> > return -EBUSY;
> > +
> > + if (hw_fib->header.XferState & cpu_to_le32(AdapterProcessed))
> > + return -EINVAL;
> > +
>
> As far as I can see the fib_xfer_state enum isn't an exported as an
> official ABI, so it's a good candidate (whole of aacraid.h actually) for
> the next round of camel case removals.
>
> Anyways, this can wait:
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
I will put in the queue for the next set of patch submissions and chip away
at it bit by bit.
Thank you,
Raghava Aditya Renukunta
>
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 05/16] aacraid: Fix memory leak in fib init path
2017-02-15 8:31 ` Johannes Thumshirn
@ 2017-02-15 18:08 ` Raghava Aditya Renukunta
0 siblings, 0 replies; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-15 18:08 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Wednesday, February 15, 2017 12:32 AM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 05/16] aacraid: Fix memory leak in fib init path
>
> EXTERNAL EMAIL
>
>
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > aac_fib_map_free frees misaligned fib dma memory, additionally it does
> not
> > free up the whole memory.
> >
> > Fixed by changing the code to free up the correct and full memory
> > allocation.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: e8b12f0fb835223 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC
> based controller family)
> > Signed-off-by: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>
> > Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> > ---
> > drivers/scsi/aacraid/commsup.c | 20 +++++++++-----------
> > 1 file changed, 9 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/scsi/aacraid/commsup.c
> b/drivers/scsi/aacraid/commsup.c
> > index f7a3bcb..863c98d 100644
> > --- a/drivers/scsi/aacraid/commsup.c
> > +++ b/drivers/scsi/aacraid/commsup.c
> > @@ -97,8 +97,8 @@ void aac_fib_map_free(struct aac_dev *dev)
> > {
> > if (dev->hw_fib_va && dev->max_cmd_size) {
> > pci_free_consistent(dev->pdev,
> > - (dev->max_cmd_size *
> > - (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB)),
> > + (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr))
> > + * (dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB) + 31,
> > dev->hw_fib_va, dev->hw_fib_pa);
>
> Can you please do something like:
>
> size_t alloc_size;
> int numtags;
>
> numtags = dev->scsi_host_ptr->can_queue + AAC_NUM_MGT_FIB;
> alloc_size = (dev->max_cmd_size + sizeof(struct aac_fib_xporthdr)) *
> numtags + 31;
> pci_free_consistent(dev->pdev, alloc_size, dev->hw_fib_va,
> dev->hw_fib_pa);
>
> And please indent correctly. If it indentation doesn't work correctly
> because you hit the 80 chars limit, that's a sign something should be
> reconsidered.
Yes, I will rework this.
Regards,
Raghava Aditya
> Thanks,
> Johannes
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 06/16] aacraid: Added sysfs for driver version
2017-02-15 8:32 ` Johannes Thumshirn
@ 2017-02-15 18:12 ` Raghava Aditya Renukunta
2017-02-16 7:43 ` Johannes Thumshirn
0 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-15 18:12 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Wednesday, February 15, 2017 12:33 AM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 06/16] aacraid: Added sysfs for driver version
>
> EXTERNAL EMAIL
>
>
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > Added support to retrieve driver version from a new sysfs variable called
> > driver_version. It makes it easier for the user to figure out the driver
> > version that is currently running.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>
> > Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> > ---
>
> Can't this be retrieved via modinfo?
I agree , but it makes it easier to get the driver version when I am developing
and I don't know which driver version is currently loaded
In addition internally our test automation suites use this information as
opposed to modinfo to get the driver version running.
Regards,
Raghava Aditya
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family)
2017-02-15 8:49 ` Johannes Thumshirn
@ 2017-02-15 18:14 ` Raghava Aditya Renukunta
0 siblings, 0 replies; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-15 18:14 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Wednesday, February 15, 2017 12:49 AM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART
> Family)
>
> EXTERNAL EMAIL
>
>
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > When the SMART family of controller panic (KERNEL_PANIC) , they do not
> ^ controllers? ^ extra space
> > honor IOP resets. So better to skip it and directly perform a IWBR reset.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>
> > Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> > ---
> > drivers/scsi/aacraid/src.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
> > index b23c818..5bb9865 100644
> > --- a/drivers/scsi/aacraid/src.c
> > +++ b/drivers/scsi/aacraid/src.c
> > @@ -714,6 +714,12 @@ static int aac_src_restart_adapter(struct aac_dev
> *dev, int bled, u8 reset_type)
> > pr_err("%s%d: adapter kernel panic'd %x.\n",
> > dev->name, dev->id, bled);
> >
> > + /*
> > + * WHen there is a BlinkLED, IOP_RESET has not effect
> ^ When
> > + */
> > + if (bled >= 2 && dev->sa_firmware && (reset_type & HW_IOP_RESET))
> ^ No need for the
> parenthesis
> > + reset_type &= ~HW_IOP_RESET;
> > +
> > dev->a_ops.adapter_enable_int = aac_src_disable_interrupt;
> >
> > switch (reset_type) {
> >
>
> Apart from that,
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Yes I will fix this in the next patch set.
Regards,
Raghava Aditya
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
2017-02-15 8:44 ` Johannes Thumshirn
@ 2017-02-15 22:22 ` Raghava Aditya Renukunta
2017-02-16 9:31 ` Johannes Thumshirn
0 siblings, 1 reply; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-15 22:22 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Wednesday, February 15, 2017 12:44 AM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw
> assert
>
> EXTERNAL EMAIL
>
>
> On 02/14/2017 09:44 PM, Raghava Aditya Renukunta wrote:
> > When the command thread performs a periodic time sync and the
> firmware is
> > going through an assert during that time, the command thread waits for
> the
> > response that would never arrive. The SCSI Mid layer's error handler would
> > eventually reset the controller, but the eh_handler just issues a
> > "thread stop" to the command thread which is stuck on a semaphore and
> the
> > eh_thread would in turn goes to sleep waiting for the command_thread to
> > respond to the stop which never happens.
> >
> > Fixed by allowing SIGTERM for the command thread, and during the
> eh_reset
> > call, sends termination signal to the command thread. As a follow-up, the
> > eh_reset handler would take care of the controller reset.
> >
> > Signed-off-by: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>
> > Reviewed-by: David Carroll <David.Carroll@microsemi.com>
> > ---
>
> This look a bit scary. Can't the kthread be converted to a workqueue so
> we could call cancel_work_sync()?
Could you please elaborate on the reasons why this fix is scary?
I understand that killing a thread is not standard (for any reason),
and if there are other nuanced issues I would like to understand them.
Thank you,
Raghava Aditya
>
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units
2017-02-15 18:08 ` Raghava Aditya Renukunta
@ 2017-02-16 7:40 ` Johannes Thumshirn
0 siblings, 0 replies; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 7:40 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
On 02/15/2017 07:08 PM, Raghava Aditya Renukunta wrote:
>
> I will put in the queue for the next set of patch submissions and chip away
> at it bit by bit.
Yay, thanks.
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 06/16] aacraid: Added sysfs for driver version
2017-02-15 18:12 ` Raghava Aditya Renukunta
@ 2017-02-16 7:43 ` Johannes Thumshirn
2017-02-16 19:38 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 7:43 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
On 02/15/2017 07:12 PM, Raghava Aditya Renukunta wrote:
>
> I agree , but it makes it easier to get the driver version when I am developing
> and I don't know which driver version is currently loaded
>
> In addition internally our test automation suites use this information as
> opposed to modinfo to get the driver version running.
OK then.
Speaking of test automation, is there something you may be able to share?
Thanks,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
2017-02-15 22:22 ` Raghava Aditya Renukunta
@ 2017-02-16 9:31 ` Johannes Thumshirn
2017-02-16 19:53 ` Raghava Aditya Renukunta
0 siblings, 1 reply; 44+ messages in thread
From: Johannes Thumshirn @ 2017-02-16 9:31 UTC (permalink / raw)
To: Raghava Aditya Renukunta, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
On 02/15/2017 11:22 PM, Raghava Aditya Renukunta wrote:
>>
>> This look a bit scary. Can't the kthread be converted to a workqueue so
>> we could call cancel_work_sync()?
>
> Could you please elaborate on the reasons why this fix is scary?
> I understand that killing a thread is not standard (for any reason),
> and if there are other nuanced issues I would like to understand them.
I'm actually concerned that this could have all kinds of side effects.
But this is just a gut feeling. I see some drm drivers are doing the
same, so it might be possible, but IMHO this is not a good design.
And IIRC kthreads do have more downsides (i.e. CPU hotplugging and
issues with kernel live patching).
I think most kthreads (haven't looked too close to the aacraid kthread I
must admit, but I'll be doing so) can be converted to either workqueues
or timers (or a combination of both).
Thanks,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 06/16] aacraid: Added sysfs for driver version
2017-02-16 7:43 ` Johannes Thumshirn
@ 2017-02-16 19:38 ` Raghava Aditya Renukunta
0 siblings, 0 replies; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-16 19:38 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Wednesday, February 15, 2017 11:43 PM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 06/16] aacraid: Added sysfs for driver version
>
> EXTERNAL EMAIL
>
>
> On 02/15/2017 07:12 PM, Raghava Aditya Renukunta wrote:
> >
> > I agree , but it makes it easier to get the driver version when I am
> developing
> > and I don't know which driver version is currently loaded
> >
> > In addition internally our test automation suites use this information as
> > opposed to modinfo to get the driver version running.
>
> OK then.
>
> Speaking of test automation, is there something you may be able to share?
>
> Thanks,
> Johannes
Well still in the very initial phases, so nothing concrete yet.
The plan is to compile the source and install it on various kernels for testing
and modinfo will not be able to retrieve the version info.
Additionally it seems to be a cleaner method when compared to modinfo.
Regards,
Raghava Aditya
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH 10/16] aacraid: Terminate kthread on controller fw assert
2017-02-16 9:31 ` Johannes Thumshirn
@ 2017-02-16 19:53 ` Raghava Aditya Renukunta
0 siblings, 0 replies; 44+ messages in thread
From: Raghava Aditya Renukunta @ 2017-02-16 19:53 UTC (permalink / raw)
To: Johannes Thumshirn, jejb, martin.petersen, linux-scsi
Cc: Dave Carroll, Gana Sridaran, Scott Benesh, dan.carpenter
> -----Original Message-----
> From: Johannes Thumshirn [mailto:jthumshirn@suse.de]
> Sent: Thursday, February 16, 2017 1:31 AM
> To: Raghava Aditya Renukunta
> <RaghavaAditya.Renukunta@microsemi.com>; jejb@linux.vnet.ibm.com;
> martin.petersen@oracle.com; linux-scsi@vger.kernel.org
> Cc: Dave Carroll <david.carroll@microsemi.com>; Gana Sridaran
> <gana.sridaran@microsemi.com>; Scott Benesh
> <scott.benesh@microsemi.com>; dan.carpenter@oracle.com
> Subject: Re: [PATCH 10/16] aacraid: Terminate kthread on controller fw
> assert
>
> EXTERNAL EMAIL
>
>
> On 02/15/2017 11:22 PM, Raghava Aditya Renukunta wrote:
> >>
> >> This look a bit scary. Can't the kthread be converted to a workqueue so
> >> we could call cancel_work_sync()?
> >
> > Could you please elaborate on the reasons why this fix is scary?
> > I understand that killing a thread is not standard (for any reason),
> > and if there are other nuanced issues I would like to understand them.
>
> I'm actually concerned that this could have all kinds of side effects.
> But this is just a gut feeling. I see some drm drivers are doing the
> same, so it might be possible, but IMHO this is not a good design.
>
> And IIRC kthreads do have more downsides (i.e. CPU hotplugging and
> issues with kernel live patching).
>
> I think most kthreads (haven't looked too close to the aacraid kthread I
> must admit, but I'll be doing so) can be converted to either workqueues
> or timers (or a combination of both).
>
> Thanks,
> Johannes
Makes sense, and I agree. With that being said I will withdraw this patch
and resend it out in different patch series once we rework aac_command_thread
into a work queue/timers.
Regards,
Raghava Aditya
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2017-02-16 19:54 UTC | newest]
Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-14 20:44 [PATCH 00/16] aacraid: Fixes and enhancements for arc family Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 01/16] aacraid: Fix camel case Raghava Aditya Renukunta
2017-02-15 8:02 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 02/16] aacraid: Use correct channel number for raw srb Raghava Aditya Renukunta
2017-02-15 8:03 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 03/16] aacraid: Fix for excessive prints on EEH Raghava Aditya Renukunta
2017-02-15 8:07 ` Johannes Thumshirn
2017-02-15 18:06 ` Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 04/16] aacraid: Prevent E3 lockup when deleting units Raghava Aditya Renukunta
2017-02-15 8:20 ` Johannes Thumshirn
2017-02-15 18:08 ` Raghava Aditya Renukunta
2017-02-16 7:40 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 05/16] aacraid: Fix memory leak in fib init path Raghava Aditya Renukunta
2017-02-15 8:31 ` Johannes Thumshirn
2017-02-15 18:08 ` Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 06/16] aacraid: Added sysfs for driver version Raghava Aditya Renukunta
2017-02-15 8:32 ` Johannes Thumshirn
2017-02-15 18:12 ` Raghava Aditya Renukunta
2017-02-16 7:43 ` Johannes Thumshirn
2017-02-16 19:38 ` Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 07/16] aacraid: Fix sync fibs time out on controller reset Raghava Aditya Renukunta
2017-02-15 8:34 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 08/16] aacraid: Skip wellness sync on controller failure Raghava Aditya Renukunta
2017-02-15 8:35 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 09/16] aacraid: Reload offlined drives after controller reset Raghava Aditya Renukunta
2017-02-15 8:38 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 10/16] aacraid: Terminate kthread on controller fw assert Raghava Aditya Renukunta
2017-02-15 8:44 ` Johannes Thumshirn
2017-02-15 22:22 ` Raghava Aditya Renukunta
2017-02-16 9:31 ` Johannes Thumshirn
2017-02-16 19:53 ` Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 11/16] aacraid: Decrease adapter health check interval Raghava Aditya Renukunta
2017-02-15 8:45 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 12/16] aacraid: Skip IOP reset on controller panic(SMART Family) Raghava Aditya Renukunta
2017-02-15 8:49 ` Johannes Thumshirn
2017-02-15 18:14 ` Raghava Aditya Renukunta
2017-02-14 20:44 ` [PATCH 13/16] aacraid: Reorder Adapter status check Raghava Aditya Renukunta
2017-02-15 8:50 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 14/16] aacraid: Save adapter fib log before an IOP reset Raghava Aditya Renukunta
2017-02-15 8:53 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 15/16] aacraid: Fix a potential spinlock double unlock bug Raghava Aditya Renukunta
2017-02-15 8:54 ` Johannes Thumshirn
2017-02-14 20:44 ` [PATCH 16/16] aacraid: Update driver version Raghava Aditya Renukunta
2017-02-15 8:55 ` Johannes Thumshirn
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.