linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization
@ 2015-05-04 15:05 Sreekanth Reddy
  2015-05-05 15:35 ` Tomas Henzl
  2015-05-06 18:48 ` Calvin Owens
  0 siblings, 2 replies; 52+ messages in thread
From: Sreekanth Reddy @ 2015-05-04 15:05 UTC (permalink / raw)
  To: calvinowens
  Cc: martin.petersen, linux-scsi, jejb, JBottomley, Sathya.Prakash,
	chaitra.basappa, linux-kernel, hch, Sreekanth Reddy

I have applied this patch on the latest upstream mpt3sas driver, then I have compiled and loaded the driver.
In the driver logs I didn't see any attached drives are added to the OS, 'fdisk -l' command also doesn't list
 the drives which are actually attached to the HBA.

When I debug this issue then I see that in '_scsih_target_alloc'
 driver is searching for sas_device from the lists 'sas_device_init_list' & 'sas_device_list'
 based on the device sas address using the function mpt3sas_scsih_sas_device_find_by_sas_address(),
 since this device is not in the 'sas_device_init_list' (as it is moved it to head list) driver exit
 from this function without updating the required device addition information.

To solve the original problem (i.e memory corruption), here I have attached the patch,
 in this patch I have added one atomic flag is_on_sas_device_init_list in _sas_device_structure
 and I followed below algorithm.

1. when ever a device is added to sas_device_init_list then driver will set this atomic flag of this device to one.

2. And during the addition of this device to SCSI mid layer,
        if the device is successfully added to the OS then driver will move this device list in to sas_device_list list from sas_device_init_list list and at this time driver will reset this flag to zero.
        if device is failed to register with SCSI mid layer then also driver will reset this flag to zero in function _scsih_sas_device_remove and will remove the device entry from sas_device_init_list and will free the device structure.

3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
         a. driver will check whether addition of discovered devices to SML process is currently running or not,
               i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
                    if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
             ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.

4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this  device removal event.

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h  |  2 ++
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 ++
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
 4 files changed, 71 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..1aa10d2 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -376,6 +376,7 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	pfa_led_on;
+	atomic_t is_on_sas_device_init_list;
 };
 
 /**
@@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
 	u8		broadcast_aen_busy;
 	u16		broadcast_aen_pending;
 	u8		shost_recovery;
+	u8		discovered_device_addition_on;
 
 	struct mutex	reset_in_progress_mutex;
 	spinlock_t 	ioc_reset_in_progress_lock;
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..2a61286 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
     struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	struct _sas_device *same_sas_device;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	same_sas_device = _scsih_sas_device_find_by_handle(ioc,
+						sas_device->handle);
+	if (same_sas_device) {
+		list_del(&same_sas_device->list);
+		if (atomic_read(&sas_device->is_on_sas_device_init_list))
+			atomic_set(&sas_device->is_on_sas_device_init_list, 0);
+		kfree(same_sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
 	    "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
+	atomic_set(&sas_device->is_on_sas_device_init_list, 1);
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
 	_scsih_determine_boot_device(ioc, sas_device, 0);
@@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (ioc->discovered_device_addition_on &&
+		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
+			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+			return;
+		} else
+			list_del(&sas_device->list);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	if (sas_device)
 		_scsih_remove_device(ioc, sas_device);
@@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (ioc->discovered_device_addition_on &&
+		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
+			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+			return;
+		} else
+			list_del(&sas_device->list);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	if (sas_device)
 		_scsih_remove_device(ioc, sas_device);
@@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
 	struct _sas_device *sas_device, *next;
 	unsigned long flags;
 
+	ioc->discovered_device_addition_on = 1;
 	/* SAS Device List */
 	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
 	    list) {
 
 		if (ioc->hide_drives)
 			continue;
-
+
 		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
 		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+			mpt2sas_transport_port_remove(ioc,
+					sas_device->sas_address,
+					sas_device->sas_address_parent);
+			_scsih_sas_device_remove(ioc, sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			if (!ioc->is_driver_loading) {
 				mpt2sas_transport_port_remove(ioc,
 					sas_device->sas_address,
 					sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+				_scsih_sas_device_remove(ioc, sas_device);
 				continue;
 			}
 		}
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
+		atomic_dec(&sas_device->is_on_sas_device_init_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
+	ioc->discovered_device_addition_on = 0;
 }
 
 /**
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index afa8816..6188490 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -315,6 +315,7 @@ struct _sas_device {
 	u8	responding;
 	u8	fast_path;
 	u8	pfa_led_on;
+	atomic_t is_on_sas_device_init_list;
 };
 
 /**
@@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
 	u8		broadcast_aen_busy;
 	u16		broadcast_aen_pending;
 	u8		shost_recovery;
+	u8		discovered_device_addition_on;
 
 	struct mutex	reset_in_progress_mutex;
 	spinlock_t	ioc_reset_in_progress_lock;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..53cc9ea 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
 	struct _sas_device *sas_device)
 {
 	unsigned long flags;
+	struct _sas_device *same_sas_device;
 
 	if (!sas_device)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	same_sas_device = _scsih_sas_device_find_by_handle(ioc,
+						sas_device->handle);
+	if (same_sas_device) {
+		list_del(&same_sas_device->list);
+		if (atomic_read(&sas_device->is_on_sas_device_init_list))
+			atomic_set(&sas_device->is_on_sas_device_init_list, 0);
+		kfree(same_sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (ioc->discovered_device_addition_on &&
+		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
+			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+			return;
+		} else
+			list_del(&sas_device->list);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	if (sas_device)
 		_scsih_remove_device(ioc, sas_device);
@@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
 	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	if (sas_device) {
+		if (ioc->discovered_device_addition_on &&
+		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
+			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+			return;
+		} else
+			list_del(&sas_device->list);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	if (sas_device)
 		_scsih_remove_device(ioc, sas_device);
@@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
 		ioc->name, __func__, sas_device->handle,
 		(unsigned long long)sas_device->sas_address));
 
+	atomic_set(&sas_device->is_on_sas_device_init_list, 1);
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
 	struct _sas_device *sas_device, *next;
 	unsigned long flags;
 
+	ioc->discovered_device_addition_on = 1;
 	/* SAS Device List */
 	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
 	    list) {
 
 		if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
 		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+			mpt3sas_transport_port_remove(ioc,
+					sas_device->sas_address,
+					sas_device->sas_address_parent);
+			_scsih_sas_device_remove(ioc, sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			/*
@@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
 				mpt3sas_transport_port_remove(ioc,
 				    sas_device->sas_address,
 				    sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+				_scsih_sas_device_remove(ioc, sas_device);
 				continue;
 			}
 		}
 
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
 		list_move_tail(&sas_device->list, &ioc->sas_device_list);
+		atomic_dec(&sas_device->is_on_sas_device_init_list);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
+	ioc->discovered_device_addition_on = 0;
 }
 
 /**
-- 
2.0.2


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization
  2015-05-04 15:05 [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization Sreekanth Reddy
@ 2015-05-05 15:35 ` Tomas Henzl
  2015-05-12  9:38   ` Sreekanth Reddy
  2015-05-06 18:48 ` Calvin Owens
  1 sibling, 1 reply; 52+ messages in thread
From: Tomas Henzl @ 2015-05-05 15:35 UTC (permalink / raw)
  To: Sreekanth Reddy, calvinowens
  Cc: martin.petersen, linux-scsi, jejb, JBottomley, Sathya.Prakash,
	chaitra.basappa, linux-kernel, hch

On 05/04/2015 05:05 PM, Sreekanth Reddy wrote:
> I have applied this patch on the latest upstream mpt3sas driver, then I have compiled and loaded the driver.
> In the driver logs I didn't see any attached drives are added to the OS, 'fdisk -l' command also doesn't list
>  the drives which are actually attached to the HBA.
>
> When I debug this issue then I see that in '_scsih_target_alloc'
>  driver is searching for sas_device from the lists 'sas_device_init_list' & 'sas_device_list'
>  based on the device sas address using the function mpt3sas_scsih_sas_device_find_by_sas_address(),
>  since this device is not in the 'sas_device_init_list' (as it is moved it to head list) driver exit
>  from this function without updating the required device addition information.
>
> To solve the original problem (i.e memory corruption), here I have attached the patch,
>  in this patch I have added one atomic flag is_on_sas_device_init_list in _sas_device_structure
>  and I followed below algorithm.
>
> 1. when ever a device is added to sas_device_init_list then driver will set this atomic flag of this device to one.
>
> 2. And during the addition of this device to SCSI mid layer,
>         if the device is successfully added to the OS then driver will move this device list in to sas_device_list list from sas_device_init_list list and at this time driver will reset this flag to zero.
>         if device is failed to register with SCSI mid layer then also driver will reset this flag to zero in function _scsih_sas_device_remove and will remove the device entry from sas_device_init_list and will free the device structure.
>
> 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
>          a. driver will check whether addition of discovered devices to SML process is currently running or not,
>                i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
>                     if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
>              ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
>
> 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this  device removal event.
>
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.h  |  2 ++
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 ++
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
>  4 files changed, 71 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..1aa10d2 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -376,6 +376,7 @@ struct _sas_device {
>  	u8	phy;
>  	u8	responding;
>  	u8	pfa_led_on;
> +	atomic_t is_on_sas_device_init_list;

Hi Sreekanth,
when is_on_sas_device_init_list is used it's protected
by ioc->sas_device_lock - why do you need a atomic_t ?
There is one exception, but easily fixable. 

>  };
>  
>  /**
> @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
>  	u8		broadcast_aen_busy;
>  	u16		broadcast_aen_pending;
>  	u8		shost_recovery;
> +	u8		discovered_device_addition_on;
>  
>  	struct mutex	reset_in_progress_mutex;
>  	spinlock_t 	ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..2a61286 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>      struct _sas_device *sas_device)
>  {
>  	unsigned long flags;
> +	struct _sas_device *same_sas_device;
>  
>  	if (!sas_device)
>  		return;
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	list_del(&sas_device->list);
> -	kfree(sas_device);
> +	same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> +						sas_device->handle);

Is it possible that when same_sas_device is not null, that the
value is not the same as for the sas_device ? 

> +	if (same_sas_device) {
> +		list_del(&same_sas_device->list);
> +		if (atomic_read(&sas_device->is_on_sas_device_init_list))

Seems easier to just set the variable without a test.

> +			atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> +		kfree(same_sas_device);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>  
> @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
>  	    "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
>  	    sas_device->handle, (unsigned long long)sas_device->sas_address));
>  
> +	atomic_set(&sas_device->is_on_sas_device_init_list, 1);
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
>  	_scsih_determine_boot_device(ioc, sas_device, 0);
> @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
>  	    sas_address);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
>  	struct _sas_device *sas_device, *next;
>  	unsigned long flags;
>  
> +	ioc->discovered_device_addition_on = 1;
>  	/* SAS Device List */
>  	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
>  	    list) {
>  
>  		if (ioc->hide_drives)
>  			continue;
> -
> +
>  		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
>  		    sas_device->sas_address_parent)) {
> -			list_del(&sas_device->list);
> -			kfree(sas_device);
> +			mpt2sas_transport_port_remove(ioc,
> +					sas_device->sas_address,
> +					sas_device->sas_address_parent);
> +			_scsih_sas_device_remove(ioc, sas_device);
>  			continue;
>  		} else if (!sas_device->starget) {
>  			if (!ioc->is_driver_loading) {
>  				mpt2sas_transport_port_remove(ioc,
>  					sas_device->sas_address,
>  					sas_device->sas_address_parent);
> -				list_del(&sas_device->list);
> -				kfree(sas_device);
> +				_scsih_sas_device_remove(ioc, sas_device);
>  				continue;
>  			}
>  		}
>  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  		list_move_tail(&sas_device->list, &ioc->sas_device_list);
> +		atomic_dec(&sas_device->is_on_sas_device_init_list);

Why not 'atomic_set(&sas_device->is_on_sas_device_init_list, 0);' ?
There is no place where you set the value of is_on_sas_device_init_list
higher than '1'.

Cheers,
Tomas

>  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	}
> +	ioc->discovered_device_addition_on = 0;
>  }
>  
>  /**
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..6188490 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -315,6 +315,7 @@ struct _sas_device {
>  	u8	responding;
>  	u8	fast_path;
>  	u8	pfa_led_on;
> +	atomic_t is_on_sas_device_init_list;
>  };
>  
>  /**
> @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
>  	u8		broadcast_aen_busy;
>  	u16		broadcast_aen_pending;
>  	u8		shost_recovery;
> +	u8		discovered_device_addition_on;
>  
>  	struct mutex	reset_in_progress_mutex;
>  	spinlock_t	ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..53cc9ea 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
>  	struct _sas_device *sas_device)
>  {
>  	unsigned long flags;
> +	struct _sas_device *same_sas_device;
>  
>  	if (!sas_device)
>  		return;
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	list_del(&sas_device->list);
> -	kfree(sas_device);
> +	same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> +						sas_device->handle);
> +	if (same_sas_device) {
> +		list_del(&same_sas_device->list);
> +		if (atomic_read(&sas_device->is_on_sas_device_init_list))
> +			atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> +		kfree(same_sas_device);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>  
> @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
>  	    sas_address);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
>  		ioc->name, __func__, sas_device->handle,
>  		(unsigned long long)sas_device->sas_address));
>  
> +	atomic_set(&sas_device->is_on_sas_device_init_list, 1);
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	list_add_tail(&sas_device->list, &ioc->sas_device_list);
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
>  	struct _sas_device *sas_device, *next;
>  	unsigned long flags;
>  
> +	ioc->discovered_device_addition_on = 1;
>  	/* SAS Device List */
>  	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
>  	    list) {
>  
>  		if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
>  		    sas_device->sas_address_parent)) {
> -			list_del(&sas_device->list);
> -			kfree(sas_device);
> +			mpt3sas_transport_port_remove(ioc,
> +					sas_device->sas_address,
> +					sas_device->sas_address_parent);
> +			_scsih_sas_device_remove(ioc, sas_device);
>  			continue;
>  		} else if (!sas_device->starget) {
>  			/*
> @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
>  				mpt3sas_transport_port_remove(ioc,
>  				    sas_device->sas_address,
>  				    sas_device->sas_address_parent);
> -				list_del(&sas_device->list);
> -				kfree(sas_device);
> +				_scsih_sas_device_remove(ioc, sas_device);
>  				continue;
>  			}
>  		}
>  
>  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  		list_move_tail(&sas_device->list, &ioc->sas_device_list);
> +		atomic_dec(&sas_device->is_on_sas_device_init_list);
>  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	}
> +	ioc->discovered_device_addition_on = 0;
>  }
>  
>  /**


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization
  2015-05-04 15:05 [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization Sreekanth Reddy
  2015-05-05 15:35 ` Tomas Henzl
@ 2015-05-06 18:48 ` Calvin Owens
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
  1 sibling, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-05-06 18:48 UTC (permalink / raw)
  To: Sreekanth Reddy
  Cc: martin.petersen, linux-scsi, jejb, JBottomley, Sathya.Prakash,
	chaitra.basappa, linux-kernel, hch, calvinowens

On Monday 05/04 at 20:35 +0530, Sreekanth Reddy wrote:
> I have applied this patch on the latest upstream mpt3sas driver, then
> I have compiled and loaded the driver.  In the driver logs I didn't
> see any attached drives are added to the OS, 'fdisk -l' command also
> doesn't list the drives which are actually attached to the HBA.
>
> When I debug this issue then I see that in '_scsih_target_alloc'
> driver is searching for sas_device from the lists
> 'sas_device_init_list' & 'sas_device_list' based on the device sas
> address using the function
> mpt3sas_scsih_sas_device_find_by_sas_address(), since this device is
> not in the 'sas_device_init_list' (as it is moved it to head list)
> driver exit from this function without updating the required device
> addition information.

Yes, I misunderstood that the initialization depended on the devices
still being on the init_list.

What's interesting about this is that when I tested it, it still worked.
I think the MPT2SAS_PORT_ENABLE_COMPLETE fw_event might zero
ioc->start_scan and allow scsih_scan_finished() to start probing devices
before all the devices are actually on the init_list. It seems to be
very repeatable per-machine whether or not it works.

But in any case, my patch was wrong.

> To solve the original problem (i.e memory corruption), here I have
> attached the patch, in this patch I have added one atomic flag
> is_on_sas_device_init_list in _sas_device_structure and I followed
> below algorithm.

The problem is that this only solves a single case. There isn't anything
to enforce that this or a similar chain of events can't happen elsewhere
in the code.

I think the best general solution would be to add a refcount to these
objects.  They sit on a list that can be concurrently accessed from
multiple threads, so I think a refcount is the best way to ensure that
objects aren't freed out from under other users.

I'm working on a patchset that does this. I'm starting by adding a
refcount to the sas_device object only, and refactoring the code in
mpt2sas_scsih.c to use it. I should be able to send up a first version
of that pretty soon to get some feedback.

Thanks,
Calvin

 
> 1. when ever a device is added to sas_device_init_list then driver
> will set this atomic flag of this device to one.
> 
> 2. And during the addition of this device to SCSI mid layer, if the
> device is successfully added to the OS then driver will move this
> device list in to sas_device_list list from sas_device_init_list list
> and at this time driver will reset this flag to zero.  if device is
> failed to register with SCSI mid layer then also driver will reset
> this flag to zero in function _scsih_sas_device_remove and will remove
> the device entry from sas_device_init_list and will free the device
> structure.
> 
> 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
>          a. driver will check whether addition of discovered devices to SML process is currently running or not,
>                i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
>                     if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
>              ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
> 
> 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this  device removal event.
> 
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.h  |  2 ++
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 ++
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
>  4 files changed, 71 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..1aa10d2 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -376,6 +376,7 @@ struct _sas_device {
>  	u8	phy;
>  	u8	responding;
>  	u8	pfa_led_on;
> +	atomic_t is_on_sas_device_init_list;
>  };
>  
>  /**
> @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
>  	u8		broadcast_aen_busy;
>  	u16		broadcast_aen_pending;
>  	u8		shost_recovery;
> +	u8		discovered_device_addition_on;
>  
>  	struct mutex	reset_in_progress_mutex;
>  	spinlock_t 	ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..2a61286 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>      struct _sas_device *sas_device)
>  {
>  	unsigned long flags;
> +	struct _sas_device *same_sas_device;
>  
>  	if (!sas_device)
>  		return;
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	list_del(&sas_device->list);
> -	kfree(sas_device);
> +	same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> +						sas_device->handle);
> +	if (same_sas_device) {
> +		list_del(&same_sas_device->list);
> +		if (atomic_read(&sas_device->is_on_sas_device_init_list))
> +			atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> +		kfree(same_sas_device);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>  
> @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
>  	    "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
>  	    sas_device->handle, (unsigned long long)sas_device->sas_address));
>  
> +	atomic_set(&sas_device->is_on_sas_device_init_list, 1);
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
>  	_scsih_determine_boot_device(ioc, sas_device, 0);
> @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
>  	    sas_address);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
>  	struct _sas_device *sas_device, *next;
>  	unsigned long flags;
>  
> +	ioc->discovered_device_addition_on = 1;
>  	/* SAS Device List */
>  	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
>  	    list) {
>  
>  		if (ioc->hide_drives)
>  			continue;
> -
> +
>  		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
>  		    sas_device->sas_address_parent)) {
> -			list_del(&sas_device->list);
> -			kfree(sas_device);
> +			mpt2sas_transport_port_remove(ioc,
> +					sas_device->sas_address,
> +					sas_device->sas_address_parent);
> +			_scsih_sas_device_remove(ioc, sas_device);
>  			continue;
>  		} else if (!sas_device->starget) {
>  			if (!ioc->is_driver_loading) {
>  				mpt2sas_transport_port_remove(ioc,
>  					sas_device->sas_address,
>  					sas_device->sas_address_parent);
> -				list_del(&sas_device->list);
> -				kfree(sas_device);
> +				_scsih_sas_device_remove(ioc, sas_device);
>  				continue;
>  			}
>  		}
>  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  		list_move_tail(&sas_device->list, &ioc->sas_device_list);
> +		atomic_dec(&sas_device->is_on_sas_device_init_list);
>  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	}
> +	ioc->discovered_device_addition_on = 0;
>  }
>  
>  /**
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..6188490 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -315,6 +315,7 @@ struct _sas_device {
>  	u8	responding;
>  	u8	fast_path;
>  	u8	pfa_led_on;
> +	atomic_t is_on_sas_device_init_list;
>  };
>  
>  /**
> @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
>  	u8		broadcast_aen_busy;
>  	u16		broadcast_aen_pending;
>  	u8		shost_recovery;
> +	u8		discovered_device_addition_on;
>  
>  	struct mutex	reset_in_progress_mutex;
>  	spinlock_t	ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..53cc9ea 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
>  	struct _sas_device *sas_device)
>  {
>  	unsigned long flags;
> +	struct _sas_device *same_sas_device;
>  
>  	if (!sas_device)
>  		return;
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	list_del(&sas_device->list);
> -	kfree(sas_device);
> +	same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> +						sas_device->handle);
> +	if (same_sas_device) {
> +		list_del(&same_sas_device->list);
> +		if (atomic_read(&sas_device->is_on_sas_device_init_list))
> +			atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> +		kfree(same_sas_device);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>  
> @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
>  	    sas_address);
> -	if (sas_device)
> -		list_del(&sas_device->list);
> +	if (sas_device) {
> +		if (ioc->discovered_device_addition_on &&
> +		    atomic_read(&sas_device->is_on_sas_device_init_list)) {
> +			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +			return;
> +		} else
> +			list_del(&sas_device->list);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	if (sas_device)
>  		_scsih_remove_device(ioc, sas_device);
> @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
>  		ioc->name, __func__, sas_device->handle,
>  		(unsigned long long)sas_device->sas_address));
>  
> +	atomic_set(&sas_device->is_on_sas_device_init_list, 1);
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	list_add_tail(&sas_device->list, &ioc->sas_device_list);
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
>  	struct _sas_device *sas_device, *next;
>  	unsigned long flags;
>  
> +	ioc->discovered_device_addition_on = 1;
>  	/* SAS Device List */
>  	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
>  	    list) {
>  
>  		if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
>  		    sas_device->sas_address_parent)) {
> -			list_del(&sas_device->list);
> -			kfree(sas_device);
> +			mpt3sas_transport_port_remove(ioc,
> +					sas_device->sas_address,
> +					sas_device->sas_address_parent);
> +			_scsih_sas_device_remove(ioc, sas_device);
>  			continue;
>  		} else if (!sas_device->starget) {
>  			/*
> @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
>  				mpt3sas_transport_port_remove(ioc,
>  				    sas_device->sas_address,
>  				    sas_device->sas_address_parent);
> -				list_del(&sas_device->list);
> -				kfree(sas_device);
> +				_scsih_sas_device_remove(ioc, sas_device);
>  				continue;
>  			}
>  		}
>  
>  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  		list_move_tail(&sas_device->list, &ioc->sas_device_list);
> +		atomic_dec(&sas_device->is_on_sas_device_init_list);
>  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  	}
> +	ioc->discovered_device_addition_on = 0;
>  }
>  
>  /**
> -- 
> 2.0.2
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization
  2015-05-05 15:35 ` Tomas Henzl
@ 2015-05-12  9:38   ` Sreekanth Reddy
  0 siblings, 0 replies; 52+ messages in thread
From: Sreekanth Reddy @ 2015-05-12  9:38 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: calvinowens, Martin K. Petersen, linux-scsi, jejb,
	James E.J. Bottomley, Sathya Prakash, Chaitra Basappa,
	linux-kernel, Christoph Hellwig

HI Tomas & Calvin,

Thanks for reviewing this patch.

There is some problem with this patch, In this patch as the driver is
ignoring the device removal event (when ioc's
discovered_device_addition_on flag and device's
is_on_sas_device_init_list is one) so driver not freeing the
sas_device structure from the sas_device_init_list list.

Due to this when ever same device is hot plugged then the device
addition to the SML is not happing.

I have one more patch to fix the original issue and I will post it today.

Regards,
Sreekanth

On Tue, May 5, 2015 at 9:05 PM, Tomas Henzl <thenzl@redhat.com> wrote:
>
> On 05/04/2015 05:05 PM, Sreekanth Reddy wrote:
> > I have applied this patch on the latest upstream mpt3sas driver, then I have compiled and loaded the driver.
> > In the driver logs I didn't see any attached drives are added to the OS, 'fdisk -l' command also doesn't list
> >  the drives which are actually attached to the HBA.
> >
> > When I debug this issue then I see that in '_scsih_target_alloc'
> >  driver is searching for sas_device from the lists 'sas_device_init_list' & 'sas_device_list'
> >  based on the device sas address using the function mpt3sas_scsih_sas_device_find_by_sas_address(),
> >  since this device is not in the 'sas_device_init_list' (as it is moved it to head list) driver exit
> >  from this function without updating the required device addition information.
> >
> > To solve the original problem (i.e memory corruption), here I have attached the patch,
> >  in this patch I have added one atomic flag is_on_sas_device_init_list in _sas_device_structure
> >  and I followed below algorithm.
> >
> > 1. when ever a device is added to sas_device_init_list then driver will set this atomic flag of this device to one.
> >
> > 2. And during the addition of this device to SCSI mid layer,
> >         if the device is successfully added to the OS then driver will move this device list in to sas_device_list list from sas_device_init_list list and at this time driver will reset this flag to zero.
> >         if device is failed to register with SCSI mid layer then also driver will reset this flag to zero in function _scsih_sas_device_remove and will remove the device entry from sas_device_init_list and will free the device structure.
> >
> > 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
> >          a. driver will check whether addition of discovered devices to SML process is currently running or not,
> >                i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
> >                     if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
> >              ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
> >
> > 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this  device removal event.
> >
> > Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> > ---
> >  drivers/scsi/mpt2sas/mpt2sas_base.h  |  2 ++
> >  drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
> >  drivers/scsi/mpt3sas/mpt3sas_base.h  |  2 ++
> >  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
> >  4 files changed, 71 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..1aa10d2 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -376,6 +376,7 @@ struct _sas_device {
> >       u8      phy;
> >       u8      responding;
> >       u8      pfa_led_on;
> > +     atomic_t is_on_sas_device_init_list;
>
> Hi Sreekanth,
> when is_on_sas_device_init_list is used it's protected
> by ioc->sas_device_lock - why do you need a atomic_t ?
> There is one exception, but easily fixable.
>
> >  };
> >
> >  /**
> > @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
> >       u8              broadcast_aen_busy;
> >       u16             broadcast_aen_pending;
> >       u8              shost_recovery;
> > +     u8              discovered_device_addition_on;
> >
> >       struct mutex    reset_in_progress_mutex;
> >       spinlock_t      ioc_reset_in_progress_lock;
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..2a61286 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> >      struct _sas_device *sas_device)
> >  {
> >       unsigned long flags;
> > +     struct _sas_device *same_sas_device;
> >
> >       if (!sas_device)
> >               return;
> >
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -     list_del(&sas_device->list);
> > -     kfree(sas_device);
> > +     same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> > +                                             sas_device->handle);
>
> Is it possible that when same_sas_device is not null, that the
> value is not the same as for the sas_device ?
>
> > +     if (same_sas_device) {
> > +             list_del(&same_sas_device->list);
> > +             if (atomic_read(&sas_device->is_on_sas_device_init_list))
>
> Seems easier to just set the variable without a test.
>
> > +                     atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> > +             kfree(same_sas_device);
> > +     }
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >  }
> >
> > @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> >           "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> >           sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> > +     atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >       list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> >       _scsih_determine_boot_device(ioc, sas_device, 0);
> > @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -     if (sas_device)
> > -             list_del(&sas_device->list);
> > +     if (sas_device) {
> > +             if (ioc->discovered_device_addition_on &&
> > +                 atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > +                     spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +                     return;
> > +             } else
> > +                     list_del(&sas_device->list);
> > +     }
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >       if (sas_device)
> >               _scsih_remove_device(ioc, sas_device);
> > @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> >           sas_address);
> > -     if (sas_device)
> > -             list_del(&sas_device->list);
> > +     if (sas_device) {
> > +             if (ioc->discovered_device_addition_on &&
> > +                 atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > +                     spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +                     return;
> > +             } else
> > +                     list_del(&sas_device->list);
> > +     }
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >       if (sas_device)
> >               _scsih_remove_device(ioc, sas_device);
> > @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> >       struct _sas_device *sas_device, *next;
> >       unsigned long flags;
> >
> > +     ioc->discovered_device_addition_on = 1;
> >       /* SAS Device List */
> >       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> >           list) {
> >
> >               if (ioc->hide_drives)
> >                       continue;
> > -
> > +
> >               if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> >                   sas_device->sas_address_parent)) {
> > -                     list_del(&sas_device->list);
> > -                     kfree(sas_device);
> > +                     mpt2sas_transport_port_remove(ioc,
> > +                                     sas_device->sas_address,
> > +                                     sas_device->sas_address_parent);
> > +                     _scsih_sas_device_remove(ioc, sas_device);
> >                       continue;
> >               } else if (!sas_device->starget) {
> >                       if (!ioc->is_driver_loading) {
> >                               mpt2sas_transport_port_remove(ioc,
> >                                       sas_device->sas_address,
> >                                       sas_device->sas_address_parent);
> > -                             list_del(&sas_device->list);
> > -                             kfree(sas_device);
> > +                             _scsih_sas_device_remove(ioc, sas_device);
> >                               continue;
> >                       }
> >               }
> >               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > +             atomic_dec(&sas_device->is_on_sas_device_init_list);
>
> Why not 'atomic_set(&sas_device->is_on_sas_device_init_list, 0);' ?
> There is no place where you set the value of is_on_sas_device_init_list
> higher than '1'.
>
> Cheers,
> Tomas
>
> >               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >       }
> > +     ioc->discovered_device_addition_on = 0;
> >  }
> >
> >  /**
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > index afa8816..6188490 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > @@ -315,6 +315,7 @@ struct _sas_device {
> >       u8      responding;
> >       u8      fast_path;
> >       u8      pfa_led_on;
> > +     atomic_t is_on_sas_device_init_list;
> >  };
> >
> >  /**
> > @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
> >       u8              broadcast_aen_busy;
> >       u16             broadcast_aen_pending;
> >       u8              shost_recovery;
> > +     u8              discovered_device_addition_on;
> >
> >       struct mutex    reset_in_progress_mutex;
> >       spinlock_t      ioc_reset_in_progress_lock;
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > index 5a97e32..53cc9ea 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
> >       struct _sas_device *sas_device)
> >  {
> >       unsigned long flags;
> > +     struct _sas_device *same_sas_device;
> >
> >       if (!sas_device)
> >               return;
> >
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -     list_del(&sas_device->list);
> > -     kfree(sas_device);
> > +     same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> > +                                             sas_device->handle);
> > +     if (same_sas_device) {
> > +             list_del(&same_sas_device->list);
> > +             if (atomic_read(&sas_device->is_on_sas_device_init_list))
> > +                     atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> > +             kfree(same_sas_device);
> > +     }
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >  }
> >
> > @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
> >
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -     if (sas_device)
> > -             list_del(&sas_device->list);
> > +     if (sas_device) {
> > +             if (ioc->discovered_device_addition_on &&
> > +                 atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > +                     spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +                     return;
> > +             } else
> > +                     list_del(&sas_device->list);
> > +     }
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >       if (sas_device)
> >               _scsih_remove_device(ioc, sas_device);
> > @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >       sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
> >           sas_address);
> > -     if (sas_device)
> > -             list_del(&sas_device->list);
> > +     if (sas_device) {
> > +             if (ioc->discovered_device_addition_on &&
> > +                 atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > +                     spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +                     return;
> > +             } else
> > +                     list_del(&sas_device->list);
> > +     }
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >       if (sas_device)
> >               _scsih_remove_device(ioc, sas_device);
> > @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
> >               ioc->name, __func__, sas_device->handle,
> >               (unsigned long long)sas_device->sas_address));
> >
> > +     atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> >       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >       list_add_tail(&sas_device->list, &ioc->sas_device_list);
> >       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> >       struct _sas_device *sas_device, *next;
> >       unsigned long flags;
> >
> > +     ioc->discovered_device_addition_on = 1;
> >       /* SAS Device List */
> >       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> >           list) {
> >
> >               if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
> >                   sas_device->sas_address_parent)) {
> > -                     list_del(&sas_device->list);
> > -                     kfree(sas_device);
> > +                     mpt3sas_transport_port_remove(ioc,
> > +                                     sas_device->sas_address,
> > +                                     sas_device->sas_address_parent);
> > +                     _scsih_sas_device_remove(ioc, sas_device);
> >                       continue;
> >               } else if (!sas_device->starget) {
> >                       /*
> > @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> >                               mpt3sas_transport_port_remove(ioc,
> >                                   sas_device->sas_address,
> >                                   sas_device->sas_address_parent);
> > -                             list_del(&sas_device->list);
> > -                             kfree(sas_device);
> > +                             _scsih_sas_device_remove(ioc, sas_device);
> >                               continue;
> >                       }
> >               }
> >
> >               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > +             atomic_dec(&sas_device->is_on_sas_device_init_list);
> >               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >       }
> > +     ioc->discovered_device_addition_on = 0;
> >  }
> >
> >  /**
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 0/6] Fixes for memory corruption in mpt2sas
  2015-05-06 18:48 ` Calvin Owens
@ 2015-05-15  3:41   ` Calvin Owens
  2015-05-15  3:41     ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
                       ` (7 more replies)
  0 siblings, 8 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:41 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. "Porting" this to mpt3sas will be
trivial since the part of the driver I'm touching is nearly identical
between the two, so I thought it would be simpler to review a patch
against mpt2sas alone at first.

I've tested this for a few days on a big storage box that seemed to be
very susceptible to the panics, and so far it seems to have eliminated
them.

Thanks,
Calvin


Total diffstat:

 drivers/scsi/mpt2sas/mpt2sas_base.h      |  20 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 482 +++++++++++++++++++++----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 359 insertions(+), 155 deletions(-)

Patches:

* [PATCH 1/6] Add refcount to sas_device struct
* [PATCH 2/6] Refactor code to use new sas_device refcount
* [PATCH 3/6] Fix unsafe sas_device_list usage
* [PATCH 4/6] Add refcount to fw_event_work struct
* [PATCH 5/6] Refactor code to use new fw_event refcount
* [PATCH 6/6] Fix unsafe fw_event_list usage

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/6] Add refcount to sas_device struct
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
@ 2015-05-15  3:41     ` Calvin Owens
  2015-05-15  3:41     ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:41 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..2e7dc33 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -376,8 +376,24 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	pfa_led_on;
+	struct kref refcount;
 };
 
+static inline void sas_device_get(struct _sas_device *s)
+{
+	kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+	kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+	kref_put(&s->refcount, sas_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/6] Refactor code to use new sas_device refcount
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-05-15  3:41     ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
@ 2015-05-15  3:41     ` Calvin Owens
  2015-05-15  3:41     ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:41 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

This patch refactors the code in the driver to use the new reference
count on the sas_device struct.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h      |   4 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 329 ++++++++++++++++++++-----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 220 insertions(+), 125 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 2e7dc33..dac0e8a 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -1111,7 +1111,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
     u16 handle);
 struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
     *ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address(
+    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address_nolock(
     struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
 
 void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..ad6ceb7e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,31 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
 	}
 }
 
+struct _sas_device *
+mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
+    u64 sas_address)
+{
+	struct _sas_device *sas_device;
+
+	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
+
+	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
+}
+
 /**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_scsih_sas_device_get_by_sas_address - sas device search
  * @ioc: per adapter object
  * @sas_address: sas address
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +559,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
     u64 sas_address)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+			sas_address);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static struct _sas_device *
+_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+	struct _sas_device *sas_device;
+
+	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
 
 	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
 }
 
 /**
- * _scsih_sas_device_find_by_handle - sas device search
+ * _scsih_sas_device_get_by_handle - sas device search
  * @ioc: per adapter object
  * @handle: sas device handle (assigned by firmware)
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +605,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
 
-	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
-
-	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	return NULL;
+	return sas_device;
 }
 
 /**
@@ -583,7 +623,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
  * @sas_device: the sas_device object
  * Context: This function will acquire ioc->sas_device_lock.
  *
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
  */
 static void
 _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
 	if (!sas_device)
 		return;
 
+	/*
+	 * The lock serializes access to the list, but we still need to verify
+	 * that nobody removed the entry while we were waiting on the lock.
+	 */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -620,6 +666,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -659,6 +706,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
 	_scsih_determine_boot_device(ioc, sas_device, 0);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
 		goto not_sata;
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
 		goto not_sata;
+
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
-	if (sas_device && sas_device->device_info &
-	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+	if (sas_device && sas_device->device_info
+			& MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
 		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  not_sata:
@@ -1271,7 +1322,7 @@ _scsih_target_alloc(struct scsi_target *starget)
 	/* sas/sata devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   rphy->identify.sas_address);
 
 	if (sas_device) {
@@ -1283,6 +1334,8 @@ _scsih_target_alloc(struct scsi_target *starget)
 		if (test_bit(sas_device->handle, ioc->pd_handles))
 			sas_target_priv_data->flags |=
 			    MPT_TARGET_FLAGS_RAID_COMPONENT;
+
+		sas_device_put(sas_device);
 	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   rphy->identify.sas_address);
 	if (sas_device && (sas_device->starget == starget) &&
 	    (sas_device->id == starget->id) &&
 	    (sas_device->channel == starget->channel))
 		sas_device->starget = NULL;
 
+	if (sas_device)
+		sas_device_put(sas_device);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  out:
@@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 				sas_target_priv_data->sas_address);
 		if (sas_device && (sas_device->starget == NULL)) {
 			sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1449,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 			     __func__, __LINE__);
 			sas_device->starget = starget;
 		}
+
+		if (sas_device)
+			sas_device_put(sas_device);
+
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 		   sas_target_priv_data->sas_address);
 		if (sas_device && !sas_target_priv_data->num_luns)
 			sas_device->starget = NULL;
+
+		if (sas_device)
+			sas_device_put(sas_device);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -2078,7 +2140,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	}
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
 	if (!sas_device) {
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2116,13 +2178,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	if (!ssp_target)
 		_scsih_display_sata_capabilities(ioc, handle, sdev);
 
-
 	_scsih_change_queue_depth(sdev, qdepth);
 
 	if (ssp_target) {
 		sas_read_port_mode_page(sdev);
 		_scsih_enable_tlr(ioc, sdev);
 	}
+
+	sas_device_put(sas_device);
 	return 0;
 }
 
@@ -2509,7 +2572,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 		    priv_target->sas_address);
 		if (sas_device) {
 			if (priv_target->flags &
@@ -2529,6 +2592,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 			    "enclosure_logical_id(0x%016llx), slot(%d)\n",
 			   (unsigned long long)sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
@@ -2604,8 +2669,7 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 
@@ -2629,12 +2693,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
+		sas_device = _scsih_sas_device_get_by_handle(ioc,
 		   sas_device_priv_data->sas_target->handle);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2651,6 +2713,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
  out:
 	sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -2665,8 +2731,7 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 	struct scsi_target *starget = scmd->device->sdev_target;
@@ -2689,12 +2754,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
+		sas_device = _scsih_sas_device_get_by_handle(ioc,
 		   sas_device_priv_data->sas_target->handle);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2711,6 +2774,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
  out:
 	starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -3002,15 +3069,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
 
 	list_for_each_entry(mpt2sas_port,
 	   &sas_expander->sas_port_list, port_list) {
-		if (mpt2sas_port->remote_identify.device_type ==
-		    SAS_END_DEVICE) {
+		if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
 			spin_lock_irqsave(&ioc->sas_device_lock, flags);
-			sas_device =
-			    mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-			   mpt2sas_port->remote_identify.sas_address);
-			if (sas_device)
+			sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+					mpt2sas_port->remote_identify.sas_address);
+			if (sas_device) {
 				set_bit(sas_device->handle,
-				    ioc->blocking_handles);
+						ioc->blocking_handles);
+				sas_device_put(sas_device);
+			}
 			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 		}
 	}
@@ -3080,7 +3147,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	Mpi2SCSITaskManagementRequest_t *mpi_request;
 	u16 smid;
-	struct _sas_device *sas_device;
+	struct _sas_device *sas_device = NULL;
 	struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
 	u64 sas_address = 0;
 	unsigned long flags;
@@ -3110,7 +3177,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (sas_device && sas_device->starget &&
 	     sas_device->starget->hostdata) {
 		sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3198,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!smid) {
 		delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
 		if (!delayed_tr)
-			return;
+			goto out;
 		INIT_LIST_HEAD(&delayed_tr->list);
 		delayed_tr->handle = handle;
 		list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
 		dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
 		    "DELAYED:tr:handle(0x%04x), (open)\n",
 		    ioc->name, handle));
-		return;
+		goto out;
 	}
 
 	dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3217,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	mpi_request->DevHandle = cpu_to_le16(handle);
 	mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
 	mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 
@@ -4068,7 +4138,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 	char *desc_scsi_state = ioc->tmp_string;
 	u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
 	struct _sas_device *sas_device = NULL;
-	unsigned long flags;
 	struct scsi_target *starget = scmd->device->sdev_target;
 	struct MPT2SAS_TARGET *priv_target = starget->hostdata;
 	char *device_str = NULL;
@@ -4200,8 +4269,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 		printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
 		    priv_target->sas_address);
 		if (sas_device) {
 			printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
@@ -4211,8 +4279,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 			    "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
 			    ioc->name, sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
 	printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4328,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	Mpi2SepRequest_t mpi_request;
 	struct _sas_device *sas_device;
 
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
 	if (!sas_device)
 		return;
 
@@ -4274,7 +4343,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    &mpi_request)) != 0) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
 		__FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 	sas_device->pfa_led_on = 1;
 
@@ -4284,8 +4353,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		 "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
 		 ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
 		 le32_to_cpu(mpi_reply.IOCLogInfo)));
-		return;
+		goto out;
 	}
+out:
+	sas_device_put(sas_device);
 }
 
 /**
@@ -4370,19 +4441,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	/* only handle non-raid devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (!sas_device) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 	starget = sas_device->starget;
 	sas_target_priv_data = starget->hostdata;
 
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
-	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+		goto out_unlock;
+
 	starget_printk(KERN_WARNING, starget, "predicted fault\n");
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -4396,7 +4465,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!event_reply) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 
 	event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4482,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
 	mpt2sas_ctl_add_to_event_log(ioc, event_reply);
 	kfree(event_reply);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+	return;
+
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	goto out;
 }
 
 /**
@@ -5148,14 +5225,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    sas_address);
 
 	if (!sas_device) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5248,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), flags!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* check if there were any issues with discovery */
 	if (_scsih_check_access_status(ioc, sas_address, handle,
-	    sas_device_pg0.AccessStatus)) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	    sas_device_pg0.AccessStatus))
+		goto out_unlock;
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	_scsih_ublock_io_device(ioc, sas_address);
+	return;
 
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 /**
@@ -5208,7 +5287,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 	u32 ioc_status;
 	__le64 sas_address;
 	u32 device_info;
-	unsigned long flags;
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5328,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
-
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
 	    sas_address);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	if (sas_device)
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return 0;
+	}
 
 	sas_device = kzalloc(sizeof(struct _sas_device),
 	    GFP_KERNEL);
@@ -5267,6 +5344,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->refcount);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5422,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5440,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 
 /**
@@ -5389,13 +5471,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc, sas_address);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
 /**
@@ -5716,26 +5802,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(event_data->SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    sas_address);
 
-	if (!sas_device || !sas_device->starget) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!sas_device || !sas_device->starget)
+		goto out;
 
 	target_priv_data = sas_device->starget->hostdata;
-	if (!target_priv_data) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!target_priv_data)
+		goto out;
 
 	if (event_data->ReasonCode ==
 	    MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
 		target_priv_data->tm_busy = 1;
 	else
 		target_priv_data->tm_busy = 0;
+
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
 }
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6211,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (sas_device) {
 		sas_device->volume_handle = 0;
 		sas_device->volume_wwid = 0;
@@ -6142,6 +6230,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	/* exposing raid component */
 	if (starget)
 		starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6170,7 +6260,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 		    &volume_wwid);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (sas_device) {
 		set_bit(handle, ioc->pd_handles);
 		if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6279,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 	/* hiding raid component */
 	if (starget)
 		starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6221,7 +6313,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
     Mpi2EventIrConfigElement_t *element)
 {
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6322,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
 
 	set_bit(handle, ioc->pd_handles);
 
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return;
+	}
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6600,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle, parent_handle;
 	u32 state;
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
 	u32 ioc_status;
@@ -6542,12 +6632,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 		if (!ioc->is_warpdrive)
 			set_bit(handle, ioc->pd_handles);
 
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
-		if (sas_device)
+		sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			return;
+		}
 
 		if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7179,11 +7268,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		}
 		phys_disk_num = pd_pg0.PhysDiskNum;
 		handle = le16_to_cpu(pd_pg0.DevHandle);
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
 		    handle) != 0)
@@ -7302,12 +7391,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		if (!(_scsih_is_end_device(
 		    le32_to_cpu(sas_device_pg0.DeviceInfo))))
 			continue;
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
 		    le64_to_cpu(sas_device_pg0.SASAddress));
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
 		if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
 			printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..ebfc827 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    rphy->identify.sas_address);
 	if (sas_device) {
 		*identifier = sas_device->enclosure_logical_id;
 		rc = 0;
+		sas_device_put(sas_device);
 	} else {
 		*identifier = 0;
 		rc = -ENXIO;
 	}
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    rphy->identify.sas_address);
-	if (sas_device)
+	if (sas_device) {
 		rc = sas_device->slot;
-	else
+		sas_device_put(sas_device);
+	} else {
 		rc = -ENXIO;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 3/6] Fix unsafe sas_device_list usage
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-05-15  3:41     ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
  2015-05-15  3:41     ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
@ 2015-05-15  3:41     ` Calvin Owens
  2015-05-15  3:42     ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:41 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

We cannot iterate over the list without holding a lock for the entire
duration, or we risk corrupting random memory if items are added or
deleted as we iterate.

This refactors code such that it always holds the lock when iterating
on or accessing the sas_device_list.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 83 +++++++++++++++++++++++++++---------
 1 file changed, 62 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index ad6ceb7e..9645055 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -7104,6 +7104,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	struct _raid_device *raid_device, *raid_device_next;
 	struct list_head tmp_list;
 	unsigned long flags;
+	LIST_HEAD(head);
 
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
 	    ioc->name);
@@ -7111,14 +7112,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	/* removing unresponding end devices */
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
 	    ioc->name);
+
+	/*
+	 * Iterate, pulling off devices marked as non-responding. We become the
+	 * owner for the reference the list had on any object we prune.
+	 */
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_for_each_entry_safe(sas_device, sas_device_next,
-	    &ioc->sas_device_list, list) {
+			&ioc->sas_device_list, list) {
 		if (!sas_device->responding)
-			mpt2sas_device_remove_by_sas_address(ioc,
-				sas_device->sas_address);
+			list_move_tail(&sas_device->list, &head);
 		else
 			sas_device->responding = 0;
 	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Now, uninitialize and remove the unresponding devices we pruned.
+	 */
+	list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+		_scsih_remove_device(ioc, sas_device);
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 
 	/* removing unresponding volumes */
 	if (ioc->ir_firmware) {
@@ -8055,6 +8071,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 	}
 }
 
+static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+	struct _sas_device *sas_device = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	if (!list_empty(&ioc->sas_device_init_list)) {
+		sas_device = list_first_entry(&ioc->sas_device_init_list,
+				struct _sas_device, list);
+		list_del_init(&sas_device->list);
+	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * If an item was dequeued, the caller now owns the reference that was
+	 * previously owned by the list
+	 */
+	return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+		struct _sas_device *sas_device)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
+	list_add_tail(&sas_device->list, &ioc->sas_device_list);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
 /**
  * _scsih_probe_sas - reporting sas devices to sas transport
  * @ioc: per adapter object
@@ -8064,34 +8111,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct _sas_device *sas_device, *next;
-	unsigned long flags;
-
-	/* SAS Device List */
-	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
-	    list) {
+	struct _sas_device *sas_device;
 
-		if (ioc->hide_drives)
-			continue;
+	if (ioc->hide_drives)
+		return;
 
+	while ((sas_device = dequeue_next_sas_device(ioc))) {
 		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
-		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+				sas_device->sas_address_parent)) {
+			sas_device_put(sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			if (!ioc->is_driver_loading) {
 				mpt2sas_transport_port_remove(ioc,
-					sas_device->sas_address,
-					sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+						sas_device->sas_address,
+						sas_device->sas_address_parent);
+				sas_device_put(sas_device);
 				continue;
 			}
 		}
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		list_move_tail(&sas_device->list, &ioc->sas_device_list);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+		sas_device_make_active(ioc, sas_device);
+		sas_device_put(sas_device);
 	}
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 4/6] Add refcount to fw_event_work struct
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                       ` (2 preceding siblings ...)
  2015-05-15  3:41     ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
@ 2015-05-15  3:42     ` Calvin Owens
  2015-05-15  3:42     ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:42 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 9645055..611b34d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
 	u8			VP_ID;
 	u8			ignore;
 	u16			event;
+	struct kref		refcount;
 	char			event_data[0] __aligned(4);
 };
 
+static void fw_event_work_free(struct kref *r)
+{
+	kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+	kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+	kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+	struct fw_event_work *fw_event;
+
+	fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+	if (!fw_event)
+		return NULL;
+
+	kref_init(&fw_event->refcount);
+	return fw_event;
+}
+
 /* raid transport support */
 static struct raid_template *mpt2sas_raid_template;
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 5/6] Refactor code to use new fw_event refcount
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                       ` (3 preceding siblings ...)
  2015-05-15  3:42     ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
@ 2015-05-15  3:42     ` Calvin Owens
  2015-05-15  3:42     ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:42 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

This refactors the fw_event code to use the new refcount.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 611b34d..8d8c814 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2863,6 +2863,7 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
 		return;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	fw_event_work_get(fw_event);
 	list_add_tail(&fw_event->list, &ioc->fw_event_list);
 	INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
 	queue_delayed_work(ioc->firmware_event_thread,
@@ -2887,12 +2888,13 @@ _scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
 	unsigned long flags;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
-	list_del(&fw_event->list);
-	kfree(fw_event);
+	if (!list_empty(&fw_event->list))
+		list_del_init(&fw_event->list);
+
+	fw_event_work_put(fw_event);
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
-
 /**
  * _scsih_error_recovery_delete_devices - remove devices not responding
  * @ioc: per adapter object
@@ -2907,13 +2909,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
 	if (ioc->is_driver_loading)
 		return;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 
 	fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -2927,12 +2930,13 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -4439,13 +4443,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
 	fw_event->device_handle = handle;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7740,7 +7745,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	}
 
 	sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
-	fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(sz);
 	if (!fw_event) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
@@ -7753,6 +7758,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	fw_event->VP_ID = mpi_reply->VP_ID;
 	fw_event->event = event;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 	return;
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 6/6] Fix unsafe fw_event_list usage
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                       ` (4 preceding siblings ...)
  2015-05-15  3:42     ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
@ 2015-05-15  3:42     ` Calvin Owens
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-07-02 19:22     ` [PATCH 0/6] " Jens Axboe
  7 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-05-15  3:42 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.

This refactors the code in _scsih_fw_event_cleanup_queue() to not
iterate over the fw_event_list without a lock. 

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 8d8c814..f504e28 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2939,6 +2939,23 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 	fw_event_work_put(fw_event);
 }
 
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+	unsigned long flags;
+	struct fw_event_work *fw_event = NULL;
+
+	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	if (!list_empty(&ioc->fw_event_list)) {
+		fw_event = list_first_entry(&ioc->fw_event_list,
+				struct fw_event_work, list);
+		list_del_init(&fw_event->list);
+		fw_event_work_get(fw_event);
+	}
+	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+	return fw_event;
+}
+
 /**
  * _scsih_fw_event_cleanup_queue - cleanup event queue
  * @ioc: per adapter object
@@ -2951,17 +2968,18 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct fw_event_work *fw_event, *next;
+	struct fw_event_work *fw_event;
 
 	if (list_empty(&ioc->fw_event_list) ||
 	     !ioc->firmware_event_thread || in_interrupt())
 		return;
 
-	list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
+	while ((fw_event = dequeue_next_fw_event(ioc))) {
 		if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
 			_scsih_fw_event_free(ioc, fw_event);
 			continue;
 		}
+		fw_event_work_put(fw_event);
 	}
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                       ` (5 preceding siblings ...)
  2015-05-15  3:42     ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
@ 2015-06-09  3:50     ` Calvin Owens
  2015-06-09  3:50       ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
                         ` (7 more replies)
  2015-07-02 19:22     ` [PATCH 0/6] " Jens Axboe
  7 siblings, 8 replies; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. "Porting" this to mpt3sas will be
trivial since the part of the driver I'm touching is nearly identical
between the two, so I thought it would be simpler to review a patch
against mpt2sas alone at first.

I've tested this on a handful of large storage boxes over the past few
weeks, so far it seems to have completely eliminated the memory
corruption panics.

Thanks,
Calvin


Total diffstat:

 drivers/scsi/mpt2sas/mpt2sas_base.h      |  20 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 482 +++++++++++++++++++++----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 359 insertions(+), 155 deletions(-)

Patches:

* [PATCH 1/6] Add refcount to sas_device struct
* [PATCH 2/6] Refactor code to use new sas_device refcount
* [PATCH 3/6] Fix unsafe sas_device_list usage
* [PATCH 4/6] Add refcount to fw_event_work struct
* [PATCH 5/6] Refactor code to use new fw_event refcount
* [PATCH 6/6] Fix unsafe fw_event_list usage

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/6] Add refcount to sas_device struct
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
@ 2015-06-09  3:50       ` Calvin Owens
  2015-07-03 15:24         ` Christoph Hellwig
  2015-06-09  3:50       ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
                         ` (6 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..2e7dc33 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -376,8 +376,24 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	pfa_led_on;
+	struct kref refcount;
 };
 
+static inline void sas_device_get(struct _sas_device *s)
+{
+	kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+	kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+	kref_put(&s->refcount, sas_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/6] Refactor code to use new sas_device refcount
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-06-09  3:50       ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
@ 2015-06-09  3:50       ` Calvin Owens
  2015-07-03 15:38         ` Christoph Hellwig
  2015-06-09  3:50       ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
                         ` (5 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

This patch refactors the code in the driver to use the new reference
count on the sas_device struct.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h      |   4 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 329 ++++++++++++++++++++-----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 220 insertions(+), 125 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 2e7dc33..dac0e8a 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -1111,7 +1111,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
     u16 handle);
 struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
     *ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address(
+    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address_nolock(
     struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
 
 void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..ad6ceb7e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,31 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
 	}
 }
 
+struct _sas_device *
+mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
+    u64 sas_address)
+{
+	struct _sas_device *sas_device;
+
+	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
+
+	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
+}
+
 /**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_scsih_sas_device_get_by_sas_address - sas device search
  * @ioc: per adapter object
  * @sas_address: sas address
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +559,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
     u64 sas_address)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+			sas_address);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static struct _sas_device *
+_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+	struct _sas_device *sas_device;
+
+	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
 
 	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
 }
 
 /**
- * _scsih_sas_device_find_by_handle - sas device search
+ * _scsih_sas_device_get_by_handle - sas device search
  * @ioc: per adapter object
  * @handle: sas device handle (assigned by firmware)
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +605,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
 
-	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
-
-	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	return NULL;
+	return sas_device;
 }
 
 /**
@@ -583,7 +623,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
  * @sas_device: the sas_device object
  * Context: This function will acquire ioc->sas_device_lock.
  *
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
  */
 static void
 _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
 	if (!sas_device)
 		return;
 
+	/*
+	 * The lock serializes access to the list, but we still need to verify
+	 * that nobody removed the entry while we were waiting on the lock.
+	 */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -620,6 +666,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -659,6 +706,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
 	_scsih_determine_boot_device(ioc, sas_device, 0);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
 		goto not_sata;
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
 		goto not_sata;
+
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
-	if (sas_device && sas_device->device_info &
-	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+	if (sas_device && sas_device->device_info
+			& MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
 		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  not_sata:
@@ -1271,7 +1322,7 @@ _scsih_target_alloc(struct scsi_target *starget)
 	/* sas/sata devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   rphy->identify.sas_address);
 
 	if (sas_device) {
@@ -1283,6 +1334,8 @@ _scsih_target_alloc(struct scsi_target *starget)
 		if (test_bit(sas_device->handle, ioc->pd_handles))
 			sas_target_priv_data->flags |=
 			    MPT_TARGET_FLAGS_RAID_COMPONENT;
+
+		sas_device_put(sas_device);
 	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   rphy->identify.sas_address);
 	if (sas_device && (sas_device->starget == starget) &&
 	    (sas_device->id == starget->id) &&
 	    (sas_device->channel == starget->channel))
 		sas_device->starget = NULL;
 
+	if (sas_device)
+		sas_device_put(sas_device);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  out:
@@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 				sas_target_priv_data->sas_address);
 		if (sas_device && (sas_device->starget == NULL)) {
 			sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1449,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 			     __func__, __LINE__);
 			sas_device->starget = starget;
 		}
+
+		if (sas_device)
+			sas_device_put(sas_device);
+
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 		   sas_target_priv_data->sas_address);
 		if (sas_device && !sas_target_priv_data->num_luns)
 			sas_device->starget = NULL;
+
+		if (sas_device)
+			sas_device_put(sas_device);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -2078,7 +2140,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	}
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
 	if (!sas_device) {
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2116,13 +2178,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	if (!ssp_target)
 		_scsih_display_sata_capabilities(ioc, handle, sdev);
 
-
 	_scsih_change_queue_depth(sdev, qdepth);
 
 	if (ssp_target) {
 		sas_read_port_mode_page(sdev);
 		_scsih_enable_tlr(ioc, sdev);
 	}
+
+	sas_device_put(sas_device);
 	return 0;
 }
 
@@ -2509,7 +2572,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 		    priv_target->sas_address);
 		if (sas_device) {
 			if (priv_target->flags &
@@ -2529,6 +2592,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 			    "enclosure_logical_id(0x%016llx), slot(%d)\n",
 			   (unsigned long long)sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
@@ -2604,8 +2669,7 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 
@@ -2629,12 +2693,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
+		sas_device = _scsih_sas_device_get_by_handle(ioc,
 		   sas_device_priv_data->sas_target->handle);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2651,6 +2713,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
  out:
 	sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -2665,8 +2731,7 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 	struct scsi_target *starget = scmd->device->sdev_target;
@@ -2689,12 +2754,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
+		sas_device = _scsih_sas_device_get_by_handle(ioc,
 		   sas_device_priv_data->sas_target->handle);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2711,6 +2774,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
  out:
 	starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -3002,15 +3069,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
 
 	list_for_each_entry(mpt2sas_port,
 	   &sas_expander->sas_port_list, port_list) {
-		if (mpt2sas_port->remote_identify.device_type ==
-		    SAS_END_DEVICE) {
+		if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
 			spin_lock_irqsave(&ioc->sas_device_lock, flags);
-			sas_device =
-			    mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-			   mpt2sas_port->remote_identify.sas_address);
-			if (sas_device)
+			sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+					mpt2sas_port->remote_identify.sas_address);
+			if (sas_device) {
 				set_bit(sas_device->handle,
-				    ioc->blocking_handles);
+						ioc->blocking_handles);
+				sas_device_put(sas_device);
+			}
 			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 		}
 	}
@@ -3080,7 +3147,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	Mpi2SCSITaskManagementRequest_t *mpi_request;
 	u16 smid;
-	struct _sas_device *sas_device;
+	struct _sas_device *sas_device = NULL;
 	struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
 	u64 sas_address = 0;
 	unsigned long flags;
@@ -3110,7 +3177,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (sas_device && sas_device->starget &&
 	     sas_device->starget->hostdata) {
 		sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3198,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!smid) {
 		delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
 		if (!delayed_tr)
-			return;
+			goto out;
 		INIT_LIST_HEAD(&delayed_tr->list);
 		delayed_tr->handle = handle;
 		list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
 		dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
 		    "DELAYED:tr:handle(0x%04x), (open)\n",
 		    ioc->name, handle));
-		return;
+		goto out;
 	}
 
 	dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3217,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	mpi_request->DevHandle = cpu_to_le16(handle);
 	mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
 	mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 
@@ -4068,7 +4138,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 	char *desc_scsi_state = ioc->tmp_string;
 	u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
 	struct _sas_device *sas_device = NULL;
-	unsigned long flags;
 	struct scsi_target *starget = scmd->device->sdev_target;
 	struct MPT2SAS_TARGET *priv_target = starget->hostdata;
 	char *device_str = NULL;
@@ -4200,8 +4269,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 		printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
 		    priv_target->sas_address);
 		if (sas_device) {
 			printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
@@ -4211,8 +4279,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 			    "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
 			    ioc->name, sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
 	printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4328,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	Mpi2SepRequest_t mpi_request;
 	struct _sas_device *sas_device;
 
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
 	if (!sas_device)
 		return;
 
@@ -4274,7 +4343,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    &mpi_request)) != 0) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
 		__FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 	sas_device->pfa_led_on = 1;
 
@@ -4284,8 +4353,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		 "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
 		 ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
 		 le32_to_cpu(mpi_reply.IOCLogInfo)));
-		return;
+		goto out;
 	}
+out:
+	sas_device_put(sas_device);
 }
 
 /**
@@ -4370,19 +4441,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	/* only handle non-raid devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (!sas_device) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 	starget = sas_device->starget;
 	sas_target_priv_data = starget->hostdata;
 
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
-	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+		goto out_unlock;
+
 	starget_printk(KERN_WARNING, starget, "predicted fault\n");
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -4396,7 +4465,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!event_reply) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 
 	event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4482,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
 	mpt2sas_ctl_add_to_event_log(ioc, event_reply);
 	kfree(event_reply);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+	return;
+
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	goto out;
 }
 
 /**
@@ -5148,14 +5225,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    sas_address);
 
 	if (!sas_device) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5248,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), flags!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* check if there were any issues with discovery */
 	if (_scsih_check_access_status(ioc, sas_address, handle,
-	    sas_device_pg0.AccessStatus)) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	    sas_device_pg0.AccessStatus))
+		goto out_unlock;
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	_scsih_ublock_io_device(ioc, sas_address);
+	return;
 
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 /**
@@ -5208,7 +5287,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 	u32 ioc_status;
 	__le64 sas_address;
 	u32 device_info;
-	unsigned long flags;
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5328,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
-
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
 	    sas_address);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	if (sas_device)
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return 0;
+	}
 
 	sas_device = kzalloc(sizeof(struct _sas_device),
 	    GFP_KERNEL);
@@ -5267,6 +5344,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->refcount);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5422,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5440,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 
 /**
@@ -5389,13 +5471,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc, sas_address);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
 /**
@@ -5716,26 +5802,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(event_data->SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    sas_address);
 
-	if (!sas_device || !sas_device->starget) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!sas_device || !sas_device->starget)
+		goto out;
 
 	target_priv_data = sas_device->starget->hostdata;
-	if (!target_priv_data) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!target_priv_data)
+		goto out;
 
 	if (event_data->ReasonCode ==
 	    MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
 		target_priv_data->tm_busy = 1;
 	else
 		target_priv_data->tm_busy = 0;
+
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
 }
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6211,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (sas_device) {
 		sas_device->volume_handle = 0;
 		sas_device->volume_wwid = 0;
@@ -6142,6 +6230,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	/* exposing raid component */
 	if (starget)
 		starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6170,7 +6260,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 		    &volume_wwid);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
 	if (sas_device) {
 		set_bit(handle, ioc->pd_handles);
 		if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6279,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 	/* hiding raid component */
 	if (starget)
 		starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6221,7 +6313,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
     Mpi2EventIrConfigElement_t *element)
 {
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6322,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
 
 	set_bit(handle, ioc->pd_handles);
 
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return;
+	}
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6600,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle, parent_handle;
 	u32 state;
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
 	u32 ioc_status;
@@ -6542,12 +6632,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 		if (!ioc->is_warpdrive)
 			set_bit(handle, ioc->pd_handles);
 
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
-		if (sas_device)
+		sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			return;
+		}
 
 		if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7179,11 +7268,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		}
 		phys_disk_num = pd_pg0.PhysDiskNum;
 		handle = le16_to_cpu(pd_pg0.DevHandle);
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
 		    handle) != 0)
@@ -7302,12 +7391,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		if (!(_scsih_is_end_device(
 		    le32_to_cpu(sas_device_pg0.DeviceInfo))))
 			continue;
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
 		    le64_to_cpu(sas_device_pg0.SASAddress));
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
 		if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
 			printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..ebfc827 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    rphy->identify.sas_address);
 	if (sas_device) {
 		*identifier = sas_device->enclosure_logical_id;
 		rc = 0;
+		sas_device_put(sas_device);
 	} else {
 		*identifier = 0;
 		rc = -ENXIO;
 	}
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
 	    rphy->identify.sas_address);
-	if (sas_device)
+	if (sas_device) {
 		rc = sas_device->slot;
-	else
+		sas_device_put(sas_device);
+	} else {
 		rc = -ENXIO;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 3/6] Fix unsafe sas_device_list usage
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-06-09  3:50       ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
  2015-06-09  3:50       ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
@ 2015-06-09  3:50       ` Calvin Owens
  2015-07-03 16:03         ` Christoph Hellwig
  2015-06-09  3:50       ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
                         ` (4 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

We cannot iterate over the list without holding a lock for the entire
duration, or we risk corrupting random memory if items are added or
deleted as we iterate.

This refactors code such that it always holds the lock when iterating
on or accessing the sas_device_list.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 83 +++++++++++++++++++++++++++---------
 1 file changed, 62 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index ad6ceb7e..9645055 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -7104,6 +7104,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	struct _raid_device *raid_device, *raid_device_next;
 	struct list_head tmp_list;
 	unsigned long flags;
+	LIST_HEAD(head);
 
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
 	    ioc->name);
@@ -7111,14 +7112,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	/* removing unresponding end devices */
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
 	    ioc->name);
+
+	/*
+	 * Iterate, pulling off devices marked as non-responding. We become the
+	 * owner for the reference the list had on any object we prune.
+	 */
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_for_each_entry_safe(sas_device, sas_device_next,
-	    &ioc->sas_device_list, list) {
+			&ioc->sas_device_list, list) {
 		if (!sas_device->responding)
-			mpt2sas_device_remove_by_sas_address(ioc,
-				sas_device->sas_address);
+			list_move_tail(&sas_device->list, &head);
 		else
 			sas_device->responding = 0;
 	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Now, uninitialize and remove the unresponding devices we pruned.
+	 */
+	list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+		_scsih_remove_device(ioc, sas_device);
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 
 	/* removing unresponding volumes */
 	if (ioc->ir_firmware) {
@@ -8055,6 +8071,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 	}
 }
 
+static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+	struct _sas_device *sas_device = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	if (!list_empty(&ioc->sas_device_init_list)) {
+		sas_device = list_first_entry(&ioc->sas_device_init_list,
+				struct _sas_device, list);
+		list_del_init(&sas_device->list);
+	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * If an item was dequeued, the caller now owns the reference that was
+	 * previously owned by the list
+	 */
+	return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+		struct _sas_device *sas_device)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
+	list_add_tail(&sas_device->list, &ioc->sas_device_list);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
 /**
  * _scsih_probe_sas - reporting sas devices to sas transport
  * @ioc: per adapter object
@@ -8064,34 +8111,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct _sas_device *sas_device, *next;
-	unsigned long flags;
-
-	/* SAS Device List */
-	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
-	    list) {
+	struct _sas_device *sas_device;
 
-		if (ioc->hide_drives)
-			continue;
+	if (ioc->hide_drives)
+		return;
 
+	while ((sas_device = dequeue_next_sas_device(ioc))) {
 		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
-		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+				sas_device->sas_address_parent)) {
+			sas_device_put(sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			if (!ioc->is_driver_loading) {
 				mpt2sas_transport_port_remove(ioc,
-					sas_device->sas_address,
-					sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+						sas_device->sas_address,
+						sas_device->sas_address_parent);
+				sas_device_put(sas_device);
 				continue;
 			}
 		}
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		list_move_tail(&sas_device->list, &ioc->sas_device_list);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+		sas_device_make_active(ioc, sas_device);
+		sas_device_put(sas_device);
 	}
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 4/6] Add refcount to fw_event_work struct
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                         ` (2 preceding siblings ...)
  2015-06-09  3:50       ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
@ 2015-06-09  3:50       ` Calvin Owens
  2015-07-03 15:38         ` Christoph Hellwig
  2015-06-09  3:50       ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
                         ` (3 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 9645055..611b34d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
 	u8			VP_ID;
 	u8			ignore;
 	u16			event;
+	struct kref		refcount;
 	char			event_data[0] __aligned(4);
 };
 
+static void fw_event_work_free(struct kref *r)
+{
+	kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+	kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+	kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+	struct fw_event_work *fw_event;
+
+	fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+	if (!fw_event)
+		return NULL;
+
+	kref_init(&fw_event->refcount);
+	return fw_event;
+}
+
 /* raid transport support */
 static struct raid_template *mpt2sas_raid_template;
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 5/6] Refactor code to use new fw_event refcount
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                         ` (3 preceding siblings ...)
  2015-06-09  3:50       ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
@ 2015-06-09  3:50       ` Calvin Owens
  2015-07-03 16:00         ` Christoph Hellwig
  2015-06-09  3:50       ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
                         ` (2 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

This refactors the fw_event code to use the new refcount.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 611b34d..8d8c814 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2863,6 +2863,7 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
 		return;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	fw_event_work_get(fw_event);
 	list_add_tail(&fw_event->list, &ioc->fw_event_list);
 	INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
 	queue_delayed_work(ioc->firmware_event_thread,
@@ -2887,12 +2888,13 @@ _scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
 	unsigned long flags;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
-	list_del(&fw_event->list);
-	kfree(fw_event);
+	if (!list_empty(&fw_event->list))
+		list_del_init(&fw_event->list);
+
+	fw_event_work_put(fw_event);
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
-
 /**
  * _scsih_error_recovery_delete_devices - remove devices not responding
  * @ioc: per adapter object
@@ -2907,13 +2909,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
 	if (ioc->is_driver_loading)
 		return;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 
 	fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -2927,12 +2930,13 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -4439,13 +4443,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
 	fw_event->device_handle = handle;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7740,7 +7745,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	}
 
 	sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
-	fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(sz);
 	if (!fw_event) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
@@ -7753,6 +7758,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	fw_event->VP_ID = mpi_reply->VP_ID;
 	fw_event->event = event;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 	return;
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 6/6] Fix unsafe fw_event_list usage
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                         ` (4 preceding siblings ...)
  2015-06-09  3:50       ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
@ 2015-06-09  3:50       ` Calvin Owens
  2015-07-03 16:02         ` Christoph Hellwig
  2015-07-02 20:15       ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Bart Van Assche
  2015-07-12  4:24       ` [PATCH 0/2 v2] " Calvin Owens
  7 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-06-09  3:50 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, calvinowens, kernel-team

Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.

This refactors the code in _scsih_fw_event_cleanup_queue() to not
iterate over the fw_event_list without a lock. 

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 8d8c814..f504e28 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2939,6 +2939,23 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 	fw_event_work_put(fw_event);
 }
 
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+	unsigned long flags;
+	struct fw_event_work *fw_event = NULL;
+
+	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	if (!list_empty(&ioc->fw_event_list)) {
+		fw_event = list_first_entry(&ioc->fw_event_list,
+				struct fw_event_work, list);
+		list_del_init(&fw_event->list);
+		fw_event_work_get(fw_event);
+	}
+	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+	return fw_event;
+}
+
 /**
  * _scsih_fw_event_cleanup_queue - cleanup event queue
  * @ioc: per adapter object
@@ -2951,17 +2968,18 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct fw_event_work *fw_event, *next;
+	struct fw_event_work *fw_event;
 
 	if (list_empty(&ioc->fw_event_list) ||
 	     !ioc->firmware_event_thread || in_interrupt())
 		return;
 
-	list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
+	while ((fw_event = dequeue_next_fw_event(ioc))) {
 		if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
 			_scsih_fw_event_free(ioc, fw_event);
 			continue;
 		}
+		fw_event_work_put(fw_event);
 	}
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/6] Fixes for memory corruption in mpt2sas
  2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                       ` (6 preceding siblings ...)
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
@ 2015-07-02 19:22     ` Jens Axboe
  7 siblings, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2015-07-02 19:22 UTC (permalink / raw)
  To: Calvin Owens, Nagalakshmi Nandigama, Praveen Krishnamoorthy,
	Sreekanth Reddy, Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Christoph Hellwig

On 05/14/2015 09:41 PM, Calvin Owens wrote:
> Hello all,
>
> This patchset attempts to address problems we've been having with
> panics due to memory corruption from the mpt2sas driver.
>
> I will provide a similar set of fixes for mpt3sas, since we see
> similar issues there as well. "Porting" this to mpt3sas will be
> trivial since the part of the driver I'm touching is nearly identical
> between the two, so I thought it would be simpler to review a patch
> against mpt2sas alone at first.
>
> I've tested this for a few days on a big storage box that seemed to be
> very susceptible to the panics, and so far it seems to have eliminated
> them.

Guys, can someone outside of FB please review this? We're hitting random 
memory corruptions without these fixes.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                         ` (5 preceding siblings ...)
  2015-06-09  3:50       ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
@ 2015-07-02 20:15       ` Bart Van Assche
  2015-07-12  4:24       ` [PATCH 0/2 v2] " Calvin Owens
  7 siblings, 0 replies; 52+ messages in thread
From: Bart Van Assche @ 2015-07-02 20:15 UTC (permalink / raw)
  To: Calvin Owens, Nagalakshmi Nandigama, Praveen Krishnamoorthy,
	Sreekanth Reddy, Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team, Jens Axboe

On 06/08/2015 08:50 PM, Calvin Owens wrote:
> This patchset attempts to address problems we've been having with
> panics due to memory corruption from the mpt2sas driver.
>
> I will provide a similar set of fixes for mpt3sas, since we see
> similar issues there as well. "Porting" this to mpt3sas will be
> trivial since the part of the driver I'm touching is nearly identical
> between the two, so I thought it would be simpler to review a patch
> against mpt2sas alone at first.
>
> I've tested this on a handful of large storage boxes over the past few
> weeks, so far it seems to have completely eliminated the memory
> corruption panics.

If you have to repost this series please convert 
BUG_ON(!spin_is_locked(&ioc->sas_device_lock)); into 
lockdep_is_held(...). Otherwise, for the whole series:

Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/6] Add refcount to sas_device struct
  2015-06-09  3:50       ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
@ 2015-07-03 15:24         ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-03 15:24 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Mon, Jun 08, 2015 at 08:50:51PM -0700, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other.
> 
> Signed-off-by: Calvin Owens <calvinowens@fb.com>

Thsi doesn't make sense without users of the refcount, and should be
squashed into the patch actually using the refcounting.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/6] Refactor code to use new sas_device refcount
  2015-06-09  3:50       ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
@ 2015-07-03 15:38         ` Christoph Hellwig
  2015-07-12  4:15           ` Calvin Owens
  0 siblings, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-03 15:38 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

>  
> +struct _sas_device *
> +mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
> +    u64 sas_address)

Any chance to use a shorter name for this function? E.g.
__mpt2sas_get_sdev_by_addr ?

> +{
> +	struct _sas_device *sas_device;
> +
> +	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));

This will blow on UP builds.  Please use assert_spin_locked or
lockdep_assert_held instead.  And don't ask me which of the two,
that's a mystery I don't understand myself either.

>  struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>      u64 sas_address)
>  {

> +static struct _sas_device *
> +_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)

>  static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)

Same comments about the function names as above.

> +	struct _sas_device *sas_device;
> +
> +	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));

Same comment about the right assert helpers as above.

> @@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>  	if (!sas_device)
>  		return;
>  
> +	/*
> +	 * The lock serializes access to the list, but we still need to verify
> +	 * that nobody removed the entry while we were waiting on the lock.
> +	 */
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	list_del(&sas_device->list);
> -	kfree(sas_device);
> +	if (!list_empty(&sas_device->list)) {
> +		list_del_init(&sas_device->list);
> +		sas_device_put(sas_device);
> +	}
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

This looks odd to me.  Normally you'd have the lock from the list
iteration that finds the device.  From looking at the code it seems
like this only called from probe failure paths, though.  It seems like
for this case the device simplify shouldn't be added until the probe
succeeds and this function should go away?

> @@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
>  		goto not_sata;
>  	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
>  		goto not_sata;
> +
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
>  	   sas_device_priv_data->sas_target->sas_address);
> -	if (sas_device && sas_device->device_info &
> -	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> +	if (sas_device && sas_device->device_info
> +			& MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
>  		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> +		sas_device_put(sas_device);
> +	}

Please store a pointer to the sas_device in struct scsi_target ->hostdata
in _scsih_target_alloc and avoid the need for this and other runtime
lookups where we have a scsi_device or scsi_target structure available.

> @@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
>  	rphy = dev_to_rphy(starget->dev.parent);
> -	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
>  	   rphy->identify.sas_address);
>  	if (sas_device && (sas_device->starget == starget) &&
>  	    (sas_device->id == starget->id) &&
>  	    (sas_device->channel == starget->channel))
>  		sas_device->starget = NULL;
>  
> +	if (sas_device)
> +		sas_device_put(sas_device);
>  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

.. like this one.

>   out:
> @@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>  
>  	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
>  				sas_target_priv_data->sas_address);
>  		if (sas_device && (sas_device->starget == NULL)) {
>  			sdev_printk(KERN_INFO, sdev,

.. or this one ..

> @@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>  
>  	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
>  		   sas_target_priv_data->sas_address);
>  		if (sas_device && !sas_target_priv_data->num_luns)
>  			sas_device->starget = NULL;
> +
> +		if (sas_device)
> +			sas_device_put(sas_device);
>  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

.. and this, and many more.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/6] Add refcount to fw_event_work struct
  2015-06-09  3:50       ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
@ 2015-07-03 15:38         ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-03 15:38 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Mon, Jun 08, 2015 at 08:50:54PM -0700, Calvin Owens wrote:
> The fw_event_work struct is concurrently referenced at shutdown, so
> add a refcount to protect it.

Same comment here - a refcount that isn't used isn't useful, please fold
into the next patch.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 5/6] Refactor code to use new fw_event refcount
  2015-06-09  3:50       ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
@ 2015-07-03 16:00         ` Christoph Hellwig
  2015-07-12  4:13           ` Calvin Owens
  0 siblings, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-03 16:00 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Mon, Jun 08, 2015 at 08:50:55PM -0700, Calvin Owens wrote:
> This refactors the fw_event code to use the new refcount.

I spent some time looking over this code because it's so convoluted.
In general I think code should either embeed one work_struct (and it
really doesn't seem to need a delayed work here!) or if needed a list
and not both like this one.  But it's probably too much work to sort
all this out, so let's go with your version.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/6] Fix unsafe fw_event_list usage
  2015-06-09  3:50       ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
@ 2015-07-03 16:02         ` Christoph Hellwig
  2015-07-12  4:20           ` Calvin Owens
  0 siblings, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-03 16:02 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
> Since the fw_event deletes itself from the list, cleanup_queue() can
> walk onto garbage pointers or walk off into freed memory.
> 
> This refactors the code in _scsih_fw_event_cleanup_queue() to not
> iterate over the fw_event_list without a lock. 

I think this really should be folded into the previous one, with the
fixes in this one the other refcounting change don't make a whole lot
sense.

> +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> +{
> +	unsigned long flags;
> +	struct fw_event_work *fw_event = NULL;
> +
> +	spin_lock_irqsave(&ioc->fw_event_lock, flags);
> +	if (!list_empty(&ioc->fw_event_list)) {
> +		fw_event = list_first_entry(&ioc->fw_event_list,
> +				struct fw_event_work, list);
> +		list_del_init(&fw_event->list);
> +		fw_event_work_get(fw_event);
> +	}
> +	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> +
> +	return fw_event;

Shouldn't we have a reference for each item on the list that gets
transfer to whomever removes it from the list?

Additionally _firmware_event_work should call dequeue_next_fw_event
first in the function so that item is off the list before we process
it, and can then just drop the reference once it's done.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 3/6] Fix unsafe sas_device_list usage
  2015-06-09  3:50       ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
@ 2015-07-03 16:03         ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-03 16:03 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Mon, Jun 08, 2015 at 08:50:53PM -0700, Calvin Owens wrote:
> We cannot iterate over the list without holding a lock for the entire
> duration, or we risk corrupting random memory if items are added or
> deleted as we iterate.
> 
> This refactors code such that it always holds the lock when iterating
> on or accessing the sas_device_list.

This looks sensible but should probably be folded into the previous
patch.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 5/6] Refactor code to use new fw_event refcount
  2015-07-03 16:00         ` Christoph Hellwig
@ 2015-07-12  4:13           ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-12  4:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

Thanks for this, I'm sending a v2 shortly.

On Friday 07/03 at 09:00 -0700, Christoph Hellwig wrote:
> On Mon, Jun 08, 2015 at 08:50:55PM -0700, Calvin Owens wrote:
> > This refactors the fw_event code to use the new refcount.
> 
> I spent some time looking over this code because it's so convoluted.
> In general I think code should either embeed one work_struct (and it
> really doesn't seem to need a delayed work here!) or if needed a list
> and not both like this one.  But it's probably too much work to sort
> all this out, so let's go with your version.

Yeah, I tried to get rid of fw_event_list altogether, since I think what
cleanup_queue() does could be simplified to calling flush_workqueue().

The problem is _scsih_check_topo_delete_events(), which looks at the
list and sometimes marks fw_events as "ignored" so they aren't executed.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/6] Refactor code to use new sas_device refcount
  2015-07-03 15:38         ` Christoph Hellwig
@ 2015-07-12  4:15           ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-12  4:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Friday 07/03 at 08:38 -0700, Christoph Hellwig wrote:
> >  
> > +struct _sas_device *
> > +mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
> > +    u64 sas_address)
> 
> Any chance to use a shorter name for this function? E.g.
> __mpt2sas_get_sdev_by_addr ?

Will do.

> > +{
> > +	struct _sas_device *sas_device;
> > +
> > +	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
> 
> This will blow on UP builds.  Please use assert_spin_locked or
> lockdep_assert_held instead.  And don't ask me which of the two,
> that's a mystery I don't understand myself either.

Will do.

> >  struct _sas_device *
> > -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > +mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> >      u64 sas_address)
> >  {
> 
> > +static struct _sas_device *
> > +_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> 
> >  static struct _sas_device *
> > -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> 
> Same comments about the function names as above.
> 
> > +	struct _sas_device *sas_device;
> > +
> > +	BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
> 
> Same comment about the right assert helpers as above.
> 
> > @@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> >  	if (!sas_device)
> >  		return;
> >  
> > +	/*
> > +	 * The lock serializes access to the list, but we still need to verify
> > +	 * that nobody removed the entry while we were waiting on the lock.
> > +	 */
> >  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -	list_del(&sas_device->list);
> > -	kfree(sas_device);
> > +	if (!list_empty(&sas_device->list)) {
> > +		list_del_init(&sas_device->list);
> > +		sas_device_put(sas_device);
> > +	}
> >  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> 
> This looks odd to me.  Normally you'd have the lock from the list
> iteration that finds the device.  From looking at the code it seems
> like this only called from probe failure paths, though.  It seems like
> for this case the device simplify shouldn't be added until the probe
> succeeds and this function should go away?

There's a horrible maze of dependencies on things being on the lists
while being added that make this impossible: I spent some time trying
to get this to work, but I always end up with no drives. :(

(The path through _scsih_probe_sas() seems not to care)

I was hopeful your suggestion below about putting the sas_device
pointer in ->hostdata would eliminate the need for all the find_by_X()
lookups, but some won't go away.

> > @@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> >  		goto not_sata;
> >  	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> >  		goto not_sata;
> > +
> >  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> >  	   sas_device_priv_data->sas_target->sas_address);
> > -	if (sas_device && sas_device->device_info &
> > -	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> > +	if (sas_device && sas_device->device_info
> > +			& MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> >  		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> > +		sas_device_put(sas_device);
> > +	}
> 
> Please store a pointer to the sas_device in struct scsi_target ->hostdata
> in _scsih_target_alloc and avoid the need for this and other runtime
> lookups where we have a scsi_device or scsi_target structure available.

Will do.

> > @@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)
> >  
> >  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >  	rphy = dev_to_rphy(starget->dev.parent);
> > -	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +	sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> >  	   rphy->identify.sas_address);
> >  	if (sas_device && (sas_device->starget == starget) &&
> >  	    (sas_device->id == starget->id) &&
> >  	    (sas_device->channel == starget->channel))
> >  		sas_device->starget = NULL;
> >  
> > +	if (sas_device)
> > +		sas_device_put(sas_device);
> >  	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> 
> .. like this one.
> 
> >   out:
> > @@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >  
> >  	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> >  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> >  				sas_target_priv_data->sas_address);
> >  		if (sas_device && (sas_device->starget == NULL)) {
> >  			sdev_printk(KERN_INFO, sdev,
> 
> .. or this one ..
> 
> > @@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
> >  
> >  	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> >  		spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +		sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> >  		   sas_target_priv_data->sas_address);
> >  		if (sas_device && !sas_target_priv_data->num_luns)
> >  			sas_device->starget = NULL;
> > +
> > +		if (sas_device)
> > +			sas_device_put(sas_device);
> >  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> 
> .. and this, and many more.
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/6] Fix unsafe fw_event_list usage
  2015-07-03 16:02         ` Christoph Hellwig
@ 2015-07-12  4:20           ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-12  4:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team

On Friday 07/03 at 09:02 -0700, Christoph Hellwig wrote:
> On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
> > Since the fw_event deletes itself from the list, cleanup_queue() can
> > walk onto garbage pointers or walk off into freed memory.
> > 
> > This refactors the code in _scsih_fw_event_cleanup_queue() to not
> > iterate over the fw_event_list without a lock. 
> 
> I think this really should be folded into the previous one, with the
> fixes in this one the other refcounting change don't make a whole lot
> sense.
> 
> > +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > +	unsigned long flags;
> > +	struct fw_event_work *fw_event = NULL;
> > +
> > +	spin_lock_irqsave(&ioc->fw_event_lock, flags);
> > +	if (!list_empty(&ioc->fw_event_list)) {
> > +		fw_event = list_first_entry(&ioc->fw_event_list,
> > +				struct fw_event_work, list);
> > +		list_del_init(&fw_event->list);
> > +		fw_event_work_get(fw_event);
> > +	}
> > +	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> > +
> > +	return fw_event;
> 
> Shouldn't we have a reference for each item on the list that gets
> transfer to whomever removes it from the list?

Yes, this was a bit weird the way I did it. I redid this in v2, hopefully
it's clearer.

> Additionally _firmware_event_work should call dequeue_next_fw_event
> first in the function so that item is off the list before we process
> it, and can then just drop the reference once it's done.

That works: cleanup_queue() won't wait on some already-running events, but
destroy_workqueue() drains the wq, so we won't run ahead and free things
from under the fw_event when unwinding.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 0/2 v2] Fixes for memory corruption in mpt2sas
  2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
                         ` (6 preceding siblings ...)
  2015-07-02 20:15       ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Bart Van Assche
@ 2015-07-12  4:24       ` Calvin Owens
  2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
                           ` (2 more replies)
  7 siblings, 3 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-12  4:24 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team, calvinowens

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Thanks,
Calvin

Patches in this series:
[PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
[PATCH 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Changes since v1:
	* Squished patches 1-3 and 4-6 into two patches
	* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
	* Use more succinct fuction names
	* Store a pointer to the sas_device object in ->hostdata to eliminate
	  the need for several lookups on the lists.
	* Remove the fw_event from fw_event_list at the start of
	  _firmware_event_work()
	* Explicitly separate fw_event_list removal from fw_event freeing

Total diffstat:

 drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 535 +++++++++++++++++++++----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 396 insertions(+), 173 deletions(-)

Diff showing changes v1 => v2:
	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-12  4:24       ` [PATCH 0/2 v2] " Calvin Owens
@ 2015-07-12  4:24         ` Calvin Owens
  2015-07-13  6:52           ` Christoph Hellwig
                             ` (2 more replies)
  2015-07-12  4:24         ` [PATCH 2/2] mpt2sas: Refcount fw_events " Calvin Owens
  2015-08-01  5:02         ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2 siblings, 3 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-12  4:24 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Christoph Hellwig, Bart Van Assche

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 315 insertions(+), 153 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
  * @flags: MPT_TARGET_FLAGS_XXX flags
  * @deleted: target flaged for deletion
  * @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
  */
 struct MPT2SAS_TARGET {
 	struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
 	u32	flags;
 	u8	deleted;
 	u8	tm_busy;
+	struct _sas_device *sdev;
 };
 
 
@@ -376,8 +378,24 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	pfa_led_on;
+	struct kref refcount;
 };
 
+static inline void sas_device_get(struct _sas_device *s)
+{
+	kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+	kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+	kref_put(&s->refcount, sas_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
     u16 handle);
 struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
     *ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
     struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
 
 void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..fad80ce 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
 	}
 }
 
+struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
+{
+	struct _sas_device *ret;
+
+	ret = tgt_priv->sdev;
+	if (ret)
+		sas_device_get(ret);
+
+	return ret;
+}
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+    u64 sas_address)
+{
+	struct _sas_device *sas_device;
+
+	assert_spin_locked(&ioc->sas_device_lock);
+
+	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
+}
+
 /**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_get_sdev_by_addr - sas device search
  * @ioc: per adapter object
  * @sas_address: sas address
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +571,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
     u64 sas_address)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+			sas_address);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static struct _sas_device *
+__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+	struct _sas_device *sas_device;
+
+	assert_spin_locked(&ioc->sas_device_lock);
 
 	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
 }
 
 /**
- * _scsih_sas_device_find_by_handle - sas device search
+ * mpt2sas_get_sdev_by_handle - sas device search
  * @ioc: per adapter object
  * @handle: sas device handle (assigned by firmware)
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +617,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
 
-	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
-
-	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	return NULL;
+	return sas_device;
 }
 
 /**
@@ -583,7 +635,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
  * @sas_device: the sas_device object
  * Context: This function will acquire ioc->sas_device_lock.
  *
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
  */
 static void
 _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +646,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
 	if (!sas_device)
 		return;
 
+	/*
+	 * The lock serializes access to the list, but we still need to verify
+	 * that nobody removed the entry while we were waiting on the lock.
+	 */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -620,6 +678,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -659,6 +718,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
 	_scsih_determine_boot_device(ioc, sas_device, 0);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1268,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
 		goto not_sata;
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
 		goto not_sata;
+
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	   sas_device_priv_data->sas_target->sas_address);
-	if (sas_device && sas_device->device_info &
-	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+	sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
+	if (sas_device && sas_device->device_info
+			& MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
 		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  not_sata:
@@ -1271,18 +1333,21 @@ _scsih_target_alloc(struct scsi_target *starget)
 	/* sas/sata devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	   rphy->identify.sas_address);
 
 	if (sas_device) {
 		sas_target_priv_data->handle = sas_device->handle;
 		sas_target_priv_data->sas_address = sas_device->sas_address;
+		sas_target_priv_data->sdev = sas_device;
 		sas_device->starget = starget;
 		sas_device->id = starget->id;
 		sas_device->channel = starget->channel;
 		if (test_bit(sas_device->handle, ioc->pd_handles))
 			sas_target_priv_data->flags |=
 			    MPT_TARGET_FLAGS_RAID_COMPONENT;
+
+		sas_device_put(sas_device);
 	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -1324,13 +1389,14 @@ _scsih_target_destroy(struct scsi_target *starget)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	   rphy->identify.sas_address);
+	sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
 	if (sas_device && (sas_device->starget == starget) &&
 	    (sas_device->id == starget->id) &&
 	    (sas_device->channel == starget->channel))
 		sas_device->starget = NULL;
 
+	if (sas_device)
+		sas_device_put(sas_device);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  out:
@@ -1386,7 +1452,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 				sas_target_priv_data->sas_address);
 		if (sas_device && (sas_device->starget == NULL)) {
 			sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1460,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 			     __func__, __LINE__);
 			sas_device->starget = starget;
 		}
+
+		if (sas_device)
+			sas_device_put(sas_device);
+
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -1428,10 +1498,12 @@ _scsih_slave_destroy(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		   sas_target_priv_data->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
 		if (sas_device && !sas_target_priv_data->num_luns)
 			sas_device->starget = NULL;
+
+		if (sas_device)
+			sas_device_put(sas_device);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	}
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
 	if (!sas_device) {
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	if (!ssp_target)
 		_scsih_display_sata_capabilities(ioc, handle, sdev);
 
-
 	_scsih_change_queue_depth(sdev, qdepth);
 
 	if (ssp_target) {
 		sas_read_port_mode_page(sdev);
 		_scsih_enable_tlr(ioc, sdev);
 	}
+
+	sas_device_put(sas_device);
 	return 0;
 }
 
@@ -2509,8 +2582,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		    priv_target->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(priv_target);
 		if (sas_device) {
 			if (priv_target->flags &
 			    MPT_TARGET_FLAGS_RAID_COMPONENT) {
@@ -2529,6 +2601,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 			    "enclosure_logical_id(0x%016llx), slot(%d)\n",
 			   (unsigned long long)sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
@@ -2604,12 +2678,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 
 	struct scsi_target *starget = scmd->device->sdev_target;
+	struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
 
 	starget_printk(KERN_INFO, starget, "attempting device reset! "
 	    "scmd(%p)\n", scmd);
@@ -2629,12 +2703,9 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
-		   sas_device_priv_data->sas_target->handle);
+		sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2651,6 +2722,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
  out:
 	sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -2665,11 +2740,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 	struct scsi_target *starget = scmd->device->sdev_target;
+	struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
 
 	starget_printk(KERN_INFO, starget, "attempting target reset! "
 	    "scmd(%p)\n", scmd);
@@ -2689,12 +2764,9 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
-		   sas_device_priv_data->sas_target->handle);
+		sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2711,6 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
  out:
 	starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -3002,15 +3078,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
 
 	list_for_each_entry(mpt2sas_port,
 	   &sas_expander->sas_port_list, port_list) {
-		if (mpt2sas_port->remote_identify.device_type ==
-		    SAS_END_DEVICE) {
+		if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
 			spin_lock_irqsave(&ioc->sas_device_lock, flags);
-			sas_device =
-			    mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-			   mpt2sas_port->remote_identify.sas_address);
-			if (sas_device)
+			sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+					mpt2sas_port->remote_identify.sas_address);
+			if (sas_device) {
 				set_bit(sas_device->handle,
-				    ioc->blocking_handles);
+						ioc->blocking_handles);
+				sas_device_put(sas_device);
+			}
 			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 		}
 	}
@@ -3080,7 +3156,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	Mpi2SCSITaskManagementRequest_t *mpi_request;
 	u16 smid;
-	struct _sas_device *sas_device;
+	struct _sas_device *sas_device = NULL;
 	struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
 	u64 sas_address = 0;
 	unsigned long flags;
@@ -3110,7 +3186,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device && sas_device->starget &&
 	     sas_device->starget->hostdata) {
 		sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3207,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!smid) {
 		delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
 		if (!delayed_tr)
-			return;
+			goto out;
 		INIT_LIST_HEAD(&delayed_tr->list);
 		delayed_tr->handle = handle;
 		list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
 		dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
 		    "DELAYED:tr:handle(0x%04x), (open)\n",
 		    ioc->name, handle));
-		return;
+		goto out;
 	}
 
 	dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3226,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	mpi_request->DevHandle = cpu_to_le16(handle);
 	mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
 	mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 
@@ -4068,7 +4147,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 	char *desc_scsi_state = ioc->tmp_string;
 	u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
 	struct _sas_device *sas_device = NULL;
-	unsigned long flags;
 	struct scsi_target *starget = scmd->device->sdev_target;
 	struct MPT2SAS_TARGET *priv_target = starget->hostdata;
 	char *device_str = NULL;
@@ -4200,9 +4278,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 		printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		    priv_target->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(priv_target);
 		if (sas_device) {
 			printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
 			    "phy(%d)\n", ioc->name, sas_device->sas_address,
@@ -4211,8 +4287,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 			    "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
 			    ioc->name, sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
 	printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4336,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	Mpi2SepRequest_t mpi_request;
 	struct _sas_device *sas_device;
 
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (!sas_device)
 		return;
 
@@ -4274,7 +4351,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    &mpi_request)) != 0) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
 		__FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 	sas_device->pfa_led_on = 1;
 
@@ -4284,8 +4361,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		 "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
 		 ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
 		 le32_to_cpu(mpi_reply.IOCLogInfo)));
-		return;
+		goto out;
 	}
+out:
+	sas_device_put(sas_device);
 }
 
 /**
@@ -4370,19 +4449,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	/* only handle non-raid devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (!sas_device) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 	starget = sas_device->starget;
 	sas_target_priv_data = starget->hostdata;
 
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
-	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+		goto out_unlock;
+
 	starget_printk(KERN_WARNING, starget, "predicted fault\n");
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -4396,7 +4473,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!event_reply) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 
 	event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4490,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
 	mpt2sas_ctl_add_to_event_log(ioc, event_reply);
 	kfree(event_reply);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+	return;
+
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	goto out;
 }
 
 /**
@@ -5148,14 +5233,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
 
 	if (!sas_device) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5256,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), flags!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* check if there were any issues with discovery */
 	if (_scsih_check_access_status(ioc, sas_address, handle,
-	    sas_device_pg0.AccessStatus)) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	    sas_device_pg0.AccessStatus))
+		goto out_unlock;
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	_scsih_ublock_io_device(ioc, sas_address);
+	return;
 
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 /**
@@ -5208,7 +5295,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 	u32 ioc_status;
 	__le64 sas_address;
 	u32 device_info;
-	unsigned long flags;
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5336,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
-
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	if (sas_device)
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return 0;
+	}
 
 	sas_device = kzalloc(sizeof(struct _sas_device),
 	    GFP_KERNEL);
@@ -5267,6 +5352,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->refcount);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5430,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5448,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 
 /**
@@ -5389,13 +5479,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
 /**
@@ -5716,26 +5810,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(event_data->SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
 
-	if (!sas_device || !sas_device->starget) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!sas_device || !sas_device->starget)
+		goto out;
 
 	target_priv_data = sas_device->starget->hostdata;
-	if (!target_priv_data) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!target_priv_data)
+		goto out;
 
 	if (event_data->ReasonCode ==
 	    MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
 		target_priv_data->tm_busy = 1;
 	else
 		target_priv_data->tm_busy = 0;
+
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
 }
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6219,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device) {
 		sas_device->volume_handle = 0;
 		sas_device->volume_wwid = 0;
@@ -6142,6 +6238,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	/* exposing raid component */
 	if (starget)
 		starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6170,7 +6268,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 		    &volume_wwid);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device) {
 		set_bit(handle, ioc->pd_handles);
 		if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6287,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 	/* hiding raid component */
 	if (starget)
 		starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6221,7 +6321,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
     Mpi2EventIrConfigElement_t *element)
 {
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6330,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
 
 	set_bit(handle, ioc->pd_handles);
 
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return;
+	}
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6608,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle, parent_handle;
 	u32 state;
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
 	u32 ioc_status;
@@ -6542,12 +6640,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 		if (!ioc->is_warpdrive)
 			set_bit(handle, ioc->pd_handles);
 
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
-		if (sas_device)
+		sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			return;
+		}
 
 		if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7015,6 +7112,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	struct _raid_device *raid_device, *raid_device_next;
 	struct list_head tmp_list;
 	unsigned long flags;
+	LIST_HEAD(head);
 
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
 	    ioc->name);
@@ -7022,14 +7120,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	/* removing unresponding end devices */
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
 	    ioc->name);
+
+	/*
+	 * Iterate, pulling off devices marked as non-responding. We become the
+	 * owner for the reference the list had on any object we prune.
+	 */
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_for_each_entry_safe(sas_device, sas_device_next,
-	    &ioc->sas_device_list, list) {
+			&ioc->sas_device_list, list) {
 		if (!sas_device->responding)
-			mpt2sas_device_remove_by_sas_address(ioc,
-				sas_device->sas_address);
+			list_move_tail(&sas_device->list, &head);
 		else
 			sas_device->responding = 0;
 	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Now, uninitialize and remove the unresponding devices we pruned.
+	 */
+	list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+		_scsih_remove_device(ioc, sas_device);
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 
 	/* removing unresponding volumes */
 	if (ioc->ir_firmware) {
@@ -7179,11 +7292,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		}
 		phys_disk_num = pd_pg0.PhysDiskNum;
 		handle = le16_to_cpu(pd_pg0.DevHandle);
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
 		    handle) != 0)
@@ -7302,12 +7415,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		if (!(_scsih_is_end_device(
 		    le32_to_cpu(sas_device_pg0.DeviceInfo))))
 			continue;
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_get_sdev_by_addr(ioc,
 		    le64_to_cpu(sas_device_pg0.SASAddress));
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
 		if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
 			printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
@@ -7966,6 +8079,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 	}
 }
 
+static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+	struct _sas_device *sas_device = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	if (!list_empty(&ioc->sas_device_init_list)) {
+		sas_device = list_first_entry(&ioc->sas_device_init_list,
+				struct _sas_device, list);
+		list_del_init(&sas_device->list);
+	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * If an item was dequeued, the caller now owns the reference that was
+	 * previously owned by the list
+	 */
+	return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+		struct _sas_device *sas_device)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
+	list_add_tail(&sas_device->list, &ioc->sas_device_list);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
 /**
  * _scsih_probe_sas - reporting sas devices to sas transport
  * @ioc: per adapter object
@@ -7975,34 +8119,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct _sas_device *sas_device, *next;
-	unsigned long flags;
-
-	/* SAS Device List */
-	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
-	    list) {
+	struct _sas_device *sas_device;
 
-		if (ioc->hide_drives)
-			continue;
+	if (ioc->hide_drives)
+		return;
 
+	while ((sas_device = dequeue_next_sas_device(ioc))) {
 		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
-		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+				sas_device->sas_address_parent)) {
+			sas_device_put(sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			if (!ioc->is_driver_loading) {
 				mpt2sas_transport_port_remove(ioc,
-					sas_device->sas_address,
-					sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+						sas_device->sas_address,
+						sas_device->sas_address_parent);
+				sas_device_put(sas_device);
 				continue;
 			}
 		}
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		list_move_tail(&sas_device->list, &ioc->sas_device_list);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+		sas_device_make_active(ioc, sas_device);
+		sas_device_put(sas_device);
 	}
 }
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..af86800 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    rphy->identify.sas_address);
 	if (sas_device) {
 		*identifier = sas_device->enclosure_logical_id;
 		rc = 0;
+		sas_device_put(sas_device);
 	} else {
 		*identifier = 0;
 		rc = -ENXIO;
 	}
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    rphy->identify.sas_address);
-	if (sas_device)
+	if (sas_device) {
 		rc = sas_device->slot;
-	else
+		sas_device_put(sas_device);
+	} else {
 		rc = -ENXIO;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
  2015-07-12  4:24       ` [PATCH 0/2 v2] " Calvin Owens
  2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
@ 2015-07-12  4:24         ` Calvin Owens
  2015-07-13  6:52           ` Christoph Hellwig
  2015-08-01  5:02         ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-07-12  4:24 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Christoph Hellwig, Bart Van Assche

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 101 ++++++++++++++++++++++++++++-------
 1 file changed, 81 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index fad80ce..8b267af 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
 	u8			VP_ID;
 	u8			ignore;
 	u16			event;
+	struct kref		refcount;
 	char			event_data[0] __aligned(4);
 };
 
+static void fw_event_work_free(struct kref *r)
+{
+	kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+	kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+	kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+	struct fw_event_work *fw_event;
+
+	fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+	if (!fw_event)
+		return NULL;
+
+	kref_init(&fw_event->refcount);
+	return fw_event;
+}
+
 /* raid transport support */
 static struct raid_template *mpt2sas_raid_template;
 
@@ -2844,36 +2872,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
 		return;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	fw_event_work_get(fw_event);
 	list_add_tail(&fw_event->list, &ioc->fw_event_list);
 	INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
+	fw_event_work_get(fw_event);
 	queue_delayed_work(ioc->firmware_event_thread,
 	    &fw_event->delayed_work, 0);
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
 /**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
  * @ioc: per adapter object
  * @fw_event: object describing the event
  * Context: This function will acquire ioc->fw_event_lock.
  *
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
  *
  * Return nothing.
  */
 static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
     *fw_event)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
-	list_del(&fw_event->list);
-	kfree(fw_event);
+	if (!list_empty(&fw_event->list)) {
+		list_del_init(&fw_event->list);
+		fw_event_work_put(fw_event);
+	}
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
-
 /**
  * _scsih_error_recovery_delete_devices - remove devices not responding
  * @ioc: per adapter object
@@ -2888,13 +2919,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
 	if (ioc->is_driver_loading)
 		return;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 
 	fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -2908,12 +2940,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+	unsigned long flags;
+	struct fw_event_work *fw_event = NULL;
+
+	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	if (!list_empty(&ioc->fw_event_list)) {
+		fw_event = list_first_entry(&ioc->fw_event_list,
+				struct fw_event_work, list);
+		list_del_init(&fw_event->list);
+	}
+	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+	return fw_event;
 }
 
 /**
@@ -2928,17 +2977,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct fw_event_work *fw_event, *next;
+	struct fw_event_work *fw_event;
 
 	if (list_empty(&ioc->fw_event_list) ||
 	     !ioc->firmware_event_thread || in_interrupt())
 		return;
 
-	list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
-		if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
-			_scsih_fw_event_free(ioc, fw_event);
-			continue;
-		}
+	while ((fw_event = dequeue_next_fw_event(ioc))) {
+		/*
+		 * Wait on the fw_event to complete. If this returns 1, then
+		 * the event was never executed, and we need a put for the
+		 * reference the delayed_work had on the fw_event.
+		 *
+		 * If it did execute, we wait for it to finish, and the put will
+		 * happen from _firmware_event_work()
+		 */
+		if (cancel_delayed_work_sync(&fw_event->delayed_work))
+			fw_event_work_put(fw_event);
+
+		fw_event_work_put(fw_event);
 	}
 }
 
@@ -4419,13 +4476,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
 	fw_event->device_handle = handle;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7523,10 +7581,11 @@ _firmware_event_work(struct work_struct *work)
 	    struct fw_event_work, delayed_work.work);
 	struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;
 
+	_scsih_fw_event_del_from_list(ioc, fw_event);
+
 	/* the queue is being flushed so ignore this event */
-	if (ioc->remove_host ||
-	    ioc->pci_error_recovery) {
-		_scsih_fw_event_free(ioc, fw_event);
+	if (ioc->remove_host || ioc->pci_error_recovery) {
+		fw_event_work_put(fw_event);
 		return;
 	}
 
@@ -7582,7 +7641,8 @@ _firmware_event_work(struct work_struct *work)
 		_scsih_sas_ir_operation_status_event(ioc, fw_event);
 		break;
 	}
-	_scsih_fw_event_free(ioc, fw_event);
+
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7720,7 +7780,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	}
 
 	sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
-	fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(sz);
 	if (!fw_event) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
@@ -7733,6 +7793,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	fw_event->VP_ID = mpi_reply->VP_ID;
 	fw_event->event = event;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 	return;
 }
 
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
@ 2015-07-13  6:52           ` Christoph Hellwig
  2015-07-21  7:06             ` Calvin Owens
  2015-07-13 15:05           ` Joe Lawrence
  2015-07-16 14:57           ` Sreekanth Reddy
  2 siblings, 1 reply; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-13  6:52 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Christoph Hellwig, Bart Van Assche

On Sat, Jul 11, 2015 at 09:24:55PM -0700, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
> 
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
> 
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 315 insertions(+), 153 deletions(-)
> 
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
>   * @flags: MPT_TARGET_FLAGS_XXX flags
>   * @deleted: target flaged for deletion
>   * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
>   */
>  struct MPT2SAS_TARGET {
>  	struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
>  	u32	flags;
>  	u8	deleted;
>  	u8	tm_busy;
> +	struct _sas_device *sdev;
>  };
>  
>  
> @@ -376,8 +378,24 @@ struct _sas_device {
>  	u8	phy;
>  	u8	responding;
>  	u8	pfa_led_on;
> +	struct kref refcount;
>  };
>  
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> +	kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> +	kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> +	kref_put(&s->refcount, sas_device_free);
> +}
> +
>  /**
>   * struct _raid_device - raid volume link list
>   * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
>      u16 handle);
>  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
>      *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
>      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>  
>  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..fad80ce 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>  	}
>  }
>  
> +struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> +{
> +	struct _sas_device *ret;
> +

Does this need a:

	assert_spin_locked(&ioc->sas_device_lock);

?

Otherwise this looks sensible to me.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
  2015-07-12  4:24         ` [PATCH 2/2] mpt2sas: Refcount fw_events " Calvin Owens
@ 2015-07-13  6:52           ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2015-07-13  6:52 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Christoph Hellwig, Bart Van Assche

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-07-13  6:52           ` Christoph Hellwig
@ 2015-07-13 15:05           ` Joe Lawrence
  2015-07-21  7:04             ` Calvin Owens
  2015-07-16 14:57           ` Sreekanth Reddy
  2 siblings, 1 reply; 52+ messages in thread
From: Joe Lawrence @ 2015-07-13 15:05 UTC (permalink / raw)
  To: Calvin Owens, Nagalakshmi Nandigama, Praveen Krishnamoorthy,
	Sreekanth Reddy, Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Christoph Hellwig, Bart Van Assche

On 07/12/2015 12:24 AM, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
> 
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
> 
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 315 insertions(+), 153 deletions(-)

[ ... snip ... ]

> @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
>  	}
>  
>  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>  	   sas_device_priv_data->sas_target->sas_address);
>  	if (!sas_device) {
>  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
>  	if (!ssp_target)
>  		_scsih_display_sata_capabilities(ioc, handle, sdev);
>  
> -
>  	_scsih_change_queue_depth(sdev, qdepth);
>  
>  	if (ssp_target) {
>  		sas_read_port_mode_page(sdev);
>  		_scsih_enable_tlr(ioc, sdev);
>  	}
> +
> +	sas_device_put(sas_device);
>  	return 0;
>  }

Hi Calvin,

Any reason why this sas_device_put is placed outside the sas_device
lock?  Most other instances in this patch were called just before unlocking.

BTW I attempted testing, but needed to port to mpt3 and ended up with a
driver that didn't boot :(   Hopefully I can retry later this week, or
find an older mpt2 box lying around.

-- Joe

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-07-13  6:52           ` Christoph Hellwig
  2015-07-13 15:05           ` Joe Lawrence
@ 2015-07-16 14:57           ` Sreekanth Reddy
  2015-07-21  7:03             ` Calvin Owens
  2 siblings, 1 reply; 52+ messages in thread
From: Sreekanth Reddy @ 2015-07-16 14:57 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Abhijit Mahajan,
	MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Christoph Hellwig, Bart Van Assche

On Sun, Jul 12, 2015 at 9:54 AM, Calvin Owens <calvinowens@fb.com> wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 315 insertions(+), 153 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
>   * @flags: MPT_TARGET_FLAGS_XXX flags
>   * @deleted: target flaged for deletion
>   * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
>   */
>  struct MPT2SAS_TARGET {
>         struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
>         u32     flags;
>         u8      deleted;
>         u8      tm_busy;
> +       struct _sas_device *sdev;
>  };
>
>
> @@ -376,8 +378,24 @@ struct _sas_device {
>         u8      phy;
>         u8      responding;
>         u8      pfa_led_on;
> +       struct kref refcount;
>  };
>
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> +       kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> +       kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> +       kref_put(&s->refcount, sas_device_free);
> +}
> +
>  /**
>   * struct _raid_device - raid volume link list
>   * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
>      u16 handle);
>  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
>      *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
>      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>
>  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..fad80ce 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>         }
>  }
>
> +struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> +{
> +       struct _sas_device *ret;
> +
> +       ret = tgt_priv->sdev;
> +       if (ret)
> +               sas_device_get(ret);
> +
> +       return ret;
> +}
> +
> +struct _sas_device *
> +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> +    u64 sas_address)
> +{
> +       struct _sas_device *sas_device;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
> +
> +       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> +               if (sas_device->sas_address == sas_address)
> +                       goto found_device;
> +
> +       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> +               if (sas_device->sas_address == sas_address)
> +                       goto found_device;
> +
> +       return NULL;
> +
> +found_device:
> +       sas_device_get(sas_device);
> +       return sas_device;
> +}
> +
>  /**
> - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> + * mpt2sas_get_sdev_by_addr - sas device search
>   * @ioc: per adapter object
>   * @sas_address: sas address
>   * Context: Calling function should acquire ioc->sas_device_lock
> @@ -536,24 +571,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>   * object.
>   */
>  struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
>      u64 sas_address)
>  {
>         struct _sas_device *sas_device;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> +                       sas_address);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return sas_device;
> +}
> +
> +static struct _sas_device *
> +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +{
> +       struct _sas_device *sas_device;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
>
>         list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> -               if (sas_device->sas_address == sas_address)
> -                       return sas_device;
> +               if (sas_device->handle == handle)
> +                       goto found_device;
>
>         list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> -               if (sas_device->sas_address == sas_address)
> -                       return sas_device;
> +               if (sas_device->handle == handle)
> +                       goto found_device;
>
>         return NULL;
> +
> +found_device:
> +       sas_device_get(sas_device);
> +       return sas_device;
>  }
>
>  /**
> - * _scsih_sas_device_find_by_handle - sas device search
> + * mpt2sas_get_sdev_by_handle - sas device search
>   * @ioc: per adapter object
>   * @handle: sas device handle (assigned by firmware)
>   * Context: Calling function should acquire ioc->sas_device_lock
> @@ -562,19 +617,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>   * object.
>   */
>  static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         struct _sas_device *sas_device;
> +       unsigned long flags;
>
> -       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> -               if (sas_device->handle == handle)
> -                       return sas_device;
> -
> -       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> -               if (sas_device->handle == handle)
> -                       return sas_device;
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> -       return NULL;
> +       return sas_device;
>  }
>
>  /**
> @@ -583,7 +635,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>   * @sas_device: the sas_device object
>   * Context: This function will acquire ioc->sas_device_lock.
>   *
> - * Removing object and freeing associated memory from the ioc->sas_device_list.
> + * If sas_device is on the list, remove it and decrement its reference count.
>   */
>  static void
>  _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> @@ -594,9 +646,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>         if (!sas_device)
>                 return;
>
> +       /*
> +        * The lock serializes access to the list, but we still need to verify
> +        * that nobody removed the entry while we were waiting on the lock.
> +        */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       list_del(&sas_device->list);
> -       kfree(sas_device);
> +       if (!list_empty(&sas_device->list)) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>
> @@ -620,6 +678,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
>             sas_device->handle, (unsigned long long)sas_device->sas_address));
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);
>         list_add_tail(&sas_device->list, &ioc->sas_device_list);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -659,6 +718,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
>             sas_device->handle, (unsigned long long)sas_device->sas_address));
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);
>         list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
>         _scsih_determine_boot_device(ioc, sas_device, 0);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -1208,12 +1268,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
>                 goto not_sata;
>         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
>                 goto not_sata;
> +
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -          sas_device_priv_data->sas_target->sas_address);
> -       if (sas_device && sas_device->device_info &
> -           MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> +       sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> +       if (sas_device && sas_device->device_info
> +                       & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
>                 max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
>   not_sata:
> @@ -1271,18 +1333,21 @@ _scsih_target_alloc(struct scsi_target *starget)
>         /* sas/sata devices */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         rphy = dev_to_rphy(starget->dev.parent);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>            rphy->identify.sas_address);
>
>         if (sas_device) {
>                 sas_target_priv_data->handle = sas_device->handle;
>                 sas_target_priv_data->sas_address = sas_device->sas_address;
> +               sas_target_priv_data->sdev = sas_device;
>                 sas_device->starget = starget;
>                 sas_device->id = starget->id;
>                 sas_device->channel = starget->channel;
>                 if (test_bit(sas_device->handle, ioc->pd_handles))
>                         sas_target_priv_data->flags |=
>                             MPT_TARGET_FLAGS_RAID_COMPONENT;
> +
> +               sas_device_put(sas_device);
>         }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -1324,13 +1389,14 @@ _scsih_target_destroy(struct scsi_target *starget)
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         rphy = dev_to_rphy(starget->dev.parent);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -          rphy->identify.sas_address);
> +       sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
>         if (sas_device && (sas_device->starget == starget) &&
>             (sas_device->id == starget->id) &&
>             (sas_device->channel == starget->channel))
>                 sas_device->starget = NULL;
>
> +       if (sas_device)
> +               sas_device_put(sas_device);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
>   out:
> @@ -1386,7 +1452,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>
>         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +               sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>                                 sas_target_priv_data->sas_address);
>                 if (sas_device && (sas_device->starget == NULL)) {
>                         sdev_printk(KERN_INFO, sdev,
> @@ -1394,6 +1460,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>                              __func__, __LINE__);
>                         sas_device->starget = starget;
>                 }
> +
> +               if (sas_device)
> +                       sas_device_put(sas_device);
> +
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
> @@ -1428,10 +1498,12 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>
>         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                  sas_target_priv_data->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
>                 if (sas_device && !sas_target_priv_data->num_luns)
>                         sas_device->starget = NULL;
> +
> +               if (sas_device)
> +                       sas_device_put(sas_device);
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
> @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
>         }
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>            sas_device_priv_data->sas_target->sas_address);
>         if (!sas_device) {
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
>         if (!ssp_target)
>                 _scsih_display_sata_capabilities(ioc, handle, sdev);
>
> -
>         _scsih_change_queue_depth(sdev, qdepth);
>
>         if (ssp_target) {
>                 sas_read_port_mode_page(sdev);
>                 _scsih_enable_tlr(ioc, sdev);
>         }
> +
> +       sas_device_put(sas_device);
>         return 0;
>  }
>
> @@ -2509,8 +2582,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
>                     device_str, (unsigned long long)priv_target->sas_address);
>         } else {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                   priv_target->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(priv_target);
>                 if (sas_device) {
>                         if (priv_target->flags &
>                             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> @@ -2529,6 +2601,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
>                             "enclosure_logical_id(0x%016llx), slot(%d)\n",
>                            (unsigned long long)sas_device->enclosure_logical_id,
>                             sas_device->slot);
> +
> +                       sas_device_put(sas_device);
>                 }
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
> @@ -2604,12 +2678,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>  {
>         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
>         struct MPT2SAS_DEVICE *sas_device_priv_data;
> -       struct _sas_device *sas_device;
> -       unsigned long flags;
> +       struct _sas_device *sas_device = NULL;
>         u16     handle;
>         int r;
>
>         struct scsi_target *starget = scmd->device->sdev_target;
> +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
>         starget_printk(KERN_INFO, starget, "attempting device reset! "
>             "scmd(%p)\n", scmd);
> @@ -2629,12 +2703,9 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>         handle = 0;
>         if (sas_device_priv_data->sas_target->flags &
>             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> -                  sas_device_priv_data->sas_target->handle);
> +               sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
>                 if (sas_device)
>                         handle = sas_device->volume_handle;
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         } else
>                 handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2651,6 +2722,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>   out:
>         sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
>             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         return r;
>  }
>
> @@ -2665,11 +2740,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>  {
>         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
>         struct MPT2SAS_DEVICE *sas_device_priv_data;
> -       struct _sas_device *sas_device;
> -       unsigned long flags;
> +       struct _sas_device *sas_device = NULL;
>         u16     handle;
>         int r;
>         struct scsi_target *starget = scmd->device->sdev_target;
> +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
>         starget_printk(KERN_INFO, starget, "attempting target reset! "
>             "scmd(%p)\n", scmd);
> @@ -2689,12 +2764,9 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>         handle = 0;
>         if (sas_device_priv_data->sas_target->flags &
>             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> -                  sas_device_priv_data->sas_target->handle);
> +               sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
>                 if (sas_device)
>                         handle = sas_device->volume_handle;
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         } else
>                 handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2711,6 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>   out:
>         starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
>             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         return r;
>  }
>
> @@ -3002,15 +3078,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
>
>         list_for_each_entry(mpt2sas_port,
>            &sas_expander->sas_port_list, port_list) {
> -               if (mpt2sas_port->remote_identify.device_type ==
> -                   SAS_END_DEVICE) {
> +               if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
>                         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -                       sas_device =
> -                           mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                          mpt2sas_port->remote_identify.sas_address);
> -                       if (sas_device)
> +                       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> +                                       mpt2sas_port->remote_identify.sas_address);
> +                       if (sas_device) {
>                                 set_bit(sas_device->handle,
> -                                   ioc->blocking_handles);
> +                                               ioc->blocking_handles);
> +                               sas_device_put(sas_device);
> +                       }
>                         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>                 }
>         }
> @@ -3080,7 +3156,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         Mpi2SCSITaskManagementRequest_t *mpi_request;
>         u16 smid;
> -       struct _sas_device *sas_device;
> +       struct _sas_device *sas_device = NULL;
>         struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
>         u64 sas_address = 0;
>         unsigned long flags;
> @@ -3110,7 +3186,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device && sas_device->starget &&
>              sas_device->starget->hostdata) {
>                 sas_target_priv_data = sas_device->starget->hostdata;
> @@ -3131,14 +3207,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         if (!smid) {
>                 delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
>                 if (!delayed_tr)
> -                       return;
> +                       goto out;
>                 INIT_LIST_HEAD(&delayed_tr->list);
>                 delayed_tr->handle = handle;
>                 list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
>                 dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
>                     "DELAYED:tr:handle(0x%04x), (open)\n",
>                     ioc->name, handle));
> -               return;
> +               goto out;
>         }
>
>         dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> @@ -3150,6 +3226,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         mpi_request->DevHandle = cpu_to_le16(handle);
>         mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
>         mpt2sas_base_put_smid_hi_priority(ioc, smid);
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
>  }
>
>
> @@ -4068,7 +4147,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>         char *desc_scsi_state = ioc->tmp_string;
>         u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
>         struct _sas_device *sas_device = NULL;
> -       unsigned long flags;
>         struct scsi_target *starget = scmd->device->sdev_target;
>         struct MPT2SAS_TARGET *priv_target = starget->hostdata;
>         char *device_str = NULL;
> @@ -4200,9 +4278,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>                 printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
>                     device_str, (unsigned long long)priv_target->sas_address);
>         } else {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                   priv_target->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(priv_target);
>                 if (sas_device) {
>                         printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
>                             "phy(%d)\n", ioc->name, sas_device->sas_address,
> @@ -4211,8 +4287,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>                             "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
>                             ioc->name, sas_device->enclosure_logical_id,
>                             sas_device->slot);
> +
> +                       sas_device_put(sas_device);
>                 }
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
>         printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> @@ -4259,7 +4336,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         Mpi2SepRequest_t mpi_request;
>         struct _sas_device *sas_device;
>
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (!sas_device)
>                 return;
>
> @@ -4274,7 +4351,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>             &mpi_request)) != 0) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
>                 __FILE__, __LINE__, __func__);
> -               return;
> +               goto out;
>         }
>         sas_device->pfa_led_on = 1;
>
> @@ -4284,8 +4361,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                  "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
>                  ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
>                  le32_to_cpu(mpi_reply.IOCLogInfo)));
> -               return;
> +               goto out;
>         }
> +out:
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -4370,19 +4449,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
>         /* only handle non-raid devices */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (!sas_device) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>         starget = sas_device->starget;
>         sas_target_priv_data = starget->hostdata;
>
>         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> -          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> +               goto out_unlock;
> +
>         starget_printk(KERN_WARNING, starget, "predicted fault\n");
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -4396,7 +4473,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         if (!event_reply) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
>                     ioc->name, __FILE__, __LINE__, __func__);
> -               return;
> +               goto out;
>         }
>
>         event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> @@ -4413,6 +4490,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
>         mpt2sas_ctl_add_to_event_log(ioc, event_reply);
>         kfree(event_reply);
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +       return;
> +
> +out_unlock:
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +       goto out;
>  }
>
>  /**
> @@ -5148,14 +5233,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
>
>         if (!sas_device) {
>                 printk(MPT2SAS_ERR_FMT "device is not present "
>                     "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>
>         if (unlikely(sas_device->handle != handle)) {
> @@ -5172,19 +5256,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>             MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
>                 printk(MPT2SAS_ERR_FMT "device is not present "
>                     "handle(0x%04x), flags!!!\n", ioc->name, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>
>         /* check if there were any issues with discovery */
>         if (_scsih_check_access_status(ioc, sas_address, handle,
> -           sas_device_pg0.AccessStatus)) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +           sas_device_pg0.AccessStatus))
> +               goto out_unlock;
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         _scsih_ublock_io_device(ioc, sas_address);
> +       return;
>
> +out_unlock:
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +       if (sas_device)
> +               sas_device_put(sas_device);
>  }
>
>  /**
> @@ -5208,7 +5295,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>         u32 ioc_status;
>         __le64 sas_address;
>         u32 device_info;
> -       unsigned long flags;
>
>         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
>             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -5250,14 +5336,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>                 return -1;
>         }
>
> -
> -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
> -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> -       if (sas_device)
> +       if (sas_device) {
> +               sas_device_put(sas_device);
>                 return 0;
> +       }
>
>         sas_device = kzalloc(sizeof(struct _sas_device),
>             GFP_KERNEL);
> @@ -5267,6 +5352,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>                 return -1;
>         }
>
> +       kref_init(&sas_device->refcount);
>         sas_device->handle = handle;
>         if (_scsih_get_sas_address(ioc, le16_to_cpu
>                 (sas_device_pg0.ParentDevHandle),
> @@ -5344,7 +5430,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
>             "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
>             sas_device->handle, (unsigned long long)
>             sas_device->sas_address));
> -       kfree(sas_device);
>  }
>  /**
>   * _scsih_device_remove_by_handle - removing device object by handle
> @@ -5363,12 +5448,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -       if (sas_device)
> -               list_del(&sas_device->list);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> +       if (sas_device) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +
> +       if (sas_device) {
>                 _scsih_remove_device(ioc, sas_device);
> +               sas_device_put(sas_device);
> +       }
>  }
>
>  /**
> @@ -5389,13 +5479,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -           sas_address);
> -       if (sas_device)
> -               list_del(&sas_device->list);
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> +       if (sas_device) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +
> +       if (sas_device) {
>                 _scsih_remove_device(ioc, sas_device);
> +               sas_device_put(sas_device);
> +       }
>  }
>  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
>  /**
> @@ -5716,26 +5810,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         sas_address = le64_to_cpu(event_data->SASAddress);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
>
> -       if (!sas_device || !sas_device->starget) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +       if (!sas_device || !sas_device->starget)
> +               goto out;
>
>         target_priv_data = sas_device->starget->hostdata;
> -       if (!target_priv_data) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +       if (!target_priv_data)
> +               goto out;
>
>         if (event_data->ReasonCode ==
>             MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
>                 target_priv_data->tm_busy = 1;
>         else
>                 target_priv_data->tm_busy = 0;
> +
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
>  }
>
>  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> @@ -6123,7 +6219,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
>         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device) {
>                 sas_device->volume_handle = 0;
>                 sas_device->volume_wwid = 0;
> @@ -6142,6 +6238,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
>         /* exposing raid component */
>         if (starget)
>                 starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> +
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -6170,7 +6268,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
>                     &volume_wwid);
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device) {
>                 set_bit(handle, ioc->pd_handles);
>                 if (sas_device->starget && sas_device->starget->hostdata) {
> @@ -6189,6 +6287,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
>         /* hiding raid component */
>         if (starget)
>                 starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> +
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -6221,7 +6321,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>      Mpi2EventIrConfigElement_t *element)
>  {
>         struct _sas_device *sas_device;
> -       unsigned long flags;
>         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>         Mpi2ConfigReply_t mpi_reply;
>         Mpi2SasDevicePage0_t sas_device_pg0;
> @@ -6231,11 +6330,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>
>         set_bit(handle, ioc->pd_handles);
>
> -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +       if (sas_device) {
> +               sas_device_put(sas_device);
>                 return;
> +       }
>
>         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
>             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -6509,7 +6608,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
>         u16 handle, parent_handle;
>         u32 state;
>         struct _sas_device *sas_device;
> -       unsigned long flags;
>         Mpi2ConfigReply_t mpi_reply;
>         Mpi2SasDevicePage0_t sas_device_pg0;
>         u32 ioc_status;
> @@ -6542,12 +6640,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
>                 if (!ioc->is_warpdrive)
>                         set_bit(handle, ioc->pd_handles);
>
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -
> -               if (sas_device)
> +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         return;
> +               }
>
>                 if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
>                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> @@ -7015,6 +7112,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
>         struct _raid_device *raid_device, *raid_device_next;
>         struct list_head tmp_list;
>         unsigned long flags;
> +       LIST_HEAD(head);
>
>         printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
>             ioc->name);
> @@ -7022,14 +7120,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
>         /* removing unresponding end devices */
>         printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
>             ioc->name);
> +
> +       /*
> +        * Iterate, pulling off devices marked as non-responding. We become the
> +        * owner for the reference the list had on any object we prune.
> +        */
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         list_for_each_entry_safe(sas_device, sas_device_next,
> -           &ioc->sas_device_list, list) {
> +                       &ioc->sas_device_list, list) {
>                 if (!sas_device->responding)
> -                       mpt2sas_device_remove_by_sas_address(ioc,
> -                               sas_device->sas_address);
> +                       list_move_tail(&sas_device->list, &head);
>                 else
>                         sas_device->responding = 0;
>         }
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       /*
> +        * Now, uninitialize and remove the unresponding devices we pruned.
> +        */
> +       list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> +               _scsih_remove_device(ioc, sas_device);
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>
>         /* removing unresponding volumes */
>         if (ioc->ir_firmware) {
> @@ -7179,11 +7292,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
>                 }
>                 phys_disk_num = pd_pg0.PhysDiskNum;
>                 handle = le16_to_cpu(pd_pg0.DevHandle);
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               if (sas_device)
> +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         continue;
> +               }
>                 if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
>                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
>                     handle) != 0)
> @@ -7302,12 +7415,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
>                 if (!(_scsih_is_end_device(
>                     le32_to_cpu(sas_device_pg0.DeviceInfo))))
>                         continue;
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +               sas_device = mpt2sas_get_sdev_by_addr(ioc,
>                     le64_to_cpu(sas_device_pg0.SASAddress));
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               if (sas_device)
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         continue;
> +               }
>                 parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
>                 if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
>                         printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> @@ -7966,6 +8079,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
>         }
>  }
>
> +static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> +{
> +       struct _sas_device *sas_device = NULL;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       if (!list_empty(&ioc->sas_device_init_list)) {
> +               sas_device = list_first_entry(&ioc->sas_device_init_list,
> +                               struct _sas_device, list);
> +               list_del_init(&sas_device->list);
> +       }
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       /*
> +        * If an item was dequeued, the caller now owns the reference that was
> +        * previously owned by the list
> +        */
> +       return sas_device;
> +}
> +
> +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> +               struct _sas_device *sas_device)
> +{
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);
> +       list_add_tail(&sas_device->list, &ioc->sas_device_list);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +}
> +
>  /**
>   * _scsih_probe_sas - reporting sas devices to sas transport
>   * @ioc: per adapter object
> @@ -7975,34 +8119,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
>  static void
>  _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
>  {
> -       struct _sas_device *sas_device, *next;
> -       unsigned long flags;
> -
> -       /* SAS Device List */
> -       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> -           list) {
> +       struct _sas_device *sas_device;
>
> -               if (ioc->hide_drives)
> -                       continue;
> +       if (ioc->hide_drives)
> +               return;
>
> +       while ((sas_device = dequeue_next_sas_device(ioc))) {

I see some issue here. Here sas_device is removed from the
sas_device_init_list and adding this device to the STL by calling sas_rphy_add,
which in turn invokes the driver's target_alloc, slave_alloc &
slave_configure callback
routines and in these routines we are checking whether this device is
present in the
sas_device_init_list or not, as this device is not in this list, so
this device won't be added.


Regards,
Sreekanth
>                 if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> -                   sas_device->sas_address_parent)) {
> -                       list_del(&sas_device->list);
> -                       kfree(sas_device);
> +                               sas_device->sas_address_parent)) {
> +                       sas_device_put(sas_device);
>                         continue;
>                 } else if (!sas_device->starget) {
>                         if (!ioc->is_driver_loading) {
>                                 mpt2sas_transport_port_remove(ioc,
> -                                       sas_device->sas_address,
> -                                       sas_device->sas_address_parent);
> -                               list_del(&sas_device->list);
> -                               kfree(sas_device);
> +                                               sas_device->sas_address,
> +                                               sas_device->sas_address_parent);
> +                               sas_device_put(sas_device);
>                                 continue;
>                         }
>                 }
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +               sas_device_make_active(ioc, sas_device);
> +               sas_device_put(sas_device);
>         }
>  }
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> index ff2500a..af86800 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
>         int rc;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             rphy->identify.sas_address);
>         if (sas_device) {
>                 *identifier = sas_device->enclosure_logical_id;
>                 rc = 0;
> +               sas_device_put(sas_device);
>         } else {
>                 *identifier = 0;
>                 rc = -ENXIO;
>         }
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         return rc;
>  }
> @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
>         int rc;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             rphy->identify.sas_address);
> -       if (sas_device)
> +       if (sas_device) {
>                 rc = sas_device->slot;
> -       else
> +               sas_device_put(sas_device);
> +       } else {
>                 rc = -ENXIO;
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         return rc;
>  }
> --
> 1.8.1
>



-- 

Regards,
Sreekanth

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-16 14:57           ` Sreekanth Reddy
@ 2015-07-21  7:03             ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-21  7:03 UTC (permalink / raw)
  To: Sreekanth Reddy
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Abhijit Mahajan,
	MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Christoph Hellwig, Bart Van Assche

On Thursday 07/16 at 20:27 +0530, Sreekanth Reddy wrote:
> On Sun, Jul 12, 2015 at 9:54 AM, Calvin Owens <calvinowens@fb.com> wrote:
> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> >
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> >
> > Cc: Christoph Hellwig <hch@infradead.org>
> > Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > ---
> >  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
> >  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
> >  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
> >  3 files changed, 315 insertions(+), 153 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..78f41ac 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -238,6 +238,7 @@
> >   * @flags: MPT_TARGET_FLAGS_XXX flags
> >   * @deleted: target flaged for deletion
> >   * @tm_busy: target is busy with TM request.
> > + * @sdev: The sas_device associated with this target
> >   */
> >  struct MPT2SAS_TARGET {
> >         struct scsi_target *starget;
> > @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> >         u32     flags;
> >         u8      deleted;
> >         u8      tm_busy;
> > +       struct _sas_device *sdev;
> >  };
> >
> >
> > @@ -376,8 +378,24 @@ struct _sas_device {
> >         u8      phy;
> >         u8      responding;
> >         u8      pfa_led_on;
> > +       struct kref refcount;
> >  };
> >
> > +static inline void sas_device_get(struct _sas_device *s)
> > +{
> > +       kref_get(&s->refcount);
> > +}
> > +
> > +static inline void sas_device_free(struct kref *r)
> > +{
> > +       kfree(container_of(r, struct _sas_device, refcount));
> > +}
> > +
> > +static inline void sas_device_put(struct _sas_device *s)
> > +{
> > +       kref_put(&s->refcount, sas_device_free);
> > +}
> > +
> >  /**
> >   * struct _raid_device - raid volume link list
> >   * @list: sas device list
> > @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> >      u16 handle);
> >  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> >      *ioc, u64 sas_address);
> > -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> > +struct _sas_device *mpt2sas_get_sdev_by_addr(
> > +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> > +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> >      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> >
> >  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..fad80ce 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> >         }
> >  }
> >
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > +       struct _sas_device *ret;
> > +
> > +       ret = tgt_priv->sdev;
> > +       if (ret)
> > +               sas_device_get(ret);
> > +
> > +       return ret;
> > +}
> > +
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> > +    u64 sas_address)
> > +{
> > +       struct _sas_device *sas_device;
> > +
> > +       assert_spin_locked(&ioc->sas_device_lock);
> > +
> > +       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > +               if (sas_device->sas_address == sas_address)
> > +                       goto found_device;
> > +
> > +       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > +               if (sas_device->sas_address == sas_address)
> > +                       goto found_device;
> > +
> > +       return NULL;
> > +
> > +found_device:
> > +       sas_device_get(sas_device);
> > +       return sas_device;
> > +}
> > +
> >  /**
> > - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> > + * mpt2sas_get_sdev_by_addr - sas device search
> >   * @ioc: per adapter object
> >   * @sas_address: sas address
> >   * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -536,24 +571,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> >   * object.
> >   */
> >  struct _sas_device *
> > -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> >      u64 sas_address)
> >  {
> >         struct _sas_device *sas_device;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > +                       sas_address);
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       return sas_device;
> > +}
> > +
> > +static struct _sas_device *
> > +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +{
> > +       struct _sas_device *sas_device;
> > +
> > +       assert_spin_locked(&ioc->sas_device_lock);
> >
> >         list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > -               if (sas_device->sas_address == sas_address)
> > -                       return sas_device;
> > +               if (sas_device->handle == handle)
> > +                       goto found_device;
> >
> >         list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > -               if (sas_device->sas_address == sas_address)
> > -                       return sas_device;
> > +               if (sas_device->handle == handle)
> > +                       goto found_device;
> >
> >         return NULL;
> > +
> > +found_device:
> > +       sas_device_get(sas_device);
> > +       return sas_device;
> >  }
> >
> >  /**
> > - * _scsih_sas_device_find_by_handle - sas device search
> > + * mpt2sas_get_sdev_by_handle - sas device search
> >   * @ioc: per adapter object
> >   * @handle: sas device handle (assigned by firmware)
> >   * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -562,19 +617,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> >   * object.
> >   */
> >  static struct _sas_device *
> > -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >  {
> >         struct _sas_device *sas_device;
> > +       unsigned long flags;
> >
> > -       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > -               if (sas_device->handle == handle)
> > -                       return sas_device;
> > -
> > -       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > -               if (sas_device->handle == handle)
> > -                       return sas_device;
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > -       return NULL;
> > +       return sas_device;
> >  }
> >
> >  /**
> > @@ -583,7 +635,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >   * @sas_device: the sas_device object
> >   * Context: This function will acquire ioc->sas_device_lock.
> >   *
> > - * Removing object and freeing associated memory from the ioc->sas_device_list.
> > + * If sas_device is on the list, remove it and decrement its reference count.
> >   */
> >  static void
> >  _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > @@ -594,9 +646,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> >         if (!sas_device)
> >                 return;
> >
> > +       /*
> > +        * The lock serializes access to the list, but we still need to verify
> > +        * that nobody removed the entry while we were waiting on the lock.
> > +        */
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       list_del(&sas_device->list);
> > -       kfree(sas_device);
> > +       if (!list_empty(&sas_device->list)) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >  }
> >
> > @@ -620,6 +678,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
> >             sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device_get(sas_device);
> >         list_add_tail(&sas_device->list, &ioc->sas_device_list);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -659,6 +718,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> >             sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device_get(sas_device);
> >         list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> >         _scsih_determine_boot_device(ioc, sas_device, 0);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -1208,12 +1268,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> >                 goto not_sata;
> >         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> >                 goto not_sata;
> > +
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -          sas_device_priv_data->sas_target->sas_address);
> > -       if (sas_device && sas_device->device_info &
> > -           MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> > +       sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> > +       if (sas_device && sas_device->device_info
> > +                       & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> >                 max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> >   not_sata:
> > @@ -1271,18 +1333,21 @@ _scsih_target_alloc(struct scsi_target *starget)
> >         /* sas/sata devices */
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         rphy = dev_to_rphy(starget->dev.parent);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >            rphy->identify.sas_address);
> >
> >         if (sas_device) {
> >                 sas_target_priv_data->handle = sas_device->handle;
> >                 sas_target_priv_data->sas_address = sas_device->sas_address;
> > +               sas_target_priv_data->sdev = sas_device;
> >                 sas_device->starget = starget;
> >                 sas_device->id = starget->id;
> >                 sas_device->channel = starget->channel;
> >                 if (test_bit(sas_device->handle, ioc->pd_handles))
> >                         sas_target_priv_data->flags |=
> >                             MPT_TARGET_FLAGS_RAID_COMPONENT;
> > +
> > +               sas_device_put(sas_device);
> >         }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -1324,13 +1389,14 @@ _scsih_target_destroy(struct scsi_target *starget)
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         rphy = dev_to_rphy(starget->dev.parent);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -          rphy->identify.sas_address);
> > +       sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> >         if (sas_device && (sas_device->starget == starget) &&
> >             (sas_device->id == starget->id) &&
> >             (sas_device->channel == starget->channel))
> >                 sas_device->starget = NULL;
> >
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> >   out:
> > @@ -1386,7 +1452,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >
> >         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> >                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +               sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >                                 sas_target_priv_data->sas_address);
> >                 if (sas_device && (sas_device->starget == NULL)) {
> >                         sdev_printk(KERN_INFO, sdev,
> > @@ -1394,6 +1460,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >                              __func__, __LINE__);
> >                         sas_device->starget = starget;
> >                 }
> > +
> > +               if (sas_device)
> > +                       sas_device_put(sas_device);
> > +
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> >
> > @@ -1428,10 +1498,12 @@ _scsih_slave_destroy(struct scsi_device *sdev)
> >
> >         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> >                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                  sas_target_priv_data->sas_address);
> > +               sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> >                 if (sas_device && !sas_target_priv_data->num_luns)
> >                         sas_device->starget = NULL;
> > +
> > +               if (sas_device)
> > +                       sas_device_put(sas_device);
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> >
> > @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> >         }
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >            sas_device_priv_data->sas_target->sas_address);
> >         if (!sas_device) {
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
> >         if (!ssp_target)
> >                 _scsih_display_sata_capabilities(ioc, handle, sdev);
> >
> > -
> >         _scsih_change_queue_depth(sdev, qdepth);
> >
> >         if (ssp_target) {
> >                 sas_read_port_mode_page(sdev);
> >                 _scsih_enable_tlr(ioc, sdev);
> >         }
> > +
> > +       sas_device_put(sas_device);
> >         return 0;
> >  }
> >
> > @@ -2509,8 +2582,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> >                     device_str, (unsigned long long)priv_target->sas_address);
> >         } else {
> >                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                   priv_target->sas_address);
> > +               sas_device = __mpt2sas_get_sdev_from_target(priv_target);
> >                 if (sas_device) {
> >                         if (priv_target->flags &
> >                             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > @@ -2529,6 +2601,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> >                             "enclosure_logical_id(0x%016llx), slot(%d)\n",
> >                            (unsigned long long)sas_device->enclosure_logical_id,
> >                             sas_device->slot);
> > +
> > +                       sas_device_put(sas_device);
> >                 }
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> > @@ -2604,12 +2678,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> >  {
> >         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> >         struct MPT2SAS_DEVICE *sas_device_priv_data;
> > -       struct _sas_device *sas_device;
> > -       unsigned long flags;
> > +       struct _sas_device *sas_device = NULL;
> >         u16     handle;
> >         int r;
> >
> >         struct scsi_target *starget = scmd->device->sdev_target;
> > +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> >         starget_printk(KERN_INFO, starget, "attempting device reset! "
> >             "scmd(%p)\n", scmd);
> > @@ -2629,12 +2703,9 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> >         handle = 0;
> >         if (sas_device_priv_data->sas_target->flags &
> >             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> > -                  sas_device_priv_data->sas_target->handle);
> > +               sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
> >                 if (sas_device)
> >                         handle = sas_device->volume_handle;
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         } else
> >                 handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2651,6 +2722,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> >   out:
> >         sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
> >             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +
> >         return r;
> >  }
> >
> > @@ -2665,11 +2740,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> >  {
> >         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> >         struct MPT2SAS_DEVICE *sas_device_priv_data;
> > -       struct _sas_device *sas_device;
> > -       unsigned long flags;
> > +       struct _sas_device *sas_device = NULL;
> >         u16     handle;
> >         int r;
> >         struct scsi_target *starget = scmd->device->sdev_target;
> > +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> >         starget_printk(KERN_INFO, starget, "attempting target reset! "
> >             "scmd(%p)\n", scmd);
> > @@ -2689,12 +2764,9 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> >         handle = 0;
> >         if (sas_device_priv_data->sas_target->flags &
> >             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> > -                  sas_device_priv_data->sas_target->handle);
> > +               sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
> >                 if (sas_device)
> >                         handle = sas_device->volume_handle;
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         } else
> >                 handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2711,6 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> >   out:
> >         starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
> >             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +
> >         return r;
> >  }
> >
> > @@ -3002,15 +3078,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
> >
> >         list_for_each_entry(mpt2sas_port,
> >            &sas_expander->sas_port_list, port_list) {
> > -               if (mpt2sas_port->remote_identify.device_type ==
> > -                   SAS_END_DEVICE) {
> > +               if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
> >                         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -                       sas_device =
> > -                           mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                          mpt2sas_port->remote_identify.sas_address);
> > -                       if (sas_device)
> > +                       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > +                                       mpt2sas_port->remote_identify.sas_address);
> > +                       if (sas_device) {
> >                                 set_bit(sas_device->handle,
> > -                                   ioc->blocking_handles);
> > +                                               ioc->blocking_handles);
> > +                               sas_device_put(sas_device);
> > +                       }
> >                         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >                 }
> >         }
> > @@ -3080,7 +3156,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >  {
> >         Mpi2SCSITaskManagementRequest_t *mpi_request;
> >         u16 smid;
> > -       struct _sas_device *sas_device;
> > +       struct _sas_device *sas_device = NULL;
> >         struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
> >         u64 sas_address = 0;
> >         unsigned long flags;
> > @@ -3110,7 +3186,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >                 return;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (sas_device && sas_device->starget &&
> >              sas_device->starget->hostdata) {
> >                 sas_target_priv_data = sas_device->starget->hostdata;
> > @@ -3131,14 +3207,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         if (!smid) {
> >                 delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
> >                 if (!delayed_tr)
> > -                       return;
> > +                       goto out;
> >                 INIT_LIST_HEAD(&delayed_tr->list);
> >                 delayed_tr->handle = handle;
> >                 list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
> >                 dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
> >                     "DELAYED:tr:handle(0x%04x), (open)\n",
> >                     ioc->name, handle));
> > -               return;
> > +               goto out;
> >         }
> >
> >         dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> > @@ -3150,6 +3226,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         mpi_request->DevHandle = cpu_to_le16(handle);
> >         mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
> >         mpt2sas_base_put_smid_hi_priority(ioc, smid);
> > +out:
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> >  }
> >
> >
> > @@ -4068,7 +4147,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> >         char *desc_scsi_state = ioc->tmp_string;
> >         u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
> >         struct _sas_device *sas_device = NULL;
> > -       unsigned long flags;
> >         struct scsi_target *starget = scmd->device->sdev_target;
> >         struct MPT2SAS_TARGET *priv_target = starget->hostdata;
> >         char *device_str = NULL;
> > @@ -4200,9 +4278,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> >                 printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
> >                     device_str, (unsigned long long)priv_target->sas_address);
> >         } else {
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                   priv_target->sas_address);
> > +               sas_device = __mpt2sas_get_sdev_from_target(priv_target);
> >                 if (sas_device) {
> >                         printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
> >                             "phy(%d)\n", ioc->name, sas_device->sas_address,
> > @@ -4211,8 +4287,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> >                             "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
> >                             ioc->name, sas_device->enclosure_logical_id,
> >                             sas_device->slot);
> > +
> > +                       sas_device_put(sas_device);
> >                 }
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> >
> >         printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> > @@ -4259,7 +4336,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         Mpi2SepRequest_t mpi_request;
> >         struct _sas_device *sas_device;
> >
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (!sas_device)
> >                 return;
> >
> > @@ -4274,7 +4351,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >             &mpi_request)) != 0) {
> >                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
> >                 __FILE__, __LINE__, __func__);
> > -               return;
> > +               goto out;
> >         }
> >         sas_device->pfa_led_on = 1;
> >
> > @@ -4284,8 +4361,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >                  "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
> >                  ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
> >                  le32_to_cpu(mpi_reply.IOCLogInfo)));
> > -               return;
> > +               goto out;
> >         }
> > +out:
> > +       sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -4370,19 +4449,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> >         /* only handle non-raid devices */
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (!sas_device) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > +               goto out_unlock;
> >         }
> >         starget = sas_device->starget;
> >         sas_target_priv_data = starget->hostdata;
> >
> >         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> > -          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> > +               goto out_unlock;
> > +
> >         starget_printk(KERN_WARNING, starget, "predicted fault\n");
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -4396,7 +4473,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         if (!event_reply) {
> >                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
> >                     ioc->name, __FILE__, __LINE__, __func__);
> > -               return;
> > +               goto out;
> >         }
> >
> >         event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> > @@ -4413,6 +4490,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
> >         mpt2sas_ctl_add_to_event_log(ioc, event_reply);
> >         kfree(event_reply);
> > +out:
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +       return;
> > +
> > +out_unlock:
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +       goto out;
> >  }
> >
> >  /**
> > @@ -5148,14 +5233,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             sas_address);
> >
> >         if (!sas_device) {
> >                 printk(MPT2SAS_ERR_FMT "device is not present "
> >                     "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > +               goto out_unlock;
> >         }
> >
> >         if (unlikely(sas_device->handle != handle)) {
> > @@ -5172,19 +5256,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >             MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
> >                 printk(MPT2SAS_ERR_FMT "device is not present "
> >                     "handle(0x%04x), flags!!!\n", ioc->name, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > +               goto out_unlock;
> >         }
> >
> >         /* check if there were any issues with discovery */
> >         if (_scsih_check_access_status(ioc, sas_address, handle,
> > -           sas_device_pg0.AccessStatus)) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +           sas_device_pg0.AccessStatus))
> > +               goto out_unlock;
> > +
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         _scsih_ublock_io_device(ioc, sas_address);
> > +       return;
> >
> > +out_unlock:
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -5208,7 +5295,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> >         u32 ioc_status;
> >         __le64 sas_address;
> >         u32 device_info;
> > -       unsigned long flags;
> >
> >         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> >             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -5250,14 +5336,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> >                 return -1;
> >         }
> >
> > -
> > -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = mpt2sas_get_sdev_by_addr(ioc,
> >             sas_address);
> > -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > -       if (sas_device)
> > +       if (sas_device) {
> > +               sas_device_put(sas_device);
> >                 return 0;
> > +       }
> >
> >         sas_device = kzalloc(sizeof(struct _sas_device),
> >             GFP_KERNEL);
> > @@ -5267,6 +5352,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> >                 return -1;
> >         }
> >
> > +       kref_init(&sas_device->refcount);
> >         sas_device->handle = handle;
> >         if (_scsih_get_sas_address(ioc, le16_to_cpu
> >                 (sas_device_pg0.ParentDevHandle),
> > @@ -5344,7 +5430,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
> >             "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> >             sas_device->handle, (unsigned long long)
> >             sas_device->sas_address));
> > -       kfree(sas_device);
> >  }
> >  /**
> >   * _scsih_device_remove_by_handle - removing device object by handle
> > @@ -5363,12 +5448,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >                 return;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -       if (sas_device)
> > -               list_del(&sas_device->list);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > +       if (sas_device) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -       if (sas_device)
> > +
> > +       if (sas_device) {
> >                 _scsih_remove_device(ioc, sas_device);
> > +               sas_device_put(sas_device);
> > +       }
> >  }
> >
> >  /**
> > @@ -5389,13 +5479,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> >                 return;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -           sas_address);
> > -       if (sas_device)
> > -               list_del(&sas_device->list);
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> > +       if (sas_device) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -       if (sas_device)
> > +
> > +       if (sas_device) {
> >                 _scsih_remove_device(ioc, sas_device);
> > +               sas_device_put(sas_device);
> > +       }
> >  }
> >  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> >  /**
> > @@ -5716,26 +5810,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         sas_address = le64_to_cpu(event_data->SASAddress);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             sas_address);
> >
> > -       if (!sas_device || !sas_device->starget) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +       if (!sas_device || !sas_device->starget)
> > +               goto out;
> >
> >         target_priv_data = sas_device->starget->hostdata;
> > -       if (!target_priv_data) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +       if (!target_priv_data)
> > +               goto out;
> >
> >         if (event_data->ReasonCode ==
> >             MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
> >                 target_priv_data->tm_busy = 1;
> >         else
> >                 target_priv_data->tm_busy = 0;
> > +
> > +out:
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> >  }
> >
> >  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> > @@ -6123,7 +6219,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> >         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (sas_device) {
> >                 sas_device->volume_handle = 0;
> >                 sas_device->volume_wwid = 0;
> > @@ -6142,6 +6238,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> >         /* exposing raid component */
> >         if (starget)
> >                 starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> > +
> > +       sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -6170,7 +6268,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> >                     &volume_wwid);
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (sas_device) {
> >                 set_bit(handle, ioc->pd_handles);
> >                 if (sas_device->starget && sas_device->starget->hostdata) {
> > @@ -6189,6 +6287,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> >         /* hiding raid component */
> >         if (starget)
> >                 starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> > +
> > +       sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -6221,7 +6321,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> >      Mpi2EventIrConfigElement_t *element)
> >  {
> >         struct _sas_device *sas_device;
> > -       unsigned long flags;
> >         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> >         Mpi2ConfigReply_t mpi_reply;
> >         Mpi2SasDevicePage0_t sas_device_pg0;
> > @@ -6231,11 +6330,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> >
> >         set_bit(handle, ioc->pd_handles);
> >
> > -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -       if (sas_device)
> > +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > +       if (sas_device) {
> > +               sas_device_put(sas_device);
> >                 return;
> > +       }
> >
> >         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> >             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -6509,7 +6608,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> >         u16 handle, parent_handle;
> >         u32 state;
> >         struct _sas_device *sas_device;
> > -       unsigned long flags;
> >         Mpi2ConfigReply_t mpi_reply;
> >         Mpi2SasDevicePage0_t sas_device_pg0;
> >         u32 ioc_status;
> > @@ -6542,12 +6640,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> >                 if (!ioc->is_warpdrive)
> >                         set_bit(handle, ioc->pd_handles);
> >
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -
> > -               if (sas_device)
> > +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > +               if (sas_device) {
> > +                       sas_device_put(sas_device);
> >                         return;
> > +               }
> >
> >                 if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> >                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> > @@ -7015,6 +7112,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> >         struct _raid_device *raid_device, *raid_device_next;
> >         struct list_head tmp_list;
> >         unsigned long flags;
> > +       LIST_HEAD(head);
> >
> >         printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
> >             ioc->name);
> > @@ -7022,14 +7120,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> >         /* removing unresponding end devices */
> >         printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
> >             ioc->name);
> > +
> > +       /*
> > +        * Iterate, pulling off devices marked as non-responding. We become the
> > +        * owner for the reference the list had on any object we prune.
> > +        */
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         list_for_each_entry_safe(sas_device, sas_device_next,
> > -           &ioc->sas_device_list, list) {
> > +                       &ioc->sas_device_list, list) {
> >                 if (!sas_device->responding)
> > -                       mpt2sas_device_remove_by_sas_address(ioc,
> > -                               sas_device->sas_address);
> > +                       list_move_tail(&sas_device->list, &head);
> >                 else
> >                         sas_device->responding = 0;
> >         }
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       /*
> > +        * Now, uninitialize and remove the unresponding devices we pruned.
> > +        */
> > +       list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> > +               _scsih_remove_device(ioc, sas_device);
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >
> >         /* removing unresponding volumes */
> >         if (ioc->ir_firmware) {
> > @@ -7179,11 +7292,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> >                 }
> >                 phys_disk_num = pd_pg0.PhysDiskNum;
> >                 handle = le16_to_cpu(pd_pg0.DevHandle);
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               if (sas_device)
> > +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > +               if (sas_device) {
> > +                       sas_device_put(sas_device);
> >                         continue;
> > +               }
> >                 if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> >                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> >                     handle) != 0)
> > @@ -7302,12 +7415,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> >                 if (!(_scsih_is_end_device(
> >                     le32_to_cpu(sas_device_pg0.DeviceInfo))))
> >                         continue;
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +               sas_device = mpt2sas_get_sdev_by_addr(ioc,
> >                     le64_to_cpu(sas_device_pg0.SASAddress));
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               if (sas_device)
> > +               if (sas_device) {
> > +                       sas_device_put(sas_device);
> >                         continue;
> > +               }
> >                 parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
> >                 if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
> >                         printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> > @@ -7966,6 +8079,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> >         }
> >  }
> >
> > +static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > +       struct _sas_device *sas_device = NULL;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       if (!list_empty(&ioc->sas_device_init_list)) {
> > +               sas_device = list_first_entry(&ioc->sas_device_init_list,
> > +                               struct _sas_device, list);
> > +               list_del_init(&sas_device->list);
> > +       }
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       /*
> > +        * If an item was dequeued, the caller now owns the reference that was
> > +        * previously owned by the list
> > +        */
> > +       return sas_device;
> > +}
> > +
> > +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> > +               struct _sas_device *sas_device)
> > +{
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device_get(sas_device);
> > +       list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +}
> > +
> >  /**
> >   * _scsih_probe_sas - reporting sas devices to sas transport
> >   * @ioc: per adapter object
> > @@ -7975,34 +8119,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> >  static void
> >  _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> >  {
> > -       struct _sas_device *sas_device, *next;
> > -       unsigned long flags;
> > -
> > -       /* SAS Device List */
> > -       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> > -           list) {
> > +       struct _sas_device *sas_device;
> >
> > -               if (ioc->hide_drives)
> > -                       continue;
> > +       if (ioc->hide_drives)
> > +               return;
> >
> > +       while ((sas_device = dequeue_next_sas_device(ioc))) {
> 
> I see some issue here. Here sas_device is removed from the
> sas_device_init_list and adding this device to the STL by calling
> sas_rphy_add, which in turn invokes the driver's target_alloc,
> slave_alloc & slave_configure callback routines and in these routines
> we are checking whether this device is present in the
> sas_device_init_list or not, as this device is not in this list, so
> this device won't be added.

Thanks for looking at this.

I think I can eliminate this problem without too much churn: Since we
hold the reference, it should be fine to leave the sas_device on the
list, as long as we're careful about the state of its list_head after
reacquiring the sas_device_lock (which we can't hold here because
mpt2sas_transport_port_add() calls things that sleep).

I'll send a v3 that does this in the next day or two.

(Lest it appear that I'm being incredibly sloppy about testing here, the
devices I'm using to test these patches don't seem to encounter this
problem at all. I can provide more detail about the hardware I'm using
if that's interesting.)

Thanks very much,
Calvin

> Regards,
> Sreekanth
> >                 if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> > -                   sas_device->sas_address_parent)) {
> > -                       list_del(&sas_device->list);
> > -                       kfree(sas_device);
> > +                               sas_device->sas_address_parent)) {
> > +                       sas_device_put(sas_device);
> >                         continue;
> >                 } else if (!sas_device->starget) {
> >                         if (!ioc->is_driver_loading) {
> >                                 mpt2sas_transport_port_remove(ioc,
> > -                                       sas_device->sas_address,
> > -                                       sas_device->sas_address_parent);
> > -                               list_del(&sas_device->list);
> > -                               kfree(sas_device);
> > +                                               sas_device->sas_address,
> > +                                               sas_device->sas_address_parent);
> > +                               sas_device_put(sas_device);
> >                                 continue;
> >                         }
> >                 }
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +               sas_device_make_active(ioc, sas_device);
> > +               sas_device_put(sas_device);
> >         }
> >  }
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > index ff2500a..af86800 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
> >         int rc;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             rphy->identify.sas_address);
> >         if (sas_device) {
> >                 *identifier = sas_device->enclosure_logical_id;
> >                 rc = 0;
> > +               sas_device_put(sas_device);
> >         } else {
> >                 *identifier = 0;
> >                 rc = -ENXIO;
> >         }
> > +
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         return rc;
> >  }
> > @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
> >         int rc;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             rphy->identify.sas_address);
> > -       if (sas_device)
> > +       if (sas_device) {
> >                 rc = sas_device->slot;
> > -       else
> > +               sas_device_put(sas_device);
> > +       } else {
> >                 rc = -ENXIO;
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         return rc;
> >  }
> > --
> > 1.8.1
> >
> 
> 
> 
> -- 
> 
> Regards,
> Sreekanth

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-13 15:05           ` Joe Lawrence
@ 2015-07-21  7:04             ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-21  7:04 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Christoph Hellwig, Bart Van Assche

On Monday 07/13 at 11:05 -0400, Joe Lawrence wrote:
> On 07/12/2015 12:24 AM, Calvin Owens wrote:
> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> > 
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> > 
> > Cc: Christoph Hellwig <hch@infradead.org>
> > Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > ---
> >  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
> >  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
> >  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
> >  3 files changed, 315 insertions(+), 153 deletions(-)
> 
> [ ... snip ... ]
> 
> > @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> >  	}
> >  
> >  	spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >  	   sas_device_priv_data->sas_target->sas_address);
> >  	if (!sas_device) {
> >  		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
> >  	if (!ssp_target)
> >  		_scsih_display_sata_capabilities(ioc, handle, sdev);
> >  
> > -
> >  	_scsih_change_queue_depth(sdev, qdepth);
> >  
> >  	if (ssp_target) {
> >  		sas_read_port_mode_page(sdev);
> >  		_scsih_enable_tlr(ioc, sdev);
> >  	}
> > +
> > +	sas_device_put(sas_device);
> >  	return 0;
> >  }
> 
> Hi Calvin,
> 
> Any reason why this sas_device_put is placed outside the sas_device
> lock?  Most other instances in this patch were called just before unlocking.

Thanks for looking at this.

I guess I thought that something below where we drop the sas_device_lock
referenced it, but it looks like nothing does. I'll move it up in v3.

I don't think it's strictly necessary that the put() happen under the
lock: the only way this could be the final put() is if both ->hostdata
and the sas_device_list had dropped their references, and in that case
it would be impossible to have a concurrent get(), since those are the
only two ways to lookup/get a sas_device. But absent any reason not to,
let's make it more consistent.

I'm really glad you pointed this out, because I realized I flubbed this
in _scsih_target_alloc() and forgot to eliminate the sas_device_put()
from before the ->hostdata lookup was added. I'll fix this in v3.

> BTW I attempted testing, but needed to port to mpt3 and ended up with a
> driver that didn't boot :(   Hopefully I can retry later this week, or
> find an older mpt2 box lying around.

More testing would be fantastic if that's possible :)

Thanks very much,
Calvin

> -- Joe

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-07-13  6:52           ` Christoph Hellwig
@ 2015-07-21  7:06             ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-07-21  7:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Bart Van Assche

On Sunday 07/12 at 23:52 -0700, Christoph Hellwig wrote:
> On Sat, Jul 11, 2015 at 09:24:55PM -0700, Calvin Owens wrote:
> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> > 
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> > 
> > Cc: Christoph Hellwig <hch@infradead.org>
> > Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > ---
> >  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
> >  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 434 ++++++++++++++++++++-----------
> >  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
> >  3 files changed, 315 insertions(+), 153 deletions(-)
> > 
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..78f41ac 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -238,6 +238,7 @@
> >   * @flags: MPT_TARGET_FLAGS_XXX flags
> >   * @deleted: target flaged for deletion
> >   * @tm_busy: target is busy with TM request.
> > + * @sdev: The sas_device associated with this target
> >   */
> >  struct MPT2SAS_TARGET {
> >  	struct scsi_target *starget;
> > @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> >  	u32	flags;
> >  	u8	deleted;
> >  	u8	tm_busy;
> > +	struct _sas_device *sdev;
> >  };
> >  
> >  
> > @@ -376,8 +378,24 @@ struct _sas_device {
> >  	u8	phy;
> >  	u8	responding;
> >  	u8	pfa_led_on;
> > +	struct kref refcount;
> >  };
> >  
> > +static inline void sas_device_get(struct _sas_device *s)
> > +{
> > +	kref_get(&s->refcount);
> > +}
> > +
> > +static inline void sas_device_free(struct kref *r)
> > +{
> > +	kfree(container_of(r, struct _sas_device, refcount));
> > +}
> > +
> > +static inline void sas_device_put(struct _sas_device *s)
> > +{
> > +	kref_put(&s->refcount, sas_device_free);
> > +}
> > +
> >  /**
> >   * struct _raid_device - raid volume link list
> >   * @list: sas device list
> > @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> >      u16 handle);
> >  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> >      *ioc, u64 sas_address);
> > -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> > +struct _sas_device *mpt2sas_get_sdev_by_addr(
> > +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> > +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> >      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> >  
> >  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..fad80ce 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> >  	}
> >  }
> >  
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > +	struct _sas_device *ret;
> > +
> 
> Does this need a:
> 
> 	assert_spin_locked(&ioc->sas_device_lock);
> 
> ?

Yeah: I'll add that.

Thanks very much,
Calvin

> Otherwise this looks sensible to me.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 0/2] Fixes for memory corruption in mpt2sas
  2015-07-12  4:24       ` [PATCH 0/2 v2] " Calvin Owens
  2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-07-12  4:24         ` [PATCH 2/2] mpt2sas: Refcount fw_events " Calvin Owens
@ 2015-08-01  5:02         ` Calvin Owens
  2015-08-01  5:02           ` [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
                             ` (2 more replies)
  2 siblings, 3 replies; 52+ messages in thread
From: Calvin Owens @ 2015-08-01  5:02 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Joe Lawrence

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Changes are noted in the individual patches, I realized putting them in the
cover was probably a bit confusing.

Thanks,
Calvin


Patches in this series:
 [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
 [PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Total diffstat:
 drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 579 ++++++++++++++++++++++---------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 439 insertions(+), 174 deletions(-)

Diff showing changes v2 => v3:
	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch

Diff showing changes v1 => v2:
	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-08-01  5:02         ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
@ 2015-08-01  5:02           ` Calvin Owens
  2015-08-10 13:15             ` Sreekanth Reddy
  2015-08-01  5:02           ` [PATCH v3 2/2] mpt2sas: Refcount fw_events " Calvin Owens
  2015-08-14  1:48           ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2 siblings, 1 reply; 52+ messages in thread
From: Calvin Owens @ 2015-08-01  5:02 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Joe Lawrence, Christoph Hellwig, Bart Van Assche

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
Changes in v3:
	* Drop the sas_device_lock while enabling devices, and leave the
	  sas_device object on the list, since it may need to be looked up
	  there while it is being enabled.
	* Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
	  reference (this was an oversight in v2).
	* Be consistent about calling sas_device_put() while holding the
	  sas_device_lock where feasible.
	* Take and assert_spin_locked() on the sas_device_lock from the newly
	  added __get_sdev_from_target(), add wrapper similar to other lookups
	  for callers which do not explicitly take the lock.

Changes in v2:
	* Squished patches 1-3 into this one
	* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
	* Store a pointer to the sas_device object in ->hostdata, to eliminate
	  the need for several lookups on the lists.

 drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 467 +++++++++++++++++++++----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 348 insertions(+), 153 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
  * @flags: MPT_TARGET_FLAGS_XXX flags
  * @deleted: target flaged for deletion
  * @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
  */
 struct MPT2SAS_TARGET {
 	struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
 	u32	flags;
 	u8	deleted;
 	u8	tm_busy;
+	struct _sas_device *sdev;
 };
 
 
@@ -376,8 +378,24 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	pfa_led_on;
+	struct kref refcount;
 };
 
+static inline void sas_device_get(struct _sas_device *s)
+{
+	kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+	kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+	kref_put(&s->refcount, sas_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
     u16 handle);
 struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
     *ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
     struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
 
 void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..a2af9a5 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
 	}
 }
 
+static struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+		struct MPT2SAS_TARGET *tgt_priv)
+{
+	struct _sas_device *ret;
+
+	assert_spin_locked(&ioc->sas_device_lock);
+
+	ret = tgt_priv->sdev;
+	if (ret)
+		sas_device_get(ret);
+
+	return ret;
+}
+
+static struct _sas_device *
+mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+		struct MPT2SAS_TARGET *tgt_priv)
+{
+	struct _sas_device *ret;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return ret;
+}
+
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+    u64 sas_address)
+{
+	struct _sas_device *sas_device;
+
+	assert_spin_locked(&ioc->sas_device_lock);
+
+	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
+}
+
 /**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_get_sdev_by_addr - sas device search
  * @ioc: per adapter object
  * @sas_address: sas address
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
     u64 sas_address)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+			sas_address);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static struct _sas_device *
+__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+	struct _sas_device *sas_device;
+
+	assert_spin_locked(&ioc->sas_device_lock);
 
 	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
 }
 
 /**
- * _scsih_sas_device_find_by_handle - sas device search
+ * mpt2sas_get_sdev_by_handle - sas device search
  * @ioc: per adapter object
  * @handle: sas device handle (assigned by firmware)
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
 
-	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
-
-	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	return NULL;
+	return sas_device;
 }
 
 /**
@@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
  * @sas_device: the sas_device object
  * Context: This function will acquire ioc->sas_device_lock.
  *
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
  */
 static void
 _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
 	if (!sas_device)
 		return;
 
+	/*
+	 * The lock serializes access to the list, but we still need to verify
+	 * that nobody removed the entry while we were waiting on the lock.
+	 */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
 	_scsih_determine_boot_device(ioc, sas_device, 0);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1286,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
 		goto not_sata;
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
 		goto not_sata;
+
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	   sas_device_priv_data->sas_target->sas_address);
-	if (sas_device && sas_device->device_info &
-	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+	sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
+	if (sas_device && sas_device->device_info
+			& MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
 		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  not_sata:
@@ -1271,18 +1351,20 @@ _scsih_target_alloc(struct scsi_target *starget)
 	/* sas/sata devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	   rphy->identify.sas_address);
 
 	if (sas_device) {
 		sas_target_priv_data->handle = sas_device->handle;
 		sas_target_priv_data->sas_address = sas_device->sas_address;
+		sas_target_priv_data->sdev = sas_device;
 		sas_device->starget = starget;
 		sas_device->id = starget->id;
 		sas_device->channel = starget->channel;
 		if (test_bit(sas_device->handle, ioc->pd_handles))
 			sas_target_priv_data->flags |=
 			    MPT_TARGET_FLAGS_RAID_COMPONENT;
+
 	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -1324,13 +1406,14 @@ _scsih_target_destroy(struct scsi_target *starget)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	   rphy->identify.sas_address);
+	sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
 	if (sas_device && (sas_device->starget == starget) &&
 	    (sas_device->id == starget->id) &&
 	    (sas_device->channel == starget->channel))
 		sas_device->starget = NULL;
 
+	if (sas_device)
+		sas_device_put(sas_device);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  out:
@@ -1386,7 +1469,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 				sas_target_priv_data->sas_address);
 		if (sas_device && (sas_device->starget == NULL)) {
 			sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1477,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 			     __func__, __LINE__);
 			sas_device->starget = starget;
 		}
+
+		if (sas_device)
+			sas_device_put(sas_device);
+
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -1428,10 +1515,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		   sas_target_priv_data->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(ioc,
+				sas_target_priv_data);
 		if (sas_device && !sas_target_priv_data->num_luns)
 			sas_device->starget = NULL;
+
+		if (sas_device)
+			sas_device_put(sas_device);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -2078,7 +2168,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	}
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
 	if (!sas_device) {
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2112,17 +2202,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	    (unsigned long long) sas_device->enclosure_logical_id,
 	    sas_device->slot);
 
+	sas_device_put(sas_device);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	if (!ssp_target)
 		_scsih_display_sata_capabilities(ioc, handle, sdev);
 
-
 	_scsih_change_queue_depth(sdev, qdepth);
 
 	if (ssp_target) {
 		sas_read_port_mode_page(sdev);
 		_scsih_enable_tlr(ioc, sdev);
 	}
+
 	return 0;
 }
 
@@ -2509,8 +2600,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		    priv_target->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
 		if (sas_device) {
 			if (priv_target->flags &
 			    MPT_TARGET_FLAGS_RAID_COMPONENT) {
@@ -2529,6 +2619,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 			    "enclosure_logical_id(0x%016llx), slot(%d)\n",
 			   (unsigned long long)sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
@@ -2604,12 +2696,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 
 	struct scsi_target *starget = scmd->device->sdev_target;
+	struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
 
 	starget_printk(KERN_INFO, starget, "attempting device reset! "
 	    "scmd(%p)\n", scmd);
@@ -2629,12 +2721,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
-		   sas_device_priv_data->sas_target->handle);
+		sas_device = mpt2sas_get_sdev_from_target(ioc,
+				target_priv_data);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2651,6 +2741,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
  out:
 	sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -2665,11 +2759,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 	struct scsi_target *starget = scmd->device->sdev_target;
+	struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
 
 	starget_printk(KERN_INFO, starget, "attempting target reset! "
 	    "scmd(%p)\n", scmd);
@@ -2689,12 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
-		   sas_device_priv_data->sas_target->handle);
+		sas_device = mpt2sas_get_sdev_from_target(ioc,
+				target_priv_data);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2711,6 +2803,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
  out:
 	starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -3002,15 +3098,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
 
 	list_for_each_entry(mpt2sas_port,
 	   &sas_expander->sas_port_list, port_list) {
-		if (mpt2sas_port->remote_identify.device_type ==
-		    SAS_END_DEVICE) {
+		if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
 			spin_lock_irqsave(&ioc->sas_device_lock, flags);
-			sas_device =
-			    mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-			   mpt2sas_port->remote_identify.sas_address);
-			if (sas_device)
+			sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+					mpt2sas_port->remote_identify.sas_address);
+			if (sas_device) {
 				set_bit(sas_device->handle,
-				    ioc->blocking_handles);
+						ioc->blocking_handles);
+				sas_device_put(sas_device);
+			}
 			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 		}
 	}
@@ -3080,7 +3176,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	Mpi2SCSITaskManagementRequest_t *mpi_request;
 	u16 smid;
-	struct _sas_device *sas_device;
+	struct _sas_device *sas_device = NULL;
 	struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
 	u64 sas_address = 0;
 	unsigned long flags;
@@ -3110,7 +3206,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device && sas_device->starget &&
 	     sas_device->starget->hostdata) {
 		sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3227,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!smid) {
 		delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
 		if (!delayed_tr)
-			return;
+			goto out;
 		INIT_LIST_HEAD(&delayed_tr->list);
 		delayed_tr->handle = handle;
 		list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
 		dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
 		    "DELAYED:tr:handle(0x%04x), (open)\n",
 		    ioc->name, handle));
-		return;
+		goto out;
 	}
 
 	dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3246,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	mpi_request->DevHandle = cpu_to_le16(handle);
 	mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
 	mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 
@@ -4068,7 +4167,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 	char *desc_scsi_state = ioc->tmp_string;
 	u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
 	struct _sas_device *sas_device = NULL;
-	unsigned long flags;
 	struct scsi_target *starget = scmd->device->sdev_target;
 	struct MPT2SAS_TARGET *priv_target = starget->hostdata;
 	char *device_str = NULL;
@@ -4200,9 +4298,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 		printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		    priv_target->sas_address);
+		sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
 		if (sas_device) {
 			printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
 			    "phy(%d)\n", ioc->name, sas_device->sas_address,
@@ -4211,8 +4307,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 			    "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
 			    ioc->name, sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
 	printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4356,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	Mpi2SepRequest_t mpi_request;
 	struct _sas_device *sas_device;
 
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (!sas_device)
 		return;
 
@@ -4274,7 +4371,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    &mpi_request)) != 0) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
 		__FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 	sas_device->pfa_led_on = 1;
 
@@ -4284,8 +4381,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		 "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
 		 ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
 		 le32_to_cpu(mpi_reply.IOCLogInfo)));
-		return;
+		goto out;
 	}
+out:
+	sas_device_put(sas_device);
 }
 
 /**
@@ -4370,19 +4469,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	/* only handle non-raid devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (!sas_device) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 	starget = sas_device->starget;
 	sas_target_priv_data = starget->hostdata;
 
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
-	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+		goto out_unlock;
+
 	starget_printk(KERN_WARNING, starget, "predicted fault\n");
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -4396,7 +4493,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!event_reply) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 
 	event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4510,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
 	mpt2sas_ctl_add_to_event_log(ioc, event_reply);
 	kfree(event_reply);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+	return;
+
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	goto out;
 }
 
 /**
@@ -5148,14 +5253,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
 
 	if (!sas_device) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5276,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), flags!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* check if there were any issues with discovery */
 	if (_scsih_check_access_status(ioc, sas_address, handle,
-	    sas_device_pg0.AccessStatus)) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	    sas_device_pg0.AccessStatus))
+		goto out_unlock;
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	_scsih_ublock_io_device(ioc, sas_address);
+	return;
 
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 /**
@@ -5208,7 +5315,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 	u32 ioc_status;
 	__le64 sas_address;
 	u32 device_info;
-	unsigned long flags;
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5356,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
-
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	if (sas_device)
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return 0;
+	}
 
 	sas_device = kzalloc(sizeof(struct _sas_device),
 	    GFP_KERNEL);
@@ -5267,6 +5372,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->refcount);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5450,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5468,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 
 /**
@@ -5389,13 +5499,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
 /**
@@ -5716,26 +5830,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(event_data->SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
 
-	if (!sas_device || !sas_device->starget) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!sas_device || !sas_device->starget)
+		goto out;
 
 	target_priv_data = sas_device->starget->hostdata;
-	if (!target_priv_data) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!target_priv_data)
+		goto out;
 
 	if (event_data->ReasonCode ==
 	    MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
 		target_priv_data->tm_busy = 1;
 	else
 		target_priv_data->tm_busy = 0;
+
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
 }
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6239,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device) {
 		sas_device->volume_handle = 0;
 		sas_device->volume_wwid = 0;
@@ -6142,6 +6258,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	/* exposing raid component */
 	if (starget)
 		starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6170,7 +6288,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 		    &volume_wwid);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device) {
 		set_bit(handle, ioc->pd_handles);
 		if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6307,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 	/* hiding raid component */
 	if (starget)
 		starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6221,7 +6341,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
     Mpi2EventIrConfigElement_t *element)
 {
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6350,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
 
 	set_bit(handle, ioc->pd_handles);
 
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return;
+	}
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6628,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle, parent_handle;
 	u32 state;
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
 	u32 ioc_status;
@@ -6542,12 +6660,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 		if (!ioc->is_warpdrive)
 			set_bit(handle, ioc->pd_handles);
 
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
-		if (sas_device)
+		sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			return;
+		}
 
 		if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7015,6 +7132,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	struct _raid_device *raid_device, *raid_device_next;
 	struct list_head tmp_list;
 	unsigned long flags;
+	LIST_HEAD(head);
 
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
 	    ioc->name);
@@ -7022,14 +7140,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	/* removing unresponding end devices */
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
 	    ioc->name);
+
+	/*
+	 * Iterate, pulling off devices marked as non-responding. We become the
+	 * owner for the reference the list had on any object we prune.
+	 */
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_for_each_entry_safe(sas_device, sas_device_next,
-	    &ioc->sas_device_list, list) {
+			&ioc->sas_device_list, list) {
 		if (!sas_device->responding)
-			mpt2sas_device_remove_by_sas_address(ioc,
-				sas_device->sas_address);
+			list_move_tail(&sas_device->list, &head);
 		else
 			sas_device->responding = 0;
 	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Now, uninitialize and remove the unresponding devices we pruned.
+	 */
+	list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+		_scsih_remove_device(ioc, sas_device);
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 
 	/* removing unresponding volumes */
 	if (ioc->ir_firmware) {
@@ -7179,11 +7312,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		}
 		phys_disk_num = pd_pg0.PhysDiskNum;
 		handle = le16_to_cpu(pd_pg0.DevHandle);
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
 		    handle) != 0)
@@ -7302,12 +7435,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		if (!(_scsih_is_end_device(
 		    le32_to_cpu(sas_device_pg0.DeviceInfo))))
 			continue;
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_get_sdev_by_addr(ioc,
 		    le64_to_cpu(sas_device_pg0.SASAddress));
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
 		if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
 			printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
@@ -7966,6 +8099,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 	}
 }
 
+static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+	struct _sas_device *sas_device = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	if (!list_empty(&ioc->sas_device_init_list)) {
+		sas_device = list_first_entry(&ioc->sas_device_init_list,
+				struct _sas_device, list);
+		sas_device_get(sas_device);
+	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+		struct _sas_device *sas_device)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Since we dropped the lock during the call to port_add(), we need to
+	 * be careful here that somebody else didn't move or delete this item
+	 * while we were busy with other things.
+	 *
+	 * If it was on the list, we need a put() for the reference the list
+	 * had. Either way, we need a get() for the destination list.
+	 */
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
+
+	sas_device_get(sas_device);
+	list_add_tail(&sas_device->list, &ioc->sas_device_list);
+
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
 /**
  * _scsih_probe_sas - reporting sas devices to sas transport
  * @ioc: per adapter object
@@ -7975,34 +8150,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct _sas_device *sas_device, *next;
-	unsigned long flags;
-
-	/* SAS Device List */
-	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
-	    list) {
+	struct _sas_device *sas_device;
 
-		if (ioc->hide_drives)
-			continue;
+	if (ioc->hide_drives)
+		return;
 
+	while ((sas_device = get_next_sas_device(ioc))) {
 		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
-		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+				sas_device->sas_address_parent)) {
+			_scsih_sas_device_remove(ioc, sas_device);
+			sas_device_put(sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			if (!ioc->is_driver_loading) {
 				mpt2sas_transport_port_remove(ioc,
-					sas_device->sas_address,
-					sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+						sas_device->sas_address,
+						sas_device->sas_address_parent);
+				_scsih_sas_device_remove(ioc, sas_device);
+				sas_device_put(sas_device);
 				continue;
 			}
 		}
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		list_move_tail(&sas_device->list, &ioc->sas_device_list);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+		sas_device_make_active(ioc, sas_device);
+		sas_device_put(sas_device);
 	}
 }
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..af86800 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    rphy->identify.sas_address);
 	if (sas_device) {
 		*identifier = sas_device->enclosure_logical_id;
 		rc = 0;
+		sas_device_put(sas_device);
 	} else {
 		*identifier = 0;
 		rc = -ENXIO;
 	}
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    rphy->identify.sas_address);
-	if (sas_device)
+	if (sas_device) {
 		rc = sas_device->slot;
-	else
+		sas_device_put(sas_device);
+	} else {
 		rc = -ENXIO;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
  2015-08-01  5:02         ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-08-01  5:02           ` [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
@ 2015-08-01  5:02           ` Calvin Owens
  2015-08-14  1:48           ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-08-01  5:02 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Joe Lawrence, Christoph Hellwig

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
---

Changes in v3:
	* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
	  which can loop over a sleep forever (5m+ at least) at unloading. I
	  don't think anything prevented this before, but taking the fw_event
	  object off the list at the top of _firmware_event_work() seems to have
	  made it more likely to happen.

Changes in v2:
	* Squished patches 4-6 into one patch
	* Remove the fw_event from fw_event_list at the start of
	  _firmware_event_work()
	* Explicitly seperate fw_event_list removal from fw_event freeing

 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
 1 file changed, 91 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index a2af9a5..cdc647d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
 	u8			VP_ID;
 	u8			ignore;
 	u16			event;
+	struct kref		refcount;
 	char			event_data[0] __aligned(4);
 };
 
+static void fw_event_work_free(struct kref *r)
+{
+	kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+	kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+	kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+	struct fw_event_work *fw_event;
+
+	fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+	if (!fw_event)
+		return NULL;
+
+	kref_init(&fw_event->refcount);
+	return fw_event;
+}
+
 /* raid transport support */
 static struct raid_template *mpt2sas_raid_template;
 
@@ -2864,36 +2892,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
 		return;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	fw_event_work_get(fw_event);
 	list_add_tail(&fw_event->list, &ioc->fw_event_list);
 	INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
+	fw_event_work_get(fw_event);
 	queue_delayed_work(ioc->firmware_event_thread,
 	    &fw_event->delayed_work, 0);
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
 /**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
  * @ioc: per adapter object
  * @fw_event: object describing the event
  * Context: This function will acquire ioc->fw_event_lock.
  *
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
  *
  * Return nothing.
  */
 static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
     *fw_event)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
-	list_del(&fw_event->list);
-	kfree(fw_event);
+	if (!list_empty(&fw_event->list)) {
+		list_del_init(&fw_event->list);
+		fw_event_work_put(fw_event);
+	}
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
-
 /**
  * _scsih_error_recovery_delete_devices - remove devices not responding
  * @ioc: per adapter object
@@ -2908,13 +2939,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
 	if (ioc->is_driver_loading)
 		return;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 
 	fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -2928,12 +2960,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+	unsigned long flags;
+	struct fw_event_work *fw_event = NULL;
+
+	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	if (!list_empty(&ioc->fw_event_list)) {
+		fw_event = list_first_entry(&ioc->fw_event_list,
+				struct fw_event_work, list);
+		list_del_init(&fw_event->list);
+	}
+	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+	return fw_event;
 }
 
 /**
@@ -2948,17 +2997,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct fw_event_work *fw_event, *next;
+	struct fw_event_work *fw_event;
 
 	if (list_empty(&ioc->fw_event_list) ||
 	     !ioc->firmware_event_thread || in_interrupt())
 		return;
 
-	list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
-		if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
-			_scsih_fw_event_free(ioc, fw_event);
-			continue;
-		}
+	while ((fw_event = dequeue_next_fw_event(ioc))) {
+		/*
+		 * Wait on the fw_event to complete. If this returns 1, then
+		 * the event was never executed, and we need a put for the
+		 * reference the delayed_work had on the fw_event.
+		 *
+		 * If it did execute, we wait for it to finish, and the put will
+		 * happen from _firmware_event_work()
+		 */
+		if (cancel_delayed_work_sync(&fw_event->delayed_work))
+			fw_event_work_put(fw_event);
+
+		fw_event_work_put(fw_event);
 	}
 }
 
@@ -4439,13 +4496,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
 	fw_event->device_handle = handle;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7543,17 +7601,27 @@ _firmware_event_work(struct work_struct *work)
 	    struct fw_event_work, delayed_work.work);
 	struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;
 
+	_scsih_fw_event_del_from_list(ioc, fw_event);
+
 	/* the queue is being flushed so ignore this event */
-	if (ioc->remove_host ||
-	    ioc->pci_error_recovery) {
-		_scsih_fw_event_free(ioc, fw_event);
+	if (ioc->remove_host || ioc->pci_error_recovery) {
+		fw_event_work_put(fw_event);
 		return;
 	}
 
 	switch (fw_event->event) {
 	case MPT2SAS_REMOVE_UNRESPONDING_DEVICES:
-		while (scsi_host_in_recovery(ioc->shost) || ioc->shost_recovery)
+		while (scsi_host_in_recovery(ioc->shost) ||
+				ioc->shost_recovery) {
+			/*
+			 * If we're unloading, bail. Otherwise, this can become
+			 * an infinite loop.
+			 */
+			if (ioc->remove_host)
+				goto out;
+
 			ssleep(1);
+		}
 		_scsih_remove_unresponding_sas_devices(ioc);
 		_scsih_scan_for_devices_after_reset(ioc);
 		break;
@@ -7602,7 +7670,8 @@ _firmware_event_work(struct work_struct *work)
 		_scsih_sas_ir_operation_status_event(ioc, fw_event);
 		break;
 	}
-	_scsih_fw_event_free(ioc, fw_event);
+out:
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7740,7 +7809,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	}
 
 	sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
-	fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(sz);
 	if (!fw_event) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
@@ -7753,6 +7822,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	fw_event->VP_ID = mpi_reply->VP_ID;
 	fw_event->event = event;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 	return;
 }
 
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-08-01  5:02           ` [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
@ 2015-08-10 13:15             ` Sreekanth Reddy
  2015-08-14  1:43               ` Calvin Owens
  0 siblings, 1 reply; 52+ messages in thread
From: Sreekanth Reddy @ 2015-08-10 13:15 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Abhijit Mahajan,
	MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Joe Lawrence, Christoph Hellwig, Bart Van Assche

On Sat, Aug 1, 2015 at 10:32 AM, Calvin Owens <calvinowens@fb.com> wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Joe Lawrence <joe.lawrence@stratus.com>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
> Changes in v3:
>         * Drop the sas_device_lock while enabling devices, and leave the
>           sas_device object on the list, since it may need to be looked up
>           there while it is being enabled.
>         * Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
>           reference (this was an oversight in v2).
>         * Be consistent about calling sas_device_put() while holding the
>           sas_device_lock where feasible.
>         * Take and assert_spin_locked() on the sas_device_lock from the newly
>           added __get_sdev_from_target(), add wrapper similar to other lookups
>           for callers which do not explicitly take the lock.
>
> Changes in v2:
>         * Squished patches 1-3 into this one
>         * s/BUG_ON(!spin_is_locked/assert_spin_locked/g
>         * Store a pointer to the sas_device object in ->hostdata, to eliminate
>           the need for several lookups on the lists.
>
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 467 +++++++++++++++++++++----------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 348 insertions(+), 153 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
>   * @flags: MPT_TARGET_FLAGS_XXX flags
>   * @deleted: target flaged for deletion
>   * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
>   */
>  struct MPT2SAS_TARGET {
>         struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
>         u32     flags;
>         u8      deleted;
>         u8      tm_busy;
> +       struct _sas_device *sdev;
>  };
>
>
> @@ -376,8 +378,24 @@ struct _sas_device {
>         u8      phy;
>         u8      responding;
>         u8      pfa_led_on;
> +       struct kref refcount;
>  };
>
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> +       kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> +       kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> +       kref_put(&s->refcount, sas_device_free);
> +}
> +
>  /**
>   * struct _raid_device - raid volume link list
>   * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
>      u16 handle);
>  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
>      *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
>      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>
>  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..a2af9a5 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>         }
>  }
>
> +static struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> +               struct MPT2SAS_TARGET *tgt_priv)
> +{
> +       struct _sas_device *ret;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
> +
> +       ret = tgt_priv->sdev;
> +       if (ret)
> +               sas_device_get(ret);
> +
> +       return ret;
> +}
> +
> +static struct _sas_device *
> +mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> +               struct MPT2SAS_TARGET *tgt_priv)
> +{
> +       struct _sas_device *ret;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return ret;
> +}
> +
> +
> +struct _sas_device *
> +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> +    u64 sas_address)
> +{
> +       struct _sas_device *sas_device;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
> +
> +       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> +               if (sas_device->sas_address == sas_address)
> +                       goto found_device;
> +
> +       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> +               if (sas_device->sas_address == sas_address)
> +                       goto found_device;
> +
> +       return NULL;
> +
> +found_device:
> +       sas_device_get(sas_device);
> +       return sas_device;
> +}
> +
>  /**
> - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> + * mpt2sas_get_sdev_by_addr - sas device search
>   * @ioc: per adapter object
>   * @sas_address: sas address
>   * Context: Calling function should acquire ioc->sas_device_lock
> @@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>   * object.
>   */
>  struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
>      u64 sas_address)
>  {
>         struct _sas_device *sas_device;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> +                       sas_address);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return sas_device;
> +}
> +
> +static struct _sas_device *
> +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +{
> +       struct _sas_device *sas_device;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
>
>         list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> -               if (sas_device->sas_address == sas_address)
> -                       return sas_device;
> +               if (sas_device->handle == handle)
> +                       goto found_device;
>
>         list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> -               if (sas_device->sas_address == sas_address)
> -                       return sas_device;
> +               if (sas_device->handle == handle)
> +                       goto found_device;
>
>         return NULL;
> +
> +found_device:
> +       sas_device_get(sas_device);
> +       return sas_device;
>  }
>
>  /**
> - * _scsih_sas_device_find_by_handle - sas device search
> + * mpt2sas_get_sdev_by_handle - sas device search
>   * @ioc: per adapter object
>   * @handle: sas device handle (assigned by firmware)
>   * Context: Calling function should acquire ioc->sas_device_lock
> @@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>   * object.
>   */
>  static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         struct _sas_device *sas_device;
> +       unsigned long flags;
>
> -       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> -               if (sas_device->handle == handle)
> -                       return sas_device;
> -
> -       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> -               if (sas_device->handle == handle)
> -                       return sas_device;
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> -       return NULL;
> +       return sas_device;
>  }
>
>  /**
> @@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>   * @sas_device: the sas_device object
>   * Context: This function will acquire ioc->sas_device_lock.
>   *
> - * Removing object and freeing associated memory from the ioc->sas_device_list.
> + * If sas_device is on the list, remove it and decrement its reference count.
>   */
>  static void
>  _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> @@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>         if (!sas_device)
>                 return;
>
> +       /*
> +        * The lock serializes access to the list, but we still need to verify
> +        * that nobody removed the entry while we were waiting on the lock.
> +        */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       list_del(&sas_device->list);
> -       kfree(sas_device);
> +       if (!list_empty(&sas_device->list)) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>
> @@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
>             sas_device->handle, (unsigned long long)sas_device->sas_address));
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);

[Sreekanth] I think here we are unnecessarily taking extra reference count,
 already devices reference count is initialized to one in
_scsih_add_device() using kref_init() API.

>         list_add_tail(&sas_device->list, &ioc->sas_device_list);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
>             sas_device->handle, (unsigned long long)sas_device->sas_address));
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);

[Sreekanth] same as above comment.

>         list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
>         _scsih_determine_boot_device(ioc, sas_device, 0);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -1208,12 +1286,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
>                 goto not_sata;
>         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
>                 goto not_sata;
> +
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -          sas_device_priv_data->sas_target->sas_address);
> -       if (sas_device && sas_device->device_info &
> -           MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> +       sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> +       if (sas_device && sas_device->device_info
> +                       & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
>                 max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> +               sas_device_put(sas_device);

[Sreekanth] Here it looks it is reducing the reference count only for
SATA drives,
what if device is of SAS device.

> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
>   not_sata:
> @@ -1271,18 +1351,20 @@ _scsih_target_alloc(struct scsi_target *starget)
>         /* sas/sata devices */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         rphy = dev_to_rphy(starget->dev.parent);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>            rphy->identify.sas_address);
>
>         if (sas_device) {
>                 sas_target_priv_data->handle = sas_device->handle;
>                 sas_target_priv_data->sas_address = sas_device->sas_address;
> +               sas_target_priv_data->sdev = sas_device;
>                 sas_device->starget = starget;
>                 sas_device->id = starget->id;
>                 sas_device->channel = starget->channel;
>                 if (test_bit(sas_device->handle, ioc->pd_handles))
>                         sas_target_priv_data->flags |=
>                             MPT_TARGET_FLAGS_RAID_COMPONENT;
> +

[Sreekanth] I think here, sas_device_put() call is missing.


>         }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -1324,13 +1406,14 @@ _scsih_target_destroy(struct scsi_target *starget)
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         rphy = dev_to_rphy(starget->dev.parent);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -          rphy->identify.sas_address);
> +       sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
>         if (sas_device && (sas_device->starget == starget) &&
>             (sas_device->id == starget->id) &&
>             (sas_device->channel == starget->channel))
>                 sas_device->starget = NULL;
>
> +       if (sas_device)
> +               sas_device_put(sas_device);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
>   out:
> @@ -1386,7 +1469,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>
>         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +               sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>                                 sas_target_priv_data->sas_address);
>                 if (sas_device && (sas_device->starget == NULL)) {
>                         sdev_printk(KERN_INFO, sdev,
> @@ -1394,6 +1477,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>                              __func__, __LINE__);
>                         sas_device->starget = starget;
>                 }
> +
> +               if (sas_device)
> +                       sas_device_put(sas_device);
> +
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
> @@ -1428,10 +1515,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>
>         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                  sas_target_priv_data->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(ioc,
> +                               sas_target_priv_data);
>                 if (sas_device && !sas_target_priv_data->num_luns)
>                         sas_device->starget = NULL;
> +
> +               if (sas_device)
> +                       sas_device_put(sas_device);
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
> @@ -2078,7 +2168,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
>         }
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>            sas_device_priv_data->sas_target->sas_address);
>         if (!sas_device) {
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2112,17 +2202,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
>             (unsigned long long) sas_device->enclosure_logical_id,
>             sas_device->slot);
>
> +       sas_device_put(sas_device);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         if (!ssp_target)
>                 _scsih_display_sata_capabilities(ioc, handle, sdev);
>
> -
>         _scsih_change_queue_depth(sdev, qdepth);
>
>         if (ssp_target) {
>                 sas_read_port_mode_page(sdev);
>                 _scsih_enable_tlr(ioc, sdev);
>         }
> +
>         return 0;
>  }
>
> @@ -2509,8 +2600,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
>                     device_str, (unsigned long long)priv_target->sas_address);
>         } else {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                   priv_target->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
>                 if (sas_device) {
>                         if (priv_target->flags &
>                             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> @@ -2529,6 +2619,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
>                             "enclosure_logical_id(0x%016llx), slot(%d)\n",
>                            (unsigned long long)sas_device->enclosure_logical_id,
>                             sas_device->slot);
> +
> +                       sas_device_put(sas_device);
>                 }
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
> @@ -2604,12 +2696,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>  {
>         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
>         struct MPT2SAS_DEVICE *sas_device_priv_data;
> -       struct _sas_device *sas_device;
> -       unsigned long flags;
> +       struct _sas_device *sas_device = NULL;
>         u16     handle;
>         int r;
>
>         struct scsi_target *starget = scmd->device->sdev_target;
> +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
>         starget_printk(KERN_INFO, starget, "attempting device reset! "
>             "scmd(%p)\n", scmd);
> @@ -2629,12 +2721,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>         handle = 0;
>         if (sas_device_priv_data->sas_target->flags &
>             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> -                  sas_device_priv_data->sas_target->handle);
> +               sas_device = mpt2sas_get_sdev_from_target(ioc,
> +                               target_priv_data);
>                 if (sas_device)
>                         handle = sas_device->volume_handle;
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         } else
>                 handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2651,6 +2741,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>   out:
>         sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
>             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         return r;
>  }
>
> @@ -2665,11 +2759,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>  {
>         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
>         struct MPT2SAS_DEVICE *sas_device_priv_data;
> -       struct _sas_device *sas_device;
> -       unsigned long flags;
> +       struct _sas_device *sas_device = NULL;
>         u16     handle;
>         int r;
>         struct scsi_target *starget = scmd->device->sdev_target;
> +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
>         starget_printk(KERN_INFO, starget, "attempting target reset! "
>             "scmd(%p)\n", scmd);
> @@ -2689,12 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>         handle = 0;
>         if (sas_device_priv_data->sas_target->flags &
>             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> -                  sas_device_priv_data->sas_target->handle);
> +               sas_device = mpt2sas_get_sdev_from_target(ioc,
> +                               target_priv_data);
>                 if (sas_device)
>                         handle = sas_device->volume_handle;
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         } else
>                 handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2711,6 +2803,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>   out:
>         starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
>             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         return r;
>  }
>
> @@ -3002,15 +3098,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
>
>         list_for_each_entry(mpt2sas_port,
>            &sas_expander->sas_port_list, port_list) {
> -               if (mpt2sas_port->remote_identify.device_type ==
> -                   SAS_END_DEVICE) {
> +               if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
>                         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -                       sas_device =
> -                           mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                          mpt2sas_port->remote_identify.sas_address);
> -                       if (sas_device)
> +                       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> +                                       mpt2sas_port->remote_identify.sas_address);
> +                       if (sas_device) {
>                                 set_bit(sas_device->handle,
> -                                   ioc->blocking_handles);
> +                                               ioc->blocking_handles);
> +                               sas_device_put(sas_device);
> +                       }
>                         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>                 }
>         }
> @@ -3080,7 +3176,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         Mpi2SCSITaskManagementRequest_t *mpi_request;
>         u16 smid;
> -       struct _sas_device *sas_device;
> +       struct _sas_device *sas_device = NULL;
>         struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
>         u64 sas_address = 0;
>         unsigned long flags;
> @@ -3110,7 +3206,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device && sas_device->starget &&
>              sas_device->starget->hostdata) {
>                 sas_target_priv_data = sas_device->starget->hostdata;
> @@ -3131,14 +3227,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         if (!smid) {
>                 delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
>                 if (!delayed_tr)
> -                       return;
> +                       goto out;
>                 INIT_LIST_HEAD(&delayed_tr->list);
>                 delayed_tr->handle = handle;
>                 list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
>                 dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
>                     "DELAYED:tr:handle(0x%04x), (open)\n",
>                     ioc->name, handle));
> -               return;
> +               goto out;
>         }
>
>         dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> @@ -3150,6 +3246,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         mpi_request->DevHandle = cpu_to_le16(handle);
>         mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
>         mpt2sas_base_put_smid_hi_priority(ioc, smid);
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
>  }
>
>
> @@ -4068,7 +4167,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>         char *desc_scsi_state = ioc->tmp_string;
>         u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
>         struct _sas_device *sas_device = NULL;
> -       unsigned long flags;
>         struct scsi_target *starget = scmd->device->sdev_target;
>         struct MPT2SAS_TARGET *priv_target = starget->hostdata;
>         char *device_str = NULL;
> @@ -4200,9 +4298,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>                 printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
>                     device_str, (unsigned long long)priv_target->sas_address);
>         } else {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                   priv_target->sas_address);
> +               sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
>                 if (sas_device) {
>                         printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
>                             "phy(%d)\n", ioc->name, sas_device->sas_address,
> @@ -4211,8 +4307,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>                             "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
>                             ioc->name, sas_device->enclosure_logical_id,
>                             sas_device->slot);
> +
> +                       sas_device_put(sas_device);
>                 }
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
>         printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> @@ -4259,7 +4356,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         Mpi2SepRequest_t mpi_request;
>         struct _sas_device *sas_device;
>
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (!sas_device)
>                 return;
>
> @@ -4274,7 +4371,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>             &mpi_request)) != 0) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
>                 __FILE__, __LINE__, __func__);
> -               return;
> +               goto out;
>         }
>         sas_device->pfa_led_on = 1;
>
> @@ -4284,8 +4381,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                  "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
>                  ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
>                  le32_to_cpu(mpi_reply.IOCLogInfo)));
> -               return;
> +               goto out;
>         }
> +out:
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -4370,19 +4469,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
>         /* only handle non-raid devices */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (!sas_device) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>         starget = sas_device->starget;
>         sas_target_priv_data = starget->hostdata;
>
>         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> -          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> +               goto out_unlock;
> +
>         starget_printk(KERN_WARNING, starget, "predicted fault\n");
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -4396,7 +4493,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         if (!event_reply) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
>                     ioc->name, __FILE__, __LINE__, __func__);
> -               return;
> +               goto out;
>         }
>
>         event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> @@ -4413,6 +4510,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
>         mpt2sas_ctl_add_to_event_log(ioc, event_reply);
>         kfree(event_reply);
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +       return;
> +
> +out_unlock:
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +       goto out;
>  }
>
>  /**
> @@ -5148,14 +5253,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
>
>         if (!sas_device) {
>                 printk(MPT2SAS_ERR_FMT "device is not present "
>                     "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>
>         if (unlikely(sas_device->handle != handle)) {
> @@ -5172,19 +5276,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>             MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
>                 printk(MPT2SAS_ERR_FMT "device is not present "
>                     "handle(0x%04x), flags!!!\n", ioc->name, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>
>         /* check if there were any issues with discovery */
>         if (_scsih_check_access_status(ioc, sas_address, handle,
> -           sas_device_pg0.AccessStatus)) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +           sas_device_pg0.AccessStatus))
> +               goto out_unlock;
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         _scsih_ublock_io_device(ioc, sas_address);
> +       return;

[Sreekanth] I think here driver exits from this function without
reducing the reference count.

>
> +out_unlock:
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +       if (sas_device)
> +               sas_device_put(sas_device);
>  }
>
>  /**
> @@ -5208,7 +5315,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>         u32 ioc_status;
>         __le64 sas_address;
>         u32 device_info;
> -       unsigned long flags;
>
>         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
>             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -5250,14 +5356,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>                 return -1;
>         }
>
> -
> -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
> -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> -       if (sas_device)
> +       if (sas_device) {
> +               sas_device_put(sas_device);
>                 return 0;
> +       }
>
>         sas_device = kzalloc(sizeof(struct _sas_device),
>             GFP_KERNEL);
> @@ -5267,6 +5372,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>                 return -1;
>         }
>
> +       kref_init(&sas_device->refcount);
>         sas_device->handle = handle;
>         if (_scsih_get_sas_address(ioc, le16_to_cpu
>                 (sas_device_pg0.ParentDevHandle),
> @@ -5344,7 +5450,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
>             "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
>             sas_device->handle, (unsigned long long)
>             sas_device->sas_address));
> -       kfree(sas_device);
>  }
>  /**
>   * _scsih_device_remove_by_handle - removing device object by handle
> @@ -5363,12 +5468,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -       if (sas_device)
> -               list_del(&sas_device->list);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> +       if (sas_device) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +
> +       if (sas_device) {
>                 _scsih_remove_device(ioc, sas_device);
> +               sas_device_put(sas_device);
> +       }
>  }
>
>  /**
> @@ -5389,13 +5499,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -           sas_address);
> -       if (sas_device)
> -               list_del(&sas_device->list);
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> +       if (sas_device) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +
> +       if (sas_device) {
>                 _scsih_remove_device(ioc, sas_device);
> +               sas_device_put(sas_device);
> +       }
>  }
>  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
>  /**
> @@ -5716,26 +5830,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         sas_address = le64_to_cpu(event_data->SASAddress);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
>
> -       if (!sas_device || !sas_device->starget) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +       if (!sas_device || !sas_device->starget)
> +               goto out;
>
>         target_priv_data = sas_device->starget->hostdata;
> -       if (!target_priv_data) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +       if (!target_priv_data)
> +               goto out;
>
>         if (event_data->ReasonCode ==
>             MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
>                 target_priv_data->tm_busy = 1;
>         else
>                 target_priv_data->tm_busy = 0;
> +
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
>  }
>
>  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> @@ -6123,7 +6239,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
>         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device) {
>                 sas_device->volume_handle = 0;
>                 sas_device->volume_wwid = 0;
> @@ -6142,6 +6258,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
>         /* exposing raid component */
>         if (starget)
>                 starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> +
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -6170,7 +6288,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
>                     &volume_wwid);
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device) {
>                 set_bit(handle, ioc->pd_handles);
>                 if (sas_device->starget && sas_device->starget->hostdata) {
> @@ -6189,6 +6307,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
>         /* hiding raid component */
>         if (starget)
>                 starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> +
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -6221,7 +6341,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>      Mpi2EventIrConfigElement_t *element)
>  {
>         struct _sas_device *sas_device;
> -       unsigned long flags;
>         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>         Mpi2ConfigReply_t mpi_reply;
>         Mpi2SasDevicePage0_t sas_device_pg0;
> @@ -6231,11 +6350,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>
>         set_bit(handle, ioc->pd_handles);
>
> -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +       if (sas_device) {
> +               sas_device_put(sas_device);
>                 return;
> +       }
>
>         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
>             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -6509,7 +6628,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
>         u16 handle, parent_handle;
>         u32 state;
>         struct _sas_device *sas_device;
> -       unsigned long flags;
>         Mpi2ConfigReply_t mpi_reply;
>         Mpi2SasDevicePage0_t sas_device_pg0;
>         u32 ioc_status;
> @@ -6542,12 +6660,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
>                 if (!ioc->is_warpdrive)
>                         set_bit(handle, ioc->pd_handles);
>
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -
> -               if (sas_device)
> +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         return;
> +               }
>
>                 if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
>                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> @@ -7015,6 +7132,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
>         struct _raid_device *raid_device, *raid_device_next;
>         struct list_head tmp_list;
>         unsigned long flags;
> +       LIST_HEAD(head);
>
>         printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
>             ioc->name);
> @@ -7022,14 +7140,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
>         /* removing unresponding end devices */
>         printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
>             ioc->name);
> +
> +       /*
> +        * Iterate, pulling off devices marked as non-responding. We become the
> +        * owner for the reference the list had on any object we prune.
> +        */
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         list_for_each_entry_safe(sas_device, sas_device_next,
> -           &ioc->sas_device_list, list) {
> +                       &ioc->sas_device_list, list) {
>                 if (!sas_device->responding)
> -                       mpt2sas_device_remove_by_sas_address(ioc,
> -                               sas_device->sas_address);
> +                       list_move_tail(&sas_device->list, &head);
>                 else
>                         sas_device->responding = 0;
>         }
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       /*
> +        * Now, uninitialize and remove the unresponding devices we pruned.
> +        */
> +       list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> +               _scsih_remove_device(ioc, sas_device);
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>
>         /* removing unresponding volumes */
>         if (ioc->ir_firmware) {
> @@ -7179,11 +7312,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
>                 }
>                 phys_disk_num = pd_pg0.PhysDiskNum;
>                 handle = le16_to_cpu(pd_pg0.DevHandle);
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               if (sas_device)
> +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         continue;
> +               }
>                 if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
>                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
>                     handle) != 0)
> @@ -7302,12 +7435,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
>                 if (!(_scsih_is_end_device(
>                     le32_to_cpu(sas_device_pg0.DeviceInfo))))
>                         continue;
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +               sas_device = mpt2sas_get_sdev_by_addr(ioc,
>                     le64_to_cpu(sas_device_pg0.SASAddress));
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               if (sas_device)
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         continue;
> +               }
>                 parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
>                 if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
>                         printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> @@ -7966,6 +8099,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
>         }
>  }
>
> +static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> +{
> +       struct _sas_device *sas_device = NULL;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       if (!list_empty(&ioc->sas_device_init_list)) {
> +               sas_device = list_first_entry(&ioc->sas_device_init_list,
> +                               struct _sas_device, list);
> +               sas_device_get(sas_device);
> +       }
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return sas_device;
> +}
> +
> +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> +               struct _sas_device *sas_device)
> +{
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +
> +       /*
> +        * Since we dropped the lock during the call to port_add(), we need to
> +        * be careful here that somebody else didn't move or delete this item
> +        * while we were busy with other things.
> +        *
> +        * If it was on the list, we need a put() for the reference the list
> +        * had. Either way, we need a get() for the destination list.
> +        */
> +       if (!list_empty(&sas_device->list)) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
> +
> +       sas_device_get(sas_device);
> +       list_add_tail(&sas_device->list, &ioc->sas_device_list);
> +
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +}
> +
>  /**
>   * _scsih_probe_sas - reporting sas devices to sas transport
>   * @ioc: per adapter object
> @@ -7975,34 +8150,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
>  static void
>  _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
>  {
> -       struct _sas_device *sas_device, *next;
> -       unsigned long flags;
> -
> -       /* SAS Device List */
> -       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> -           list) {
> +       struct _sas_device *sas_device;
>
> -               if (ioc->hide_drives)
> -                       continue;
> +       if (ioc->hide_drives)
> +               return;
>
> +       while ((sas_device = get_next_sas_device(ioc))) {
>                 if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> -                   sas_device->sas_address_parent)) {
> -                       list_del(&sas_device->list);
> -                       kfree(sas_device);
> +                               sas_device->sas_address_parent)) {
> +                       _scsih_sas_device_remove(ioc, sas_device);
> +                       sas_device_put(sas_device);
>                         continue;
>                 } else if (!sas_device->starget) {
>                         if (!ioc->is_driver_loading) {
>                                 mpt2sas_transport_port_remove(ioc,
> -                                       sas_device->sas_address,
> -                                       sas_device->sas_address_parent);
> -                               list_del(&sas_device->list);
> -                               kfree(sas_device);
> +                                               sas_device->sas_address,
> +                                               sas_device->sas_address_parent);
> +                               _scsih_sas_device_remove(ioc, sas_device);
> +                               sas_device_put(sas_device);
>                                 continue;
>                         }
>                 }
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +               sas_device_make_active(ioc, sas_device);
> +               sas_device_put(sas_device);
>         }
>  }
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> index ff2500a..af86800 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
>         int rc;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             rphy->identify.sas_address);
>         if (sas_device) {
>                 *identifier = sas_device->enclosure_logical_id;
>                 rc = 0;
> +               sas_device_put(sas_device);
>         } else {
>                 *identifier = 0;
>                 rc = -ENXIO;
>         }
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         return rc;
>  }
> @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
>         int rc;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             rphy->identify.sas_address);
> -       if (sas_device)
> +       if (sas_device) {
>                 rc = sas_device->slot;
> -       else
> +               sas_device_put(sas_device);
> +       } else {
>                 rc = -ENXIO;
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         return rc;
>  }
> --
> 1.8.5.6
>



-- 

Regards,
Sreekanth

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-08-10 13:15             ` Sreekanth Reddy
@ 2015-08-14  1:43               ` Calvin Owens
  0 siblings, 0 replies; 52+ messages in thread
From: Calvin Owens @ 2015-08-14  1:43 UTC (permalink / raw)
  To: Sreekanth Reddy
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Abhijit Mahajan,
	MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Joe Lawrence, Christoph Hellwig, Bart Van Assche

On Monday 08/10 at 18:45 +0530, Sreekanth Reddy wrote:
> On Sat, Aug 1, 2015 at 10:32 AM, Calvin Owens <calvinowens@fb.com> wrote:

Sreekanth,

Thanks for the review, responses below. I'll have a v4 out shortly.

Calvin

> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> >
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> >
> > Cc: Christoph Hellwig <hch@infradead.org>
> > Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> > Cc: Joe Lawrence <joe.lawrence@stratus.com>
> > Signed-off-by: Calvin Owens <calvinowens@fb.com>
> > ---
> > Changes in v3:
> >         * Drop the sas_device_lock while enabling devices, and leave the
> >           sas_device object on the list, since it may need to be looked up
> >           there while it is being enabled.
> >         * Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
> >           reference (this was an oversight in v2).
> >         * Be consistent about calling sas_device_put() while holding the
> >           sas_device_lock where feasible.
> >         * Take and assert_spin_locked() on the sas_device_lock from the newly
> >           added __get_sdev_from_target(), add wrapper similar to other lookups
> >           for callers which do not explicitly take the lock.
> >
> > Changes in v2:
> >         * Squished patches 1-3 into this one
> >         * s/BUG_ON(!spin_is_locked/assert_spin_locked/g
> >         * Store a pointer to the sas_device object in ->hostdata, to eliminate
> >           the need for several lookups on the lists.
> >
> >  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
> >  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 467 +++++++++++++++++++++----------
> >  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
> >  3 files changed, 348 insertions(+), 153 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..78f41ac 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -238,6 +238,7 @@
> >   * @flags: MPT_TARGET_FLAGS_XXX flags
> >   * @deleted: target flaged for deletion
> >   * @tm_busy: target is busy with TM request.
> > + * @sdev: The sas_device associated with this target
> >   */
> >  struct MPT2SAS_TARGET {
> >         struct scsi_target *starget;
> > @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> >         u32     flags;
> >         u8      deleted;
> >         u8      tm_busy;
> > +       struct _sas_device *sdev;
> >  };
> >
> >
> > @@ -376,8 +378,24 @@ struct _sas_device {
> >         u8      phy;
> >         u8      responding;
> >         u8      pfa_led_on;
> > +       struct kref refcount;
> >  };
> >
> > +static inline void sas_device_get(struct _sas_device *s)
> > +{
> > +       kref_get(&s->refcount);
> > +}
> > +
> > +static inline void sas_device_free(struct kref *r)
> > +{
> > +       kfree(container_of(r, struct _sas_device, refcount));
> > +}
> > +
> > +static inline void sas_device_put(struct _sas_device *s)
> > +{
> > +       kref_put(&s->refcount, sas_device_free);
> > +}
> > +
> >  /**
> >   * struct _raid_device - raid volume link list
> >   * @list: sas device list
> > @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> >      u16 handle);
> >  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> >      *ioc, u64 sas_address);
> > -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> > +struct _sas_device *mpt2sas_get_sdev_by_addr(
> > +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> > +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> >      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> >
> >  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..a2af9a5 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> >         }
> >  }
> >
> > +static struct _sas_device *
> > +__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> > +               struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > +       struct _sas_device *ret;
> > +
> > +       assert_spin_locked(&ioc->sas_device_lock);
> > +
> > +       ret = tgt_priv->sdev;
> > +       if (ret)
> > +               sas_device_get(ret);
> > +
> > +       return ret;
> > +}
> > +
> > +static struct _sas_device *
> > +mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> > +               struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > +       struct _sas_device *ret;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       return ret;
> > +}
> > +
> > +
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> > +    u64 sas_address)
> > +{
> > +       struct _sas_device *sas_device;
> > +
> > +       assert_spin_locked(&ioc->sas_device_lock);
> > +
> > +       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > +               if (sas_device->sas_address == sas_address)
> > +                       goto found_device;
> > +
> > +       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > +               if (sas_device->sas_address == sas_address)
> > +                       goto found_device;
> > +
> > +       return NULL;
> > +
> > +found_device:
> > +       sas_device_get(sas_device);
> > +       return sas_device;
> > +}
> > +
> >  /**
> > - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> > + * mpt2sas_get_sdev_by_addr - sas device search
> >   * @ioc: per adapter object
> >   * @sas_address: sas address
> >   * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> >   * object.
> >   */
> >  struct _sas_device *
> > -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> >      u64 sas_address)
> >  {
> >         struct _sas_device *sas_device;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > +                       sas_address);
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       return sas_device;
> > +}
> > +
> > +static struct _sas_device *
> > +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +{
> > +       struct _sas_device *sas_device;
> > +
> > +       assert_spin_locked(&ioc->sas_device_lock);
> >
> >         list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > -               if (sas_device->sas_address == sas_address)
> > -                       return sas_device;
> > +               if (sas_device->handle == handle)
> > +                       goto found_device;
> >
> >         list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > -               if (sas_device->sas_address == sas_address)
> > -                       return sas_device;
> > +               if (sas_device->handle == handle)
> > +                       goto found_device;
> >
> >         return NULL;
> > +
> > +found_device:
> > +       sas_device_get(sas_device);
> > +       return sas_device;
> >  }
> >
> >  /**
> > - * _scsih_sas_device_find_by_handle - sas device search
> > + * mpt2sas_get_sdev_by_handle - sas device search
> >   * @ioc: per adapter object
> >   * @handle: sas device handle (assigned by firmware)
> >   * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> >   * object.
> >   */
> >  static struct _sas_device *
> > -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >  {
> >         struct _sas_device *sas_device;
> > +       unsigned long flags;
> >
> > -       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > -               if (sas_device->handle == handle)
> > -                       return sas_device;
> > -
> > -       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > -               if (sas_device->handle == handle)
> > -                       return sas_device;
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > -       return NULL;
> > +       return sas_device;
> >  }
> >
> >  /**
> > @@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >   * @sas_device: the sas_device object
> >   * Context: This function will acquire ioc->sas_device_lock.
> >   *
> > - * Removing object and freeing associated memory from the ioc->sas_device_list.
> > + * If sas_device is on the list, remove it and decrement its reference count.
> >   */
> >  static void
> >  _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > @@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> >         if (!sas_device)
> >                 return;
> >
> > +       /*
> > +        * The lock serializes access to the list, but we still need to verify
> > +        * that nobody removed the entry while we were waiting on the lock.
> > +        */
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       list_del(&sas_device->list);
> > -       kfree(sas_device);
> > +       if (!list_empty(&sas_device->list)) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >  }
> >
> > @@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
> >             sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device_get(sas_device);
> 
> [Sreekanth] I think here we are unnecessarily taking extra reference count,
>  already devices reference count is initialized to one in
> _scsih_add_device() using kref_init() API.

The reference here is for the list itself. The corresponding put() is in
_scsih_sas_device_remove().

> >         list_add_tail(&sas_device->list, &ioc->sas_device_list);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> >             sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       sas_device_get(sas_device);
> 
> [Sreekanth] same as above comment.

Again, this is a reference for the list. The corresponding put() happens
in sas_device_make_active(), or in _scsih_sas_device_remove() if
mpt2sas_transport_port_add() fails in _scsih_probe_sas().

> >         list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> >         _scsih_determine_boot_device(ioc, sas_device, 0);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -1208,12 +1286,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> >                 goto not_sata;
> >         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> >                 goto not_sata;
> > +
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -          sas_device_priv_data->sas_target->sas_address);
> > -       if (sas_device && sas_device->device_info &
> > -           MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> > +       sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> > +       if (sas_device && sas_device->device_info
> > +                       & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> >                 max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> > +               sas_device_put(sas_device);
> 
> [Sreekanth] Here it looks it is reducing the reference count only for
> SATA drives,
> what if device is of SAS device.

Yeah, you're right. Will fix.

> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> >   not_sata:
> > @@ -1271,18 +1351,20 @@ _scsih_target_alloc(struct scsi_target *starget)
> >         /* sas/sata devices */
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         rphy = dev_to_rphy(starget->dev.parent);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >            rphy->identify.sas_address);
> >
> >         if (sas_device) {
> >                 sas_target_priv_data->handle = sas_device->handle;
> >                 sas_target_priv_data->sas_address = sas_device->sas_address;
> > +               sas_target_priv_data->sdev = sas_device;
> >                 sas_device->starget = starget;
> >                 sas_device->id = starget->id;
> >                 sas_device->channel = starget->channel;
> >                 if (test_bit(sas_device->handle, ioc->pd_handles))
> >                         sas_target_priv_data->flags |=
> >                             MPT_TARGET_FLAGS_RAID_COMPONENT;
> > +
> 
> [Sreekanth] I think here, sas_device_put() call is missing.

The reference here is for the pointer to the sas_device in the
->hostdata.

However, the corresponding put() is missing in _scsih_target_destroy(),
so it's definitely confusing. I'll fix that.

> >         }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -1324,13 +1406,14 @@ _scsih_target_destroy(struct scsi_target *starget)
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         rphy = dev_to_rphy(starget->dev.parent);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -          rphy->identify.sas_address);
> > +       sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> >         if (sas_device && (sas_device->starget == starget) &&
> >             (sas_device->id == starget->id) &&
> >             (sas_device->channel == starget->channel))
> >                 sas_device->starget = NULL;
> >
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> >   out:
> > @@ -1386,7 +1469,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >
> >         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> >                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +               sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >                                 sas_target_priv_data->sas_address);
> >                 if (sas_device && (sas_device->starget == NULL)) {
> >                         sdev_printk(KERN_INFO, sdev,
> > @@ -1394,6 +1477,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >                              __func__, __LINE__);
> >                         sas_device->starget = starget;
> >                 }
> > +
> > +               if (sas_device)
> > +                       sas_device_put(sas_device);
> > +
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> >
> > @@ -1428,10 +1515,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
> >
> >         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> >                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                  sas_target_priv_data->sas_address);
> > +               sas_device = __mpt2sas_get_sdev_from_target(ioc,
> > +                               sas_target_priv_data);
> >                 if (sas_device && !sas_target_priv_data->num_luns)
> >                         sas_device->starget = NULL;
> > +
> > +               if (sas_device)
> > +                       sas_device_put(sas_device);
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> >
> > @@ -2078,7 +2168,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> >         }
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >            sas_device_priv_data->sas_target->sas_address);
> >         if (!sas_device) {
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -2112,17 +2202,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
> >             (unsigned long long) sas_device->enclosure_logical_id,
> >             sas_device->slot);
> >
> > +       sas_device_put(sas_device);
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         if (!ssp_target)
> >                 _scsih_display_sata_capabilities(ioc, handle, sdev);
> >
> > -
> >         _scsih_change_queue_depth(sdev, qdepth);
> >
> >         if (ssp_target) {
> >                 sas_read_port_mode_page(sdev);
> >                 _scsih_enable_tlr(ioc, sdev);
> >         }
> > +
> >         return 0;
> >  }
> >
> > @@ -2509,8 +2600,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> >                     device_str, (unsigned long long)priv_target->sas_address);
> >         } else {
> >                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                   priv_target->sas_address);
> > +               sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
> >                 if (sas_device) {
> >                         if (priv_target->flags &
> >                             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > @@ -2529,6 +2619,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> >                             "enclosure_logical_id(0x%016llx), slot(%d)\n",
> >                            (unsigned long long)sas_device->enclosure_logical_id,
> >                             sas_device->slot);
> > +
> > +                       sas_device_put(sas_device);
> >                 }
> >                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> > @@ -2604,12 +2696,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> >  {
> >         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> >         struct MPT2SAS_DEVICE *sas_device_priv_data;
> > -       struct _sas_device *sas_device;
> > -       unsigned long flags;
> > +       struct _sas_device *sas_device = NULL;
> >         u16     handle;
> >         int r;
> >
> >         struct scsi_target *starget = scmd->device->sdev_target;
> > +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> >         starget_printk(KERN_INFO, starget, "attempting device reset! "
> >             "scmd(%p)\n", scmd);
> > @@ -2629,12 +2721,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> >         handle = 0;
> >         if (sas_device_priv_data->sas_target->flags &
> >             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> > -                  sas_device_priv_data->sas_target->handle);
> > +               sas_device = mpt2sas_get_sdev_from_target(ioc,
> > +                               target_priv_data);
> >                 if (sas_device)
> >                         handle = sas_device->volume_handle;
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         } else
> >                 handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2651,6 +2741,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> >   out:
> >         sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
> >             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +
> >         return r;
> >  }
> >
> > @@ -2665,11 +2759,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> >  {
> >         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> >         struct MPT2SAS_DEVICE *sas_device_priv_data;
> > -       struct _sas_device *sas_device;
> > -       unsigned long flags;
> > +       struct _sas_device *sas_device = NULL;
> >         u16     handle;
> >         int r;
> >         struct scsi_target *starget = scmd->device->sdev_target;
> > +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> >         starget_printk(KERN_INFO, starget, "attempting target reset! "
> >             "scmd(%p)\n", scmd);
> > @@ -2689,12 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> >         handle = 0;
> >         if (sas_device_priv_data->sas_target->flags &
> >             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> > -                  sas_device_priv_data->sas_target->handle);
> > +               sas_device = mpt2sas_get_sdev_from_target(ioc,
> > +                               target_priv_data);
> >                 if (sas_device)
> >                         handle = sas_device->volume_handle;
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         } else
> >                 handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2711,6 +2803,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> >   out:
> >         starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
> >             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +
> >         return r;
> >  }
> >
> > @@ -3002,15 +3098,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
> >
> >         list_for_each_entry(mpt2sas_port,
> >            &sas_expander->sas_port_list, port_list) {
> > -               if (mpt2sas_port->remote_identify.device_type ==
> > -                   SAS_END_DEVICE) {
> > +               if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
> >                         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -                       sas_device =
> > -                           mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                          mpt2sas_port->remote_identify.sas_address);
> > -                       if (sas_device)
> > +                       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > +                                       mpt2sas_port->remote_identify.sas_address);
> > +                       if (sas_device) {
> >                                 set_bit(sas_device->handle,
> > -                                   ioc->blocking_handles);
> > +                                               ioc->blocking_handles);
> > +                               sas_device_put(sas_device);
> > +                       }
> >                         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >                 }
> >         }
> > @@ -3080,7 +3176,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >  {
> >         Mpi2SCSITaskManagementRequest_t *mpi_request;
> >         u16 smid;
> > -       struct _sas_device *sas_device;
> > +       struct _sas_device *sas_device = NULL;
> >         struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
> >         u64 sas_address = 0;
> >         unsigned long flags;
> > @@ -3110,7 +3206,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >                 return;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (sas_device && sas_device->starget &&
> >              sas_device->starget->hostdata) {
> >                 sas_target_priv_data = sas_device->starget->hostdata;
> > @@ -3131,14 +3227,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         if (!smid) {
> >                 delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
> >                 if (!delayed_tr)
> > -                       return;
> > +                       goto out;
> >                 INIT_LIST_HEAD(&delayed_tr->list);
> >                 delayed_tr->handle = handle;
> >                 list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
> >                 dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
> >                     "DELAYED:tr:handle(0x%04x), (open)\n",
> >                     ioc->name, handle));
> > -               return;
> > +               goto out;
> >         }
> >
> >         dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> > @@ -3150,6 +3246,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         mpi_request->DevHandle = cpu_to_le16(handle);
> >         mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
> >         mpt2sas_base_put_smid_hi_priority(ioc, smid);
> > +out:
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> >  }
> >
> >
> > @@ -4068,7 +4167,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> >         char *desc_scsi_state = ioc->tmp_string;
> >         u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
> >         struct _sas_device *sas_device = NULL;
> > -       unsigned long flags;
> >         struct scsi_target *starget = scmd->device->sdev_target;
> >         struct MPT2SAS_TARGET *priv_target = starget->hostdata;
> >         char *device_str = NULL;
> > @@ -4200,9 +4298,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> >                 printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
> >                     device_str, (unsigned long long)priv_target->sas_address);
> >         } else {
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -                   priv_target->sas_address);
> > +               sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
> >                 if (sas_device) {
> >                         printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
> >                             "phy(%d)\n", ioc->name, sas_device->sas_address,
> > @@ -4211,8 +4307,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> >                             "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
> >                             ioc->name, sas_device->enclosure_logical_id,
> >                             sas_device->slot);
> > +
> > +                       sas_device_put(sas_device);
> >                 }
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         }
> >
> >         printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> > @@ -4259,7 +4356,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         Mpi2SepRequest_t mpi_request;
> >         struct _sas_device *sas_device;
> >
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (!sas_device)
> >                 return;
> >
> > @@ -4274,7 +4371,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >             &mpi_request)) != 0) {
> >                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
> >                 __FILE__, __LINE__, __func__);
> > -               return;
> > +               goto out;
> >         }
> >         sas_device->pfa_led_on = 1;
> >
> > @@ -4284,8 +4381,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >                  "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
> >                  ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
> >                  le32_to_cpu(mpi_reply.IOCLogInfo)));
> > -               return;
> > +               goto out;
> >         }
> > +out:
> > +       sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -4370,19 +4469,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> >         /* only handle non-raid devices */
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (!sas_device) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > +               goto out_unlock;
> >         }
> >         starget = sas_device->starget;
> >         sas_target_priv_data = starget->hostdata;
> >
> >         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> > -          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> > +               goto out_unlock;
> > +
> >         starget_printk(KERN_WARNING, starget, "predicted fault\n");
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -4396,7 +4493,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         if (!event_reply) {
> >                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
> >                     ioc->name, __FILE__, __LINE__, __func__);
> > -               return;
> > +               goto out;
> >         }
> >
> >         event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> > @@ -4413,6 +4510,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >         event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
> >         mpt2sas_ctl_add_to_event_log(ioc, event_reply);
> >         kfree(event_reply);
> > +out:
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +       return;
> > +
> > +out_unlock:
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +       goto out;
> >  }
> >
> >  /**
> > @@ -5148,14 +5253,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             sas_address);
> >
> >         if (!sas_device) {
> >                 printk(MPT2SAS_ERR_FMT "device is not present "
> >                     "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > +               goto out_unlock;
> >         }
> >
> >         if (unlikely(sas_device->handle != handle)) {
> > @@ -5172,19 +5276,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >             MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
> >                 printk(MPT2SAS_ERR_FMT "device is not present "
> >                     "handle(0x%04x), flags!!!\n", ioc->name, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > +               goto out_unlock;
> >         }
> >
> >         /* check if there were any issues with discovery */
> >         if (_scsih_check_access_status(ioc, sas_address, handle,
> > -           sas_device_pg0.AccessStatus)) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +           sas_device_pg0.AccessStatus))
> > +               goto out_unlock;
> > +
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         _scsih_ublock_io_device(ioc, sas_address);
> > +       return;
> 
> [Sreekanth] I think here driver exits from this function without
> reducing the reference count.

Yes, it does. Will fix.

> >
> > +out_unlock:
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -5208,7 +5315,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> >         u32 ioc_status;
> >         __le64 sas_address;
> >         u32 device_info;
> > -       unsigned long flags;
> >
> >         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> >             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -5250,14 +5356,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> >                 return -1;
> >         }
> >
> > -
> > -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = mpt2sas_get_sdev_by_addr(ioc,
> >             sas_address);
> > -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > -       if (sas_device)
> > +       if (sas_device) {
> > +               sas_device_put(sas_device);
> >                 return 0;
> > +       }
> >
> >         sas_device = kzalloc(sizeof(struct _sas_device),
> >             GFP_KERNEL);
> > @@ -5267,6 +5372,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> >                 return -1;
> >         }
> >
> > +       kref_init(&sas_device->refcount);
> >         sas_device->handle = handle;
> >         if (_scsih_get_sas_address(ioc, le16_to_cpu
> >                 (sas_device_pg0.ParentDevHandle),
> > @@ -5344,7 +5450,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
> >             "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> >             sas_device->handle, (unsigned long long)
> >             sas_device->sas_address));
> > -       kfree(sas_device);
> >  }
> >  /**
> >   * _scsih_device_remove_by_handle - removing device object by handle
> > @@ -5363,12 +5468,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >                 return;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -       if (sas_device)
> > -               list_del(&sas_device->list);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > +       if (sas_device) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -       if (sas_device)
> > +
> > +       if (sas_device) {
> >                 _scsih_remove_device(ioc, sas_device);
> > +               sas_device_put(sas_device);
> > +       }
> >  }
> >
> >  /**
> > @@ -5389,13 +5499,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> >                 return;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > -           sas_address);
> > -       if (sas_device)
> > -               list_del(&sas_device->list);
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> > +       if (sas_device) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -       if (sas_device)
> > +
> > +       if (sas_device) {
> >                 _scsih_remove_device(ioc, sas_device);
> > +               sas_device_put(sas_device);
> > +       }
> >  }
> >  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> >  /**
> > @@ -5716,26 +5830,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         sas_address = le64_to_cpu(event_data->SASAddress);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             sas_address);
> >
> > -       if (!sas_device || !sas_device->starget) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +       if (!sas_device || !sas_device->starget)
> > +               goto out;
> >
> >         target_priv_data = sas_device->starget->hostdata;
> > -       if (!target_priv_data) {
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               return;
> > -       }
> > +       if (!target_priv_data)
> > +               goto out;
> >
> >         if (event_data->ReasonCode ==
> >             MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
> >                 target_priv_data->tm_busy = 1;
> >         else
> >                 target_priv_data->tm_busy = 0;
> > +
> > +out:
> > +       if (sas_device)
> > +               sas_device_put(sas_device);
> > +
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> >  }
> >
> >  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> > @@ -6123,7 +6239,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> >         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (sas_device) {
> >                 sas_device->volume_handle = 0;
> >                 sas_device->volume_wwid = 0;
> > @@ -6142,6 +6258,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> >         /* exposing raid component */
> >         if (starget)
> >                 starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> > +
> > +       sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -6170,7 +6288,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> >                     &volume_wwid);
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> >         if (sas_device) {
> >                 set_bit(handle, ioc->pd_handles);
> >                 if (sas_device->starget && sas_device->starget->hostdata) {
> > @@ -6189,6 +6307,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> >         /* hiding raid component */
> >         if (starget)
> >                 starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> > +
> > +       sas_device_put(sas_device);
> >  }
> >
> >  /**
> > @@ -6221,7 +6341,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> >      Mpi2EventIrConfigElement_t *element)
> >  {
> >         struct _sas_device *sas_device;
> > -       unsigned long flags;
> >         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> >         Mpi2ConfigReply_t mpi_reply;
> >         Mpi2SasDevicePage0_t sas_device_pg0;
> > @@ -6231,11 +6350,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> >
> >         set_bit(handle, ioc->pd_handles);
> >
> > -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -       if (sas_device)
> > +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > +       if (sas_device) {
> > +               sas_device_put(sas_device);
> >                 return;
> > +       }
> >
> >         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> >             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -6509,7 +6628,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> >         u16 handle, parent_handle;
> >         u32 state;
> >         struct _sas_device *sas_device;
> > -       unsigned long flags;
> >         Mpi2ConfigReply_t mpi_reply;
> >         Mpi2SasDevicePage0_t sas_device_pg0;
> >         u32 ioc_status;
> > @@ -6542,12 +6660,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> >                 if (!ioc->is_warpdrive)
> >                         set_bit(handle, ioc->pd_handles);
> >
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -
> > -               if (sas_device)
> > +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > +               if (sas_device) {
> > +                       sas_device_put(sas_device);
> >                         return;
> > +               }
> >
> >                 if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> >                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> > @@ -7015,6 +7132,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> >         struct _raid_device *raid_device, *raid_device_next;
> >         struct list_head tmp_list;
> >         unsigned long flags;
> > +       LIST_HEAD(head);
> >
> >         printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
> >             ioc->name);
> > @@ -7022,14 +7140,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> >         /* removing unresponding end devices */
> >         printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
> >             ioc->name);
> > +
> > +       /*
> > +        * Iterate, pulling off devices marked as non-responding. We become the
> > +        * owner for the reference the list had on any object we prune.
> > +        */
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> >         list_for_each_entry_safe(sas_device, sas_device_next,
> > -           &ioc->sas_device_list, list) {
> > +                       &ioc->sas_device_list, list) {
> >                 if (!sas_device->responding)
> > -                       mpt2sas_device_remove_by_sas_address(ioc,
> > -                               sas_device->sas_address);
> > +                       list_move_tail(&sas_device->list, &head);
> >                 else
> >                         sas_device->responding = 0;
> >         }
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       /*
> > +        * Now, uninitialize and remove the unresponding devices we pruned.
> > +        */
> > +       list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> > +               _scsih_remove_device(ioc, sas_device);
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> >
> >         /* removing unresponding volumes */
> >         if (ioc->ir_firmware) {
> > @@ -7179,11 +7312,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> >                 }
> >                 phys_disk_num = pd_pg0.PhysDiskNum;
> >                 handle = le16_to_cpu(pd_pg0.DevHandle);
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               if (sas_device)
> > +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > +               if (sas_device) {
> > +                       sas_device_put(sas_device);
> >                         continue;
> > +               }
> >                 if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> >                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> >                     handle) != 0)
> > @@ -7302,12 +7435,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> >                 if (!(_scsih_is_end_device(
> >                     le32_to_cpu(sas_device_pg0.DeviceInfo))))
> >                         continue;
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +               sas_device = mpt2sas_get_sdev_by_addr(ioc,
> >                     le64_to_cpu(sas_device_pg0.SASAddress));
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -               if (sas_device)
> > +               if (sas_device) {
> > +                       sas_device_put(sas_device);
> >                         continue;
> > +               }
> >                 parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
> >                 if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
> >                         printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> > @@ -7966,6 +8099,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> >         }
> >  }
> >
> > +static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > +       struct _sas_device *sas_device = NULL;
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +       if (!list_empty(&ioc->sas_device_init_list)) {
> > +               sas_device = list_first_entry(&ioc->sas_device_init_list,
> > +                               struct _sas_device, list);
> > +               sas_device_get(sas_device);
> > +       }
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +       return sas_device;
> > +}
> > +
> > +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> > +               struct _sas_device *sas_device)
> > +{
> > +       unsigned long flags;
> > +
> > +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +
> > +       /*
> > +        * Since we dropped the lock during the call to port_add(), we need to
> > +        * be careful here that somebody else didn't move or delete this item
> > +        * while we were busy with other things.
> > +        *
> > +        * If it was on the list, we need a put() for the reference the list
> > +        * had. Either way, we need a get() for the destination list.
> > +        */
> > +       if (!list_empty(&sas_device->list)) {
> > +               list_del_init(&sas_device->list);
> > +               sas_device_put(sas_device);
> > +       }
> > +
> > +       sas_device_get(sas_device);
> > +       list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > +
> > +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +}
> > +
> >  /**
> >   * _scsih_probe_sas - reporting sas devices to sas transport
> >   * @ioc: per adapter object
> > @@ -7975,34 +8150,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> >  static void
> >  _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> >  {
> > -       struct _sas_device *sas_device, *next;
> > -       unsigned long flags;
> > -
> > -       /* SAS Device List */
> > -       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> > -           list) {
> > +       struct _sas_device *sas_device;
> >
> > -               if (ioc->hide_drives)
> > -                       continue;
> > +       if (ioc->hide_drives)
> > +               return;
> >
> > +       while ((sas_device = get_next_sas_device(ioc))) {
> >                 if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> > -                   sas_device->sas_address_parent)) {
> > -                       list_del(&sas_device->list);
> > -                       kfree(sas_device);
> > +                               sas_device->sas_address_parent)) {
> > +                       _scsih_sas_device_remove(ioc, sas_device);
> > +                       sas_device_put(sas_device);
> >                         continue;
> >                 } else if (!sas_device->starget) {
> >                         if (!ioc->is_driver_loading) {
> >                                 mpt2sas_transport_port_remove(ioc,
> > -                                       sas_device->sas_address,
> > -                                       sas_device->sas_address_parent);
> > -                               list_del(&sas_device->list);
> > -                               kfree(sas_device);
> > +                                               sas_device->sas_address,
> > +                                               sas_device->sas_address_parent);
> > +                               _scsih_sas_device_remove(ioc, sas_device);
> > +                               sas_device_put(sas_device);
> >                                 continue;
> >                         }
> >                 }
> > -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > +               sas_device_make_active(ioc, sas_device);
> > +               sas_device_put(sas_device);
> >         }
> >  }
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > index ff2500a..af86800 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
> >         int rc;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             rphy->identify.sas_address);
> >         if (sas_device) {
> >                 *identifier = sas_device->enclosure_logical_id;
> >                 rc = 0;
> > +               sas_device_put(sas_device);
> >         } else {
> >                 *identifier = 0;
> >                 rc = -ENXIO;
> >         }
> > +
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         return rc;
> >  }
> > @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
> >         int rc;
> >
> >         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> >             rphy->identify.sas_address);
> > -       if (sas_device)
> > +       if (sas_device) {
> >                 rc = sas_device->slot;
> > -       else
> > +               sas_device_put(sas_device);
> > +       } else {
> >                 rc = -ENXIO;
> > +       }
> >         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >         return rc;
> >  }
> > --
> > 1.8.5.6
> >
> 
> 
> 
> -- 
> 
> Regards,
> Sreekanth

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 0/2] Fixes for memory corruption in mpt2sas
  2015-08-01  5:02         ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-08-01  5:02           ` [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-08-01  5:02           ` [PATCH v3 2/2] mpt2sas: Refcount fw_events " Calvin Owens
@ 2015-08-14  1:48           ` Calvin Owens
  2015-08-14  1:48             ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-08-25 21:21             ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Nicholas A. Bellinger
  2 siblings, 2 replies; 52+ messages in thread
From: Calvin Owens @ 2015-08-14  1:48 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Joe Lawrence

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Thanks,
Calvin


[PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
[PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Total diffstat:
 drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 592 ++++++++++++++++++++++---------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 451 insertions(+), 175 deletions(-)

Diff showing changes v3 => v4:
	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v3v4.patch

Diff showing changes v2 => v3:
	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch

Diff showing changes v1 => v2:
	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-08-14  1:48           ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
@ 2015-08-14  1:48             ` Calvin Owens
  2015-08-14  1:48               ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
                                 ` (2 more replies)
  2015-08-25 21:21             ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Nicholas A. Bellinger
  1 sibling, 3 replies; 52+ messages in thread
From: Calvin Owens @ 2015-08-14  1:48 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Joe Lawrence, Christoph Hellwig, Bart Van Assche

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
Changes in v4:
	* Fix lack of put() in non-SATA case in _scsih_change_queue_depth()
	* Fix lack of put() in the non-error case in _scsih_check_device()
	* Add missing put() at bottom of _scsih_add_device()
	* Add put for ->hostdata pointer in _scsih_target_destroy() for the
	  get() in _scsih_target_alloc()

Changes in v3:
	* Drop the sas_device_lock while enabling devices, and leave the
	  sas_device object on the list, since it may need to be looked up there
	  while it is being enabled.
	* Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
	  reference (this was an oversight in v2).
	* Be consistent about calling sas_device_put() while holding the
	  sas_device_lock where feasible.
	* Take and assert_spin_locked() on the sas_device_lock from the newly
	  added __get_sdev_from_target(), add wrapper similar to other lookups
	  for callers which do not explicitly take the lock.

Changes in v2:
	* Squished patches 1-3 into this one
	* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
	* Store a pointer to the sas_device object in ->hostdata, to eliminate
	  the need for several lookups on the lists.

 drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
 drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 480 +++++++++++++++++++++----------
 drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
 3 files changed, 360 insertions(+), 154 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
  * @flags: MPT_TARGET_FLAGS_XXX flags
  * @deleted: target flaged for deletion
  * @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
  */
 struct MPT2SAS_TARGET {
 	struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
 	u32	flags;
 	u8	deleted;
 	u8	tm_busy;
+	struct _sas_device *sdev;
 };
 
 
@@ -376,8 +378,24 @@ struct _sas_device {
 	u8	phy;
 	u8	responding;
 	u8	pfa_led_on;
+	struct kref refcount;
 };
 
+static inline void sas_device_get(struct _sas_device *s)
+{
+	kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+	kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+	kref_put(&s->refcount, sas_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
     u16 handle);
 struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
     *ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
     struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
 
 void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..5eca3a4 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
 	}
 }
 
+static struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+		struct MPT2SAS_TARGET *tgt_priv)
+{
+	struct _sas_device *ret;
+
+	assert_spin_locked(&ioc->sas_device_lock);
+
+	ret = tgt_priv->sdev;
+	if (ret)
+		sas_device_get(ret);
+
+	return ret;
+}
+
+static struct _sas_device *
+mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+		struct MPT2SAS_TARGET *tgt_priv)
+{
+	struct _sas_device *ret;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return ret;
+}
+
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+    u64 sas_address)
+{
+	struct _sas_device *sas_device;
+
+	assert_spin_locked(&ioc->sas_device_lock);
+
+	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+		if (sas_device->sas_address == sas_address)
+			goto found_device;
+
+	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
+}
+
 /**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_get_sdev_by_addr - sas device search
  * @ioc: per adapter object
  * @sas_address: sas address
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
     u64 sas_address)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+			sas_address);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static struct _sas_device *
+__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+	struct _sas_device *sas_device;
+
+	assert_spin_locked(&ioc->sas_device_lock);
 
 	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->sas_address == sas_address)
-			return sas_device;
+		if (sas_device->handle == handle)
+			goto found_device;
 
 	return NULL;
+
+found_device:
+	sas_device_get(sas_device);
+	return sas_device;
 }
 
 /**
- * _scsih_sas_device_find_by_handle - sas device search
+ * mpt2sas_get_sdev_by_handle - sas device search
  * @ioc: per adapter object
  * @handle: sas device handle (assigned by firmware)
  * Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
  * object.
  */
 static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct _sas_device *sas_device;
+	unsigned long flags;
 
-	list_for_each_entry(sas_device, &ioc->sas_device_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
-
-	list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
-		if (sas_device->handle == handle)
-			return sas_device;
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	return NULL;
+	return sas_device;
 }
 
 /**
@@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
  * @sas_device: the sas_device object
  * Context: This function will acquire ioc->sas_device_lock.
  *
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
  */
 static void
 _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
 	if (!sas_device)
 		return;
 
+	/*
+	 * The lock serializes access to the list, but we still need to verify
+	 * that nobody removed the entry while we were waiting on the lock.
+	 */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	list_del(&sas_device->list);
-	kfree(sas_device);
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 }
 
@@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_list);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
 	    sas_device->handle, (unsigned long long)sas_device->sas_address));
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	sas_device_get(sas_device);
 	list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
 	_scsih_determine_boot_device(ioc, sas_device, 0);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1286,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
 		goto not_sata;
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
 		goto not_sata;
+
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	   sas_device_priv_data->sas_target->sas_address);
-	if (sas_device && sas_device->device_info &
-	    MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
-		max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+	sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
+	if (sas_device) {
+		if (sas_device->device_info & MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+			max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  not_sata:
@@ -1271,18 +1352,20 @@ _scsih_target_alloc(struct scsi_target *starget)
 	/* sas/sata devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	   rphy->identify.sas_address);
 
 	if (sas_device) {
 		sas_target_priv_data->handle = sas_device->handle;
 		sas_target_priv_data->sas_address = sas_device->sas_address;
+		sas_target_priv_data->sdev = sas_device;
 		sas_device->starget = starget;
 		sas_device->id = starget->id;
 		sas_device->channel = starget->channel;
 		if (test_bit(sas_device->handle, ioc->pd_handles))
 			sas_target_priv_data->flags |=
 			    MPT_TARGET_FLAGS_RAID_COMPONENT;
+
 	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -1324,13 +1407,21 @@ _scsih_target_destroy(struct scsi_target *starget)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	rphy = dev_to_rphy(starget->dev.parent);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	   rphy->identify.sas_address);
+	sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
 	if (sas_device && (sas_device->starget == starget) &&
 	    (sas_device->id == starget->id) &&
 	    (sas_device->channel == starget->channel))
 		sas_device->starget = NULL;
 
+	if (sas_device) {
+		/*
+		 * Corresponding get() is in _scsih_target_alloc()
+		 */
+		sas_target_priv_data->sdev = NULL;
+		sas_device_put(sas_device);
+
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
  out:
@@ -1386,7 +1477,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 				sas_target_priv_data->sas_address);
 		if (sas_device && (sas_device->starget == NULL)) {
 			sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1485,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
 			     __func__, __LINE__);
 			sas_device->starget = starget;
 		}
+
+		if (sas_device)
+			sas_device_put(sas_device);
+
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -1428,10 +1523,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
 
 	if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		   sas_target_priv_data->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(ioc,
+				sas_target_priv_data);
 		if (sas_device && !sas_target_priv_data->num_luns)
 			sas_device->starget = NULL;
+
+		if (sas_device)
+			sas_device_put(sas_device);
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
@@ -2078,7 +2176,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	}
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	   sas_device_priv_data->sas_target->sas_address);
 	if (!sas_device) {
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2112,17 +2210,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
 	    (unsigned long long) sas_device->enclosure_logical_id,
 	    sas_device->slot);
 
+	sas_device_put(sas_device);
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	if (!ssp_target)
 		_scsih_display_sata_capabilities(ioc, handle, sdev);
 
-
 	_scsih_change_queue_depth(sdev, qdepth);
 
 	if (ssp_target) {
 		sas_read_port_mode_page(sdev);
 		_scsih_enable_tlr(ioc, sdev);
 	}
+
 	return 0;
 }
 
@@ -2509,8 +2608,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
 		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		    priv_target->sas_address);
+		sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
 		if (sas_device) {
 			if (priv_target->flags &
 			    MPT_TARGET_FLAGS_RAID_COMPONENT) {
@@ -2529,6 +2627,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
 			    "enclosure_logical_id(0x%016llx), slot(%d)\n",
 			   (unsigned long long)sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
 		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
@@ -2604,12 +2704,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 
 	struct scsi_target *starget = scmd->device->sdev_target;
+	struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
 
 	starget_printk(KERN_INFO, starget, "attempting device reset! "
 	    "scmd(%p)\n", scmd);
@@ -2629,12 +2729,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
-		   sas_device_priv_data->sas_target->handle);
+		sas_device = mpt2sas_get_sdev_from_target(ioc,
+				target_priv_data);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2651,6 +2749,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
  out:
 	sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -2665,11 +2767,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 {
 	struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
 	struct MPT2SAS_DEVICE *sas_device_priv_data;
-	struct _sas_device *sas_device;
-	unsigned long flags;
+	struct _sas_device *sas_device = NULL;
 	u16	handle;
 	int r;
 	struct scsi_target *starget = scmd->device->sdev_target;
+	struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
 
 	starget_printk(KERN_INFO, starget, "attempting target reset! "
 	    "scmd(%p)\n", scmd);
@@ -2689,12 +2791,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
 	handle = 0;
 	if (sas_device_priv_data->sas_target->flags &
 	    MPT_TARGET_FLAGS_RAID_COMPONENT) {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc,
-		   sas_device_priv_data->sas_target->handle);
+		sas_device = mpt2sas_get_sdev_from_target(ioc,
+				target_priv_data);
 		if (sas_device)
 			handle = sas_device->volume_handle;
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	} else
 		handle = sas_device_priv_data->sas_target->handle;
 
@@ -2711,6 +2811,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
  out:
 	starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
 	    ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	return r;
 }
 
@@ -3002,15 +3106,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
 
 	list_for_each_entry(mpt2sas_port,
 	   &sas_expander->sas_port_list, port_list) {
-		if (mpt2sas_port->remote_identify.device_type ==
-		    SAS_END_DEVICE) {
+		if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
 			spin_lock_irqsave(&ioc->sas_device_lock, flags);
-			sas_device =
-			    mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-			   mpt2sas_port->remote_identify.sas_address);
-			if (sas_device)
+			sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+					mpt2sas_port->remote_identify.sas_address);
+			if (sas_device) {
 				set_bit(sas_device->handle,
-				    ioc->blocking_handles);
+						ioc->blocking_handles);
+				sas_device_put(sas_device);
+			}
 			spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 		}
 	}
@@ -3080,7 +3184,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	Mpi2SCSITaskManagementRequest_t *mpi_request;
 	u16 smid;
-	struct _sas_device *sas_device;
+	struct _sas_device *sas_device = NULL;
 	struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
 	u64 sas_address = 0;
 	unsigned long flags;
@@ -3110,7 +3214,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device && sas_device->starget &&
 	     sas_device->starget->hostdata) {
 		sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3235,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!smid) {
 		delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
 		if (!delayed_tr)
-			return;
+			goto out;
 		INIT_LIST_HEAD(&delayed_tr->list);
 		delayed_tr->handle = handle;
 		list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
 		dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
 		    "DELAYED:tr:handle(0x%04x), (open)\n",
 		    ioc->name, handle));
-		return;
+		goto out;
 	}
 
 	dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3254,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	mpi_request->DevHandle = cpu_to_le16(handle);
 	mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
 	mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 
@@ -4068,7 +4175,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 	char *desc_scsi_state = ioc->tmp_string;
 	u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
 	struct _sas_device *sas_device = NULL;
-	unsigned long flags;
 	struct scsi_target *starget = scmd->device->sdev_target;
 	struct MPT2SAS_TARGET *priv_target = starget->hostdata;
 	char *device_str = NULL;
@@ -4200,9 +4306,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 		printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
 		    device_str, (unsigned long long)priv_target->sas_address);
 	} else {
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-		    priv_target->sas_address);
+		sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
 		if (sas_device) {
 			printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
 			    "phy(%d)\n", ioc->name, sas_device->sas_address,
@@ -4211,8 +4315,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 			    "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
 			    ioc->name, sas_device->enclosure_logical_id,
 			    sas_device->slot);
+
+			sas_device_put(sas_device);
 		}
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	}
 
 	printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4364,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	Mpi2SepRequest_t mpi_request;
 	struct _sas_device *sas_device;
 
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (!sas_device)
 		return;
 
@@ -4274,7 +4379,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    &mpi_request)) != 0) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
 		__FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 	sas_device->pfa_led_on = 1;
 
@@ -4284,8 +4389,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		 "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
 		 ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
 		 le32_to_cpu(mpi_reply.IOCLogInfo)));
-		return;
+		goto out;
 	}
+out:
+	sas_device_put(sas_device);
 }
 
 /**
@@ -4370,19 +4477,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	/* only handle non-raid devices */
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (!sas_device) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 	starget = sas_device->starget;
 	sas_target_priv_data = starget->hostdata;
 
 	if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
-	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	   ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+		goto out_unlock;
+
 	starget_printk(KERN_WARNING, starget, "predicted fault\n");
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
@@ -4396,7 +4501,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	if (!event_reply) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
-		return;
+		goto out;
 	}
 
 	event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4518,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
 	mpt2sas_ctl_add_to_event_log(ioc, event_reply);
 	kfree(event_reply);
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+	return;
+
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	goto out;
 }
 
 /**
@@ -5148,14 +5261,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
 
 	if (!sas_device) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5284,24 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 	    MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
 		printk(MPT2SAS_ERR_FMT "device is not present "
 		    "handle(0x%04x), flags!!!\n", ioc->name, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* check if there were any issues with discovery */
 	if (_scsih_check_access_status(ioc, sas_address, handle,
-	    sas_device_pg0.AccessStatus)) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	    sas_device_pg0.AccessStatus))
+		goto out_unlock;
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	_scsih_ublock_io_device(ioc, sas_address);
+	if (sas_device)
+		sas_device_put(sas_device);
+	return;
 
+out_unlock:
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+	if (sas_device)
+		sas_device_put(sas_device);
 }
 
 /**
@@ -5208,7 +5325,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 	u32 ioc_status;
 	__le64 sas_address;
 	u32 device_info;
-	unsigned long flags;
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5366,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
-
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 
-	if (sas_device)
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return 0;
+	}
 
 	sas_device = kzalloc(sizeof(struct _sas_device),
 	    GFP_KERNEL);
@@ -5267,6 +5382,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 		return -1;
 	}
 
+	kref_init(&sas_device->refcount);
 	sas_device->handle = handle;
 	if (_scsih_get_sas_address(ioc, le16_to_cpu
 		(sas_device_pg0.ParentDevHandle),
@@ -5296,6 +5412,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
 	else
 		_scsih_sas_device_add(ioc, sas_device);
 
+	sas_device_put(sas_device);
 	return 0;
 }
 
@@ -5344,7 +5461,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
 	    "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
 	    sas_device->handle, (unsigned long long)
 	    sas_device->sas_address));
-	kfree(sas_device);
 }
 /**
  * _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5479,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 
 /**
@@ -5389,13 +5510,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
 		return;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
-	    sas_address);
-	if (sas_device)
-		list_del(&sas_device->list);
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
+	if (sas_device) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+
+	if (sas_device) {
 		_scsih_remove_device(ioc, sas_device);
+		sas_device_put(sas_device);
+	}
 }
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
 /**
@@ -5716,26 +5841,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	sas_address = le64_to_cpu(event_data->SASAddress);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    sas_address);
 
-	if (!sas_device || !sas_device->starget) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!sas_device || !sas_device->starget)
+		goto out;
 
 	target_priv_data = sas_device->starget->hostdata;
-	if (!target_priv_data) {
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		return;
-	}
+	if (!target_priv_data)
+		goto out;
 
 	if (event_data->ReasonCode ==
 	    MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
 		target_priv_data->tm_busy = 1;
 	else
 		target_priv_data->tm_busy = 0;
+
+out:
+	if (sas_device)
+		sas_device_put(sas_device);
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
 }
 
 #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6250,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device) {
 		sas_device->volume_handle = 0;
 		sas_device->volume_wwid = 0;
@@ -6142,6 +6269,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
 	/* exposing raid component */
 	if (starget)
 		starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6170,7 +6299,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 		    &volume_wwid);
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+	sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
 	if (sas_device) {
 		set_bit(handle, ioc->pd_handles);
 		if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6318,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
 	/* hiding raid component */
 	if (starget)
 		starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+	sas_device_put(sas_device);
 }
 
 /**
@@ -6221,7 +6352,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
     Mpi2EventIrConfigElement_t *element)
 {
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6361,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
 
 	set_bit(handle, ioc->pd_handles);
 
-	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-	if (sas_device)
+	sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+	if (sas_device) {
+		sas_device_put(sas_device);
 		return;
+	}
 
 	if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
 	    MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6639,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 	u16 handle, parent_handle;
 	u32 state;
 	struct _sas_device *sas_device;
-	unsigned long flags;
 	Mpi2ConfigReply_t mpi_reply;
 	Mpi2SasDevicePage0_t sas_device_pg0;
 	u32 ioc_status;
@@ -6542,12 +6671,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
 		if (!ioc->is_warpdrive)
 			set_bit(handle, ioc->pd_handles);
 
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
-		if (sas_device)
+		sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			return;
+		}
 
 		if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7015,6 +7143,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	struct _raid_device *raid_device, *raid_device_next;
 	struct list_head tmp_list;
 	unsigned long flags;
+	LIST_HEAD(head);
 
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
 	    ioc->name);
@@ -7022,14 +7151,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
 	/* removing unresponding end devices */
 	printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
 	    ioc->name);
+
+	/*
+	 * Iterate, pulling off devices marked as non-responding. We become the
+	 * owner for the reference the list had on any object we prune.
+	 */
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
 	list_for_each_entry_safe(sas_device, sas_device_next,
-	    &ioc->sas_device_list, list) {
+			&ioc->sas_device_list, list) {
 		if (!sas_device->responding)
-			mpt2sas_device_remove_by_sas_address(ioc,
-				sas_device->sas_address);
+			list_move_tail(&sas_device->list, &head);
 		else
 			sas_device->responding = 0;
 	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Now, uninitialize and remove the unresponding devices we pruned.
+	 */
+	list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+		_scsih_remove_device(ioc, sas_device);
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
 
 	/* removing unresponding volumes */
 	if (ioc->ir_firmware) {
@@ -7179,11 +7323,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		}
 		phys_disk_num = pd_pg0.PhysDiskNum;
 		handle = le16_to_cpu(pd_pg0.DevHandle);
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
 		    &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
 		    handle) != 0)
@@ -7302,12 +7446,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
 		if (!(_scsih_is_end_device(
 		    le32_to_cpu(sas_device_pg0.DeviceInfo))))
 			continue;
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+		sas_device = mpt2sas_get_sdev_by_addr(ioc,
 		    le64_to_cpu(sas_device_pg0.SASAddress));
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-		if (sas_device)
+		if (sas_device) {
+			sas_device_put(sas_device);
 			continue;
+		}
 		parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
 		if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
 			printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
@@ -7966,6 +8110,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 	}
 }
 
+static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+	struct _sas_device *sas_device = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+	if (!list_empty(&ioc->sas_device_init_list)) {
+		sas_device = list_first_entry(&ioc->sas_device_init_list,
+				struct _sas_device, list);
+		sas_device_get(sas_device);
+	}
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+	return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+		struct _sas_device *sas_device)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ioc->sas_device_lock, flags);
+
+	/*
+	 * Since we dropped the lock during the call to port_add(), we need to
+	 * be careful here that somebody else didn't move or delete this item
+	 * while we were busy with other things.
+	 *
+	 * If it was on the list, we need a put() for the reference the list
+	 * had. Either way, we need a get() for the destination list.
+	 */
+	if (!list_empty(&sas_device->list)) {
+		list_del_init(&sas_device->list);
+		sas_device_put(sas_device);
+	}
+
+	sas_device_get(sas_device);
+	list_add_tail(&sas_device->list, &ioc->sas_device_list);
+
+	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
 /**
  * _scsih_probe_sas - reporting sas devices to sas transport
  * @ioc: per adapter object
@@ -7975,34 +8161,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct _sas_device *sas_device, *next;
-	unsigned long flags;
-
-	/* SAS Device List */
-	list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
-	    list) {
+	struct _sas_device *sas_device;
 
-		if (ioc->hide_drives)
-			continue;
+	if (ioc->hide_drives)
+		return;
 
+	while ((sas_device = get_next_sas_device(ioc))) {
 		if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
-		    sas_device->sas_address_parent)) {
-			list_del(&sas_device->list);
-			kfree(sas_device);
+				sas_device->sas_address_parent)) {
+			_scsih_sas_device_remove(ioc, sas_device);
+			sas_device_put(sas_device);
 			continue;
 		} else if (!sas_device->starget) {
 			if (!ioc->is_driver_loading) {
 				mpt2sas_transport_port_remove(ioc,
-					sas_device->sas_address,
-					sas_device->sas_address_parent);
-				list_del(&sas_device->list);
-				kfree(sas_device);
+						sas_device->sas_address,
+						sas_device->sas_address_parent);
+				_scsih_sas_device_remove(ioc, sas_device);
+				sas_device_put(sas_device);
 				continue;
 			}
 		}
-		spin_lock_irqsave(&ioc->sas_device_lock, flags);
-		list_move_tail(&sas_device->list, &ioc->sas_device_list);
-		spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+		sas_device_make_active(ioc, sas_device);
+		sas_device_put(sas_device);
 	}
 }
 
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..af86800 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    rphy->identify.sas_address);
 	if (sas_device) {
 		*identifier = sas_device->enclosure_logical_id;
 		rc = 0;
+		sas_device_put(sas_device);
 	} else {
 		*identifier = 0;
 		rc = -ENXIO;
 	}
+
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
 	int rc;
 
 	spin_lock_irqsave(&ioc->sas_device_lock, flags);
-	sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+	sas_device = __mpt2sas_get_sdev_by_addr(ioc,
 	    rphy->identify.sas_address);
-	if (sas_device)
+	if (sas_device) {
 		rc = sas_device->slot;
-	else
+		sas_device_put(sas_device);
+	} else {
 		rc = -ENXIO;
+	}
 	spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
 	return rc;
 }
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
  2015-08-14  1:48             ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
@ 2015-08-14  1:48               ` Calvin Owens
  2015-08-25 21:06                 ` Nicholas A. Bellinger
  2015-09-04 14:35                 ` Sreekanth Reddy
  2015-08-25 21:03               ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects " Nicholas A. Bellinger
  2015-09-04 14:34               ` Sreekanth Reddy
  2 siblings, 2 replies; 52+ messages in thread
From: Calvin Owens @ 2015-08-14  1:48 UTC (permalink / raw)
  To: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	calvinowens, Joe Lawrence, Christoph Hellwig

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
Changes in v4: None

Changes in v3:
	* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
	  which can loop over a sleep forever (5m+ at least) at unloading. I
	  don't think anything prevented this before, but taking the fw_event
	  object off the list at the top of _firmware_event_work() seems to have
	  made it more likely to happen.

Changes in v2:
	* Squished patches 4-6 into one patch
	* Remove the fw_event from fw_event_list at the start of
	  _firmware_event_work()
	* Explicitly seperate fw_event_list removal from fw_event freeing

drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
 1 file changed, 91 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 5eca3a4..c0ff55b 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
 	u8			VP_ID;
 	u8			ignore;
 	u16			event;
+	struct kref		refcount;
 	char			event_data[0] __aligned(4);
 };
 
+static void fw_event_work_free(struct kref *r)
+{
+	kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+	kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+	kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+	struct fw_event_work *fw_event;
+
+	fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+	if (!fw_event)
+		return NULL;
+
+	kref_init(&fw_event->refcount);
+	return fw_event;
+}
+
 /* raid transport support */
 static struct raid_template *mpt2sas_raid_template;
 
@@ -2872,36 +2900,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
 		return;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	fw_event_work_get(fw_event);
 	list_add_tail(&fw_event->list, &ioc->fw_event_list);
 	INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
+	fw_event_work_get(fw_event);
 	queue_delayed_work(ioc->firmware_event_thread,
 	    &fw_event->delayed_work, 0);
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
 /**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
  * @ioc: per adapter object
  * @fw_event: object describing the event
  * Context: This function will acquire ioc->fw_event_lock.
  *
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
  *
  * Return nothing.
  */
 static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
     *fw_event)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&ioc->fw_event_lock, flags);
-	list_del(&fw_event->list);
-	kfree(fw_event);
+	if (!list_empty(&fw_event->list)) {
+		list_del_init(&fw_event->list);
+		fw_event_work_put(fw_event);
+	}
 	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
 }
 
-
 /**
  * _scsih_error_recovery_delete_devices - remove devices not responding
  * @ioc: per adapter object
@@ -2916,13 +2947,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
 	if (ioc->is_driver_loading)
 		return;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 
 	fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -2936,12 +2968,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+	unsigned long flags;
+	struct fw_event_work *fw_event = NULL;
+
+	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	if (!list_empty(&ioc->fw_event_list)) {
+		fw_event = list_first_entry(&ioc->fw_event_list,
+				struct fw_event_work, list);
+		list_del_init(&fw_event->list);
+	}
+	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+	return fw_event;
 }
 
 /**
@@ -2956,17 +3005,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct fw_event_work *fw_event, *next;
+	struct fw_event_work *fw_event;
 
 	if (list_empty(&ioc->fw_event_list) ||
 	     !ioc->firmware_event_thread || in_interrupt())
 		return;
 
-	list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
-		if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
-			_scsih_fw_event_free(ioc, fw_event);
-			continue;
-		}
+	while ((fw_event = dequeue_next_fw_event(ioc))) {
+		/*
+		 * Wait on the fw_event to complete. If this returns 1, then
+		 * the event was never executed, and we need a put for the
+		 * reference the delayed_work had on the fw_event.
+		 *
+		 * If it did execute, we wait for it to finish, and the put will
+		 * happen from _firmware_event_work()
+		 */
+		if (cancel_delayed_work_sync(&fw_event->delayed_work))
+			fw_event_work_put(fw_event);
+
+		fw_event_work_put(fw_event);
 	}
 }
 
@@ -4447,13 +4504,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
 {
 	struct fw_event_work *fw_event;
 
-	fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(0);
 	if (!fw_event)
 		return;
 	fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
 	fw_event->device_handle = handle;
 	fw_event->ioc = ioc;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7554,17 +7612,27 @@ _firmware_event_work(struct work_struct *work)
 	    struct fw_event_work, delayed_work.work);
 	struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;
 
+	_scsih_fw_event_del_from_list(ioc, fw_event);
+
 	/* the queue is being flushed so ignore this event */
-	if (ioc->remove_host ||
-	    ioc->pci_error_recovery) {
-		_scsih_fw_event_free(ioc, fw_event);
+	if (ioc->remove_host || ioc->pci_error_recovery) {
+		fw_event_work_put(fw_event);
 		return;
 	}
 
 	switch (fw_event->event) {
 	case MPT2SAS_REMOVE_UNRESPONDING_DEVICES:
-		while (scsi_host_in_recovery(ioc->shost) || ioc->shost_recovery)
+		while (scsi_host_in_recovery(ioc->shost) ||
+				ioc->shost_recovery) {
+			/*
+			 * If we're unloading, bail. Otherwise, this can become
+			 * an infinite loop.
+			 */
+			if (ioc->remove_host)
+				goto out;
+
 			ssleep(1);
+		}
 		_scsih_remove_unresponding_sas_devices(ioc);
 		_scsih_scan_for_devices_after_reset(ioc);
 		break;
@@ -7613,7 +7681,8 @@ _firmware_event_work(struct work_struct *work)
 		_scsih_sas_ir_operation_status_event(ioc, fw_event);
 		break;
 	}
-	_scsih_fw_event_free(ioc, fw_event);
+out:
+	fw_event_work_put(fw_event);
 }
 
 /**
@@ -7751,7 +7820,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	}
 
 	sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
-	fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+	fw_event = alloc_fw_event_work(sz);
 	if (!fw_event) {
 		printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
 		    ioc->name, __FILE__, __LINE__, __func__);
@@ -7764,6 +7833,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
 	fw_event->VP_ID = mpi_reply->VP_ID;
 	fw_event->event = event;
 	_scsih_fw_event_add(ioc, fw_event);
+	fw_event_work_put(fw_event);
 	return;
 }
 
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-08-14  1:48             ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-08-14  1:48               ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
@ 2015-08-25 21:03               ` Nicholas A. Bellinger
  2015-09-04 14:34               ` Sreekanth Reddy
  2 siblings, 0 replies; 52+ messages in thread
From: Nicholas A. Bellinger @ 2015-08-25 21:03 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Joe Lawrence, Christoph Hellwig, Bart Van Assche

Hi Calvin,

On Thu, 2015-08-13 at 18:48 -0700, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
> 
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
> 
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Joe Lawrence <joe.lawrence@stratus.com>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
> Changes in v4:
> 	* Fix lack of put() in non-SATA case in _scsih_change_queue_depth()
> 	* Fix lack of put() in the non-error case in _scsih_check_device()
> 	* Add missing put() at bottom of _scsih_add_device()
> 	* Add put for ->hostdata pointer in _scsih_target_destroy() for the
> 	  get() in _scsih_target_alloc()
> 
> Changes in v3:
> 	* Drop the sas_device_lock while enabling devices, and leave the
> 	  sas_device object on the list, since it may need to be looked up there
> 	  while it is being enabled.
> 	* Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
> 	  reference (this was an oversight in v2).
> 	* Be consistent about calling sas_device_put() while holding the
> 	  sas_device_lock where feasible.
> 	* Take and assert_spin_locked() on the sas_device_lock from the newly
> 	  added __get_sdev_from_target(), add wrapper similar to other lookups
> 	  for callers which do not explicitly take the lock.
> 
> Changes in v2:
> 	* Squished patches 1-3 into this one
> 	* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
> 	* Store a pointer to the sas_device object in ->hostdata, to eliminate
> 	  the need for several lookups on the lists.
> 
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 480 +++++++++++++++++++++----------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 360 insertions(+), 154 deletions(-)
> 

Looks good.

Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
  2015-08-14  1:48               ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
@ 2015-08-25 21:06                 ` Nicholas A. Bellinger
  2015-09-04 14:35                 ` Sreekanth Reddy
  1 sibling, 0 replies; 52+ messages in thread
From: Nicholas A. Bellinger @ 2015-08-25 21:06 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Joe Lawrence, Christoph Hellwig

On Thu, 2015-08-13 at 18:48 -0700, Calvin Owens wrote:
> The fw_event_work struct is concurrently referenced at shutdown, so
> add a refcount to protect it, and refactor the code to use it.
> 
> Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
> no longer iterates over the list without holding the lock, since
> _firmware_event_work() concurrently deletes items from the list.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>
> ---
> Changes in v4: None
> 
> Changes in v3:
> 	* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
> 	  which can loop over a sleep forever (5m+ at least) at unloading. I
> 	  don't think anything prevented this before, but taking the fw_event
> 	  object off the list at the top of _firmware_event_work() seems to have
> 	  made it more likely to happen.
> 
> Changes in v2:
> 	* Squished patches 4-6 into one patch
> 	* Remove the fw_event from fw_event_list at the start of
> 	  _firmware_event_work()
> 	* Explicitly seperate fw_event_list removal from fw_event freeing
> 
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
>  1 file changed, 91 insertions(+), 21 deletions(-)
> 

Looks good.

Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 0/2] Fixes for memory corruption in mpt2sas
  2015-08-14  1:48           ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
  2015-08-14  1:48             ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
@ 2015-08-25 21:21             ` Nicholas A. Bellinger
  1 sibling, 0 replies; 52+ messages in thread
From: Nicholas A. Bellinger @ 2015-08-25 21:21 UTC (permalink / raw)
  To: Calvin Owens
  Cc: Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, MPT-FusionLinux.pdl, linux-scsi, linux-kernel,
	kernel-team, Joe Lawrence

On Thu, 2015-08-13 at 18:48 -0700, Calvin Owens wrote:
> Hello all,
> 
> This patchset attempts to address problems we've been having with
> panics due to memory corruption from the mpt2sas driver.
> 
> Thanks,
> Calvin
> 
> 
> [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
> [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
> 
> Total diffstat:
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 592 ++++++++++++++++++++++---------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 451 insertions(+), 175 deletions(-)
> 
> Diff showing changes v3 => v4:
> 	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v3v4.patch
> 
> Diff showing changes v2 => v3:
> 	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch
> 
> Diff showing changes v1 => v2:
> 	http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch
> --

(Adding JEJB CC')

James, please considering pick this up for v4.3-rc1.

Btw, I'm seeing the same type of issues on mpt3sas, and unless someone
at Avago is already working on a similar patch series, I'll end up
forward porting these to mpt3sas code.

Thank you,

--nab


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
  2015-08-14  1:48             ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
  2015-08-14  1:48               ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
  2015-08-25 21:03               ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects " Nicholas A. Bellinger
@ 2015-09-04 14:34               ` Sreekanth Reddy
  2 siblings, 0 replies; 52+ messages in thread
From: Sreekanth Reddy @ 2015-09-04 14:34 UTC (permalink / raw)
  To: Calvin Owens
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Joe Lawrence, Christoph Hellwig, Bart Van Assche,
	Krishnaraddi Mankani, Sathya Prakash, Chaitra Basappa

On Fri, Aug 14, 2015 at 7:18 AM, Calvin Owens <calvinowens@fb.com> wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Joe Lawrence <joe.lawrence@stratus.com>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>

Tested-by: Chaitra Basappa <chaitra.basappa@avagotech.com>
ACK-by: Sreekanth Reddy <sreekanth.reddy@avagotech.com>

> ---
> Changes in v4:
>         * Fix lack of put() in non-SATA case in _scsih_change_queue_depth()
>         * Fix lack of put() in the non-error case in _scsih_check_device()
>         * Add missing put() at bottom of _scsih_add_device()
>         * Add put for ->hostdata pointer in _scsih_target_destroy() for the
>           get() in _scsih_target_alloc()
>
> Changes in v3:
>         * Drop the sas_device_lock while enabling devices, and leave the
>           sas_device object on the list, since it may need to be looked up there
>           while it is being enabled.
>         * Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
>           reference (this was an oversight in v2).
>         * Be consistent about calling sas_device_put() while holding the
>           sas_device_lock where feasible.
>         * Take and assert_spin_locked() on the sas_device_lock from the newly
>           added __get_sdev_from_target(), add wrapper similar to other lookups
>           for callers which do not explicitly take the lock.
>
> Changes in v2:
>         * Squished patches 1-3 into this one
>         * s/BUG_ON(!spin_is_locked/assert_spin_locked/g
>         * Store a pointer to the sas_device object in ->hostdata, to eliminate
>           the need for several lookups on the lists.
>
>  drivers/scsi/mpt2sas/mpt2sas_base.h      |  22 +-
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c     | 480 +++++++++++++++++++++----------
>  drivers/scsi/mpt2sas/mpt2sas_transport.c |  12 +-
>  3 files changed, 360 insertions(+), 154 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
>   * @flags: MPT_TARGET_FLAGS_XXX flags
>   * @deleted: target flaged for deletion
>   * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
>   */
>  struct MPT2SAS_TARGET {
>         struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
>         u32     flags;
>         u8      deleted;
>         u8      tm_busy;
> +       struct _sas_device *sdev;
>  };
>
>
> @@ -376,8 +378,24 @@ struct _sas_device {
>         u8      phy;
>         u8      responding;
>         u8      pfa_led_on;
> +       struct kref refcount;
>  };
>
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> +       kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> +       kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> +       kref_put(&s->refcount, sas_device_free);
> +}
> +
>  /**
>   * struct _raid_device - raid volume link list
>   * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
>      u16 handle);
>  struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
>      *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> +    struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
>      struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>
>  void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..5eca3a4 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>         }
>  }
>
> +static struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> +               struct MPT2SAS_TARGET *tgt_priv)
> +{
> +       struct _sas_device *ret;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
> +
> +       ret = tgt_priv->sdev;
> +       if (ret)
> +               sas_device_get(ret);
> +
> +       return ret;
> +}
> +
> +static struct _sas_device *
> +mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> +               struct MPT2SAS_TARGET *tgt_priv)
> +{
> +       struct _sas_device *ret;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return ret;
> +}
> +
> +
> +struct _sas_device *
> +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> +    u64 sas_address)
> +{
> +       struct _sas_device *sas_device;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
> +
> +       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> +               if (sas_device->sas_address == sas_address)
> +                       goto found_device;
> +
> +       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> +               if (sas_device->sas_address == sas_address)
> +                       goto found_device;
> +
> +       return NULL;
> +
> +found_device:
> +       sas_device_get(sas_device);
> +       return sas_device;
> +}
> +
>  /**
> - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> + * mpt2sas_get_sdev_by_addr - sas device search
>   * @ioc: per adapter object
>   * @sas_address: sas address
>   * Context: Calling function should acquire ioc->sas_device_lock
> @@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
>   * object.
>   */
>  struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
>      u64 sas_address)
>  {
>         struct _sas_device *sas_device;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> +                       sas_address);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return sas_device;
> +}
> +
> +static struct _sas_device *
> +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +{
> +       struct _sas_device *sas_device;
> +
> +       assert_spin_locked(&ioc->sas_device_lock);
>
>         list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> -               if (sas_device->sas_address == sas_address)
> -                       return sas_device;
> +               if (sas_device->handle == handle)
> +                       goto found_device;
>
>         list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> -               if (sas_device->sas_address == sas_address)
> -                       return sas_device;
> +               if (sas_device->handle == handle)
> +                       goto found_device;
>
>         return NULL;
> +
> +found_device:
> +       sas_device_get(sas_device);
> +       return sas_device;
>  }
>
>  /**
> - * _scsih_sas_device_find_by_handle - sas device search
> + * mpt2sas_get_sdev_by_handle - sas device search
>   * @ioc: per adapter object
>   * @handle: sas device handle (assigned by firmware)
>   * Context: Calling function should acquire ioc->sas_device_lock
> @@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>   * object.
>   */
>  static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         struct _sas_device *sas_device;
> +       unsigned long flags;
>
> -       list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> -               if (sas_device->handle == handle)
> -                       return sas_device;
> -
> -       list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> -               if (sas_device->handle == handle)
> -                       return sas_device;
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> -       return NULL;
> +       return sas_device;
>  }
>
>  /**
> @@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>   * @sas_device: the sas_device object
>   * Context: This function will acquire ioc->sas_device_lock.
>   *
> - * Removing object and freeing associated memory from the ioc->sas_device_list.
> + * If sas_device is on the list, remove it and decrement its reference count.
>   */
>  static void
>  _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> @@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
>         if (!sas_device)
>                 return;
>
> +       /*
> +        * The lock serializes access to the list, but we still need to verify
> +        * that nobody removed the entry while we were waiting on the lock.
> +        */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       list_del(&sas_device->list);
> -       kfree(sas_device);
> +       if (!list_empty(&sas_device->list)) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>  }
>
> @@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
>             sas_device->handle, (unsigned long long)sas_device->sas_address));
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);
>         list_add_tail(&sas_device->list, &ioc->sas_device_list);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
>             sas_device->handle, (unsigned long long)sas_device->sas_address));
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       sas_device_get(sas_device);
>         list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
>         _scsih_determine_boot_device(ioc, sas_device, 0);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -1208,12 +1286,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
>                 goto not_sata;
>         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
>                 goto not_sata;
> +
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -          sas_device_priv_data->sas_target->sas_address);
> -       if (sas_device && sas_device->device_info &
> -           MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> -               max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> +       sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> +       if (sas_device) {
> +               if (sas_device->device_info & MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> +                       max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> +
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
>   not_sata:
> @@ -1271,18 +1352,20 @@ _scsih_target_alloc(struct scsi_target *starget)
>         /* sas/sata devices */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         rphy = dev_to_rphy(starget->dev.parent);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>            rphy->identify.sas_address);
>
>         if (sas_device) {
>                 sas_target_priv_data->handle = sas_device->handle;
>                 sas_target_priv_data->sas_address = sas_device->sas_address;
> +               sas_target_priv_data->sdev = sas_device;
>                 sas_device->starget = starget;
>                 sas_device->id = starget->id;
>                 sas_device->channel = starget->channel;
>                 if (test_bit(sas_device->handle, ioc->pd_handles))
>                         sas_target_priv_data->flags |=
>                             MPT_TARGET_FLAGS_RAID_COMPONENT;
> +
>         }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -1324,13 +1407,21 @@ _scsih_target_destroy(struct scsi_target *starget)
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         rphy = dev_to_rphy(starget->dev.parent);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -          rphy->identify.sas_address);
> +       sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
>         if (sas_device && (sas_device->starget == starget) &&
>             (sas_device->id == starget->id) &&
>             (sas_device->channel == starget->channel))
>                 sas_device->starget = NULL;
>
> +       if (sas_device) {
> +               /*
> +                * Corresponding get() is in _scsih_target_alloc()
> +                */
> +               sas_target_priv_data->sdev = NULL;
> +               sas_device_put(sas_device);
> +
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
>   out:
> @@ -1386,7 +1477,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>
>         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +               sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>                                 sas_target_priv_data->sas_address);
>                 if (sas_device && (sas_device->starget == NULL)) {
>                         sdev_printk(KERN_INFO, sdev,
> @@ -1394,6 +1485,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>                              __func__, __LINE__);
>                         sas_device->starget = starget;
>                 }
> +
> +               if (sas_device)
> +                       sas_device_put(sas_device);
> +
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
> @@ -1428,10 +1523,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>
>         if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                  sas_target_priv_data->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(ioc,
> +                               sas_target_priv_data);
>                 if (sas_device && !sas_target_priv_data->num_luns)
>                         sas_device->starget = NULL;
> +
> +               if (sas_device)
> +                       sas_device_put(sas_device);
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
> @@ -2078,7 +2176,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
>         }
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>            sas_device_priv_data->sas_target->sas_address);
>         if (!sas_device) {
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2112,17 +2210,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
>             (unsigned long long) sas_device->enclosure_logical_id,
>             sas_device->slot);
>
> +       sas_device_put(sas_device);
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         if (!ssp_target)
>                 _scsih_display_sata_capabilities(ioc, handle, sdev);
>
> -
>         _scsih_change_queue_depth(sdev, qdepth);
>
>         if (ssp_target) {
>                 sas_read_port_mode_page(sdev);
>                 _scsih_enable_tlr(ioc, sdev);
>         }
> +
>         return 0;
>  }
>
> @@ -2509,8 +2608,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
>                     device_str, (unsigned long long)priv_target->sas_address);
>         } else {
>                 spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                   priv_target->sas_address);
> +               sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
>                 if (sas_device) {
>                         if (priv_target->flags &
>                             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> @@ -2529,6 +2627,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
>                             "enclosure_logical_id(0x%016llx), slot(%d)\n",
>                            (unsigned long long)sas_device->enclosure_logical_id,
>                             sas_device->slot);
> +
> +                       sas_device_put(sas_device);
>                 }
>                 spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
> @@ -2604,12 +2704,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>  {
>         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
>         struct MPT2SAS_DEVICE *sas_device_priv_data;
> -       struct _sas_device *sas_device;
> -       unsigned long flags;
> +       struct _sas_device *sas_device = NULL;
>         u16     handle;
>         int r;
>
>         struct scsi_target *starget = scmd->device->sdev_target;
> +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
>         starget_printk(KERN_INFO, starget, "attempting device reset! "
>             "scmd(%p)\n", scmd);
> @@ -2629,12 +2729,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>         handle = 0;
>         if (sas_device_priv_data->sas_target->flags &
>             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> -                  sas_device_priv_data->sas_target->handle);
> +               sas_device = mpt2sas_get_sdev_from_target(ioc,
> +                               target_priv_data);
>                 if (sas_device)
>                         handle = sas_device->volume_handle;
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         } else
>                 handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2651,6 +2749,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
>   out:
>         sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
>             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         return r;
>  }
>
> @@ -2665,11 +2767,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>  {
>         struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
>         struct MPT2SAS_DEVICE *sas_device_priv_data;
> -       struct _sas_device *sas_device;
> -       unsigned long flags;
> +       struct _sas_device *sas_device = NULL;
>         u16     handle;
>         int r;
>         struct scsi_target *starget = scmd->device->sdev_target;
> +       struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
>         starget_printk(KERN_INFO, starget, "attempting target reset! "
>             "scmd(%p)\n", scmd);
> @@ -2689,12 +2791,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>         handle = 0;
>         if (sas_device_priv_data->sas_target->flags &
>             MPT_TARGET_FLAGS_RAID_COMPONENT) {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc,
> -                  sas_device_priv_data->sas_target->handle);
> +               sas_device = mpt2sas_get_sdev_from_target(ioc,
> +                               target_priv_data);
>                 if (sas_device)
>                         handle = sas_device->volume_handle;
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         } else
>                 handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2711,6 +2811,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
>   out:
>         starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
>             ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         return r;
>  }
>
> @@ -3002,15 +3106,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
>
>         list_for_each_entry(mpt2sas_port,
>            &sas_expander->sas_port_list, port_list) {
> -               if (mpt2sas_port->remote_identify.device_type ==
> -                   SAS_END_DEVICE) {
> +               if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
>                         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -                       sas_device =
> -                           mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                          mpt2sas_port->remote_identify.sas_address);
> -                       if (sas_device)
> +                       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> +                                       mpt2sas_port->remote_identify.sas_address);
> +                       if (sas_device) {
>                                 set_bit(sas_device->handle,
> -                                   ioc->blocking_handles);
> +                                               ioc->blocking_handles);
> +                               sas_device_put(sas_device);
> +                       }
>                         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>                 }
>         }
> @@ -3080,7 +3184,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         Mpi2SCSITaskManagementRequest_t *mpi_request;
>         u16 smid;
> -       struct _sas_device *sas_device;
> +       struct _sas_device *sas_device = NULL;
>         struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
>         u64 sas_address = 0;
>         unsigned long flags;
> @@ -3110,7 +3214,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device && sas_device->starget &&
>              sas_device->starget->hostdata) {
>                 sas_target_priv_data = sas_device->starget->hostdata;
> @@ -3131,14 +3235,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         if (!smid) {
>                 delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
>                 if (!delayed_tr)
> -                       return;
> +                       goto out;
>                 INIT_LIST_HEAD(&delayed_tr->list);
>                 delayed_tr->handle = handle;
>                 list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
>                 dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
>                     "DELAYED:tr:handle(0x%04x), (open)\n",
>                     ioc->name, handle));
> -               return;
> +               goto out;
>         }
>
>         dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> @@ -3150,6 +3254,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         mpi_request->DevHandle = cpu_to_le16(handle);
>         mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
>         mpt2sas_base_put_smid_hi_priority(ioc, smid);
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
>  }
>
>
> @@ -4068,7 +4175,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>         char *desc_scsi_state = ioc->tmp_string;
>         u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
>         struct _sas_device *sas_device = NULL;
> -       unsigned long flags;
>         struct scsi_target *starget = scmd->device->sdev_target;
>         struct MPT2SAS_TARGET *priv_target = starget->hostdata;
>         char *device_str = NULL;
> @@ -4200,9 +4306,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>                 printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
>                     device_str, (unsigned long long)priv_target->sas_address);
>         } else {
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -                   priv_target->sas_address);
> +               sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
>                 if (sas_device) {
>                         printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
>                             "phy(%d)\n", ioc->name, sas_device->sas_address,
> @@ -4211,8 +4315,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
>                             "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
>                             ioc->name, sas_device->enclosure_logical_id,
>                             sas_device->slot);
> +
> +                       sas_device_put(sas_device);
>                 }
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         }
>
>         printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> @@ -4259,7 +4364,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         Mpi2SepRequest_t mpi_request;
>         struct _sas_device *sas_device;
>
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (!sas_device)
>                 return;
>
> @@ -4274,7 +4379,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>             &mpi_request)) != 0) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
>                 __FILE__, __LINE__, __func__);
> -               return;
> +               goto out;
>         }
>         sas_device->pfa_led_on = 1;
>
> @@ -4284,8 +4389,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                  "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
>                  ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
>                  le32_to_cpu(mpi_reply.IOCLogInfo)));
> -               return;
> +               goto out;
>         }
> +out:
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -4370,19 +4477,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
>         /* only handle non-raid devices */
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (!sas_device) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>         starget = sas_device->starget;
>         sas_target_priv_data = starget->hostdata;
>
>         if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> -          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +          ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> +               goto out_unlock;
> +
>         starget_printk(KERN_WARNING, starget, "predicted fault\n");
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -4396,7 +4501,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         if (!event_reply) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
>                     ioc->name, __FILE__, __LINE__, __func__);
> -               return;
> +               goto out;
>         }
>
>         event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> @@ -4413,6 +4518,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>         event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
>         mpt2sas_ctl_add_to_event_log(ioc, event_reply);
>         kfree(event_reply);
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +       return;
> +
> +out_unlock:
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +       goto out;
>  }
>
>  /**
> @@ -5148,14 +5261,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
>
>         if (!sas_device) {
>                 printk(MPT2SAS_ERR_FMT "device is not present "
>                     "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>
>         if (unlikely(sas_device->handle != handle)) {
> @@ -5172,19 +5284,24 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>             MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
>                 printk(MPT2SAS_ERR_FMT "device is not present "
>                     "handle(0x%04x), flags!!!\n", ioc->name, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> +               goto out_unlock;
>         }
>
>         /* check if there were any issues with discovery */
>         if (_scsih_check_access_status(ioc, sas_address, handle,
> -           sas_device_pg0.AccessStatus)) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +           sas_device_pg0.AccessStatus))
> +               goto out_unlock;
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         _scsih_ublock_io_device(ioc, sas_address);
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +       return;
>
> +out_unlock:
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +       if (sas_device)
> +               sas_device_put(sas_device);
>  }
>
>  /**
> @@ -5208,7 +5325,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>         u32 ioc_status;
>         __le64 sas_address;
>         u32 device_info;
> -       unsigned long flags;
>
>         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
>             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -5250,14 +5366,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>                 return -1;
>         }
>
> -
> -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
> -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> -       if (sas_device)
> +       if (sas_device) {
> +               sas_device_put(sas_device);
>                 return 0;
> +       }
>
>         sas_device = kzalloc(sizeof(struct _sas_device),
>             GFP_KERNEL);
> @@ -5267,6 +5382,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>                 return -1;
>         }
>
> +       kref_init(&sas_device->refcount);
>         sas_device->handle = handle;
>         if (_scsih_get_sas_address(ioc, le16_to_cpu
>                 (sas_device_pg0.ParentDevHandle),
> @@ -5296,6 +5412,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
>         else
>                 _scsih_sas_device_add(ioc, sas_device);
>
> +       sas_device_put(sas_device);
>         return 0;
>  }
>
> @@ -5344,7 +5461,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
>             "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
>             sas_device->handle, (unsigned long long)
>             sas_device->sas_address));
> -       kfree(sas_device);
>  }
>  /**
>   * _scsih_device_remove_by_handle - removing device object by handle
> @@ -5363,12 +5479,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -       if (sas_device)
> -               list_del(&sas_device->list);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> +       if (sas_device) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +
> +       if (sas_device) {
>                 _scsih_remove_device(ioc, sas_device);
> +               sas_device_put(sas_device);
> +       }
>  }
>
>  /**
> @@ -5389,13 +5510,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
>                 return;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> -           sas_address);
> -       if (sas_device)
> -               list_del(&sas_device->list);
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> +       if (sas_device) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +
> +       if (sas_device) {
>                 _scsih_remove_device(ioc, sas_device);
> +               sas_device_put(sas_device);
> +       }
>  }
>  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
>  /**
> @@ -5716,26 +5841,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         sas_address = le64_to_cpu(event_data->SASAddress);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             sas_address);
>
> -       if (!sas_device || !sas_device->starget) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +       if (!sas_device || !sas_device->starget)
> +               goto out;
>
>         target_priv_data = sas_device->starget->hostdata;
> -       if (!target_priv_data) {
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               return;
> -       }
> +       if (!target_priv_data)
> +               goto out;
>
>         if (event_data->ReasonCode ==
>             MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
>                 target_priv_data->tm_busy = 1;
>         else
>                 target_priv_data->tm_busy = 0;
> +
> +out:
> +       if (sas_device)
> +               sas_device_put(sas_device);
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
>  }
>
>  #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> @@ -6123,7 +6250,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
>         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device) {
>                 sas_device->volume_handle = 0;
>                 sas_device->volume_wwid = 0;
> @@ -6142,6 +6269,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
>         /* exposing raid component */
>         if (starget)
>                 starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> +
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -6170,7 +6299,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
>                     &volume_wwid);
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> +       sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
>         if (sas_device) {
>                 set_bit(handle, ioc->pd_handles);
>                 if (sas_device->starget && sas_device->starget->hostdata) {
> @@ -6189,6 +6318,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
>         /* hiding raid component */
>         if (starget)
>                 starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> +
> +       sas_device_put(sas_device);
>  }
>
>  /**
> @@ -6221,7 +6352,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>      Mpi2EventIrConfigElement_t *element)
>  {
>         struct _sas_device *sas_device;
> -       unsigned long flags;
>         u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>         Mpi2ConfigReply_t mpi_reply;
>         Mpi2SasDevicePage0_t sas_device_pg0;
> @@ -6231,11 +6361,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>
>         set_bit(handle, ioc->pd_handles);
>
> -       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -       if (sas_device)
> +       sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +       if (sas_device) {
> +               sas_device_put(sas_device);
>                 return;
> +       }
>
>         if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
>             MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -6509,7 +6639,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
>         u16 handle, parent_handle;
>         u32 state;
>         struct _sas_device *sas_device;
> -       unsigned long flags;
>         Mpi2ConfigReply_t mpi_reply;
>         Mpi2SasDevicePage0_t sas_device_pg0;
>         u32 ioc_status;
> @@ -6542,12 +6671,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
>                 if (!ioc->is_warpdrive)
>                         set_bit(handle, ioc->pd_handles);
>
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -
> -               if (sas_device)
> +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         return;
> +               }
>
>                 if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
>                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> @@ -7015,6 +7143,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
>         struct _raid_device *raid_device, *raid_device_next;
>         struct list_head tmp_list;
>         unsigned long flags;
> +       LIST_HEAD(head);
>
>         printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
>             ioc->name);
> @@ -7022,14 +7151,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
>         /* removing unresponding end devices */
>         printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
>             ioc->name);
> +
> +       /*
> +        * Iterate, pulling off devices marked as non-responding. We become the
> +        * owner for the reference the list had on any object we prune.
> +        */
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
>         list_for_each_entry_safe(sas_device, sas_device_next,
> -           &ioc->sas_device_list, list) {
> +                       &ioc->sas_device_list, list) {
>                 if (!sas_device->responding)
> -                       mpt2sas_device_remove_by_sas_address(ioc,
> -                               sas_device->sas_address);
> +                       list_move_tail(&sas_device->list, &head);
>                 else
>                         sas_device->responding = 0;
>         }
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       /*
> +        * Now, uninitialize and remove the unresponding devices we pruned.
> +        */
> +       list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> +               _scsih_remove_device(ioc, sas_device);
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
>
>         /* removing unresponding volumes */
>         if (ioc->ir_firmware) {
> @@ -7179,11 +7323,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
>                 }
>                 phys_disk_num = pd_pg0.PhysDiskNum;
>                 handle = le16_to_cpu(pd_pg0.DevHandle);
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               if (sas_device)
> +               sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         continue;
> +               }
>                 if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
>                     &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
>                     handle) != 0)
> @@ -7302,12 +7446,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
>                 if (!(_scsih_is_end_device(
>                     le32_to_cpu(sas_device_pg0.DeviceInfo))))
>                         continue;
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +               sas_device = mpt2sas_get_sdev_by_addr(ioc,
>                     le64_to_cpu(sas_device_pg0.SASAddress));
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -               if (sas_device)
> +               if (sas_device) {
> +                       sas_device_put(sas_device);
>                         continue;
> +               }
>                 parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
>                 if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
>                         printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> @@ -7966,6 +8110,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
>         }
>  }
>
> +static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> +{
> +       struct _sas_device *sas_device = NULL;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +       if (!list_empty(&ioc->sas_device_init_list)) {
> +               sas_device = list_first_entry(&ioc->sas_device_init_list,
> +                               struct _sas_device, list);
> +               sas_device_get(sas_device);
> +       }
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +       return sas_device;
> +}
> +
> +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> +               struct _sas_device *sas_device)
> +{
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +
> +       /*
> +        * Since we dropped the lock during the call to port_add(), we need to
> +        * be careful here that somebody else didn't move or delete this item
> +        * while we were busy with other things.
> +        *
> +        * If it was on the list, we need a put() for the reference the list
> +        * had. Either way, we need a get() for the destination list.
> +        */
> +       if (!list_empty(&sas_device->list)) {
> +               list_del_init(&sas_device->list);
> +               sas_device_put(sas_device);
> +       }
> +
> +       sas_device_get(sas_device);
> +       list_add_tail(&sas_device->list, &ioc->sas_device_list);
> +
> +       spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +}
> +
>  /**
>   * _scsih_probe_sas - reporting sas devices to sas transport
>   * @ioc: per adapter object
> @@ -7975,34 +8161,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
>  static void
>  _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
>  {
> -       struct _sas_device *sas_device, *next;
> -       unsigned long flags;
> -
> -       /* SAS Device List */
> -       list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> -           list) {
> +       struct _sas_device *sas_device;
>
> -               if (ioc->hide_drives)
> -                       continue;
> +       if (ioc->hide_drives)
> +               return;
>
> +       while ((sas_device = get_next_sas_device(ioc))) {
>                 if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> -                   sas_device->sas_address_parent)) {
> -                       list_del(&sas_device->list);
> -                       kfree(sas_device);
> +                               sas_device->sas_address_parent)) {
> +                       _scsih_sas_device_remove(ioc, sas_device);
> +                       sas_device_put(sas_device);
>                         continue;
>                 } else if (!sas_device->starget) {
>                         if (!ioc->is_driver_loading) {
>                                 mpt2sas_transport_port_remove(ioc,
> -                                       sas_device->sas_address,
> -                                       sas_device->sas_address_parent);
> -                               list_del(&sas_device->list);
> -                               kfree(sas_device);
> +                                               sas_device->sas_address,
> +                                               sas_device->sas_address_parent);
> +                               _scsih_sas_device_remove(ioc, sas_device);
> +                               sas_device_put(sas_device);
>                                 continue;
>                         }
>                 }
> -               spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -               list_move_tail(&sas_device->list, &ioc->sas_device_list);
> -               spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> +               sas_device_make_active(ioc, sas_device);
> +               sas_device_put(sas_device);
>         }
>  }
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> index ff2500a..af86800 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
>         int rc;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             rphy->identify.sas_address);
>         if (sas_device) {
>                 *identifier = sas_device->enclosure_logical_id;
>                 rc = 0;
> +               sas_device_put(sas_device);
>         } else {
>                 *identifier = 0;
>                 rc = -ENXIO;
>         }
> +
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         return rc;
>  }
> @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
>         int rc;
>
>         spin_lock_irqsave(&ioc->sas_device_lock, flags);
> -       sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> +       sas_device = __mpt2sas_get_sdev_by_addr(ioc,
>             rphy->identify.sas_address);
> -       if (sas_device)
> +       if (sas_device) {
>                 rc = sas_device->slot;
> -       else
> +               sas_device_put(sas_device);
> +       } else {
>                 rc = -ENXIO;
> +       }
>         spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>         return rc;
>  }
> --
> 2.5.0
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
  2015-08-14  1:48               ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
  2015-08-25 21:06                 ` Nicholas A. Bellinger
@ 2015-09-04 14:35                 ` Sreekanth Reddy
  1 sibling, 0 replies; 52+ messages in thread
From: Sreekanth Reddy @ 2015-09-04 14:35 UTC (permalink / raw)
  To: Calvin Owens
  Cc: MPT-FusionLinux.pdl, linux-scsi, linux-kernel, kernel-team,
	Joe Lawrence, Christoph Hellwig, Krishnaraddi Mankani,
	Sathya Prakash, Chaitra Basappa

On Fri, Aug 14, 2015 at 7:18 AM, Calvin Owens <calvinowens@fb.com> wrote:
> The fw_event_work struct is concurrently referenced at shutdown, so
> add a refcount to protect it, and refactor the code to use it.
>
> Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
> no longer iterates over the list without holding the lock, since
> _firmware_event_work() concurrently deletes items from the list.
>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Calvin Owens <calvinowens@fb.com>

Tested-by: Chaitra Basappa <chaitra.basappa@avagotech.com>
ACK-by: Sreekanth Reddy <sreekanth.reddy@avagotech.com>

> ---
> Changes in v4: None
>
> Changes in v3:
>         * Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
>           which can loop over a sleep forever (5m+ at least) at unloading. I
>           don't think anything prevented this before, but taking the fw_event
>           object off the list at the top of _firmware_event_work() seems to have
>           made it more likely to happen.
>
> Changes in v2:
>         * Squished patches 4-6 into one patch
>         * Remove the fw_event from fw_event_list at the start of
>           _firmware_event_work()
>         * Explicitly seperate fw_event_list removal from fw_event freeing
>
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
>  1 file changed, 91 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 5eca3a4..c0ff55b 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -176,9 +176,37 @@ struct fw_event_work {
>         u8                      VP_ID;
>         u8                      ignore;
>         u16                     event;
> +       struct kref             refcount;
>         char                    event_data[0] __aligned(4);
>  };
>
> +static void fw_event_work_free(struct kref *r)
> +{
> +       kfree(container_of(r, struct fw_event_work, refcount));
> +}
> +
> +static void fw_event_work_get(struct fw_event_work *fw_work)
> +{
> +       kref_get(&fw_work->refcount);
> +}
> +
> +static void fw_event_work_put(struct fw_event_work *fw_work)
> +{
> +       kref_put(&fw_work->refcount, fw_event_work_free);
> +}
> +
> +static struct fw_event_work *alloc_fw_event_work(int len)
> +{
> +       struct fw_event_work *fw_event;
> +
> +       fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
> +       if (!fw_event)
> +               return NULL;
> +
> +       kref_init(&fw_event->refcount);
> +       return fw_event;
> +}
> +
>  /* raid transport support */
>  static struct raid_template *mpt2sas_raid_template;
>
> @@ -2872,36 +2900,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
>                 return;
>
>         spin_lock_irqsave(&ioc->fw_event_lock, flags);
> +       fw_event_work_get(fw_event);
>         list_add_tail(&fw_event->list, &ioc->fw_event_list);
>         INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
> +       fw_event_work_get(fw_event);
>         queue_delayed_work(ioc->firmware_event_thread,
>             &fw_event->delayed_work, 0);
>         spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
>  }
>
>  /**
> - * _scsih_fw_event_free - delete fw_event
> + * _scsih_fw_event_del_from_list - delete fw_event from the list
>   * @ioc: per adapter object
>   * @fw_event: object describing the event
>   * Context: This function will acquire ioc->fw_event_lock.
>   *
> - * This removes firmware event object from link list, frees associated memory.
> + * If the fw_event is on the fw_event_list, remove it and do a put.
>   *
>   * Return nothing.
>   */
>  static void
> -_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
> +_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
>      *fw_event)
>  {
>         unsigned long flags;
>
>         spin_lock_irqsave(&ioc->fw_event_lock, flags);
> -       list_del(&fw_event->list);
> -       kfree(fw_event);
> +       if (!list_empty(&fw_event->list)) {
> +               list_del_init(&fw_event->list);
> +               fw_event_work_put(fw_event);
> +       }
>         spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
>  }
>
> -
>  /**
>   * _scsih_error_recovery_delete_devices - remove devices not responding
>   * @ioc: per adapter object
> @@ -2916,13 +2947,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
>         if (ioc->is_driver_loading)
>                 return;
>
> -       fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
> +       fw_event = alloc_fw_event_work(0);
>         if (!fw_event)
>                 return;
>
>         fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
>         fw_event->ioc = ioc;
>         _scsih_fw_event_add(ioc, fw_event);
> +       fw_event_work_put(fw_event);
>  }
>
>  /**
> @@ -2936,12 +2968,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
>  {
>         struct fw_event_work *fw_event;
>
> -       fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
> +       fw_event = alloc_fw_event_work(0);
>         if (!fw_event)
>                 return;
>         fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
>         fw_event->ioc = ioc;
>         _scsih_fw_event_add(ioc, fw_event);
> +       fw_event_work_put(fw_event);
> +}
> +
> +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> +{
> +       unsigned long flags;
> +       struct fw_event_work *fw_event = NULL;
> +
> +       spin_lock_irqsave(&ioc->fw_event_lock, flags);
> +       if (!list_empty(&ioc->fw_event_list)) {
> +               fw_event = list_first_entry(&ioc->fw_event_list,
> +                               struct fw_event_work, list);
> +               list_del_init(&fw_event->list);
> +       }
> +       spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> +
> +       return fw_event;
>  }
>
>  /**
> @@ -2956,17 +3005,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
>  static void
>  _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
>  {
> -       struct fw_event_work *fw_event, *next;
> +       struct fw_event_work *fw_event;
>
>         if (list_empty(&ioc->fw_event_list) ||
>              !ioc->firmware_event_thread || in_interrupt())
>                 return;
>
> -       list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
> -               if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
> -                       _scsih_fw_event_free(ioc, fw_event);
> -                       continue;
> -               }
> +       while ((fw_event = dequeue_next_fw_event(ioc))) {
> +               /*
> +                * Wait on the fw_event to complete. If this returns 1, then
> +                * the event was never executed, and we need a put for the
> +                * reference the delayed_work had on the fw_event.
> +                *
> +                * If it did execute, we wait for it to finish, and the put will
> +                * happen from _firmware_event_work()
> +                */
> +               if (cancel_delayed_work_sync(&fw_event->delayed_work))
> +                       fw_event_work_put(fw_event);
> +
> +               fw_event_work_put(fw_event);
>         }
>  }
>
> @@ -4447,13 +4504,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>  {
>         struct fw_event_work *fw_event;
>
> -       fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
> +       fw_event = alloc_fw_event_work(0);
>         if (!fw_event)
>                 return;
>         fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
>         fw_event->device_handle = handle;
>         fw_event->ioc = ioc;
>         _scsih_fw_event_add(ioc, fw_event);
> +       fw_event_work_put(fw_event);
>  }
>
>  /**
> @@ -7554,17 +7612,27 @@ _firmware_event_work(struct work_struct *work)
>             struct fw_event_work, delayed_work.work);
>         struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;
>
> +       _scsih_fw_event_del_from_list(ioc, fw_event);
> +
>         /* the queue is being flushed so ignore this event */
> -       if (ioc->remove_host ||
> -           ioc->pci_error_recovery) {
> -               _scsih_fw_event_free(ioc, fw_event);
> +       if (ioc->remove_host || ioc->pci_error_recovery) {
> +               fw_event_work_put(fw_event);
>                 return;
>         }
>
>         switch (fw_event->event) {
>         case MPT2SAS_REMOVE_UNRESPONDING_DEVICES:
> -               while (scsi_host_in_recovery(ioc->shost) || ioc->shost_recovery)
> +               while (scsi_host_in_recovery(ioc->shost) ||
> +                               ioc->shost_recovery) {
> +                       /*
> +                        * If we're unloading, bail. Otherwise, this can become
> +                        * an infinite loop.
> +                        */
> +                       if (ioc->remove_host)
> +                               goto out;
> +
>                         ssleep(1);
> +               }
>                 _scsih_remove_unresponding_sas_devices(ioc);
>                 _scsih_scan_for_devices_after_reset(ioc);
>                 break;
> @@ -7613,7 +7681,8 @@ _firmware_event_work(struct work_struct *work)
>                 _scsih_sas_ir_operation_status_event(ioc, fw_event);
>                 break;
>         }
> -       _scsih_fw_event_free(ioc, fw_event);
> +out:
> +       fw_event_work_put(fw_event);
>  }
>
>  /**
> @@ -7751,7 +7820,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
>         }
>
>         sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
> -       fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
> +       fw_event = alloc_fw_event_work(sz);
>         if (!fw_event) {
>                 printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
>                     ioc->name, __FILE__, __LINE__, __func__);
> @@ -7764,6 +7833,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
>         fw_event->VP_ID = mpi_reply->VP_ID;
>         fw_event->event = event;
>         _scsih_fw_event_add(ioc, fw_event);
> +       fw_event_work_put(fw_event);
>         return;
>  }
>
> --
> 2.5.0
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2015-09-04 14:35 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-04 15:05 [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization Sreekanth Reddy
2015-05-05 15:35 ` Tomas Henzl
2015-05-12  9:38   ` Sreekanth Reddy
2015-05-06 18:48 ` Calvin Owens
2015-05-15  3:41   ` [PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
2015-05-15  3:41     ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
2015-05-15  3:41     ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
2015-05-15  3:41     ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
2015-05-15  3:42     ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
2015-05-15  3:42     ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
2015-05-15  3:42     ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
2015-06-09  3:50     ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Calvin Owens
2015-06-09  3:50       ` [PATCH 1/6] Add refcount to sas_device struct Calvin Owens
2015-07-03 15:24         ` Christoph Hellwig
2015-06-09  3:50       ` [PATCH 2/6] Refactor code to use new sas_device refcount Calvin Owens
2015-07-03 15:38         ` Christoph Hellwig
2015-07-12  4:15           ` Calvin Owens
2015-06-09  3:50       ` [PATCH 3/6] Fix unsafe sas_device_list usage Calvin Owens
2015-07-03 16:03         ` Christoph Hellwig
2015-06-09  3:50       ` [PATCH 4/6] Add refcount to fw_event_work struct Calvin Owens
2015-07-03 15:38         ` Christoph Hellwig
2015-06-09  3:50       ` [PATCH 5/6] Refactor code to use new fw_event refcount Calvin Owens
2015-07-03 16:00         ` Christoph Hellwig
2015-07-12  4:13           ` Calvin Owens
2015-06-09  3:50       ` [PATCH 6/6] Fix unsafe fw_event_list usage Calvin Owens
2015-07-03 16:02         ` Christoph Hellwig
2015-07-12  4:20           ` Calvin Owens
2015-07-02 20:15       ` [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas Bart Van Assche
2015-07-12  4:24       ` [PATCH 0/2 v2] " Calvin Owens
2015-07-12  4:24         ` [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
2015-07-13  6:52           ` Christoph Hellwig
2015-07-21  7:06             ` Calvin Owens
2015-07-13 15:05           ` Joe Lawrence
2015-07-21  7:04             ` Calvin Owens
2015-07-16 14:57           ` Sreekanth Reddy
2015-07-21  7:03             ` Calvin Owens
2015-07-12  4:24         ` [PATCH 2/2] mpt2sas: Refcount fw_events " Calvin Owens
2015-07-13  6:52           ` Christoph Hellwig
2015-08-01  5:02         ` [PATCH v3 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
2015-08-01  5:02           ` [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
2015-08-10 13:15             ` Sreekanth Reddy
2015-08-14  1:43               ` Calvin Owens
2015-08-01  5:02           ` [PATCH v3 2/2] mpt2sas: Refcount fw_events " Calvin Owens
2015-08-14  1:48           ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Calvin Owens
2015-08-14  1:48             ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage Calvin Owens
2015-08-14  1:48               ` [PATCH v4 2/2] mpt2sas: Refcount fw_events " Calvin Owens
2015-08-25 21:06                 ` Nicholas A. Bellinger
2015-09-04 14:35                 ` Sreekanth Reddy
2015-08-25 21:03               ` [PATCH v4 1/2] mpt2sas: Refcount sas_device objects " Nicholas A. Bellinger
2015-09-04 14:34               ` Sreekanth Reddy
2015-08-25 21:21             ` [PATCH v4 0/2] Fixes for memory corruption in mpt2sas Nicholas A. Bellinger
2015-07-02 19:22     ` [PATCH 0/6] " Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).