All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] lpfc: Fix panic on BFS configuration.
@ 2017-04-27 22:08 ` jsmart2021
  0 siblings, 0 replies; 9+ messages in thread
From: jsmart2021 @ 2017-04-27 22:08 UTC (permalink / raw)
  To: linux-scsi, linux-nvme, sagi, stable
  Cc: loberman, jthumshirn, emilne, James Smart, Dick Kennedy, James Smart

From: James Smart <jsmart2021@gmail.com>

To select the appropriate shost template, the driver is issuing
a mailbox command to retrieve the wwn. Turns out the sending of
the command precedes the reset of the function.  On SLI-4 adapters,
this is inconsequential as the mailbox command location is specified
by dma via the BMBX register. However, on SLI-3 adapters, the
location of the mailbox command submission area changes. When the
function is first powered on or reset, the cmd is submitted via PCI
bar memory. Later the driver changes the function config to use
host memory and DMA. The request to start a mailbox command is the
same, a simple doorbell write, regardless of submission area.
So.. if there has not been a boot driver run against the adapter,
the mailbox command works as defaults are ok. But, if the boot
driver has configured the card and, and if no platform pci
function/slot reset occurs as the os starts, the mailbox command
will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.

Fix is to reset the sli-3 function before sending the
mailbox command, thus synchronizing the function/driver on mailbox
location.

Note: The fix uses routines that are typically invoked later in the
call flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely the
allocation of the pport structure. So, rather than significantly reworking
the initialization flow so that the pport is alloc'd first, pointer checks
are added to work around it. Checks are limited to the routines invoked
by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
and another adapter reset, still occur - both on sli-3 and sli-4 adapters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
Cc: stable@vger.kernel.org
---
The issue was introduced by one of the patches in the lpfc 11.2.0.10
patch set: http://www.spinics.net/lists/linux-scsi/msg105908.html
As the set contained a number of NVME fixes, it was picked up only in
the linux-block tree. The scsi trees are at ~11.2.0.7.

This patch was cut against the nvme-4.12 tree, which is backed by
linux-block. The patch must go in via the block tree.

As the patch is scsi-specific, it was cherry-picked for stable trees.
Thus the fix needs to be picked up by them.

Warning: the patch applies to pre-11.2.0.10 versions of lpfc (in the
scsi trees) reporting only fuzz warnings, but creates something
completely erroneous.
---
 drivers/scsi/lpfc/lpfc_crtn.h |  1 +
 drivers/scsi/lpfc/lpfc_init.c |  7 +++++++
 drivers/scsi/lpfc/lpfc_sli.c  | 19 ++++++++++++-------
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 944b32c..1c55408 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -294,6 +294,7 @@ int lpfc_selective_reset(struct lpfc_hba *);
 void lpfc_reset_barrier(struct lpfc_hba *);
 int lpfc_sli_brdready(struct lpfc_hba *, uint32_t);
 int lpfc_sli_brdkill(struct lpfc_hba *);
+int lpfc_sli_chipset_init(struct lpfc_hba *phba);
 int lpfc_sli_brdreset(struct lpfc_hba *);
 int lpfc_sli_brdrestart(struct lpfc_hba *);
 int lpfc_sli_hba_setup(struct lpfc_hba *);
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 90ae354..e85f273 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -3602,6 +3602,13 @@ lpfc_get_wwpn(struct lpfc_hba *phba)
 	LPFC_MBOXQ_t *mboxq;
 	MAILBOX_t *mb;
 
+	if (phba->sli_rev < LPFC_SLI_REV4) {
+		/* Reset the port first */
+		lpfc_sli_brdrestart(phba);
+		rc = lpfc_sli_chipset_init(phba);
+		if (rc)
+			return (uint64_t)-1;
+	}
 
 	mboxq = (LPFC_MBOXQ_t *) mempool_alloc(phba->mbox_mem_pool,
 						GFP_KERNEL);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index cf19f49..2a4fc00 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -4204,13 +4204,16 @@ lpfc_sli_brdreset(struct lpfc_hba *phba)
 	/* Reset HBA */
 	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
 			"0325 Reset HBA Data: x%x x%x\n",
-			phba->pport->port_state, psli->sli_flag);
+			(phba->pport) ? phba->pport->port_state : 0,
+			psli->sli_flag);
 
 	/* perform board reset */
 	phba->fc_eventTag = 0;
 	phba->link_events = 0;
-	phba->pport->fc_myDID = 0;
-	phba->pport->fc_prevDID = 0;
+	if (phba->pport) {
+		phba->pport->fc_myDID = 0;
+		phba->pport->fc_prevDID = 0;
+	}
 
 	/* Turn off parity checking and serr during the physical reset */
 	pci_read_config_word(phba->pcidev, PCI_COMMAND, &cfg_value);
@@ -4336,7 +4339,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
 	/* Restart HBA */
 	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
 			"0337 Restart HBA Data: x%x x%x\n",
-			phba->pport->port_state, psli->sli_flag);
+			(phba->pport) ? phba->pport->port_state : 0,
+			psli->sli_flag);
 
 	word0 = 0;
 	mb = (MAILBOX_t *) &word0;
@@ -4350,7 +4354,7 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
 	readl(to_slim); /* flush */
 
 	/* Only skip post after fc_ffinit is completed */
-	if (phba->pport->port_state)
+	if (phba->pport && phba->pport->port_state)
 		word0 = 1;	/* This is really setting up word1 */
 	else
 		word0 = 0;	/* This is really setting up word1 */
@@ -4359,7 +4363,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
 	readl(to_slim); /* flush */
 
 	lpfc_sli_brdreset(phba);
-	phba->pport->stopped = 0;
+	if (phba->pport)
+		phba->pport->stopped = 0;
 	phba->link_state = LPFC_INIT_START;
 	phba->hba_flag = 0;
 	spin_unlock_irq(&phba->hbalock);
@@ -4446,7 +4451,7 @@ lpfc_sli_brdrestart(struct lpfc_hba *phba)
  * iteration, the function will restart the HBA again. The function returns
  * zero if HBA successfully restarted else returns negative error code.
  **/
-static int
+int
 lpfc_sli_chipset_init(struct lpfc_hba *phba)
 {
 	uint32_t status, i = 0;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3] lpfc: Fix panic on BFS configuration.
@ 2017-04-27 22:08 ` jsmart2021
  0 siblings, 0 replies; 9+ messages in thread
From: jsmart2021 @ 2017-04-27 22:08 UTC (permalink / raw)


From: James Smart <jsmart2021@gmail.com>

To select the appropriate shost template, the driver is issuing
a mailbox command to retrieve the wwn. Turns out the sending of
the command precedes the reset of the function.  On SLI-4 adapters,
this is inconsequential as the mailbox command location is specified
by dma via the BMBX register. However, on SLI-3 adapters, the
location of the mailbox command submission area changes. When the
function is first powered on or reset, the cmd is submitted via PCI
bar memory. Later the driver changes the function config to use
host memory and DMA. The request to start a mailbox command is the
same, a simple doorbell write, regardless of submission area.
So.. if there has not been a boot driver run against the adapter,
the mailbox command works as defaults are ok. But, if the boot
driver has configured the card and, and if no platform pci
function/slot reset occurs as the os starts, the mailbox command
will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.

Fix is to reset the sli-3 function before sending the
mailbox command, thus synchronizing the function/driver on mailbox
location.

Note: The fix uses routines that are typically invoked later in the
call flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely the
allocation of the pport structure. So, rather than significantly reworking
the initialization flow so that the pport is alloc'd first, pointer checks
are added to work around it. Checks are limited to the routines invoked
by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
and another adapter reset, still occur - both on sli-3 and sli-4 adapters.

Signed-off-by: Dick Kennedy <dick.kennedy at broadcom.com>
Signed-off-by: James Smart <james.smart at broadcom.com>
Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
Cc: stable at vger.kernel.org
---
The issue was introduced by one of the patches in the lpfc 11.2.0.10
patch set: http://www.spinics.net/lists/linux-scsi/msg105908.html
As the set contained a number of NVME fixes, it was picked up only in
the linux-block tree. The scsi trees are at ~11.2.0.7.

This patch was cut against the nvme-4.12 tree, which is backed by
linux-block. The patch must go in via the block tree.

As the patch is scsi-specific, it was cherry-picked for stable trees.
Thus the fix needs to be picked up by them.

Warning: the patch applies to pre-11.2.0.10 versions of lpfc (in the
scsi trees) reporting only fuzz warnings, but creates something
completely erroneous.
---
 drivers/scsi/lpfc/lpfc_crtn.h |  1 +
 drivers/scsi/lpfc/lpfc_init.c |  7 +++++++
 drivers/scsi/lpfc/lpfc_sli.c  | 19 ++++++++++++-------
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 944b32c..1c55408 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -294,6 +294,7 @@ int lpfc_selective_reset(struct lpfc_hba *);
 void lpfc_reset_barrier(struct lpfc_hba *);
 int lpfc_sli_brdready(struct lpfc_hba *, uint32_t);
 int lpfc_sli_brdkill(struct lpfc_hba *);
+int lpfc_sli_chipset_init(struct lpfc_hba *phba);
 int lpfc_sli_brdreset(struct lpfc_hba *);
 int lpfc_sli_brdrestart(struct lpfc_hba *);
 int lpfc_sli_hba_setup(struct lpfc_hba *);
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 90ae354..e85f273 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -3602,6 +3602,13 @@ lpfc_get_wwpn(struct lpfc_hba *phba)
 	LPFC_MBOXQ_t *mboxq;
 	MAILBOX_t *mb;
 
+	if (phba->sli_rev < LPFC_SLI_REV4) {
+		/* Reset the port first */
+		lpfc_sli_brdrestart(phba);
+		rc = lpfc_sli_chipset_init(phba);
+		if (rc)
+			return (uint64_t)-1;
+	}
 
 	mboxq = (LPFC_MBOXQ_t *) mempool_alloc(phba->mbox_mem_pool,
 						GFP_KERNEL);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index cf19f49..2a4fc00 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -4204,13 +4204,16 @@ lpfc_sli_brdreset(struct lpfc_hba *phba)
 	/* Reset HBA */
 	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
 			"0325 Reset HBA Data: x%x x%x\n",
-			phba->pport->port_state, psli->sli_flag);
+			(phba->pport) ? phba->pport->port_state : 0,
+			psli->sli_flag);
 
 	/* perform board reset */
 	phba->fc_eventTag = 0;
 	phba->link_events = 0;
-	phba->pport->fc_myDID = 0;
-	phba->pport->fc_prevDID = 0;
+	if (phba->pport) {
+		phba->pport->fc_myDID = 0;
+		phba->pport->fc_prevDID = 0;
+	}
 
 	/* Turn off parity checking and serr during the physical reset */
 	pci_read_config_word(phba->pcidev, PCI_COMMAND, &cfg_value);
@@ -4336,7 +4339,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
 	/* Restart HBA */
 	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
 			"0337 Restart HBA Data: x%x x%x\n",
-			phba->pport->port_state, psli->sli_flag);
+			(phba->pport) ? phba->pport->port_state : 0,
+			psli->sli_flag);
 
 	word0 = 0;
 	mb = (MAILBOX_t *) &word0;
@@ -4350,7 +4354,7 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
 	readl(to_slim); /* flush */
 
 	/* Only skip post after fc_ffinit is completed */
-	if (phba->pport->port_state)
+	if (phba->pport && phba->pport->port_state)
 		word0 = 1;	/* This is really setting up word1 */
 	else
 		word0 = 0;	/* This is really setting up word1 */
@@ -4359,7 +4363,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
 	readl(to_slim); /* flush */
 
 	lpfc_sli_brdreset(phba);
-	phba->pport->stopped = 0;
+	if (phba->pport)
+		phba->pport->stopped = 0;
 	phba->link_state = LPFC_INIT_START;
 	phba->hba_flag = 0;
 	spin_unlock_irq(&phba->hbalock);
@@ -4446,7 +4451,7 @@ lpfc_sli_brdrestart(struct lpfc_hba *phba)
  * iteration, the function will restart the HBA again. The function returns
  * zero if HBA successfully restarted else returns negative error code.
  **/
-static int
+int
 lpfc_sli_chipset_init(struct lpfc_hba *phba)
 {
 	uint32_t status, i = 0;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] lpfc: Fix panic on BFS configuration.
  2017-04-27 22:08 ` jsmart2021
@ 2017-04-28 20:36   ` Ewan D. Milne
  -1 siblings, 0 replies; 9+ messages in thread
From: Ewan D. Milne @ 2017-04-28 20:36 UTC (permalink / raw)
  To: jsmart2021
  Cc: linux-scsi, linux-nvme, sagi, stable, loberman, jthumshirn,
	Dick Kennedy, James Smart

On Thu, 2017-04-27 at 15:08 -0700, jsmart2021@gmail.com wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> To select the appropriate shost template, the driver is issuing
> a mailbox command to retrieve the wwn. Turns out the sending of
> the command precedes the reset of the function.  On SLI-4 adapters,
> this is inconsequential as the mailbox command location is specified
> by dma via the BMBX register. However, on SLI-3 adapters, the
> location of the mailbox command submission area changes. When the
> function is first powered on or reset, the cmd is submitted via PCI
> bar memory. Later the driver changes the function config to use
> host memory and DMA. The request to start a mailbox command is the
> same, a simple doorbell write, regardless of submission area.
> So.. if there has not been a boot driver run against the adapter,
> the mailbox command works as defaults are ok. But, if the boot
> driver has configured the card and, and if no platform pci
> function/slot reset occurs as the os starts, the mailbox command
> will fail. The SLI-3 device will use the stale boot driver dma
> location. This can cause PCI eeh errors.
> 
> Fix is to reset the sli-3 function before sending the
> mailbox command, thus synchronizing the function/driver on mailbox
> location.
> 
> Note: The fix uses routines that are typically invoked later in the
> call flow to reset the sli-3 device. The issue in using those routines is
> that the normal (non-fix) flow does additional initialization, namely the
> allocation of the pport structure. So, rather than significantly reworking
> the initialization flow so that the pport is alloc'd first, pointer checks
> are added to work around it. Checks are limited to the routines invoked
> by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
> on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
> and another adapter reset, still occur - both on sli-3 and sli-4 adapters.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
> Signed-off-by: James Smart <james.smart@broadcom.com>
> Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
> Cc: stable@vger.kernel.org
> ---
> The issue was introduced by one of the patches in the lpfc 11.2.0.10
> patch set: http://www.spinics.net/lists/linux-scsi/msg105908.html
> As the set contained a number of NVME fixes, it was picked up only in
> the linux-block tree. The scsi trees are at ~11.2.0.7.
> 
> This patch was cut against the nvme-4.12 tree, which is backed by
> linux-block. The patch must go in via the block tree.
> 
> As the patch is scsi-specific, it was cherry-picked for stable trees.
> Thus the fix needs to be picked up by them.
> 
> Warning: the patch applies to pre-11.2.0.10 versions of lpfc (in the
> scsi trees) reporting only fuzz warnings, but creates something
> completely erroneous.
> ---
>  drivers/scsi/lpfc/lpfc_crtn.h |  1 +
>  drivers/scsi/lpfc/lpfc_init.c |  7 +++++++
>  drivers/scsi/lpfc/lpfc_sli.c  | 19 ++++++++++++-------
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
> index 944b32c..1c55408 100644
> --- a/drivers/scsi/lpfc/lpfc_crtn.h
> +++ b/drivers/scsi/lpfc/lpfc_crtn.h
> @@ -294,6 +294,7 @@ int lpfc_selective_reset(struct lpfc_hba *);
>  void lpfc_reset_barrier(struct lpfc_hba *);
>  int lpfc_sli_brdready(struct lpfc_hba *, uint32_t);
>  int lpfc_sli_brdkill(struct lpfc_hba *);
> +int lpfc_sli_chipset_init(struct lpfc_hba *phba);
>  int lpfc_sli_brdreset(struct lpfc_hba *);
>  int lpfc_sli_brdrestart(struct lpfc_hba *);
>  int lpfc_sli_hba_setup(struct lpfc_hba *);
> diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
> index 90ae354..e85f273 100644
> --- a/drivers/scsi/lpfc/lpfc_init.c
> +++ b/drivers/scsi/lpfc/lpfc_init.c
> @@ -3602,6 +3602,13 @@ lpfc_get_wwpn(struct lpfc_hba *phba)
>  	LPFC_MBOXQ_t *mboxq;
>  	MAILBOX_t *mb;
>  
> +	if (phba->sli_rev < LPFC_SLI_REV4) {
> +		/* Reset the port first */
> +		lpfc_sli_brdrestart(phba);
> +		rc = lpfc_sli_chipset_init(phba);
> +		if (rc)
> +			return (uint64_t)-1;
> +	}
>  
>  	mboxq = (LPFC_MBOXQ_t *) mempool_alloc(phba->mbox_mem_pool,
>  						GFP_KERNEL);
> diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
> index cf19f49..2a4fc00 100644
> --- a/drivers/scsi/lpfc/lpfc_sli.c
> +++ b/drivers/scsi/lpfc/lpfc_sli.c
> @@ -4204,13 +4204,16 @@ lpfc_sli_brdreset(struct lpfc_hba *phba)
>  	/* Reset HBA */
>  	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
>  			"0325 Reset HBA Data: x%x x%x\n",
> -			phba->pport->port_state, psli->sli_flag);
> +			(phba->pport) ? phba->pport->port_state : 0,
> +			psli->sli_flag);
>  
>  	/* perform board reset */
>  	phba->fc_eventTag = 0;
>  	phba->link_events = 0;
> -	phba->pport->fc_myDID = 0;
> -	phba->pport->fc_prevDID = 0;
> +	if (phba->pport) {
> +		phba->pport->fc_myDID = 0;
> +		phba->pport->fc_prevDID = 0;
> +	}
>  
>  	/* Turn off parity checking and serr during the physical reset */
>  	pci_read_config_word(phba->pcidev, PCI_COMMAND, &cfg_value);
> @@ -4336,7 +4339,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
>  	/* Restart HBA */
>  	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
>  			"0337 Restart HBA Data: x%x x%x\n",
> -			phba->pport->port_state, psli->sli_flag);
> +			(phba->pport) ? phba->pport->port_state : 0,
> +			psli->sli_flag);
>  
>  	word0 = 0;
>  	mb = (MAILBOX_t *) &word0;
> @@ -4350,7 +4354,7 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
>  	readl(to_slim); /* flush */
>  
>  	/* Only skip post after fc_ffinit is completed */
> -	if (phba->pport->port_state)
> +	if (phba->pport && phba->pport->port_state)
>  		word0 = 1;	/* This is really setting up word1 */
>  	else
>  		word0 = 0;	/* This is really setting up word1 */
> @@ -4359,7 +4363,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
>  	readl(to_slim); /* flush */
>  
>  	lpfc_sli_brdreset(phba);
> -	phba->pport->stopped = 0;
> +	if (phba->pport)
> +		phba->pport->stopped = 0;
>  	phba->link_state = LPFC_INIT_START;
>  	phba->hba_flag = 0;
>  	spin_unlock_irq(&phba->hbalock);
> @@ -4446,7 +4451,7 @@ lpfc_sli_brdrestart(struct lpfc_hba *phba)
>   * iteration, the function will restart the HBA again. The function returns
>   * zero if HBA successfully restarted else returns negative error code.
>   **/
> -static int
> +int
>  lpfc_sli_chipset_init(struct lpfc_hba *phba)
>  {
>  	uint32_t status, i = 0;

Reviewed-by: Ewan D. Milne <emilne@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] lpfc: Fix panic on BFS configuration.
@ 2017-04-28 20:36   ` Ewan D. Milne
  0 siblings, 0 replies; 9+ messages in thread
From: Ewan D. Milne @ 2017-04-28 20:36 UTC (permalink / raw)


On Thu, 2017-04-27@15:08 -0700, jsmart2021@gmail.com wrote:
> From: James Smart <jsmart2021 at gmail.com>
> 
> To select the appropriate shost template, the driver is issuing
> a mailbox command to retrieve the wwn. Turns out the sending of
> the command precedes the reset of the function.  On SLI-4 adapters,
> this is inconsequential as the mailbox command location is specified
> by dma via the BMBX register. However, on SLI-3 adapters, the
> location of the mailbox command submission area changes. When the
> function is first powered on or reset, the cmd is submitted via PCI
> bar memory. Later the driver changes the function config to use
> host memory and DMA. The request to start a mailbox command is the
> same, a simple doorbell write, regardless of submission area.
> So.. if there has not been a boot driver run against the adapter,
> the mailbox command works as defaults are ok. But, if the boot
> driver has configured the card and, and if no platform pci
> function/slot reset occurs as the os starts, the mailbox command
> will fail. The SLI-3 device will use the stale boot driver dma
> location. This can cause PCI eeh errors.
> 
> Fix is to reset the sli-3 function before sending the
> mailbox command, thus synchronizing the function/driver on mailbox
> location.
> 
> Note: The fix uses routines that are typically invoked later in the
> call flow to reset the sli-3 device. The issue in using those routines is
> that the normal (non-fix) flow does additional initialization, namely the
> allocation of the pport structure. So, rather than significantly reworking
> the initialization flow so that the pport is alloc'd first, pointer checks
> are added to work around it. Checks are limited to the routines invoked
> by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
> on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
> and another adapter reset, still occur - both on sli-3 and sli-4 adapters.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy at broadcom.com>
> Signed-off-by: James Smart <james.smart at broadcom.com>
> Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
> Cc: stable at vger.kernel.org
> ---
> The issue was introduced by one of the patches in the lpfc 11.2.0.10
> patch set: http://www.spinics.net/lists/linux-scsi/msg105908.html
> As the set contained a number of NVME fixes, it was picked up only in
> the linux-block tree. The scsi trees are at ~11.2.0.7.
> 
> This patch was cut against the nvme-4.12 tree, which is backed by
> linux-block. The patch must go in via the block tree.
> 
> As the patch is scsi-specific, it was cherry-picked for stable trees.
> Thus the fix needs to be picked up by them.
> 
> Warning: the patch applies to pre-11.2.0.10 versions of lpfc (in the
> scsi trees) reporting only fuzz warnings, but creates something
> completely erroneous.
> ---
>  drivers/scsi/lpfc/lpfc_crtn.h |  1 +
>  drivers/scsi/lpfc/lpfc_init.c |  7 +++++++
>  drivers/scsi/lpfc/lpfc_sli.c  | 19 ++++++++++++-------
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
> index 944b32c..1c55408 100644
> --- a/drivers/scsi/lpfc/lpfc_crtn.h
> +++ b/drivers/scsi/lpfc/lpfc_crtn.h
> @@ -294,6 +294,7 @@ int lpfc_selective_reset(struct lpfc_hba *);
>  void lpfc_reset_barrier(struct lpfc_hba *);
>  int lpfc_sli_brdready(struct lpfc_hba *, uint32_t);
>  int lpfc_sli_brdkill(struct lpfc_hba *);
> +int lpfc_sli_chipset_init(struct lpfc_hba *phba);
>  int lpfc_sli_brdreset(struct lpfc_hba *);
>  int lpfc_sli_brdrestart(struct lpfc_hba *);
>  int lpfc_sli_hba_setup(struct lpfc_hba *);
> diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
> index 90ae354..e85f273 100644
> --- a/drivers/scsi/lpfc/lpfc_init.c
> +++ b/drivers/scsi/lpfc/lpfc_init.c
> @@ -3602,6 +3602,13 @@ lpfc_get_wwpn(struct lpfc_hba *phba)
>  	LPFC_MBOXQ_t *mboxq;
>  	MAILBOX_t *mb;
>  
> +	if (phba->sli_rev < LPFC_SLI_REV4) {
> +		/* Reset the port first */
> +		lpfc_sli_brdrestart(phba);
> +		rc = lpfc_sli_chipset_init(phba);
> +		if (rc)
> +			return (uint64_t)-1;
> +	}
>  
>  	mboxq = (LPFC_MBOXQ_t *) mempool_alloc(phba->mbox_mem_pool,
>  						GFP_KERNEL);
> diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
> index cf19f49..2a4fc00 100644
> --- a/drivers/scsi/lpfc/lpfc_sli.c
> +++ b/drivers/scsi/lpfc/lpfc_sli.c
> @@ -4204,13 +4204,16 @@ lpfc_sli_brdreset(struct lpfc_hba *phba)
>  	/* Reset HBA */
>  	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
>  			"0325 Reset HBA Data: x%x x%x\n",
> -			phba->pport->port_state, psli->sli_flag);
> +			(phba->pport) ? phba->pport->port_state : 0,
> +			psli->sli_flag);
>  
>  	/* perform board reset */
>  	phba->fc_eventTag = 0;
>  	phba->link_events = 0;
> -	phba->pport->fc_myDID = 0;
> -	phba->pport->fc_prevDID = 0;
> +	if (phba->pport) {
> +		phba->pport->fc_myDID = 0;
> +		phba->pport->fc_prevDID = 0;
> +	}
>  
>  	/* Turn off parity checking and serr during the physical reset */
>  	pci_read_config_word(phba->pcidev, PCI_COMMAND, &cfg_value);
> @@ -4336,7 +4339,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
>  	/* Restart HBA */
>  	lpfc_printf_log(phba, KERN_INFO, LOG_SLI,
>  			"0337 Restart HBA Data: x%x x%x\n",
> -			phba->pport->port_state, psli->sli_flag);
> +			(phba->pport) ? phba->pport->port_state : 0,
> +			psli->sli_flag);
>  
>  	word0 = 0;
>  	mb = (MAILBOX_t *) &word0;
> @@ -4350,7 +4354,7 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
>  	readl(to_slim); /* flush */
>  
>  	/* Only skip post after fc_ffinit is completed */
> -	if (phba->pport->port_state)
> +	if (phba->pport && phba->pport->port_state)
>  		word0 = 1;	/* This is really setting up word1 */
>  	else
>  		word0 = 0;	/* This is really setting up word1 */
> @@ -4359,7 +4363,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba)
>  	readl(to_slim); /* flush */
>  
>  	lpfc_sli_brdreset(phba);
> -	phba->pport->stopped = 0;
> +	if (phba->pport)
> +		phba->pport->stopped = 0;
>  	phba->link_state = LPFC_INIT_START;
>  	phba->hba_flag = 0;
>  	spin_unlock_irq(&phba->hbalock);
> @@ -4446,7 +4451,7 @@ lpfc_sli_brdrestart(struct lpfc_hba *phba)
>   * iteration, the function will restart the HBA again. The function returns
>   * zero if HBA successfully restarted else returns negative error code.
>   **/
> -static int
> +int
>  lpfc_sli_chipset_init(struct lpfc_hba *phba)
>  {
>  	uint32_t status, i = 0;

Reviewed-by: Ewan D. Milne <emilne at redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] lpfc: Fix panic on BFS configuration.
  2017-04-27 22:08 ` jsmart2021
  (?)
@ 2017-05-08  7:10   ` Johannes Thumshirn
  -1 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2017-05-08  7:10 UTC (permalink / raw)
  To: jsmart2021
  Cc: linux-scsi, linux-nvme, sagi, stable, loberman, emilne,
	Dick Kennedy, James Smart

On Thu, Apr 27, 2017 at 03:08:26PM -0700, jsmart2021@gmail.com wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> To select the appropriate shost template, the driver is issuing
> a mailbox command to retrieve the wwn. Turns out the sending of
> the command precedes the reset of the function.  On SLI-4 adapters,
> this is inconsequential as the mailbox command location is specified
> by dma via the BMBX register. However, on SLI-3 adapters, the
> location of the mailbox command submission area changes. When the
> function is first powered on or reset, the cmd is submitted via PCI
> bar memory. Later the driver changes the function config to use
> host memory and DMA. The request to start a mailbox command is the
> same, a simple doorbell write, regardless of submission area.
> So.. if there has not been a boot driver run against the adapter,
> the mailbox command works as defaults are ok. But, if the boot
> driver has configured the card and, and if no platform pci
> function/slot reset occurs as the os starts, the mailbox command
> will fail. The SLI-3 device will use the stale boot driver dma
> location. This can cause PCI eeh errors.
> 
> Fix is to reset the sli-3 function before sending the
> mailbox command, thus synchronizing the function/driver on mailbox
> location.
> 
> Note: The fix uses routines that are typically invoked later in the
> call flow to reset the sli-3 device. The issue in using those routines is
> that the normal (non-fix) flow does additional initialization, namely the
> allocation of the pport structure. So, rather than significantly reworking
> the initialization flow so that the pport is alloc'd first, pointer checks
> are added to work around it. Checks are limited to the routines invoked
> by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
> on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
> and another adapter reset, still occur - both on sli-3 and sli-4 adapters.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
> Signed-off-by: James Smart <james.smart@broadcom.com>
> Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
> Cc: stable@vger.kernel.org
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] lpfc: Fix panic on BFS configuration.
@ 2017-05-08  7:10   ` Johannes Thumshirn
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2017-05-08  7:10 UTC (permalink / raw)
  To: jsmart2021
  Cc: linux-scsi, linux-nvme, sagi, stable, loberman, emilne,
	Dick Kennedy, James Smart

On Thu, Apr 27, 2017 at 03:08:26PM -0700, jsmart2021@gmail.com wrote:
> From: James Smart <jsmart2021@gmail.com>
> 
> To select the appropriate shost template, the driver is issuing
> a mailbox command to retrieve the wwn. Turns out the sending of
> the command precedes the reset of the function.  On SLI-4 adapters,
> this is inconsequential as the mailbox command location is specified
> by dma via the BMBX register. However, on SLI-3 adapters, the
> location of the mailbox command submission area changes. When the
> function is first powered on or reset, the cmd is submitted via PCI
> bar memory. Later the driver changes the function config to use
> host memory and DMA. The request to start a mailbox command is the
> same, a simple doorbell write, regardless of submission area.
> So.. if there has not been a boot driver run against the adapter,
> the mailbox command works as defaults are ok. But, if the boot
> driver has configured the card and, and if no platform pci
> function/slot reset occurs as the os starts, the mailbox command
> will fail. The SLI-3 device will use the stale boot driver dma
> location. This can cause PCI eeh errors.
> 
> Fix is to reset the sli-3 function before sending the
> mailbox command, thus synchronizing the function/driver on mailbox
> location.
> 
> Note: The fix uses routines that are typically invoked later in the
> call flow to reset the sli-3 device. The issue in using those routines is
> that the normal (non-fix) flow does additional initialization, namely the
> allocation of the pport structure. So, rather than significantly reworking
> the initialization flow so that the pport is alloc'd first, pointer checks
> are added to work around it. Checks are limited to the routines invoked
> by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
> on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
> and another adapter reset, still occur - both on sli-3 and sli-4 adapters.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
> Signed-off-by: James Smart <james.smart@broadcom.com>
> Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
> Cc: stable@vger.kernel.org
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] lpfc: Fix panic on BFS configuration.
@ 2017-05-08  7:10   ` Johannes Thumshirn
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2017-05-08  7:10 UTC (permalink / raw)


On Thu, Apr 27, 2017@03:08:26PM -0700, jsmart2021@gmail.com wrote:
> From: James Smart <jsmart2021 at gmail.com>
> 
> To select the appropriate shost template, the driver is issuing
> a mailbox command to retrieve the wwn. Turns out the sending of
> the command precedes the reset of the function.  On SLI-4 adapters,
> this is inconsequential as the mailbox command location is specified
> by dma via the BMBX register. However, on SLI-3 adapters, the
> location of the mailbox command submission area changes. When the
> function is first powered on or reset, the cmd is submitted via PCI
> bar memory. Later the driver changes the function config to use
> host memory and DMA. The request to start a mailbox command is the
> same, a simple doorbell write, regardless of submission area.
> So.. if there has not been a boot driver run against the adapter,
> the mailbox command works as defaults are ok. But, if the boot
> driver has configured the card and, and if no platform pci
> function/slot reset occurs as the os starts, the mailbox command
> will fail. The SLI-3 device will use the stale boot driver dma
> location. This can cause PCI eeh errors.
> 
> Fix is to reset the sli-3 function before sending the
> mailbox command, thus synchronizing the function/driver on mailbox
> location.
> 
> Note: The fix uses routines that are typically invoked later in the
> call flow to reset the sli-3 device. The issue in using those routines is
> that the normal (non-fix) flow does additional initialization, namely the
> allocation of the pport structure. So, rather than significantly reworking
> the initialization flow so that the pport is alloc'd first, pointer checks
> are added to work around it. Checks are limited to the routines invoked
> by a sli-3 adapter (s3 routines) as this fix/early call is only invoked
> on a sli3 adapter. Nothing changes post the fix. Subsequent initialization,
> and another adapter reset, still occur - both on sli-3 and sli-4 adapters.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy at broadcom.com>
> Signed-off-by: James Smart <james.smart at broadcom.com>
> Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
> Cc: stable at vger.kernel.org
> ---

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn at suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] lpfc: Fix panic on BFS configuration.
  2017-04-27 22:08 ` jsmart2021
@ 2017-05-09  1:26   ` Martin K. Petersen
  -1 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2017-05-09  1:26 UTC (permalink / raw)
  To: jsmart2021
  Cc: linux-scsi, linux-nvme, sagi, stable, loberman, jthumshirn,
	emilne, Dick Kennedy, James Smart


James,

> Fix is to reset the sli-3 function before sending the mailbox command,
> thus synchronizing the function/driver on mailbox location.

Applied to 4.12/scsi-fixes.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] lpfc: Fix panic on BFS configuration.
@ 2017-05-09  1:26   ` Martin K. Petersen
  0 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2017-05-09  1:26 UTC (permalink / raw)



James,

> Fix is to reset the sli-3 function before sending the mailbox command,
> thus synchronizing the function/driver on mailbox location.

Applied to 4.12/scsi-fixes.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-05-09  1:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-27 22:08 [PATCH v3] lpfc: Fix panic on BFS configuration jsmart2021
2017-04-27 22:08 ` jsmart2021
2017-04-28 20:36 ` Ewan D. Milne
2017-04-28 20:36   ` Ewan D. Milne
2017-05-08  7:10 ` Johannes Thumshirn
2017-05-08  7:10   ` Johannes Thumshirn
2017-05-08  7:10   ` Johannes Thumshirn
2017-05-09  1:26 ` Martin K. Petersen
2017-05-09  1:26   ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.