All of lore.kernel.org
 help / color / mirror / Atom feed
* [V1 0/6] mpt3sas: Adding MPI Endpoint device support.
@ 2018-02-07 10:51 ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

V1 Change info:

* Few sparse warning fixes over initial patch set.
* For 32 bit Arch,_base_writeq function is identical
to _base_mpi_ep_writeq, Removed duplicate code as suggested by Martin.

Andromeda is a PCIe switch, and it has a dedicated management
 CPU (mCPU), nonvolatile flash memory, RAM etc... and
 Linux kernel runs on mCPU. MPI Endpoint driver is the
 management driver for Andromeda.

The Plx Manager driver running on mCPU synthesizes a
 virtual/Synthetic MPI End point to host.
Synthetic MPI End point is emulated IT firmware running on
 Linux operating system, which interfaces with PLX management 
 driver.

PLX Management driver integrates IOCFW in same driver binary.
At the end of Plx_Mgr driver load, it initializes IOC FW as well.
Current implementation is single instance
 of IOC FW (as it supports only one host). 

 PLX management driver will provide required resources 
and infrastructure for Synthetic MPI End point.

Existing PLXManagement driver will reserve virtual slot for
 MPI end point. currently, Virtual slot number 29 is reserved
 for MPI end point. 

Synthetic device in management driver will be marked as
 new type “PLX_DEV_TYPE_SYNTH_MPI_EP”. PLXmanagement driver
 will interface with Synthetic MPI Endpoint for any
 communication happening on PLX_DEV_TYPE_SYNTH_MPI_EP device
 type from host.

Link between host and PLX C2 is in below diagram.

                                 _______________
 _______________                |               |
|               |               |               |
| PLX C2        |===============|    HOST       |
| PCI -         |===============|   MACHINE	|
|  SWITCH       |             	|               |
|_______________|               |               |
        ||                      |_______________|
	||
	||
 _______||______
|               |
|  MCPU         |
|               |
|_______________|



 After MPI end point implementation -
(Host will see dedicated Virtual SLOT as MPI End point.)
In Below single line is logical channel for MPI Endpoint
                                 _______________
 _______________                |               |
|               |               |               |
| PLX C2        |===============|   HOST        |
| PCI -         |===============|   MACHINE     |
|  SWITCH       |               |               |
|               |               |  -----------	|
|_______________|---------------| | IT DRIVER | |
        ||  |			|  -----------  |
        ||  |			|_______________|
        ||  |
        ||  |
 _______||__|___________
|       ||  |           |
|      MCPU |           |
|        ___|____   	|
|	| PLX MGR|	|
|	| DRIVER |  	|
|	|________|  	|
|           |	 	|
|        ___|_____	|
|	|	  |	|
|	|IOC FW   |	|
|       |_________|	|
|_______________________|

PLXmanagement driver will create MPI end point based on
 device table definition. PLXManagement driver will also
 populate Synthetic device tree based on Device Table 
 for each host.  

>From host it will be seen as IT HBA (Simplified version of SAS2/MPI2)
(PCI Device, in which emulated IT FW running on mCPU behind Synthetic
 endpoint of PCISWITCH). For host it is considered as actual
 Physical Device.

PLX Management driver provide interface to do DMA from mCPU to Host
 using “MCPU Response Data Buffer“ method. DMA from Host to mCPU using
 “MCPU Response Data Buffer” is not possible.

Why DMA from host to mCPU is not possible using Responsebuffer ?
 MCPU Response buffer is not really for reading from host
 (reading will work, but answer TLP will not come back to the CSR FIFO,
 but will go to the MCPU root complex - which could be an
 unexpected read completion!

Existing host driver (mpt2sas) will not work
 for MPI end point. As the interface to DMA from host to mCPU is
 not present for Mcpu/MPI Endpoint device, To overcome this
 Driver should do double copy of those buffer directly to the
 mCPU memory region via BAR-0 region.

The host BAR0 region is divided into different group to serve Host
 assisted DMA.

 0    - 255     System register(Doorbell, Host Interrupt etc)
 256  - 4352    MPI Frame. (This is based on maxCredit 32)
 4352 - 4864    Reply_free pool (512 byte is reserved considering
                maxCredit 32. Reply needsextra room, for mCPU case
                kept four times of maxCredit)
 4864 -17152    SGE chain element.
                (32 command * 3 chain of 128 byte size = 12288)
 17152 -x       Host buffer mapped with smid.
                (Each smid can have 64K Max IO.)
BAR0+Last 1K    MSIX Addr and DataTotalsize in use 2113664 bytes
                of 4MB BAR0 MPI end point module of PLX management
                driver must be aware of regions above.

SGE and Host buffer details will be available in MPI frame.

Each PCI packets coming from host on MPI end point will end up in
 mCPU PLXmanagement driver. We can consider this as front end for IOC FW.
 PLXManagementdriver will call IOC front end API which will be the entry
 point in IOC FW module. Once PLX management calls relevant callbackfrom
 IOC FW, rest of the processing will behandled within IOC FW. 
 IOC FW should release TLP packet as soon as possible to avoid any
 TLP timeout.

Suganath Prabu S (6):
  mpt3sas: Add PCI device ID for Andromeda.
  mpt3sas: Configure reply post queue depth, DMA and sgl      tablesize.
  mpt3sas: Introduce API's to get BAR0 mapped buffer      address.
  mpt3sas: Introduce Base function for cloning.
  mpt3sas: Introduce function to clone mpi request.
  mpt3sas: Introduce function to clone mpi reply.

 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h             |    1 +
 drivers/scsi/mpt3sas/mpt3sas_base.c              |  528 ++++++++-
 drivers/scsi/mpt3sas/mpt3sas_base.h              |    6 +
 drivers/scsi/mpt3sas/mpt3sas_config.c            |    1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c             |   54 +-
 16 files changed, 540 insertions(+), 9291 deletions(-)

Thanks,
Suganath Prabu S
-- 
2.5.5

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [V1 0/6] mpt3sas: Adding MPI Endpoint device support.
@ 2018-02-07 10:51 ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


V1 Change info:

* Few sparse warning fixes over initial patch set.
* For 32 bit Arch,_base_writeq function is identical
to _base_mpi_ep_writeq, Removed duplicate code as suggested by Martin.

Andromeda is a PCIe switch, and it has a dedicated management
 CPU (mCPU), nonvolatile flash memory, RAM etc... and
 Linux kernel runs on mCPU. MPI Endpoint driver is the
 management driver for Andromeda.

The Plx Manager driver running on mCPU synthesizes a
 virtual/Synthetic MPI End point to host.
Synthetic MPI End point is emulated IT firmware running on
 Linux operating system, which interfaces with PLX management 
 driver.

PLX Management driver integrates IOCFW in same driver binary.
At the end of Plx_Mgr driver load, it initializes IOC FW as well.
Current implementation is single instance
 of IOC FW (as it supports only one host). 

 PLX management driver will provide required resources 
and infrastructure for Synthetic MPI End point.

Existing PLXManagement driver will reserve virtual slot for
 MPI end point. currently, Virtual slot number 29 is reserved
 for MPI end point. 

Synthetic device in management driver will be marked as
 new type ?PLX_DEV_TYPE_SYNTH_MPI_EP?. PLXmanagement driver
 will interface with Synthetic MPI Endpoint for any
 communication happening on PLX_DEV_TYPE_SYNTH_MPI_EP device
 type from host.

Link between host and PLX C2 is in below diagram.

                                 _______________
 _______________                |               |
|               |               |               |
| PLX C2        |===============|    HOST       |
| PCI -         |===============|   MACHINE	|
|  SWITCH       |             	|               |
|_______________|               |               |
        ||                      |_______________|
	||
	||
 _______||______
|               |
|  MCPU         |
|               |
|_______________|



 After MPI end point implementation -
(Host will see dedicated Virtual SLOT as MPI End point.)
In Below single line is logical channel for MPI Endpoint
                                 _______________
 _______________                |               |
|               |               |               |
| PLX C2        |===============|   HOST        |
| PCI -         |===============|   MACHINE     |
|  SWITCH       |               |               |
|               |               |  -----------	|
|_______________|---------------| | IT DRIVER | |
        ||  |			|  -----------  |
        ||  |			|_______________|
        ||  |
        ||  |
 _______||__|___________
|       ||  |           |
|      MCPU |           |
|        ___|____   	|
|	| PLX MGR|	|
|	| DRIVER |  	|
|	|________|  	|
|           |	 	|
|        ___|_____	|
|	|	  |	|
|	|IOC FW   |	|
|       |_________|	|
|_______________________|

PLXmanagement driver will create MPI end point based on
 device table definition. PLXManagement driver will also
 populate Synthetic device tree based on Device Table 
 for each host.  

>From host it will be seen as IT HBA (Simplified version of SAS2/MPI2)
(PCI Device, in which emulated IT FW running on mCPU behind Synthetic
 endpoint of PCISWITCH). For host it is considered as actual
 Physical Device.

PLX Management driver provide interface to do DMA from mCPU to Host
 using ?MCPU Response Data Buffer? method. DMA from Host to mCPU using
 ?MCPU Response Data Buffer? is not possible.

Why DMA from host to mCPU is not possible using Responsebuffer ?
 MCPU Response buffer is not really for reading from host
 (reading will work, but answer TLP will not come back to the CSR FIFO,
 but will go to the MCPU root complex - which could be an
 unexpected read completion!

Existing host driver (mpt2sas) will not work
 for MPI end point. As the interface to DMA from host to mCPU is
 not present for Mcpu/MPI Endpoint device, To overcome this
 Driver should do double copy of those buffer directly to the
 mCPU memory region via BAR-0 region.

The host BAR0 region is divided into different group to serve Host
 assisted DMA.

 0    - 255     System register(Doorbell, Host Interrupt etc)
 256  - 4352    MPI Frame. (This is based on maxCredit 32)
 4352 - 4864    Reply_free pool (512 byte is reserved considering
                maxCredit 32. Reply needsextra room, for mCPU case
                kept four times of maxCredit)
 4864 -17152    SGE chain element.
                (32 command * 3 chain of 128 byte size = 12288)
 17152 -x       Host buffer mapped with smid.
                (Each smid can have 64K Max IO.)
BAR0+Last 1K    MSIX Addr and DataTotalsize in use 2113664 bytes
                of 4MB BAR0 MPI end point module of PLX management
                driver must be aware of regions above.

SGE and Host buffer details will be available in MPI frame.

Each PCI packets coming from host on MPI end point will end up in
 mCPU PLXmanagement driver. We can consider this as front end for IOC FW.
 PLXManagementdriver will call IOC front end API which will be the entry
 point in IOC FW module. Once PLX management calls relevant callbackfrom
 IOC FW, rest of the processing will behandled within IOC FW. 
 IOC FW should release TLP packet as soon as possible to avoid any
 TLP timeout.

Suganath Prabu S (6):
  mpt3sas: Add PCI device ID for Andromeda.
  mpt3sas: Configure reply post queue depth, DMA and sgl      tablesize.
  mpt3sas: Introduce API's to get BAR0 mapped buffer      address.
  mpt3sas: Introduce Base function for cloning.
  mpt3sas: Introduce function to clone mpi request.
  mpt3sas: Introduce function to clone mpi reply.

 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h             |    1 +
 drivers/scsi/mpt3sas/mpt3sas_base.c              |  528 ++++++++-
 drivers/scsi/mpt3sas/mpt3sas_base.h              |    6 +
 drivers/scsi/mpt3sas/mpt3sas_config.c            |    1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c             |   54 +-
 16 files changed, 540 insertions(+), 9291 deletions(-)

Thanks,
Suganath Prabu S
-- 
2.5.5

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [V1 1/6] mpt3sas: Add PCI device ID for Andromeda.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-07 10:51   ` Suganath Prabu S
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

Add device ID and flag for Andromeda/MPI Emdpont.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h |  1 +
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 ++++++++++++--
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h b/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
index ee11710..0ad88de 100644
--- a/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
+++ b/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
@@ -524,6 +524,7 @@ typedef struct _MPI2_CONFIG_REPLY {
 #define MPI2_MFGPAGE_DEVID_SAS2308_1                (0x0086)
 #define MPI2_MFGPAGE_DEVID_SAS2308_2                (0x0087)
 #define MPI2_MFGPAGE_DEVID_SAS2308_3                (0x006E)
+#define MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP           (0x02B0)
 
 /*MPI v2.5 SAS products */
 #define MPI25_MFGPAGE_DEVID_SAS3004                 (0x0096)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 789bc42..897394d 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -1336,6 +1336,7 @@ struct MPT3SAS_ADAPTER {
 	u32		ring_buffer_offset;
 	u32		ring_buffer_sz;
 	u8		is_warpdrive;
+	u8		is_mcpu_endpoint;
 	u8		hide_ir_msg;
 	u8		mfg_pg10_hide_flag;
 	u8		hide_drives;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 74fca18..bde3c6f 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -10335,6 +10335,7 @@ _scsih_determine_hba_mpi_version(struct pci_dev *pdev)
 	case MPI2_MFGPAGE_DEVID_SAS2308_1:
 	case MPI2_MFGPAGE_DEVID_SAS2308_2:
 	case MPI2_MFGPAGE_DEVID_SAS2308_3:
+	case MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP:
 		return MPI2_VERSION;
 	case MPI25_MFGPAGE_DEVID_SAS3004:
 	case MPI25_MFGPAGE_DEVID_SAS3008:
@@ -10412,11 +10413,18 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		ioc->hba_mpi_version_belonged = hba_mpi_version;
 		ioc->id = mpt2_ids++;
 		sprintf(ioc->driver_name, "%s", MPT2SAS_DRIVER_NAME);
-		if (pdev->device == MPI2_MFGPAGE_DEVID_SSS6200) {
+		switch (pdev->device) {
+		case MPI2_MFGPAGE_DEVID_SSS6200:
 			ioc->is_warpdrive = 1;
 			ioc->hide_ir_msg = 1;
-		} else
+			break;
+		case MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP:
+			ioc->is_mcpu_endpoint = 1;
+			break;
+		default:
 			ioc->mfg_pg10_hide_flag = MFG_PAGE10_EXPOSE_ALL_DISKS;
+			break;
+		}
 		break;
 	case MPI25_VERSION:
 	case MPI26_VERSION:
@@ -10845,6 +10853,8 @@ static const struct pci_device_id mpt3sas_pci_table[] = {
 		PCI_ANY_ID, PCI_ANY_ID },
 	{ MPI2_MFGPAGE_VENDORID_LSI, MPI2_MFGPAGE_DEVID_SAS2308_3,
 		PCI_ANY_ID, PCI_ANY_ID },
+	{ MPI2_MFGPAGE_VENDORID_LSI, MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP,
+		PCI_ANY_ID, PCI_ANY_ID },
 	/* SSS6200 */
 	{ MPI2_MFGPAGE_VENDORID_LSI, MPI2_MFGPAGE_DEVID_SSS6200,
 		PCI_ANY_ID, PCI_ANY_ID },
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 1/6] mpt3sas: Add PCI device ID for Andromeda.
@ 2018-02-07 10:51   ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


Add device ID and flag for Andromeda/MPI Emdpont.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani at broadcom.com>
---
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h |  1 +
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 ++++++++++++--
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h b/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
index ee11710..0ad88de 100644
--- a/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
+++ b/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
@@ -524,6 +524,7 @@ typedef struct _MPI2_CONFIG_REPLY {
 #define MPI2_MFGPAGE_DEVID_SAS2308_1                (0x0086)
 #define MPI2_MFGPAGE_DEVID_SAS2308_2                (0x0087)
 #define MPI2_MFGPAGE_DEVID_SAS2308_3                (0x006E)
+#define MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP           (0x02B0)
 
 /*MPI v2.5 SAS products */
 #define MPI25_MFGPAGE_DEVID_SAS3004                 (0x0096)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 789bc42..897394d 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -1336,6 +1336,7 @@ struct MPT3SAS_ADAPTER {
 	u32		ring_buffer_offset;
 	u32		ring_buffer_sz;
 	u8		is_warpdrive;
+	u8		is_mcpu_endpoint;
 	u8		hide_ir_msg;
 	u8		mfg_pg10_hide_flag;
 	u8		hide_drives;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 74fca18..bde3c6f 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -10335,6 +10335,7 @@ _scsih_determine_hba_mpi_version(struct pci_dev *pdev)
 	case MPI2_MFGPAGE_DEVID_SAS2308_1:
 	case MPI2_MFGPAGE_DEVID_SAS2308_2:
 	case MPI2_MFGPAGE_DEVID_SAS2308_3:
+	case MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP:
 		return MPI2_VERSION;
 	case MPI25_MFGPAGE_DEVID_SAS3004:
 	case MPI25_MFGPAGE_DEVID_SAS3008:
@@ -10412,11 +10413,18 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		ioc->hba_mpi_version_belonged = hba_mpi_version;
 		ioc->id = mpt2_ids++;
 		sprintf(ioc->driver_name, "%s", MPT2SAS_DRIVER_NAME);
-		if (pdev->device == MPI2_MFGPAGE_DEVID_SSS6200) {
+		switch (pdev->device) {
+		case MPI2_MFGPAGE_DEVID_SSS6200:
 			ioc->is_warpdrive = 1;
 			ioc->hide_ir_msg = 1;
-		} else
+			break;
+		case MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP:
+			ioc->is_mcpu_endpoint = 1;
+			break;
+		default:
 			ioc->mfg_pg10_hide_flag = MFG_PAGE10_EXPOSE_ALL_DISKS;
+			break;
+		}
 		break;
 	case MPI25_VERSION:
 	case MPI26_VERSION:
@@ -10845,6 +10853,8 @@ static const struct pci_device_id mpt3sas_pci_table[] = {
 		PCI_ANY_ID, PCI_ANY_ID },
 	{ MPI2_MFGPAGE_VENDORID_LSI, MPI2_MFGPAGE_DEVID_SAS2308_3,
 		PCI_ANY_ID, PCI_ANY_ID },
+	{ MPI2_MFGPAGE_VENDORID_LSI, MPI2_MFGPAGE_DEVID_SAS2308_MPI_EP,
+		PCI_ANY_ID, PCI_ANY_ID },
 	/* SSS6200 */
 	{ MPI2_MFGPAGE_VENDORID_LSI, MPI2_MFGPAGE_DEVID_SSS6200,
 		PCI_ANY_ID, PCI_ANY_ID },
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 2/6] mpt3sas: Configure reply post queue depth, DMA and sgl  tablesize.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-07 10:51   ` Suganath Prabu S
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

This configures shost max sector to 128, single reply descriptor
post queue, sgl table size to 16 and 32 bit DMA for MPI Endpoint
and it supports 64K as max IO.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 47 +++++++++++++++++++++++-------------
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 ++++++++++++++++++------------
 2 files changed, 54 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 13d6e4e..f45da9a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -2214,6 +2214,9 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
 	struct sysinfo s;
 	u64 consistent_dma_mask;
 
+	if (ioc->is_mcpu_endpoint)
+		goto try_32bit;
+
 	if (ioc->dma_mask)
 		consistent_dma_mask = DMA_BIT_MASK(64);
 	else
@@ -2232,6 +2235,7 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
 		}
 	}
 
+ try_32bit:
 	if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))
 	    && !pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32))) {
 		ioc->base_add_sg_single = &_base_add_sg_single_32;
@@ -3887,17 +3891,21 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 		sg_tablesize = min_t(unsigned short, sg_tablesize,
 		   MPT_KDUMP_MIN_PHYS_SEGMENTS);
 
-	if (sg_tablesize < MPT_MIN_PHYS_SEGMENTS)
-		sg_tablesize = MPT_MIN_PHYS_SEGMENTS;
-	else if (sg_tablesize > MPT_MAX_PHYS_SEGMENTS) {
-		sg_tablesize = min_t(unsigned short, sg_tablesize,
-				      SG_MAX_SEGMENTS);
-		pr_warn(MPT3SAS_FMT
-		 "sg_tablesize(%u) is bigger than kernel"
-		 " defined SG_CHUNK_SIZE(%u)\n", ioc->name,
-		 sg_tablesize, MPT_MAX_PHYS_SEGMENTS);
+	if (ioc->is_mcpu_endpoint)
+		ioc->shost->sg_tablesize = MPT_MIN_PHYS_SEGMENTS;
+	else {
+		if (sg_tablesize < MPT_MIN_PHYS_SEGMENTS)
+			sg_tablesize = MPT_MIN_PHYS_SEGMENTS;
+		else if (sg_tablesize > MPT_MAX_PHYS_SEGMENTS) {
+			sg_tablesize = min_t(unsigned short, sg_tablesize,
+					SG_MAX_SEGMENTS);
+			pr_warn(MPT3SAS_FMT
+				"sg_tablesize(%u) is bigger than kernel "
+				"defined SG_CHUNK_SIZE(%u)\n", ioc->name,
+				sg_tablesize, MPT_MAX_PHYS_SEGMENTS);
+		}
+		ioc->shost->sg_tablesize = sg_tablesize;
 	}
-	ioc->shost->sg_tablesize = sg_tablesize;
 
 	ioc->internal_depth = min_t(int, (facts->HighPriorityCredit + (5)),
 		(facts->RequestCredit / 4));
@@ -3982,13 +3990,18 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 	/* reply free queue sizing - taking into account for 64 FW events */
 	ioc->reply_free_queue_depth = ioc->hba_queue_depth + 64;
 
-	/* calculate reply descriptor post queue depth */
-	ioc->reply_post_queue_depth = ioc->hba_queue_depth +
-				ioc->reply_free_queue_depth +  1 ;
-	/* align the reply post queue on the next 16 count boundary */
-	if (ioc->reply_post_queue_depth % 16)
-		ioc->reply_post_queue_depth += 16 -
-		(ioc->reply_post_queue_depth % 16);
+	/* mCPU manage single counters for simplicity */
+	if (ioc->is_mcpu_endpoint)
+		ioc->reply_post_queue_depth = ioc->reply_free_queue_depth;
+	else {
+		/* calculate reply descriptor post queue depth */
+		ioc->reply_post_queue_depth = ioc->hba_queue_depth +
+			ioc->reply_free_queue_depth +  1;
+		/* align the reply post queue on the next 16 count boundary */
+		if (ioc->reply_post_queue_depth % 16)
+			ioc->reply_post_queue_depth += 16 -
+				(ioc->reply_post_queue_depth % 16);
+	}
 
 	if (ioc->reply_post_queue_depth >
 	    facts->MaxReplyDescriptorPostQueueDepth) {
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index bde3c6f..5e52679 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -10521,26 +10521,34 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	shost->transportt = mpt3sas_transport_template;
 	shost->unique_id = ioc->id;
 
-	if (max_sectors != 0xFFFF) {
-		if (max_sectors < 64) {
-			shost->max_sectors = 64;
-			pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
-			    "for max_sectors, range is 64 to 32767. Assigning "
-			    "value of 64.\n", ioc->name, max_sectors);
-		} else if (max_sectors > 32767) {
-			shost->max_sectors = 32767;
-			pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
-			    "for max_sectors, range is 64 to 32767. Assigning "
-			    "default value of 32767.\n", ioc->name,
-			    max_sectors);
-		} else {
-			shost->max_sectors = max_sectors & 0xFFFE;
-			pr_info(MPT3SAS_FMT
+	if (ioc->is_mcpu_endpoint) {
+		/* mCPU MPI support 64K max IO */
+		shost->max_sectors = 128;
+		pr_info(MPT3SAS_FMT
 				"The max_sectors value is set to %d\n",
 				ioc->name, shost->max_sectors);
+	} else {
+		if (max_sectors != 0xFFFF) {
+			if (max_sectors < 64) {
+				shost->max_sectors = 64;
+				pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
+				    "for max_sectors, range is 64 to 32767. " \
+				    "Assigning value of 64.\n", \
+				    ioc->name, max_sectors);
+			} else if (max_sectors > 32767) {
+				shost->max_sectors = 32767;
+				pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
+				    "for max_sectors, range is 64 to 32767." \
+				    "Assigning default value of 32767.\n", \
+				    ioc->name, max_sectors);
+			} else {
+				shost->max_sectors = max_sectors & 0xFFFE;
+				pr_info(MPT3SAS_FMT
+					"The max_sectors value is set to %d\n",
+					ioc->name, shost->max_sectors);
+			}
 		}
 	}
-
 	/* register EEDP capabilities with SCSI layer */
 	if (prot_mask > 0)
 		scsi_host_set_prot(shost, prot_mask);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 2/6] mpt3sas: Configure reply post queue depth, DMA and sgl  tablesize.
@ 2018-02-07 10:51   ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


This configures shost max sector to 128, single reply descriptor
post queue, sgl table size to 16 and 32 bit DMA for MPI Endpoint
and it supports 64K as max IO.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani at broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 47 +++++++++++++++++++++++-------------
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 ++++++++++++++++++------------
 2 files changed, 54 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 13d6e4e..f45da9a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -2214,6 +2214,9 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
 	struct sysinfo s;
 	u64 consistent_dma_mask;
 
+	if (ioc->is_mcpu_endpoint)
+		goto try_32bit;
+
 	if (ioc->dma_mask)
 		consistent_dma_mask = DMA_BIT_MASK(64);
 	else
@@ -2232,6 +2235,7 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
 		}
 	}
 
+ try_32bit:
 	if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))
 	    && !pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32))) {
 		ioc->base_add_sg_single = &_base_add_sg_single_32;
@@ -3887,17 +3891,21 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 		sg_tablesize = min_t(unsigned short, sg_tablesize,
 		   MPT_KDUMP_MIN_PHYS_SEGMENTS);
 
-	if (sg_tablesize < MPT_MIN_PHYS_SEGMENTS)
-		sg_tablesize = MPT_MIN_PHYS_SEGMENTS;
-	else if (sg_tablesize > MPT_MAX_PHYS_SEGMENTS) {
-		sg_tablesize = min_t(unsigned short, sg_tablesize,
-				      SG_MAX_SEGMENTS);
-		pr_warn(MPT3SAS_FMT
-		 "sg_tablesize(%u) is bigger than kernel"
-		 " defined SG_CHUNK_SIZE(%u)\n", ioc->name,
-		 sg_tablesize, MPT_MAX_PHYS_SEGMENTS);
+	if (ioc->is_mcpu_endpoint)
+		ioc->shost->sg_tablesize = MPT_MIN_PHYS_SEGMENTS;
+	else {
+		if (sg_tablesize < MPT_MIN_PHYS_SEGMENTS)
+			sg_tablesize = MPT_MIN_PHYS_SEGMENTS;
+		else if (sg_tablesize > MPT_MAX_PHYS_SEGMENTS) {
+			sg_tablesize = min_t(unsigned short, sg_tablesize,
+					SG_MAX_SEGMENTS);
+			pr_warn(MPT3SAS_FMT
+				"sg_tablesize(%u) is bigger than kernel "
+				"defined SG_CHUNK_SIZE(%u)\n", ioc->name,
+				sg_tablesize, MPT_MAX_PHYS_SEGMENTS);
+		}
+		ioc->shost->sg_tablesize = sg_tablesize;
 	}
-	ioc->shost->sg_tablesize = sg_tablesize;
 
 	ioc->internal_depth = min_t(int, (facts->HighPriorityCredit + (5)),
 		(facts->RequestCredit / 4));
@@ -3982,13 +3990,18 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 	/* reply free queue sizing - taking into account for 64 FW events */
 	ioc->reply_free_queue_depth = ioc->hba_queue_depth + 64;
 
-	/* calculate reply descriptor post queue depth */
-	ioc->reply_post_queue_depth = ioc->hba_queue_depth +
-				ioc->reply_free_queue_depth +  1 ;
-	/* align the reply post queue on the next 16 count boundary */
-	if (ioc->reply_post_queue_depth % 16)
-		ioc->reply_post_queue_depth += 16 -
-		(ioc->reply_post_queue_depth % 16);
+	/* mCPU manage single counters for simplicity */
+	if (ioc->is_mcpu_endpoint)
+		ioc->reply_post_queue_depth = ioc->reply_free_queue_depth;
+	else {
+		/* calculate reply descriptor post queue depth */
+		ioc->reply_post_queue_depth = ioc->hba_queue_depth +
+			ioc->reply_free_queue_depth +  1;
+		/* align the reply post queue on the next 16 count boundary */
+		if (ioc->reply_post_queue_depth % 16)
+			ioc->reply_post_queue_depth += 16 -
+				(ioc->reply_post_queue_depth % 16);
+	}
 
 	if (ioc->reply_post_queue_depth >
 	    facts->MaxReplyDescriptorPostQueueDepth) {
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index bde3c6f..5e52679 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -10521,26 +10521,34 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	shost->transportt = mpt3sas_transport_template;
 	shost->unique_id = ioc->id;
 
-	if (max_sectors != 0xFFFF) {
-		if (max_sectors < 64) {
-			shost->max_sectors = 64;
-			pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
-			    "for max_sectors, range is 64 to 32767. Assigning "
-			    "value of 64.\n", ioc->name, max_sectors);
-		} else if (max_sectors > 32767) {
-			shost->max_sectors = 32767;
-			pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
-			    "for max_sectors, range is 64 to 32767. Assigning "
-			    "default value of 32767.\n", ioc->name,
-			    max_sectors);
-		} else {
-			shost->max_sectors = max_sectors & 0xFFFE;
-			pr_info(MPT3SAS_FMT
+	if (ioc->is_mcpu_endpoint) {
+		/* mCPU MPI support 64K max IO */
+		shost->max_sectors = 128;
+		pr_info(MPT3SAS_FMT
 				"The max_sectors value is set to %d\n",
 				ioc->name, shost->max_sectors);
+	} else {
+		if (max_sectors != 0xFFFF) {
+			if (max_sectors < 64) {
+				shost->max_sectors = 64;
+				pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
+				    "for max_sectors, range is 64 to 32767. " \
+				    "Assigning value of 64.\n", \
+				    ioc->name, max_sectors);
+			} else if (max_sectors > 32767) {
+				shost->max_sectors = 32767;
+				pr_warn(MPT3SAS_FMT "Invalid value %d passed " \
+				    "for max_sectors, range is 64 to 32767." \
+				    "Assigning default value of 32767.\n", \
+				    ioc->name, max_sectors);
+			} else {
+				shost->max_sectors = max_sectors & 0xFFFE;
+				pr_info(MPT3SAS_FMT
+					"The max_sectors value is set to %d\n",
+					ioc->name, shost->max_sectors);
+			}
 		}
 	}
-
 	/* register EEDP capabilities with SCSI layer */
 	if (prot_mask > 0)
 		scsi_host_set_prot(shost, prot_mask);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 3/6] mpt3sas: Introduce API's to get BAR0 mapped buffer  address.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-07 10:51   ` Suganath Prabu S
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

For MPI Endpoint/Mcpu, Driver should double buffer data buffer/sgl's.
 This is normally copied from host to internal memory of IOC by
 DMA engine of PCI Device. Since the interface to DMA from host
 to mCPU is not present for Mcpu/MPI Endpoint device,
 Driver does double copy of those buffer directly to the mCPU
 memory region via BAR-0 region.

 Introduced API's to calculate and return BAR0 mapped
 host buffer's physical and virtual address for the provided smid

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 93 +++++++++++++++++++++++++++++++++++++
 drivers/scsi/mpt3sas/mpt3sas_base.h |  2 +
 2 files changed, 95 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index f45da9a..36f1242 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,99 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_get_chain - Calculates and Returns virtual chain address
+ *			 for the provided smid in BAR0 space.
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ * @sge_chain_count: Scatter gather chain count.
+ *
+ * @Return: chain address.
+ */
+static inline void __iomem*
+_base_get_chain(struct MPT3SAS_ADAPTER *ioc, u16 smid,
+		u8 sge_chain_count)
+{
+	void __iomem *base_chain, *chain_virt;
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+
+	base_chain  = (void __iomem *)ioc->chip + MPI_FRAME_START_OFFSET +
+		(cmd_credit * ioc->request_sz) +
+		REPLY_FREE_POOL_SIZE;
+	chain_virt = base_chain + (smid * ioc->facts.MaxChainDepth *
+			ioc->request_sz) + (sge_chain_count * ioc->request_sz);
+	return chain_virt;
+}
+
+/**
+ * _base_get_chain_phys - Calculates and Returns physical address
+ *			in BAR0 for scatter gather chains, for
+ *			the provided smid.
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ * @sge_chain_count: Scatter gather chain count.
+ *
+ * @Return - Physical chain address.
+ */
+static inline void *
+_base_get_chain_phys(struct MPT3SAS_ADAPTER *ioc, u16 smid,
+		u8 sge_chain_count)
+{
+	void *base_chain_phys, *chain_phys;
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+
+	base_chain_phys  = (void *)ioc->chip_phys + MPI_FRAME_START_OFFSET +
+		(cmd_credit * ioc->request_sz) +
+		REPLY_FREE_POOL_SIZE;
+	chain_phys = base_chain_phys + (smid * ioc->facts.MaxChainDepth *
+			ioc->request_sz) + (sge_chain_count * ioc->request_sz);
+	return chain_phys;
+}
+
+/**
+ * _base_get_buffer_bar0 - Calculates and Returns BAR0 mapped Host
+ *			buffer address for the provided smid.
+ *			(Each smid can have 64K starts from 17024)
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ *
+ * @Returns - Pointer to buffer location in BAR0.
+ */
+
+static void __iomem *
+_base_get_buffer_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
+{
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+	// Added extra 1 to reach end of chain.
+	void __iomem *chain_end = _base_get_chain(ioc,
+			cmd_credit + 1,
+			ioc->facts.MaxChainDepth);
+	return chain_end + (smid * 64 * 1024);
+}
+
+/**
+ * _base_get_buffer_phys_bar0 - Calculates and Returns BAR0 mapped
+ *		Host buffer Physical address for the provided smid.
+ *		(Each smid can have 64K starts from 17024)
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ *
+ * @Returns - Pointer to buffer location in BAR0.
+ */
+static void *
+_base_get_buffer_phys_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
+{
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+	void *chain_end_phys = _base_get_chain_phys(ioc,
+			cmd_credit + 1,
+			ioc->facts.MaxChainDepth);
+	return chain_end_phys + (smid * 64 * 1024);
+}
+
+/**
  *  mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc
  * @arg: input argument, used to derive ioc
  *
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 897394d..2529d25 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -120,6 +120,8 @@
 #define MPT3SAS_NVME_QUEUE_DEPTH	128
 #define MPT_NAME_LENGTH			32	/* generic length of strings */
 #define MPT_STRING_LENGTH		64
+#define MPI_FRAME_START_OFFSET		256
+#define REPLY_FREE_POOL_SIZE		512 /*(32 maxcredix *4)*(4 times)*/
 
 #define MPT_MAX_CALLBACKS		32
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 3/6] mpt3sas: Introduce API's to get BAR0 mapped buffer address.
@ 2018-02-07 10:51   ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


For MPI Endpoint/Mcpu, Driver should double buffer data buffer/sgl's.
 This is normally copied from host to internal memory of IOC by
 DMA engine of PCI Device. Since the interface to DMA from host
 to mCPU is not present for Mcpu/MPI Endpoint device,
 Driver does double copy of those buffer directly to the mCPU
 memory region via BAR-0 region.

 Introduced API's to calculate and return BAR0 mapped
 host buffer's physical and virtual address for the provided smid

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani at broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 93 +++++++++++++++++++++++++++++++++++++
 drivers/scsi/mpt3sas/mpt3sas_base.h |  2 +
 2 files changed, 95 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index f45da9a..36f1242 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,99 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_get_chain - Calculates and Returns virtual chain address
+ *			 for the provided smid in BAR0 space.
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ * @sge_chain_count: Scatter gather chain count.
+ *
+ * @Return: chain address.
+ */
+static inline void __iomem*
+_base_get_chain(struct MPT3SAS_ADAPTER *ioc, u16 smid,
+		u8 sge_chain_count)
+{
+	void __iomem *base_chain, *chain_virt;
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+
+	base_chain  = (void __iomem *)ioc->chip + MPI_FRAME_START_OFFSET +
+		(cmd_credit * ioc->request_sz) +
+		REPLY_FREE_POOL_SIZE;
+	chain_virt = base_chain + (smid * ioc->facts.MaxChainDepth *
+			ioc->request_sz) + (sge_chain_count * ioc->request_sz);
+	return chain_virt;
+}
+
+/**
+ * _base_get_chain_phys - Calculates and Returns physical address
+ *			in BAR0 for scatter gather chains, for
+ *			the provided smid.
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ * @sge_chain_count: Scatter gather chain count.
+ *
+ * @Return - Physical chain address.
+ */
+static inline void *
+_base_get_chain_phys(struct MPT3SAS_ADAPTER *ioc, u16 smid,
+		u8 sge_chain_count)
+{
+	void *base_chain_phys, *chain_phys;
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+
+	base_chain_phys  = (void *)ioc->chip_phys + MPI_FRAME_START_OFFSET +
+		(cmd_credit * ioc->request_sz) +
+		REPLY_FREE_POOL_SIZE;
+	chain_phys = base_chain_phys + (smid * ioc->facts.MaxChainDepth *
+			ioc->request_sz) + (sge_chain_count * ioc->request_sz);
+	return chain_phys;
+}
+
+/**
+ * _base_get_buffer_bar0 - Calculates and Returns BAR0 mapped Host
+ *			buffer address for the provided smid.
+ *			(Each smid can have 64K starts from 17024)
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ *
+ * @Returns - Pointer to buffer location in BAR0.
+ */
+
+static void __iomem *
+_base_get_buffer_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
+{
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+	// Added extra 1 to reach end of chain.
+	void __iomem *chain_end = _base_get_chain(ioc,
+			cmd_credit + 1,
+			ioc->facts.MaxChainDepth);
+	return chain_end + (smid * 64 * 1024);
+}
+
+/**
+ * _base_get_buffer_phys_bar0 - Calculates and Returns BAR0 mapped
+ *		Host buffer Physical address for the provided smid.
+ *		(Each smid can have 64K starts from 17024)
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index
+ *
+ * @Returns - Pointer to buffer location in BAR0.
+ */
+static void *
+_base_get_buffer_phys_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
+{
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+	void *chain_end_phys = _base_get_chain_phys(ioc,
+			cmd_credit + 1,
+			ioc->facts.MaxChainDepth);
+	return chain_end_phys + (smid * 64 * 1024);
+}
+
+/**
  *  mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc
  * @arg: input argument, used to derive ioc
  *
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 897394d..2529d25 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -120,6 +120,8 @@
 #define MPT3SAS_NVME_QUEUE_DEPTH	128
 #define MPT_NAME_LENGTH			32	/* generic length of strings */
 #define MPT_STRING_LENGTH		64
+#define MPI_FRAME_START_OFFSET		256
+#define REPLY_FREE_POOL_SIZE		512 /*(32 maxcredix *4)*(4 times)*/
 
 #define MPT_MAX_CALLBACKS		32
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 4/6] mpt3sas: Introduce Base function for cloning.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-07 10:51   ` Suganath Prabu S
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

All scsi IO's and config requests data buffer and
sgl are cloned to system memory in _clone_sg_entries
before submitting it to Firmware.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c   | 215 +++++++++++++++++++++++++++++++++-
 drivers/scsi/mpt3sas/mpt3sas_base.h   |   3 +
 drivers/scsi/mpt3sas/mpt3sas_config.c |   1 +
 3 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 36f1242..c41c65b 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,24 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_clone_to_sys_mem - Writes/copies data to system/BAR0 region
+ *
+ * @dst_iomem: Pointer to the destinaltion location in BAR0 space.
+ * @src: Pointer to the Source data.
+ * @size: Size of data to be copied.
+ */
+static void
+_base_clone_to_sys_mem(void __iomem *dst_iomem, void *src, u32 size)
+{
+	int i;
+	u32 *src_virt_mem = (u32 *)(src);
+
+	for (i = 0; i < size/4; i++)
+		writel((u32)src_virt_mem[i],
+			(void __iomem *)dst_iomem + (i * 4));
+}
+
+/**
  * _base_get_chain - Calculates and Returns virtual chain address
  *			 for the provided smid in BAR0 space.
  *
@@ -219,6 +237,201 @@ _base_get_buffer_phys_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 }
 
 /**
+ * _base_get_chain_buffer_dma_to_chain_buffer - Iterates chain
+ *			lookup list and Provides chain_buffer
+ *			address for the matching dma address.
+ *			(Each smid can have 64K starts from 17024)
+ *
+ * @ioc: per adapter object
+ * @chain_buffer_dma: Chain buffer dma address.
+ *
+ * @Returns - Pointer to chain buffer. Or Null on Failure.
+ */
+static void *
+_base_get_chain_buffer_dma_to_chain_buffer(struct MPT3SAS_ADAPTER *ioc,
+		dma_addr_t chain_buffer_dma)
+{
+	u16 index;
+
+	for (index = 0; index < ioc->chain_depth; index++) {
+		if (ioc->chain_lookup[index].chain_buffer_dma ==
+				chain_buffer_dma)
+			return ioc->chain_lookup[index].chain_buffer;
+	}
+	pr_info(MPT3SAS_FMT
+	    "Provided chain_buffer_dma address is not in the lookup list\n",
+	    ioc->name);
+	return NULL;
+}
+
+/**
+ * _clone_sg_entries -	MPI EP's scsiio and config requests
+ *			are handled here. Base function for
+ *			double buffering, before submitting
+ *			the requests.
+ *
+ * @ioc: per adapter object.
+ * @mpi_request: mf request pointer.
+ * @smid: system request message index.
+ *
+ * @Returns: Nothing.
+ */
+static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
+		void *mpi_request, u16 smid)
+{
+	Mpi2SGESimple32_t *sgel, *sgel_next;
+	u32  sgl_flags, sge_chain_count = 0;
+	bool is_write = 0;
+	u16 i = 0;
+	void __iomem *buffer_iomem;
+	void  *buffer_iomem_phys;
+	void __iomem *buff_ptr;
+	void *buff_ptr_phys;
+	void __iomem *dst_chain_addr[MCPU_MAX_CHAINS_PER_IO];
+	void *src_chain_addr[MCPU_MAX_CHAINS_PER_IO], *dst_addr_phys;
+	MPI2RequestHeader_t *request_hdr;
+	struct scsi_cmnd *scmd;
+	struct scatterlist *sg_scmd = NULL;
+	int is_scsiio_req = 0;
+
+	request_hdr = (MPI2RequestHeader_t *) mpi_request;
+
+	if (request_hdr->Function == MPI2_FUNCTION_SCSI_IO_REQUEST) {
+		Mpi25SCSIIORequest_t *scsiio_request =
+			(Mpi25SCSIIORequest_t *)mpi_request;
+		sgel = (Mpi2SGESimple32_t *) &scsiio_request->SGL;
+		is_scsiio_req = 1;
+	} else if (request_hdr->Function == MPI2_FUNCTION_CONFIG) {
+		Mpi2ConfigRequest_t  *config_req =
+			(Mpi2ConfigRequest_t *)mpi_request;
+		sgel = (Mpi2SGESimple32_t *) &config_req->PageBufferSGE;
+	} else
+		return;
+
+	/* From smid we can get scsi_cmd, once we have sg_scmd,
+	 * we just need to get sg_virt and sg_next to get virual
+	 * address associated with sgel->Address.
+	 */
+
+	if (is_scsiio_req) {
+		/* Get scsi_cmd using smid */
+		scmd = mpt3sas_scsih_scsi_lookup_get(ioc, smid);
+		if (scmd == NULL) {
+			pr_err(MPT3SAS_FMT "scmd is NULL\n", ioc->name);
+			return;
+		}
+
+		/* Get sg_scmd from scmd provided */
+		sg_scmd = scsi_sglist(scmd);
+	}
+
+	/*
+	 * 0 - 255	System register
+	 * 256 - 4352	MPI Frame. (This is based on maxCredit 32)
+	 * 4352 - 4864	Reply_free pool (512 byte is reserved
+	 *		considering maxCredit 32. Reply need extra
+	 *		room, for mCPU case kept four times of
+	 *		maxCredit).
+	 * 4864 - 17152	SGE chain element. (32cmd * 3 chain of
+	 *		128 byte size = 12288)
+	 * 17152 - x	Host buffer mapped with smid.
+	 *		(Each smid can have 64K Max IO.)
+	 * BAR0+Last 1K MSIX Addr and Data
+	 * Total size in use 2113664 bytes of 4MB BAR0
+	 */
+
+	buffer_iomem = _base_get_buffer_bar0(ioc, smid);
+	buffer_iomem_phys = _base_get_buffer_phys_bar0(ioc, smid);
+
+	buff_ptr = buffer_iomem;
+	buff_ptr_phys = buffer_iomem_phys;
+
+	if (sgel->FlagsLength &
+			(MPI2_SGE_FLAGS_HOST_TO_IOC << MPI2_SGE_FLAGS_SHIFT))
+		is_write = 1;
+
+	for (i = 0; i < MPT_MIN_PHYS_SEGMENTS + ioc->facts.MaxChainDepth; i++) {
+
+		sgl_flags = (sgel->FlagsLength >> MPI2_SGE_FLAGS_SHIFT);
+
+		switch (sgl_flags & MPI2_SGE_FLAGS_ELEMENT_MASK) {
+		case MPI2_SGE_FLAGS_CHAIN_ELEMENT:
+			/*
+			 * Helper function which on passing
+			 * chain_buffer_dma returns chain_buffer. Get
+			 * the virtual address for sgel->Address
+			 */
+			sgel_next =
+				_base_get_chain_buffer_dma_to_chain_buffer(ioc,
+						sgel->Address);
+			if (sgel_next == NULL)
+				return;
+			/*
+			 * This is coping 128 byte chain
+			 * frame (not a host buffer)
+			 */
+			dst_chain_addr[sge_chain_count] =
+				_base_get_chain(ioc,
+					smid, sge_chain_count);
+			src_chain_addr[sge_chain_count] =
+						(void *) sgel_next;
+			dst_addr_phys =
+				_base_get_chain_phys(ioc,
+						smid, sge_chain_count);
+			sgel->Address = (dma_addr_t)dst_addr_phys;
+			sgel = sgel_next;
+			sge_chain_count++;
+			break;
+		case MPI2_SGE_FLAGS_SIMPLE_ELEMENT:
+			if (is_write) {
+				if (is_scsiio_req) {
+					_base_clone_to_sys_mem(buff_ptr,
+					    sg_virt(sg_scmd),
+					    (sgel->FlagsLength & 0x00ffffff));
+					sgel->Address =
+						(dma_addr_t)buff_ptr_phys;
+				} else {
+					_base_clone_to_sys_mem(buff_ptr,
+					    ioc->config_vaddr,
+					    (sgel->FlagsLength & 0x00ffffff));
+					sgel->Address =
+					    (dma_addr_t)buff_ptr_phys;
+				}
+			}
+			buff_ptr += (sgel->FlagsLength & 0x00ffffff);
+			buff_ptr_phys += (sgel->FlagsLength & 0x00ffffff);
+			if ((sgel->FlagsLength &
+			    (MPI2_SGE_FLAGS_END_OF_BUFFER
+					<< MPI2_SGE_FLAGS_SHIFT)))
+				goto eob_clone_chain;
+			else {
+				/*
+				 * Every single element in MPT will have
+				 * associated sg_next. Better to sanity that
+				 * sg_next is not NULL, but it will be a bug
+				 * if it is null.
+				 */
+				if (is_scsiio_req) {
+					sg_scmd = sg_next(sg_scmd);
+					if (sg_scmd)
+						sgel++;
+					else
+						goto eob_clone_chain;
+				}
+			}
+			break;
+		}
+	}
+
+eob_clone_chain:
+	for (i = 0; i < sge_chain_count; i++) {
+		if (is_scsiio_req)
+			_base_clone_to_sys_mem(dst_chain_addr[i],
+				src_chain_addr[i], ioc->request_sz);
+	}
+}
+
+/**
  *  mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc
  * @arg: input argument, used to derive ioc
  *
@@ -3295,7 +3508,7 @@ _base_put_smid_nvme_encap_atomic(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 
 /**
  * _base_put_smid_default - Default, primarily used for config pages
- * use Atomic Request Descriptor
+ *				use Atomic Request Descriptor
  * @ioc: per adapter object
  * @smid: system request message index
  *
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 2529d25..4fd582b 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -95,6 +95,8 @@
 #define MPT_MIN_PHYS_SEGMENTS	16
 #define MPT_KDUMP_MIN_PHYS_SEGMENTS	32
 
+#define MCPU_MAX_CHAINS_PER_IO	3
+
 #ifdef CONFIG_SCSI_MPT3SAS_MAX_SGE
 #define MPT3SAS_SG_DEPTH		CONFIG_SCSI_MPT3SAS_MAX_SGE
 #else
@@ -1238,6 +1240,7 @@ struct MPT3SAS_ADAPTER {
 	u16		config_page_sz;
 	void		*config_page;
 	dma_addr_t	config_page_dma;
+	void		*config_vaddr;
 
 	/* scsiio request */
 	u16		hba_queue_depth;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_config.c b/drivers/scsi/mpt3sas/mpt3sas_config.c
index 1c747cf..0dba3c4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_config.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_config.c
@@ -219,6 +219,7 @@ _config_alloc_config_dma_memory(struct MPT3SAS_ADAPTER *ioc,
 		mem->page = ioc->config_page;
 		mem->page_dma = ioc->config_page_dma;
 	}
+	ioc->config_vaddr = mem->page;
 	return r;
 }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 4/6] mpt3sas: Introduce Base function for cloning.
@ 2018-02-07 10:51   ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


All scsi IO's and config requests data buffer and
sgl are cloned to system memory in _clone_sg_entries
before submitting it to Firmware.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani at broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c   | 215 +++++++++++++++++++++++++++++++++-
 drivers/scsi/mpt3sas/mpt3sas_base.h   |   3 +
 drivers/scsi/mpt3sas/mpt3sas_config.c |   1 +
 3 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 36f1242..c41c65b 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,24 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_clone_to_sys_mem - Writes/copies data to system/BAR0 region
+ *
+ * @dst_iomem: Pointer to the destinaltion location in BAR0 space.
+ * @src: Pointer to the Source data.
+ * @size: Size of data to be copied.
+ */
+static void
+_base_clone_to_sys_mem(void __iomem *dst_iomem, void *src, u32 size)
+{
+	int i;
+	u32 *src_virt_mem = (u32 *)(src);
+
+	for (i = 0; i < size/4; i++)
+		writel((u32)src_virt_mem[i],
+			(void __iomem *)dst_iomem + (i * 4));
+}
+
+/**
  * _base_get_chain - Calculates and Returns virtual chain address
  *			 for the provided smid in BAR0 space.
  *
@@ -219,6 +237,201 @@ _base_get_buffer_phys_bar0(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 }
 
 /**
+ * _base_get_chain_buffer_dma_to_chain_buffer - Iterates chain
+ *			lookup list and Provides chain_buffer
+ *			address for the matching dma address.
+ *			(Each smid can have 64K starts from 17024)
+ *
+ * @ioc: per adapter object
+ * @chain_buffer_dma: Chain buffer dma address.
+ *
+ * @Returns - Pointer to chain buffer. Or Null on Failure.
+ */
+static void *
+_base_get_chain_buffer_dma_to_chain_buffer(struct MPT3SAS_ADAPTER *ioc,
+		dma_addr_t chain_buffer_dma)
+{
+	u16 index;
+
+	for (index = 0; index < ioc->chain_depth; index++) {
+		if (ioc->chain_lookup[index].chain_buffer_dma ==
+				chain_buffer_dma)
+			return ioc->chain_lookup[index].chain_buffer;
+	}
+	pr_info(MPT3SAS_FMT
+	    "Provided chain_buffer_dma address is not in the lookup list\n",
+	    ioc->name);
+	return NULL;
+}
+
+/**
+ * _clone_sg_entries -	MPI EP's scsiio and config requests
+ *			are handled here. Base function for
+ *			double buffering, before submitting
+ *			the requests.
+ *
+ * @ioc: per adapter object.
+ * @mpi_request: mf request pointer.
+ * @smid: system request message index.
+ *
+ * @Returns: Nothing.
+ */
+static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
+		void *mpi_request, u16 smid)
+{
+	Mpi2SGESimple32_t *sgel, *sgel_next;
+	u32  sgl_flags, sge_chain_count = 0;
+	bool is_write = 0;
+	u16 i = 0;
+	void __iomem *buffer_iomem;
+	void  *buffer_iomem_phys;
+	void __iomem *buff_ptr;
+	void *buff_ptr_phys;
+	void __iomem *dst_chain_addr[MCPU_MAX_CHAINS_PER_IO];
+	void *src_chain_addr[MCPU_MAX_CHAINS_PER_IO], *dst_addr_phys;
+	MPI2RequestHeader_t *request_hdr;
+	struct scsi_cmnd *scmd;
+	struct scatterlist *sg_scmd = NULL;
+	int is_scsiio_req = 0;
+
+	request_hdr = (MPI2RequestHeader_t *) mpi_request;
+
+	if (request_hdr->Function == MPI2_FUNCTION_SCSI_IO_REQUEST) {
+		Mpi25SCSIIORequest_t *scsiio_request =
+			(Mpi25SCSIIORequest_t *)mpi_request;
+		sgel = (Mpi2SGESimple32_t *) &scsiio_request->SGL;
+		is_scsiio_req = 1;
+	} else if (request_hdr->Function == MPI2_FUNCTION_CONFIG) {
+		Mpi2ConfigRequest_t  *config_req =
+			(Mpi2ConfigRequest_t *)mpi_request;
+		sgel = (Mpi2SGESimple32_t *) &config_req->PageBufferSGE;
+	} else
+		return;
+
+	/* From smid we can get scsi_cmd, once we have sg_scmd,
+	 * we just need to get sg_virt and sg_next to get virual
+	 * address associated with sgel->Address.
+	 */
+
+	if (is_scsiio_req) {
+		/* Get scsi_cmd using smid */
+		scmd = mpt3sas_scsih_scsi_lookup_get(ioc, smid);
+		if (scmd == NULL) {
+			pr_err(MPT3SAS_FMT "scmd is NULL\n", ioc->name);
+			return;
+		}
+
+		/* Get sg_scmd from scmd provided */
+		sg_scmd = scsi_sglist(scmd);
+	}
+
+	/*
+	 * 0 - 255	System register
+	 * 256 - 4352	MPI Frame. (This is based on maxCredit 32)
+	 * 4352 - 4864	Reply_free pool (512 byte is reserved
+	 *		considering maxCredit 32. Reply need extra
+	 *		room, for mCPU case kept four times of
+	 *		maxCredit).
+	 * 4864 - 17152	SGE chain element. (32cmd * 3 chain of
+	 *		128 byte size = 12288)
+	 * 17152 - x	Host buffer mapped with smid.
+	 *		(Each smid can have 64K Max IO.)
+	 * BAR0+Last 1K MSIX Addr and Data
+	 * Total size in use 2113664 bytes of 4MB BAR0
+	 */
+
+	buffer_iomem = _base_get_buffer_bar0(ioc, smid);
+	buffer_iomem_phys = _base_get_buffer_phys_bar0(ioc, smid);
+
+	buff_ptr = buffer_iomem;
+	buff_ptr_phys = buffer_iomem_phys;
+
+	if (sgel->FlagsLength &
+			(MPI2_SGE_FLAGS_HOST_TO_IOC << MPI2_SGE_FLAGS_SHIFT))
+		is_write = 1;
+
+	for (i = 0; i < MPT_MIN_PHYS_SEGMENTS + ioc->facts.MaxChainDepth; i++) {
+
+		sgl_flags = (sgel->FlagsLength >> MPI2_SGE_FLAGS_SHIFT);
+
+		switch (sgl_flags & MPI2_SGE_FLAGS_ELEMENT_MASK) {
+		case MPI2_SGE_FLAGS_CHAIN_ELEMENT:
+			/*
+			 * Helper function which on passing
+			 * chain_buffer_dma returns chain_buffer. Get
+			 * the virtual address for sgel->Address
+			 */
+			sgel_next =
+				_base_get_chain_buffer_dma_to_chain_buffer(ioc,
+						sgel->Address);
+			if (sgel_next == NULL)
+				return;
+			/*
+			 * This is coping 128 byte chain
+			 * frame (not a host buffer)
+			 */
+			dst_chain_addr[sge_chain_count] =
+				_base_get_chain(ioc,
+					smid, sge_chain_count);
+			src_chain_addr[sge_chain_count] =
+						(void *) sgel_next;
+			dst_addr_phys =
+				_base_get_chain_phys(ioc,
+						smid, sge_chain_count);
+			sgel->Address = (dma_addr_t)dst_addr_phys;
+			sgel = sgel_next;
+			sge_chain_count++;
+			break;
+		case MPI2_SGE_FLAGS_SIMPLE_ELEMENT:
+			if (is_write) {
+				if (is_scsiio_req) {
+					_base_clone_to_sys_mem(buff_ptr,
+					    sg_virt(sg_scmd),
+					    (sgel->FlagsLength & 0x00ffffff));
+					sgel->Address =
+						(dma_addr_t)buff_ptr_phys;
+				} else {
+					_base_clone_to_sys_mem(buff_ptr,
+					    ioc->config_vaddr,
+					    (sgel->FlagsLength & 0x00ffffff));
+					sgel->Address =
+					    (dma_addr_t)buff_ptr_phys;
+				}
+			}
+			buff_ptr += (sgel->FlagsLength & 0x00ffffff);
+			buff_ptr_phys += (sgel->FlagsLength & 0x00ffffff);
+			if ((sgel->FlagsLength &
+			    (MPI2_SGE_FLAGS_END_OF_BUFFER
+					<< MPI2_SGE_FLAGS_SHIFT)))
+				goto eob_clone_chain;
+			else {
+				/*
+				 * Every single element in MPT will have
+				 * associated sg_next. Better to sanity that
+				 * sg_next is not NULL, but it will be a bug
+				 * if it is null.
+				 */
+				if (is_scsiio_req) {
+					sg_scmd = sg_next(sg_scmd);
+					if (sg_scmd)
+						sgel++;
+					else
+						goto eob_clone_chain;
+				}
+			}
+			break;
+		}
+	}
+
+eob_clone_chain:
+	for (i = 0; i < sge_chain_count; i++) {
+		if (is_scsiio_req)
+			_base_clone_to_sys_mem(dst_chain_addr[i],
+				src_chain_addr[i], ioc->request_sz);
+	}
+}
+
+/**
  *  mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc
  * @arg: input argument, used to derive ioc
  *
@@ -3295,7 +3508,7 @@ _base_put_smid_nvme_encap_atomic(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 
 /**
  * _base_put_smid_default - Default, primarily used for config pages
- * use Atomic Request Descriptor
+ *				use Atomic Request Descriptor
  * @ioc: per adapter object
  * @smid: system request message index
  *
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 2529d25..4fd582b 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -95,6 +95,8 @@
 #define MPT_MIN_PHYS_SEGMENTS	16
 #define MPT_KDUMP_MIN_PHYS_SEGMENTS	32
 
+#define MCPU_MAX_CHAINS_PER_IO	3
+
 #ifdef CONFIG_SCSI_MPT3SAS_MAX_SGE
 #define MPT3SAS_SG_DEPTH		CONFIG_SCSI_MPT3SAS_MAX_SGE
 #else
@@ -1238,6 +1240,7 @@ struct MPT3SAS_ADAPTER {
 	u16		config_page_sz;
 	void		*config_page;
 	dma_addr_t	config_page_dma;
+	void		*config_vaddr;
 
 	/* scsiio request */
 	u16		hba_queue_depth;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_config.c b/drivers/scsi/mpt3sas/mpt3sas_config.c
index 1c747cf..0dba3c4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_config.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_config.c
@@ -219,6 +219,7 @@ _config_alloc_config_dma_memory(struct MPT3SAS_ADAPTER *ioc,
 		mem->page = ioc->config_page;
 		mem->page_dma = ioc->config_page_dma;
 	}
+	ioc->config_vaddr = mem->page;
 	return r;
 }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 5/6] mpt3sas: Introduce function to clone mpi request.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-07 10:51   ` Suganath Prabu S
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

1) Added function _base_clone_mpi_to_sys_mem to clone
MPI request into system BAR0 mapped region.

2) Separate out MPI Endpoint IO submissions to function
_base_put_smid_mpi_ep_scsi_io.

3) MPI EP requests are submitted in two 32 bit MMIO writes.
from _base_mpi_ep_writeq.

 For 32 bit Arch,_base_writeq function is identical
to _base_mpi_ep_writeq, Removed duplicate code as suggested.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 140 ++++++++++++++++++++++++++++++++----
 1 file changed, 125 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index c41c65b..52effd1 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,25 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_clone_mpi_to_sys_mem - Writes/copies MPI frames
+ *				to system/BAR0 region.
+ *
+ * @dst_iomem: Pointer to the destinaltion location in BAR0 space.
+ * @src: Pointer to the Source data.
+ * @size: Size of data to be copied.
+ */
+static void
+_base_clone_mpi_to_sys_mem(void *dst_iomem, void *src, u32 size)
+{
+	int i;
+	u32 *src_virt_mem = (u32 *)src;
+
+	for (i = 0; i < size/4; i++)
+		writel((u32)src_virt_mem[i],
+				(void __iomem *)dst_iomem + (i * 4));
+}
+
+/**
  * _base_clone_to_sys_mem - Writes/copies data to system/BAR0 region
  *
  * @dst_iomem: Pointer to the destinaltion location in BAR0 space.
@@ -3268,6 +3287,29 @@ mpt3sas_base_free_smid(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 }
 
 /**
+ * _base_mpi_ep_writeq - 32 bit write to MMIO
+ * @b: data payload
+ * @addr: address in MMIO space
+ * @writeq_lock: spin lock
+ *
+ * This special handling for MPI EP to take care of 32 bit
+ * environment where its not quarenteed to send the entire word
+ * in one transfer.
+ */
+static inline void
+_base_mpi_ep_writeq(__u64 b, volatile void __iomem *addr,
+					spinlock_t *writeq_lock)
+{
+	unsigned long flags;
+	__u64 data_out = cpu_to_le64(b);
+
+	spin_lock_irqsave(writeq_lock, flags);
+	writel((u32)(data_out), addr);
+	writel((u32)(data_out >> 32), (addr + 4));
+	spin_unlock_irqrestore(writeq_lock, flags);
+}
+
+/**
  * _base_writeq - 64 bit write to MMIO
  * @ioc: per adapter object
  * @b: data payload
@@ -3288,17 +3330,41 @@ _base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t *writeq_lock)
 static inline void
 _base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t *writeq_lock)
 {
-	unsigned long flags;
-	__u64 data_out = cpu_to_le64(b);
-
-	spin_lock_irqsave(writeq_lock, flags);
-	writel((u32)(data_out), addr);
-	writel((u32)(data_out >> 32), (addr + 4));
-	spin_unlock_irqrestore(writeq_lock, flags);
+	_base_mpi_ep_writeq(b, addr, writeq_lock);
 }
 #endif
 
 /**
+ * _base_put_smid_mpi_ep_scsi_io - send SCSI_IO request to firmware
+ * @ioc: per adapter object
+ * @smid: system request message index
+ * @handle: device handle
+ *
+ * Return nothing.
+ */
+static void
+_base_put_smid_mpi_ep_scsi_io(struct MPT3SAS_ADAPTER *ioc, u16 smid, u16 handle)
+{
+	Mpi2RequestDescriptorUnion_t descriptor;
+	u64 *request = (u64 *)&descriptor;
+	void *mpi_req_iomem;
+	__le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
+
+	_clone_sg_entries(ioc, (void *) mfp, smid);
+	mpi_req_iomem = (void *)ioc->chip +
+			MPI_FRAME_START_OFFSET + (smid * ioc->request_sz);
+	_base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
+					ioc->request_sz);
+	descriptor.SCSIIO.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_SCSI_IO;
+	descriptor.SCSIIO.MSIxIndex =  _base_get_msix_index(ioc);
+	descriptor.SCSIIO.SMID = cpu_to_le16(smid);
+	descriptor.SCSIIO.DevHandle = cpu_to_le16(handle);
+	descriptor.SCSIIO.LMID = 0;
+	_base_mpi_ep_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
+	    &ioc->scsi_lookup_lock);
+}
+
+/**
  * _base_put_smid_scsi_io - send SCSI_IO request to firmware
  * @ioc: per adapter object
  * @smid: system request message index
@@ -3359,7 +3425,23 @@ _base_put_smid_hi_priority(struct MPT3SAS_ADAPTER *ioc, u16 smid,
 	u16 msix_task)
 {
 	Mpi2RequestDescriptorUnion_t descriptor;
-	u64 *request = (u64 *)&descriptor;
+	void *mpi_req_iomem;
+	u64 *request;
+
+	if (ioc->is_mcpu_endpoint) {
+		MPI2RequestHeader_t *request_hdr;
+
+		__le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
+
+		request_hdr = (MPI2RequestHeader_t *)mfp;
+		/* TBD 256 is offset within sys register. */
+		mpi_req_iomem = (void *)ioc->chip + MPI_FRAME_START_OFFSET
+					+ (smid * ioc->request_sz);
+		_base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
+							ioc->request_sz);
+	}
+
+	request = (u64 *)&descriptor;
 
 	descriptor.HighPriority.RequestFlags =
 	    MPI2_REQ_DESCRIPT_FLAGS_HIGH_PRIORITY;
@@ -3367,8 +3449,13 @@ _base_put_smid_hi_priority(struct MPT3SAS_ADAPTER *ioc, u16 smid,
 	descriptor.HighPriority.SMID = cpu_to_le16(smid);
 	descriptor.HighPriority.LMID = 0;
 	descriptor.HighPriority.Reserved1 = 0;
-	_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
-	    &ioc->scsi_lookup_lock);
+	if (ioc->is_mcpu_endpoint)
+		_base_mpi_ep_writeq(*request,
+				&ioc->chip->RequestDescriptorPostLow,
+				&ioc->scsi_lookup_lock);
+	else
+		_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
+		    &ioc->scsi_lookup_lock);
 }
 
 /**
@@ -3406,15 +3493,35 @@ static void
 _base_put_smid_default(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 {
 	Mpi2RequestDescriptorUnion_t descriptor;
-	u64 *request = (u64 *)&descriptor;
+	void *mpi_req_iomem;
+	u64 *request;
+	MPI2RequestHeader_t *request_hdr;
+
+	if (ioc->is_mcpu_endpoint) {
+		__le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
+
+		request_hdr = (MPI2RequestHeader_t *)mfp;
 
+		_clone_sg_entries(ioc, (void *) mfp, smid);
+		/* TBD 256 is offset within sys register */
+		mpi_req_iomem = (void *)ioc->chip +
+			MPI_FRAME_START_OFFSET + (smid * ioc->request_sz);
+		_base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
+							ioc->request_sz);
+	}
+	request = (u64 *)&descriptor;
 	descriptor.Default.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_DEFAULT_TYPE;
 	descriptor.Default.MSIxIndex =  _base_get_msix_index(ioc);
 	descriptor.Default.SMID = cpu_to_le16(smid);
 	descriptor.Default.LMID = 0;
 	descriptor.Default.DescriptorTypeDependent = 0;
-	_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
-	    &ioc->scsi_lookup_lock);
+	if (ioc->is_mcpu_endpoint)
+		_base_mpi_ep_writeq(*request,
+				&ioc->chip->RequestDescriptorPostLow,
+				&ioc->scsi_lookup_lock);
+	else
+		_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
+				&ioc->scsi_lookup_lock);
 }
 
 /**
@@ -3508,7 +3615,7 @@ _base_put_smid_nvme_encap_atomic(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 
 /**
  * _base_put_smid_default - Default, primarily used for config pages
- *				use Atomic Request Descriptor
+ * use Atomic Request Descriptor
  * @ioc: per adapter object
  * @smid: system request message index
  *
@@ -6333,7 +6440,10 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
 		ioc->put_smid_nvme_encap = &_base_put_smid_nvme_encap_atomic;
 	} else {
 		ioc->put_smid_default = &_base_put_smid_default;
-		ioc->put_smid_scsi_io = &_base_put_smid_scsi_io;
+		if (ioc->is_mcpu_endpoint)
+			ioc->put_smid_scsi_io = &_base_put_smid_mpi_ep_scsi_io;
+		else
+			ioc->put_smid_scsi_io = &_base_put_smid_scsi_io;
 		ioc->put_smid_fast_path = &_base_put_smid_fast_path;
 		ioc->put_smid_hi_priority = &_base_put_smid_hi_priority;
 		ioc->put_smid_nvme_encap = &_base_put_smid_nvme_encap;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 5/6] mpt3sas: Introduce function to clone mpi request.
@ 2018-02-07 10:51   ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


1) Added function _base_clone_mpi_to_sys_mem to clone
MPI request into system BAR0 mapped region.

2) Separate out MPI Endpoint IO submissions to function
_base_put_smid_mpi_ep_scsi_io.

3) MPI EP requests are submitted in two 32 bit MMIO writes.
from _base_mpi_ep_writeq.

 For 32 bit Arch,_base_writeq function is identical
to _base_mpi_ep_writeq, Removed duplicate code as suggested.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani at broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 140 ++++++++++++++++++++++++++++++++----
 1 file changed, 125 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index c41c65b..52effd1 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,25 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_clone_mpi_to_sys_mem - Writes/copies MPI frames
+ *				to system/BAR0 region.
+ *
+ * @dst_iomem: Pointer to the destinaltion location in BAR0 space.
+ * @src: Pointer to the Source data.
+ * @size: Size of data to be copied.
+ */
+static void
+_base_clone_mpi_to_sys_mem(void *dst_iomem, void *src, u32 size)
+{
+	int i;
+	u32 *src_virt_mem = (u32 *)src;
+
+	for (i = 0; i < size/4; i++)
+		writel((u32)src_virt_mem[i],
+				(void __iomem *)dst_iomem + (i * 4));
+}
+
+/**
  * _base_clone_to_sys_mem - Writes/copies data to system/BAR0 region
  *
  * @dst_iomem: Pointer to the destinaltion location in BAR0 space.
@@ -3268,6 +3287,29 @@ mpt3sas_base_free_smid(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 }
 
 /**
+ * _base_mpi_ep_writeq - 32 bit write to MMIO
+ * @b: data payload
+ * @addr: address in MMIO space
+ * @writeq_lock: spin lock
+ *
+ * This special handling for MPI EP to take care of 32 bit
+ * environment where its not quarenteed to send the entire word
+ * in one transfer.
+ */
+static inline void
+_base_mpi_ep_writeq(__u64 b, volatile void __iomem *addr,
+					spinlock_t *writeq_lock)
+{
+	unsigned long flags;
+	__u64 data_out = cpu_to_le64(b);
+
+	spin_lock_irqsave(writeq_lock, flags);
+	writel((u32)(data_out), addr);
+	writel((u32)(data_out >> 32), (addr + 4));
+	spin_unlock_irqrestore(writeq_lock, flags);
+}
+
+/**
  * _base_writeq - 64 bit write to MMIO
  * @ioc: per adapter object
  * @b: data payload
@@ -3288,17 +3330,41 @@ _base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t *writeq_lock)
 static inline void
 _base_writeq(__u64 b, volatile void __iomem *addr, spinlock_t *writeq_lock)
 {
-	unsigned long flags;
-	__u64 data_out = cpu_to_le64(b);
-
-	spin_lock_irqsave(writeq_lock, flags);
-	writel((u32)(data_out), addr);
-	writel((u32)(data_out >> 32), (addr + 4));
-	spin_unlock_irqrestore(writeq_lock, flags);
+	_base_mpi_ep_writeq(b, addr, writeq_lock);
 }
 #endif
 
 /**
+ * _base_put_smid_mpi_ep_scsi_io - send SCSI_IO request to firmware
+ * @ioc: per adapter object
+ * @smid: system request message index
+ * @handle: device handle
+ *
+ * Return nothing.
+ */
+static void
+_base_put_smid_mpi_ep_scsi_io(struct MPT3SAS_ADAPTER *ioc, u16 smid, u16 handle)
+{
+	Mpi2RequestDescriptorUnion_t descriptor;
+	u64 *request = (u64 *)&descriptor;
+	void *mpi_req_iomem;
+	__le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
+
+	_clone_sg_entries(ioc, (void *) mfp, smid);
+	mpi_req_iomem = (void *)ioc->chip +
+			MPI_FRAME_START_OFFSET + (smid * ioc->request_sz);
+	_base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
+					ioc->request_sz);
+	descriptor.SCSIIO.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_SCSI_IO;
+	descriptor.SCSIIO.MSIxIndex =  _base_get_msix_index(ioc);
+	descriptor.SCSIIO.SMID = cpu_to_le16(smid);
+	descriptor.SCSIIO.DevHandle = cpu_to_le16(handle);
+	descriptor.SCSIIO.LMID = 0;
+	_base_mpi_ep_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
+	    &ioc->scsi_lookup_lock);
+}
+
+/**
  * _base_put_smid_scsi_io - send SCSI_IO request to firmware
  * @ioc: per adapter object
  * @smid: system request message index
@@ -3359,7 +3425,23 @@ _base_put_smid_hi_priority(struct MPT3SAS_ADAPTER *ioc, u16 smid,
 	u16 msix_task)
 {
 	Mpi2RequestDescriptorUnion_t descriptor;
-	u64 *request = (u64 *)&descriptor;
+	void *mpi_req_iomem;
+	u64 *request;
+
+	if (ioc->is_mcpu_endpoint) {
+		MPI2RequestHeader_t *request_hdr;
+
+		__le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
+
+		request_hdr = (MPI2RequestHeader_t *)mfp;
+		/* TBD 256 is offset within sys register. */
+		mpi_req_iomem = (void *)ioc->chip + MPI_FRAME_START_OFFSET
+					+ (smid * ioc->request_sz);
+		_base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
+							ioc->request_sz);
+	}
+
+	request = (u64 *)&descriptor;
 
 	descriptor.HighPriority.RequestFlags =
 	    MPI2_REQ_DESCRIPT_FLAGS_HIGH_PRIORITY;
@@ -3367,8 +3449,13 @@ _base_put_smid_hi_priority(struct MPT3SAS_ADAPTER *ioc, u16 smid,
 	descriptor.HighPriority.SMID = cpu_to_le16(smid);
 	descriptor.HighPriority.LMID = 0;
 	descriptor.HighPriority.Reserved1 = 0;
-	_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
-	    &ioc->scsi_lookup_lock);
+	if (ioc->is_mcpu_endpoint)
+		_base_mpi_ep_writeq(*request,
+				&ioc->chip->RequestDescriptorPostLow,
+				&ioc->scsi_lookup_lock);
+	else
+		_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
+		    &ioc->scsi_lookup_lock);
 }
 
 /**
@@ -3406,15 +3493,35 @@ static void
 _base_put_smid_default(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 {
 	Mpi2RequestDescriptorUnion_t descriptor;
-	u64 *request = (u64 *)&descriptor;
+	void *mpi_req_iomem;
+	u64 *request;
+	MPI2RequestHeader_t *request_hdr;
+
+	if (ioc->is_mcpu_endpoint) {
+		__le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
+
+		request_hdr = (MPI2RequestHeader_t *)mfp;
 
+		_clone_sg_entries(ioc, (void *) mfp, smid);
+		/* TBD 256 is offset within sys register */
+		mpi_req_iomem = (void *)ioc->chip +
+			MPI_FRAME_START_OFFSET + (smid * ioc->request_sz);
+		_base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
+							ioc->request_sz);
+	}
+	request = (u64 *)&descriptor;
 	descriptor.Default.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_DEFAULT_TYPE;
 	descriptor.Default.MSIxIndex =  _base_get_msix_index(ioc);
 	descriptor.Default.SMID = cpu_to_le16(smid);
 	descriptor.Default.LMID = 0;
 	descriptor.Default.DescriptorTypeDependent = 0;
-	_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
-	    &ioc->scsi_lookup_lock);
+	if (ioc->is_mcpu_endpoint)
+		_base_mpi_ep_writeq(*request,
+				&ioc->chip->RequestDescriptorPostLow,
+				&ioc->scsi_lookup_lock);
+	else
+		_base_writeq(*request, &ioc->chip->RequestDescriptorPostLow,
+				&ioc->scsi_lookup_lock);
 }
 
 /**
@@ -3508,7 +3615,7 @@ _base_put_smid_nvme_encap_atomic(struct MPT3SAS_ADAPTER *ioc, u16 smid)
 
 /**
  * _base_put_smid_default - Default, primarily used for config pages
- *				use Atomic Request Descriptor
+ * use Atomic Request Descriptor
  * @ioc: per adapter object
  * @smid: system request message index
  *
@@ -6333,7 +6440,10 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
 		ioc->put_smid_nvme_encap = &_base_put_smid_nvme_encap_atomic;
 	} else {
 		ioc->put_smid_default = &_base_put_smid_default;
-		ioc->put_smid_scsi_io = &_base_put_smid_scsi_io;
+		if (ioc->is_mcpu_endpoint)
+			ioc->put_smid_scsi_io = &_base_put_smid_mpi_ep_scsi_io;
+		else
+			ioc->put_smid_scsi_io = &_base_put_smid_scsi_io;
 		ioc->put_smid_fast_path = &_base_put_smid_fast_path;
 		ioc->put_smid_hi_priority = &_base_put_smid_hi_priority;
 		ioc->put_smid_nvme_encap = &_base_put_smid_nvme_encap;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 6/6] mpt3sas: Introduce function to clone mpi reply.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-07 10:51   ` Suganath Prabu S
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya.Prakash, sreekanth.reddy, chaitra.basappa, Suganath Prabu S

If the posted request has an error of any type, the IOC writes
a Reply message into a host-based system reply message frame.
This functions clone it in the BAR0 mapped region.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 52effd1..1c29286 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,33 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_clone_reply_to_sys_mem - copies reply to reply free iomem
+ *				  in BAR0 space.
+ *
+ * @ioc: per adapter object
+ * @reply: reply message frame(lower 32bit addr)
+ * @index: System request message index.
+ *
+ * @Returns - Nothing
+ */
+static void
+_base_clone_reply_to_sys_mem(struct MPT3SAS_ADAPTER *ioc, u32 reply,
+		u32 index)
+{
+	/*
+	 * 256 is offset within sys register.
+	 * 256 offset MPI frame starts. Max MPI frame supported is 32.
+	 * 32 * 128 = 4K. From here, Clone of reply free for mcpu starts
+	 */
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+	void __iomem *reply_free_iomem = (void __iomem *)ioc->chip +
+			MPI_FRAME_START_OFFSET +
+			(cmd_credit * ioc->request_sz) + (index * sizeof(u32));
+
+	writel(reply, reply_free_iomem);
+}
+
+/**
  * _base_clone_mpi_to_sys_mem - Writes/copies MPI frames
  *				to system/BAR0 region.
  *
@@ -1400,6 +1427,10 @@ _base_interrupt(int irq, void *bus_id)
 				    0 : ioc->reply_free_host_index + 1;
 				ioc->reply_free[ioc->reply_free_host_index] =
 				    cpu_to_le32(reply);
+				if (ioc->is_mcpu_endpoint)
+					_base_clone_reply_to_sys_mem(ioc,
+						cpu_to_le32(reply),
+						ioc->reply_free_host_index);
 				writel(ioc->reply_free_host_index,
 				    &ioc->chip->ReplyFreeHostIndex);
 			}
@@ -6242,8 +6273,12 @@ _base_make_ioc_operational(struct MPT3SAS_ADAPTER *ioc)
 	/* initialize Reply Free Queue */
 	for (i = 0, reply_address = (u32)ioc->reply_dma ;
 	    i < ioc->reply_free_queue_depth ; i++, reply_address +=
-	    ioc->reply_sz)
+	    ioc->reply_sz) {
 		ioc->reply_free[i] = cpu_to_le32(reply_address);
+		if (ioc->is_mcpu_endpoint)
+			_base_clone_reply_to_sys_mem(ioc,
+					(__le32)reply_address, i);
+	}
 
 	/* initialize reply queues */
 	if (ioc->is_driver_loading)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [V1 6/6] mpt3sas: Introduce function to clone mpi reply.
@ 2018-02-07 10:51   ` Suganath Prabu S
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu S @ 2018-02-07 10:51 UTC (permalink / raw)


If the posted request has an error of any type, the IOC writes
a Reply message into a host-based system reply message frame.
This functions clone it in the BAR0 mapped region.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani at broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 52effd1..1c29286 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -126,6 +126,33 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug,
 	param_get_int, &mpt3sas_fwfault_debug, 0644);
 
 /**
+ * _base_clone_reply_to_sys_mem - copies reply to reply free iomem
+ *				  in BAR0 space.
+ *
+ * @ioc: per adapter object
+ * @reply: reply message frame(lower 32bit addr)
+ * @index: System request message index.
+ *
+ * @Returns - Nothing
+ */
+static void
+_base_clone_reply_to_sys_mem(struct MPT3SAS_ADAPTER *ioc, u32 reply,
+		u32 index)
+{
+	/*
+	 * 256 is offset within sys register.
+	 * 256 offset MPI frame starts. Max MPI frame supported is 32.
+	 * 32 * 128 = 4K. From here, Clone of reply free for mcpu starts
+	 */
+	u16 cmd_credit = ioc->facts.RequestCredit + 1;
+	void __iomem *reply_free_iomem = (void __iomem *)ioc->chip +
+			MPI_FRAME_START_OFFSET +
+			(cmd_credit * ioc->request_sz) + (index * sizeof(u32));
+
+	writel(reply, reply_free_iomem);
+}
+
+/**
  * _base_clone_mpi_to_sys_mem - Writes/copies MPI frames
  *				to system/BAR0 region.
  *
@@ -1400,6 +1427,10 @@ _base_interrupt(int irq, void *bus_id)
 				    0 : ioc->reply_free_host_index + 1;
 				ioc->reply_free[ioc->reply_free_host_index] =
 				    cpu_to_le32(reply);
+				if (ioc->is_mcpu_endpoint)
+					_base_clone_reply_to_sys_mem(ioc,
+						cpu_to_le32(reply),
+						ioc->reply_free_host_index);
 				writel(ioc->reply_free_host_index,
 				    &ioc->chip->ReplyFreeHostIndex);
 			}
@@ -6242,8 +6273,12 @@ _base_make_ioc_operational(struct MPT3SAS_ADAPTER *ioc)
 	/* initialize Reply Free Queue */
 	for (i = 0, reply_address = (u32)ioc->reply_dma ;
 	    i < ioc->reply_free_queue_depth ; i++, reply_address +=
-	    ioc->reply_sz)
+	    ioc->reply_sz) {
 		ioc->reply_free[i] = cpu_to_le32(reply_address);
+		if (ioc->is_mcpu_endpoint)
+			_base_clone_reply_to_sys_mem(ioc,
+					(__le32)reply_address, i);
+	}
 
 	/* initialize reply queues */
 	if (ioc->is_driver_loading)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [V1 0/6] mpt3sas: Adding MPI Endpoint device support.
  2018-02-07 10:51 ` Suganath Prabu S
@ 2018-02-15  8:41   ` Suganath Prabu Subramani
  -1 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu Subramani @ 2018-02-15  8:41 UTC (permalink / raw)
  To: linux-scsi, linux-nvme
  Cc: Sathya Prakash, Sreekanth Reddy, Chaitra Basappa, Suganath Prabu S

Gentle Reminder, Any update on this ?

Thanks,
Suganath Prabu S

On Wed, Feb 7, 2018 at 4:21 PM, Suganath Prabu S
<suganath-prabu.subramani@broadcom.com> wrote:
> V1 Change info:
>
> * Few sparse warning fixes over initial patch set.
> * For 32 bit Arch,_base_writeq function is identical
> to _base_mpi_ep_writeq, Removed duplicate code as suggested by Martin.
>
> Andromeda is a PCIe switch, and it has a dedicated management
>  CPU (mCPU), nonvolatile flash memory, RAM etc... and
>  Linux kernel runs on mCPU. MPI Endpoint driver is the
>  management driver for Andromeda.
>
> The Plx Manager driver running on mCPU synthesizes a
>  virtual/Synthetic MPI End point to host.
> Synthetic MPI End point is emulated IT firmware running on
>  Linux operating system, which interfaces with PLX management
>  driver.
>
> PLX Management driver integrates IOCFW in same driver binary.
> At the end of Plx_Mgr driver load, it initializes IOC FW as well.
> Current implementation is single instance
>  of IOC FW (as it supports only one host).
>
>  PLX management driver will provide required resources
> and infrastructure for Synthetic MPI End point.
>
> Existing PLXManagement driver will reserve virtual slot for
>  MPI end point. currently, Virtual slot number 29 is reserved
>  for MPI end point.
>
> Synthetic device in management driver will be marked as
>  new type “PLX_DEV_TYPE_SYNTH_MPI_EP”. PLXmanagement driver
>  will interface with Synthetic MPI Endpoint for any
>  communication happening on PLX_DEV_TYPE_SYNTH_MPI_EP device
>  type from host.
>
> Link between host and PLX C2 is in below diagram.
>
>                                  _______________
>  _______________                |               |
> |               |               |               |
> | PLX C2        |===============|    HOST       |
> | PCI -         |===============|   MACHINE     |
> |  SWITCH       |               |               |
> |_______________|               |               |
>         ||                      |_______________|
>         ||
>         ||
>  _______||______
> |               |
> |  MCPU         |
> |               |
> |_______________|
>
>
>
>  After MPI end point implementation -
> (Host will see dedicated Virtual SLOT as MPI End point.)
> In Below single line is logical channel for MPI Endpoint
>                                  _______________
>  _______________                |               |
> |               |               |               |
> | PLX C2        |===============|   HOST        |
> | PCI -         |===============|   MACHINE     |
> |  SWITCH       |               |               |
> |               |               |  -----------  |
> |_______________|---------------| | IT DRIVER | |
>         ||  |                   |  -----------  |
>         ||  |                   |_______________|
>         ||  |
>         ||  |
>  _______||__|___________
> |       ||  |           |
> |      MCPU |           |
> |        ___|____       |
> |       | PLX MGR|      |
> |       | DRIVER |      |
> |       |________|      |
> |           |           |
> |        ___|_____      |
> |       |         |     |
> |       |IOC FW   |     |
> |       |_________|     |
> |_______________________|
>
> PLXmanagement driver will create MPI end point based on
>  device table definition. PLXManagement driver will also
>  populate Synthetic device tree based on Device Table
>  for each host.
>
> From host it will be seen as IT HBA (Simplified version of SAS2/MPI2)
> (PCI Device, in which emulated IT FW running on mCPU behind Synthetic
>  endpoint of PCISWITCH). For host it is considered as actual
>  Physical Device.
>
> PLX Management driver provide interface to do DMA from mCPU to Host
>  using “MCPU Response Data Buffer“ method. DMA from Host to mCPU using
>  “MCPU Response Data Buffer” is not possible.
>
> Why DMA from host to mCPU is not possible using Responsebuffer ?
>  MCPU Response buffer is not really for reading from host
>  (reading will work, but answer TLP will not come back to the CSR FIFO,
>  but will go to the MCPU root complex - which could be an
>  unexpected read completion!
>
> Existing host driver (mpt2sas) will not work
>  for MPI end point. As the interface to DMA from host to mCPU is
>  not present for Mcpu/MPI Endpoint device, To overcome this
>  Driver should do double copy of those buffer directly to the
>  mCPU memory region via BAR-0 region.
>
> The host BAR0 region is divided into different group to serve Host
>  assisted DMA.
>
>  0    - 255     System register(Doorbell, Host Interrupt etc)
>  256  - 4352    MPI Frame. (This is based on maxCredit 32)
>  4352 - 4864    Reply_free pool (512 byte is reserved considering
>                 maxCredit 32. Reply needsextra room, for mCPU case
>                 kept four times of maxCredit)
>  4864 -17152    SGE chain element.
>                 (32 command * 3 chain of 128 byte size = 12288)
>  17152 -x       Host buffer mapped with smid.
>                 (Each smid can have 64K Max IO.)
> BAR0+Last 1K    MSIX Addr and DataTotalsize in use 2113664 bytes
>                 of 4MB BAR0 MPI end point module of PLX management
>                 driver must be aware of regions above.
>
> SGE and Host buffer details will be available in MPI frame.
>
> Each PCI packets coming from host on MPI end point will end up in
>  mCPU PLXmanagement driver. We can consider this as front end for IOC FW.
>  PLXManagementdriver will call IOC front end API which will be the entry
>  point in IOC FW module. Once PLX management calls relevant callbackfrom
>  IOC FW, rest of the processing will behandled within IOC FW.
>  IOC FW should release TLP packet as soon as possible to avoid any
>  TLP timeout.
>
> Suganath Prabu S (6):
>   mpt3sas: Add PCI device ID for Andromeda.
>   mpt3sas: Configure reply post queue depth, DMA and sgl      tablesize.
>   mpt3sas: Introduce API's to get BAR0 mapped buffer      address.
>   mpt3sas: Introduce Base function for cloning.
>   mpt3sas: Introduce function to clone mpi request.
>   mpt3sas: Introduce function to clone mpi reply.
>
>  drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h             |    1 +
>  drivers/scsi/mpt3sas/mpt3sas_base.c              |  528 ++++++++-
>  drivers/scsi/mpt3sas/mpt3sas_base.h              |    6 +
>  drivers/scsi/mpt3sas/mpt3sas_config.c            |    1 +
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c             |   54 +-
>  16 files changed, 540 insertions(+), 9291 deletions(-)
>
> Thanks,
> Suganath Prabu S
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [V1 0/6] mpt3sas: Adding MPI Endpoint device support.
@ 2018-02-15  8:41   ` Suganath Prabu Subramani
  0 siblings, 0 replies; 16+ messages in thread
From: Suganath Prabu Subramani @ 2018-02-15  8:41 UTC (permalink / raw)


Gentle Reminder, Any update on this ?

Thanks,
Suganath Prabu S

On Wed, Feb 7, 2018 at 4:21 PM, Suganath Prabu S
<suganath-prabu.subramani@broadcom.com> wrote:
> V1 Change info:
>
> * Few sparse warning fixes over initial patch set.
> * For 32 bit Arch,_base_writeq function is identical
> to _base_mpi_ep_writeq, Removed duplicate code as suggested by Martin.
>
> Andromeda is a PCIe switch, and it has a dedicated management
>  CPU (mCPU), nonvolatile flash memory, RAM etc... and
>  Linux kernel runs on mCPU. MPI Endpoint driver is the
>  management driver for Andromeda.
>
> The Plx Manager driver running on mCPU synthesizes a
>  virtual/Synthetic MPI End point to host.
> Synthetic MPI End point is emulated IT firmware running on
>  Linux operating system, which interfaces with PLX management
>  driver.
>
> PLX Management driver integrates IOCFW in same driver binary.
> At the end of Plx_Mgr driver load, it initializes IOC FW as well.
> Current implementation is single instance
>  of IOC FW (as it supports only one host).
>
>  PLX management driver will provide required resources
> and infrastructure for Synthetic MPI End point.
>
> Existing PLXManagement driver will reserve virtual slot for
>  MPI end point. currently, Virtual slot number 29 is reserved
>  for MPI end point.
>
> Synthetic device in management driver will be marked as
>  new type ?PLX_DEV_TYPE_SYNTH_MPI_EP?. PLXmanagement driver
>  will interface with Synthetic MPI Endpoint for any
>  communication happening on PLX_DEV_TYPE_SYNTH_MPI_EP device
>  type from host.
>
> Link between host and PLX C2 is in below diagram.
>
>                                  _______________
>  _______________                |               |
> |               |               |               |
> | PLX C2        |===============|    HOST       |
> | PCI -         |===============|   MACHINE     |
> |  SWITCH       |               |               |
> |_______________|               |               |
>         ||                      |_______________|
>         ||
>         ||
>  _______||______
> |               |
> |  MCPU         |
> |               |
> |_______________|
>
>
>
>  After MPI end point implementation -
> (Host will see dedicated Virtual SLOT as MPI End point.)
> In Below single line is logical channel for MPI Endpoint
>                                  _______________
>  _______________                |               |
> |               |               |               |
> | PLX C2        |===============|   HOST        |
> | PCI -         |===============|   MACHINE     |
> |  SWITCH       |               |               |
> |               |               |  -----------  |
> |_______________|---------------| | IT DRIVER | |
>         ||  |                   |  -----------  |
>         ||  |                   |_______________|
>         ||  |
>         ||  |
>  _______||__|___________
> |       ||  |           |
> |      MCPU |           |
> |        ___|____       |
> |       | PLX MGR|      |
> |       | DRIVER |      |
> |       |________|      |
> |           |           |
> |        ___|_____      |
> |       |         |     |
> |       |IOC FW   |     |
> |       |_________|     |
> |_______________________|
>
> PLXmanagement driver will create MPI end point based on
>  device table definition. PLXManagement driver will also
>  populate Synthetic device tree based on Device Table
>  for each host.
>
> From host it will be seen as IT HBA (Simplified version of SAS2/MPI2)
> (PCI Device, in which emulated IT FW running on mCPU behind Synthetic
>  endpoint of PCISWITCH). For host it is considered as actual
>  Physical Device.
>
> PLX Management driver provide interface to do DMA from mCPU to Host
>  using ?MCPU Response Data Buffer? method. DMA from Host to mCPU using
>  ?MCPU Response Data Buffer? is not possible.
>
> Why DMA from host to mCPU is not possible using Responsebuffer ?
>  MCPU Response buffer is not really for reading from host
>  (reading will work, but answer TLP will not come back to the CSR FIFO,
>  but will go to the MCPU root complex - which could be an
>  unexpected read completion!
>
> Existing host driver (mpt2sas) will not work
>  for MPI end point. As the interface to DMA from host to mCPU is
>  not present for Mcpu/MPI Endpoint device, To overcome this
>  Driver should do double copy of those buffer directly to the
>  mCPU memory region via BAR-0 region.
>
> The host BAR0 region is divided into different group to serve Host
>  assisted DMA.
>
>  0    - 255     System register(Doorbell, Host Interrupt etc)
>  256  - 4352    MPI Frame. (This is based on maxCredit 32)
>  4352 - 4864    Reply_free pool (512 byte is reserved considering
>                 maxCredit 32. Reply needsextra room, for mCPU case
>                 kept four times of maxCredit)
>  4864 -17152    SGE chain element.
>                 (32 command * 3 chain of 128 byte size = 12288)
>  17152 -x       Host buffer mapped with smid.
>                 (Each smid can have 64K Max IO.)
> BAR0+Last 1K    MSIX Addr and DataTotalsize in use 2113664 bytes
>                 of 4MB BAR0 MPI end point module of PLX management
>                 driver must be aware of regions above.
>
> SGE and Host buffer details will be available in MPI frame.
>
> Each PCI packets coming from host on MPI end point will end up in
>  mCPU PLXmanagement driver. We can consider this as front end for IOC FW.
>  PLXManagementdriver will call IOC front end API which will be the entry
>  point in IOC FW module. Once PLX management calls relevant callbackfrom
>  IOC FW, rest of the processing will behandled within IOC FW.
>  IOC FW should release TLP packet as soon as possible to avoid any
>  TLP timeout.
>
> Suganath Prabu S (6):
>   mpt3sas: Add PCI device ID for Andromeda.
>   mpt3sas: Configure reply post queue depth, DMA and sgl      tablesize.
>   mpt3sas: Introduce API's to get BAR0 mapped buffer      address.
>   mpt3sas: Introduce Base function for cloning.
>   mpt3sas: Introduce function to clone mpi request.
>   mpt3sas: Introduce function to clone mpi reply.
>
>  drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h             |    1 +
>  drivers/scsi/mpt3sas/mpt3sas_base.c              |  528 ++++++++-
>  drivers/scsi/mpt3sas/mpt3sas_base.h              |    6 +
>  drivers/scsi/mpt3sas/mpt3sas_config.c            |    1 +
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c             |   54 +-
>  16 files changed, 540 insertions(+), 9291 deletions(-)
>
> Thanks,
> Suganath Prabu S
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-02-15  8:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-07 10:51 [V1 0/6] mpt3sas: Adding MPI Endpoint device support Suganath Prabu S
2018-02-07 10:51 ` Suganath Prabu S
2018-02-07 10:51 ` [V1 1/6] mpt3sas: Add PCI device ID for Andromeda Suganath Prabu S
2018-02-07 10:51   ` Suganath Prabu S
2018-02-07 10:51 ` [V1 2/6] mpt3sas: Configure reply post queue depth, DMA and sgl tablesize Suganath Prabu S
2018-02-07 10:51   ` Suganath Prabu S
2018-02-07 10:51 ` [V1 3/6] mpt3sas: Introduce API's to get BAR0 mapped buffer address Suganath Prabu S
2018-02-07 10:51   ` Suganath Prabu S
2018-02-07 10:51 ` [V1 4/6] mpt3sas: Introduce Base function for cloning Suganath Prabu S
2018-02-07 10:51   ` Suganath Prabu S
2018-02-07 10:51 ` [V1 5/6] mpt3sas: Introduce function to clone mpi request Suganath Prabu S
2018-02-07 10:51   ` Suganath Prabu S
2018-02-07 10:51 ` [V1 6/6] mpt3sas: Introduce function to clone mpi reply Suganath Prabu S
2018-02-07 10:51   ` Suganath Prabu S
2018-02-15  8:41 ` [V1 0/6] mpt3sas: Adding MPI Endpoint device support Suganath Prabu Subramani
2018-02-15  8:41   ` Suganath Prabu Subramani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.