All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/3] scsi: mptxsas: updates for ARM64
@ 2015-11-09  1:57 ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Sinan Kaya

Changes from V1: (https://lkml.org/lkml/2015/11/4/856)
Changes from V1: (https://lkml.org/lkml/2015/11/4/859)
* merge two patches together

Changes from V1: (https://lkml.org/lkml/2015/11/4/857)
* Use do_div to handle the linker errors on i386

Changes from V1: https://lkml.org/lkml/2015/11/4/858
* None

Sinan Kaya (3):
  scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  scsi: fix compiler warning for sg
  scsi: mptxsas: offload IRQ execution

 drivers/scsi/mpt2sas/mpt2sas_base.c | 33 ++++++++++++++++++++++++++++-----
 drivers/scsi/mpt3sas/mpt3sas_base.c | 35 ++++++++++++++++++++++++++++++-----
 drivers/scsi/sg.c                   | 20 +++++++++++++-------
 3 files changed, 71 insertions(+), 17 deletions(-)

-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 0/3] scsi: mptxsas: updates for ARM64
@ 2015-11-09  1:57 ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-arm-kernel

Changes from V1: (https://lkml.org/lkml/2015/11/4/856)
Changes from V1: (https://lkml.org/lkml/2015/11/4/859)
* merge two patches together

Changes from V1: (https://lkml.org/lkml/2015/11/4/857)
* Use do_div to handle the linker errors on i386

Changes from V1: https://lkml.org/lkml/2015/11/4/858
* None

Sinan Kaya (3):
  scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  scsi: fix compiler warning for sg
  scsi: mptxsas: offload IRQ execution

 drivers/scsi/mpt2sas/mpt2sas_base.c | 33 ++++++++++++++++++++++++++++-----
 drivers/scsi/mpt3sas/mpt3sas_base.c | 35 ++++++++++++++++++++++++++++++-----
 drivers/scsi/sg.c                   | 20 +++++++++++++-------
 3 files changed, 71 insertions(+), 17 deletions(-)

-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09  1:57 ` Sinan Kaya
@ 2015-11-09  1:57   ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, James E.J. Bottomley, MPT-FusionLinux.pdl,
	linux-kernel

Current code gives up when 32 bit DMA is not supported.
This problem has been observed on systems without any
memory below 4 gig.

This patch tests 64 bit support before bailing out to find
a working combination.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.c | 21 ++++++++++++++++++++-
 drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++++++++++++++++++++-
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index c167911..c61c82a 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -1217,8 +1217,27 @@ _base_config_dma_addressing(struct MPT2SAS_ADAPTER *ioc, struct pci_dev *pdev)
 		ioc->base_add_sg_single = &_base_add_sg_single_32;
 		ioc->sge_size = sizeof(Mpi2SGESimple32_t);
 		ioc->dma_mask = 32;
-	} else
+	} else {
+		/* Try 64 bit, 32 bit failed */
+		consistent_dma_mask = DMA_BIT_MASK(64);
+
+		if (sizeof(dma_addr_t) > 4) {
+			const uint64_t required_mask =
+				dma_get_required_mask(&pdev->dev);
+			if ((required_mask > DMA_BIT_MASK(32)) &&
+				!pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+				!pci_set_consistent_dma_mask(pdev,
+							consistent_dma_mask)) {
+				ioc->base_add_sg_single =
+					&_base_add_sg_single_64;
+				ioc->sge_size = sizeof(Mpi2SGESimple64_t);
+				ioc->dma_mask = 64;
+				goto out;
+			}
+		}
+
 		return -ENODEV;
+	}
 
  out:
 	si_meminfo(&s);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index d4f1dcd..6dc391c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1535,9 +1535,29 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
 		ioc->base_add_sg_single = &_base_add_sg_single_32;
 		ioc->sge_size = sizeof(Mpi2SGESimple32_t);
 		ioc->dma_mask = 32;
-	} else
+	} else {
+		/* Try 64 bit, 32 bit failed */
+		consistent_dma_mask = DMA_BIT_MASK(64);
+		if (sizeof(dma_addr_t) > 4) {
+			const uint64_t required_mask =
+				dma_get_required_mask(&pdev->dev);
+			int consistent_mask =
+				pci_set_consistent_dma_mask(pdev,
+							consistent_dma_mask);
+
+			if ((required_mask > DMA_BIT_MASK(32)) &&
+				!pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+				!consistent_mask) {
+				ioc->base_add_sg_single =
+					&_base_add_sg_single_64;
+				ioc->sge_size = sizeof(Mpi2SGESimple64_t);
+				ioc->dma_mask = 64;
+				goto out;
+			}
+		}
 		return -ENODEV;
 
+	}
  out:
 	si_meminfo(&s);
 	pr_info(MPT3SAS_FMT
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09  1:57   ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-arm-kernel

Current code gives up when 32 bit DMA is not supported.
This problem has been observed on systems without any
memory below 4 gig.

This patch tests 64 bit support before bailing out to find
a working combination.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.c | 21 ++++++++++++++++++++-
 drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++++++++++++++++++++-
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index c167911..c61c82a 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -1217,8 +1217,27 @@ _base_config_dma_addressing(struct MPT2SAS_ADAPTER *ioc, struct pci_dev *pdev)
 		ioc->base_add_sg_single = &_base_add_sg_single_32;
 		ioc->sge_size = sizeof(Mpi2SGESimple32_t);
 		ioc->dma_mask = 32;
-	} else
+	} else {
+		/* Try 64 bit, 32 bit failed */
+		consistent_dma_mask = DMA_BIT_MASK(64);
+
+		if (sizeof(dma_addr_t) > 4) {
+			const uint64_t required_mask =
+				dma_get_required_mask(&pdev->dev);
+			if ((required_mask > DMA_BIT_MASK(32)) &&
+				!pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+				!pci_set_consistent_dma_mask(pdev,
+							consistent_dma_mask)) {
+				ioc->base_add_sg_single =
+					&_base_add_sg_single_64;
+				ioc->sge_size = sizeof(Mpi2SGESimple64_t);
+				ioc->dma_mask = 64;
+				goto out;
+			}
+		}
+
 		return -ENODEV;
+	}
 
  out:
 	si_meminfo(&s);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index d4f1dcd..6dc391c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1535,9 +1535,29 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev)
 		ioc->base_add_sg_single = &_base_add_sg_single_32;
 		ioc->sge_size = sizeof(Mpi2SGESimple32_t);
 		ioc->dma_mask = 32;
-	} else
+	} else {
+		/* Try 64 bit, 32 bit failed */
+		consistent_dma_mask = DMA_BIT_MASK(64);
+		if (sizeof(dma_addr_t) > 4) {
+			const uint64_t required_mask =
+				dma_get_required_mask(&pdev->dev);
+			int consistent_mask =
+				pci_set_consistent_dma_mask(pdev,
+							consistent_dma_mask);
+
+			if ((required_mask > DMA_BIT_MASK(32)) &&
+				!pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) &&
+				!consistent_mask) {
+				ioc->base_add_sg_single =
+					&_base_add_sg_single_64;
+				ioc->sge_size = sizeof(Mpi2SGESimple64_t);
+				ioc->dma_mask = 64;
+				goto out;
+			}
+		}
 		return -ENODEV;
 
+	}
  out:
 	si_meminfo(&s);
 	pr_info(MPT3SAS_FMT
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-09  1:57 ` Sinan Kaya
@ 2015-11-09  1:57   ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	Doug Gilbert, James E.J. Bottomley, linux-kernel

The MULDIV macro has been designed for small numbers.
Compiler emits an overflow warning on 64 bit systems.
This patch uses 64 bit numbers in order to suppress
warning.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/sg.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 9d7b7db..112d8974 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -51,6 +51,7 @@ static int sg_version_num = 30536;	/* 2 digits for each component */
 #include <linux/atomic.h>
 #include <linux/ratelimit.h>
 #include <linux/uio.h>
+#include <asm/div64.h>
 
 #include "scsi.h"
 #include <scsi/scsi_dbg.h>
@@ -85,12 +86,17 @@ static void sg_proc_cleanup(void);
  * Replacing muldiv(x) by muldiv(x)=((x % d) * m) / d + int(x / d) * m
  * calculates the same, but prevents the overflow when both m and d
  * are "small" numbers (like HZ and USER_HZ).
- * Of course an overflow is inavoidable if the result of muldiv doesn't fit
- * in 32 bits.
  */
-#define MULDIV(X,MUL,DIV) ((((X % DIV) * MUL) / DIV) + ((X / DIV) * MUL))
+static inline u64 mult_frac64(u64 x, u32 numer, u32 denom)
+{
+	u64 r1 = do_div(x, denom);
+	u64 r2 = r1 * numer;
+
+	do_div(r2, denom);
+	return (x * numer) + r2;
+}
 
-#define SG_DEFAULT_TIMEOUT MULDIV(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
+#define SG_DEFAULT_TIMEOUT mult_frac64(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
@@ -877,10 +883,10 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 			return result;
 		if (val < 0)
 			return -EIO;
-		if (val >= MULDIV (INT_MAX, USER_HZ, HZ))
-		    val = MULDIV (INT_MAX, USER_HZ, HZ);
+		if (val >= mult_frac64(INT_MAX, USER_HZ, HZ))
+			val = mult_frac64(INT_MAX, USER_HZ, HZ);
 		sfp->timeout_user = val;
-		sfp->timeout = MULDIV (val, HZ, USER_HZ);
+		sfp->timeout = mult_frac64(val, HZ, USER_HZ);
 
 		return 0;
 	case SG_GET_TIMEOUT:	/* N.B. User receives timeout as return value */
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-09  1:57   ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-arm-kernel

The MULDIV macro has been designed for small numbers.
Compiler emits an overflow warning on 64 bit systems.
This patch uses 64 bit numbers in order to suppress
warning.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/sg.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 9d7b7db..112d8974 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -51,6 +51,7 @@ static int sg_version_num = 30536;	/* 2 digits for each component */
 #include <linux/atomic.h>
 #include <linux/ratelimit.h>
 #include <linux/uio.h>
+#include <asm/div64.h>
 
 #include "scsi.h"
 #include <scsi/scsi_dbg.h>
@@ -85,12 +86,17 @@ static void sg_proc_cleanup(void);
  * Replacing muldiv(x) by muldiv(x)=((x % d) * m) / d + int(x / d) * m
  * calculates the same, but prevents the overflow when both m and d
  * are "small" numbers (like HZ and USER_HZ).
- * Of course an overflow is inavoidable if the result of muldiv doesn't fit
- * in 32 bits.
  */
-#define MULDIV(X,MUL,DIV) ((((X % DIV) * MUL) / DIV) + ((X / DIV) * MUL))
+static inline u64 mult_frac64(u64 x, u32 numer, u32 denom)
+{
+	u64 r1 = do_div(x, denom);
+	u64 r2 = r1 * numer;
+
+	do_div(r2, denom);
+	return (x * numer) + r2;
+}
 
-#define SG_DEFAULT_TIMEOUT MULDIV(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
+#define SG_DEFAULT_TIMEOUT mult_frac64(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
@@ -877,10 +883,10 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 			return result;
 		if (val < 0)
 			return -EIO;
-		if (val >= MULDIV (INT_MAX, USER_HZ, HZ))
-		    val = MULDIV (INT_MAX, USER_HZ, HZ);
+		if (val >= mult_frac64(INT_MAX, USER_HZ, HZ))
+			val = mult_frac64(INT_MAX, USER_HZ, HZ);
 		sfp->timeout_user = val;
-		sfp->timeout = MULDIV (val, HZ, USER_HZ);
+		sfp->timeout = mult_frac64(val, HZ, USER_HZ);
 
 		return 0;
 	case SG_GET_TIMEOUT:	/* N.B. User receives timeout as return value */
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
  2015-11-09  1:57 ` Sinan Kaya
@ 2015-11-09  1:57   ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Sinan Kaya,
	Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, James E.J. Bottomley, MPT-FusionLinux.pdl,
	linux-kernel

The mpt2sas and mpt3sas drivers are spinning forever in
their IRQ handlers if there are a lot of jobs queued up
by the PCIe card. This handler is causing spikes for
the rest of the system and sluggish behavior.

Marking all MSI interrupts as non-shared and moving the
MSI interrupts to thread context. This relexes the rest
of the system execution.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.c | 12 ++++++++----
 drivers/scsi/mpt3sas/mpt3sas_base.c | 13 +++++++++----
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index c61c82a..b619a0e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -1359,14 +1359,18 @@ _base_request_irq(struct MPT2SAS_ADAPTER *ioc, u8 index, u32 vector)
 	cpumask_clear(reply_q->affinity_hint);
 
 	atomic_set(&reply_q->busy, 0);
-	if (ioc->msix_enable)
+	if (ioc->msix_enable) {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d-msix%d",
 		    MPT2SAS_DRIVER_NAME, ioc->id, index);
-	else
+		r = request_threaded_irq(vector, NULL, _base_interrupt,
+					 IRQF_TRIGGER_RISING | IRQF_ONESHOT,
+					 reply_q->name, reply_q);
+	} else {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d",
 		    MPT2SAS_DRIVER_NAME, ioc->id);
-	r = request_irq(vector, _base_interrupt, IRQF_SHARED, reply_q->name,
-	    reply_q);
+		r = request_irq(vector, _base_interrupt, IRQF_SHARED,
+				reply_q->name, reply_q);
+	}
 	if (r) {
 		printk(MPT2SAS_ERR_FMT "unable to allocate interrupt %d!\n",
 		    reply_q->name, vector);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 6dc391c..62bee43 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1661,14 +1661,19 @@ _base_request_irq(struct MPT3SAS_ADAPTER *ioc, u8 index, u32 vector)
 	cpumask_clear(reply_q->affinity_hint);
 
 	atomic_set(&reply_q->busy, 0);
-	if (ioc->msix_enable)
+	if (ioc->msix_enable) {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d-msix%d",
 		    MPT3SAS_DRIVER_NAME, ioc->id, index);
-	else
+
+		r = request_threaded_irq(vector, NULL, _base_interrupt,
+					 IRQF_TRIGGER_RISING | IRQF_ONESHOT,
+					 reply_q->name, reply_q);
+	} else {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d",
 		    MPT3SAS_DRIVER_NAME, ioc->id);
-	r = request_irq(vector, _base_interrupt, IRQF_SHARED, reply_q->name,
-	    reply_q);
+		r = request_irq(vector, _base_interrupt, IRQF_SHARED,
+				reply_q->name, reply_q);
+	}
 	if (r) {
 		pr_err(MPT3SAS_FMT "unable to allocate interrupt %d!\n",
 		    reply_q->name, vector);
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 86+ messages in thread

* [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
@ 2015-11-09  1:57   ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09  1:57 UTC (permalink / raw)
  To: linux-arm-kernel

The mpt2sas and mpt3sas drivers are spinning forever in
their IRQ handlers if there are a lot of jobs queued up
by the PCIe card. This handler is causing spikes for
the rest of the system and sluggish behavior.

Marking all MSI interrupts as non-shared and moving the
MSI interrupts to thread context. This relexes the rest
of the system execution.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 drivers/scsi/mpt2sas/mpt2sas_base.c | 12 ++++++++----
 drivers/scsi/mpt3sas/mpt3sas_base.c | 13 +++++++++----
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index c61c82a..b619a0e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -1359,14 +1359,18 @@ _base_request_irq(struct MPT2SAS_ADAPTER *ioc, u8 index, u32 vector)
 	cpumask_clear(reply_q->affinity_hint);
 
 	atomic_set(&reply_q->busy, 0);
-	if (ioc->msix_enable)
+	if (ioc->msix_enable) {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d-msix%d",
 		    MPT2SAS_DRIVER_NAME, ioc->id, index);
-	else
+		r = request_threaded_irq(vector, NULL, _base_interrupt,
+					 IRQF_TRIGGER_RISING | IRQF_ONESHOT,
+					 reply_q->name, reply_q);
+	} else {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d",
 		    MPT2SAS_DRIVER_NAME, ioc->id);
-	r = request_irq(vector, _base_interrupt, IRQF_SHARED, reply_q->name,
-	    reply_q);
+		r = request_irq(vector, _base_interrupt, IRQF_SHARED,
+				reply_q->name, reply_q);
+	}
 	if (r) {
 		printk(MPT2SAS_ERR_FMT "unable to allocate interrupt %d!\n",
 		    reply_q->name, vector);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 6dc391c..62bee43 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1661,14 +1661,19 @@ _base_request_irq(struct MPT3SAS_ADAPTER *ioc, u8 index, u32 vector)
 	cpumask_clear(reply_q->affinity_hint);
 
 	atomic_set(&reply_q->busy, 0);
-	if (ioc->msix_enable)
+	if (ioc->msix_enable) {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d-msix%d",
 		    MPT3SAS_DRIVER_NAME, ioc->id, index);
-	else
+
+		r = request_threaded_irq(vector, NULL, _base_interrupt,
+					 IRQF_TRIGGER_RISING | IRQF_ONESHOT,
+					 reply_q->name, reply_q);
+	} else {
 		snprintf(reply_q->name, MPT_NAME_LENGTH, "%s%d",
 		    MPT3SAS_DRIVER_NAME, ioc->id);
-	r = request_irq(vector, _base_interrupt, IRQF_SHARED, reply_q->name,
-	    reply_q);
+		r = request_irq(vector, _base_interrupt, IRQF_SHARED,
+				reply_q->name, reply_q);
+	}
 	if (r) {
 		pr_err(MPT3SAS_FMT "unable to allocate interrupt %d!\n",
 		    reply_q->name, vector);
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09  1:57   ` Sinan Kaya
@ 2015-11-09  7:09     ` Hannes Reinecke
  -1 siblings, 0 replies; 86+ messages in thread
From: Hannes Reinecke @ 2015-11-09  7:09 UTC (permalink / raw)
  To: Sinan Kaya, linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Nagalakshmi Nandigama,
	Praveen Krishnamoorthy, Sreekanth Reddy, Abhijit Mahajan,
	James E.J. Bottomley, MPT-FusionLinux.pdl, linux-kernel

On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> Current code gives up when 32 bit DMA is not supported.
> This problem has been observed on systems without any
> memory below 4 gig.
> 
> This patch tests 64 bit support before bailing out to find
> a working combination.
> 
That feels decidedly odd.

Why do you probe for 64bit if 32bit fails?
Typically it's the other way round, on the grounds that 64bit DMA
should be preferred over 32bit.
Can you explain why it needs to be done the other way round here?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09  7:09     ` Hannes Reinecke
  0 siblings, 0 replies; 86+ messages in thread
From: Hannes Reinecke @ 2015-11-09  7:09 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> Current code gives up when 32 bit DMA is not supported.
> This problem has been observed on systems without any
> memory below 4 gig.
> 
> This patch tests 64 bit support before bailing out to find
> a working combination.
> 
That feels decidedly odd.

Why do you probe for 64bit if 32bit fails?
Typically it's the other way round, on the grounds that 64bit DMA
should be preferred over 32bit.
Can you explain why it needs to be done the other way round here?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
  2015-11-09  1:57   ` Sinan Kaya
@ 2015-11-09  7:15     ` Hannes Reinecke
  -1 siblings, 0 replies; 86+ messages in thread
From: Hannes Reinecke @ 2015-11-09  7:15 UTC (permalink / raw)
  To: Sinan Kaya, linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Nagalakshmi Nandigama,
	Praveen Krishnamoorthy, Sreekanth Reddy, Abhijit Mahajan,
	James E.J. Bottomley, MPT-FusionLinux.pdl, linux-kernel

On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> The mpt2sas and mpt3sas drivers are spinning forever in
> their IRQ handlers if there are a lot of jobs queued up
> by the PCIe card. This handler is causing spikes for
> the rest of the system and sluggish behavior.
> 
> Marking all MSI interrupts as non-shared and moving the
> MSI interrupts to thread context. This relexes the rest
> of the system execution.
> 
NACK.

If there is a scalability issue when handling interrupts
it should be fixed in the driver directly.

Looking at the driver is should be possible to implement
a worker thread handling the reply descriptor, and having the
interrupt only to fetch the reply descriptor.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
@ 2015-11-09  7:15     ` Hannes Reinecke
  0 siblings, 0 replies; 86+ messages in thread
From: Hannes Reinecke @ 2015-11-09  7:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> The mpt2sas and mpt3sas drivers are spinning forever in
> their IRQ handlers if there are a lot of jobs queued up
> by the PCIe card. This handler is causing spikes for
> the rest of the system and sluggish behavior.
> 
> Marking all MSI interrupts as non-shared and moving the
> MSI interrupts to thread context. This relexes the rest
> of the system execution.
> 
NACK.

If there is a scalability issue when handling interrupts
it should be fixed in the driver directly.

Looking at the driver is should be possible to implement
a worker thread handling the reply descriptor, and having the
interrupt only to fetch the reply descriptor.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare at suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09  7:09     ` Hannes Reinecke
@ 2015-11-09  8:59       ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-09  8:59 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Hannes Reinecke, Sinan Kaya, linux-scsi, timur, cov, jcm,
	Abhijit Mahajan, Nagalakshmi Nandigama, linux-arm-msm,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, agross, MPT-FusionLinux.pdl

On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> > Current code gives up when 32 bit DMA is not supported.
> > This problem has been observed on systems without any
> > memory below 4 gig.
> > 
> > This patch tests 64 bit support before bailing out to find
> > a working combination.
> > 
> That feels decidedly odd.
> 
> Why do you probe for 64bit if 32bit fails?

32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
of places. If the machine has all of its RAM visible above 4GB
PCI bus addresses, it must use an IOMMU.

> Typically it's the other way round, on the grounds that 64bit DMA
> should be preferred over 32bit.
> Can you explain why it needs to be done the other way round here?

Something else is odd here, the driver already checks for
dma_get_required_mask(), which will return the smallest mask
that fits all of RAM. If the machine has any memory above 4GB,
it already uses the 64-bit mask, and only falls back to
the 32-bit mask if that fails or if all memory fits within the
first 4GB.

So both the description and the patch are wrong. :(

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09  8:59       ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-09  8:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> > Current code gives up when 32 bit DMA is not supported.
> > This problem has been observed on systems without any
> > memory below 4 gig.
> > 
> > This patch tests 64 bit support before bailing out to find
> > a working combination.
> > 
> That feels decidedly odd.
> 
> Why do you probe for 64bit if 32bit fails?

32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
of places. If the machine has all of its RAM visible above 4GB
PCI bus addresses, it must use an IOMMU.

> Typically it's the other way round, on the grounds that 64bit DMA
> should be preferred over 32bit.
> Can you explain why it needs to be done the other way round here?

Something else is odd here, the driver already checks for
dma_get_required_mask(), which will return the smallest mask
that fits all of RAM. If the machine has any memory above 4GB,
it already uses the 64-bit mask, and only falls back to
the 32-bit mask if that fails or if all memory fits within the
first 4GB.

So both the description and the patch are wrong. :(

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09  7:09     ` Hannes Reinecke
@ 2015-11-09 14:00       ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 14:00 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Nagalakshmi Nandigama,
	Praveen Krishnamoorthy, Sreekanth Reddy, Abhijit Mahajan,
	James E.J. Bottomley, MPT-FusionLinux.pdl, linux-kernel



On 11/9/2015 2:09 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> Current code gives up when 32 bit DMA is not supported.
>> This problem has been observed on systems without any
>> memory below 4 gig.
>>
>> This patch tests 64 bit support before bailing out to find
>> a working combination.
>>
> That feels decidedly odd.
>
> Why do you probe for 64bit if 32bit fails?
> Typically it's the other way round, on the grounds that 64bit DMA
> should be preferred over 32bit.
> Can you explain why it needs to be done the other way round here?
>
> Cheers,
>
> Hannes
>

The platform does not have any memory below 4G. So, 32 bit DMA is not 
possible. I'm trying to use 64 bit DMA instead since both the platform 
and hardware supports it. Current code will not try 64 bit DMA if 32 bit 
DMA is not working.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09 14:00       ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 14:00 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 2:09 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> Current code gives up when 32 bit DMA is not supported.
>> This problem has been observed on systems without any
>> memory below 4 gig.
>>
>> This patch tests 64 bit support before bailing out to find
>> a working combination.
>>
> That feels decidedly odd.
>
> Why do you probe for 64bit if 32bit fails?
> Typically it's the other way round, on the grounds that 64bit DMA
> should be preferred over 32bit.
> Can you explain why it needs to be done the other way round here?
>
> Cheers,
>
> Hannes
>

The platform does not have any memory below 4G. So, 32 bit DMA is not 
possible. I'm trying to use 64 bit DMA instead since both the platform 
and hardware supports it. Current code will not try 64 bit DMA if 32 bit 
DMA is not working.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
  2015-11-09  7:15     ` Hannes Reinecke
@ 2015-11-09 14:01       ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 14:01 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Nagalakshmi Nandigama,
	Praveen Krishnamoorthy, Sreekanth Reddy, Abhijit Mahajan,
	James E.J. Bottomley, MPT-FusionLinux.pdl, linux-kernel



On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> The mpt2sas and mpt3sas drivers are spinning forever in
>> their IRQ handlers if there are a lot of jobs queued up
>> by the PCIe card. This handler is causing spikes for
>> the rest of the system and sluggish behavior.
>>
>> Marking all MSI interrupts as non-shared and moving the
>> MSI interrupts to thread context. This relexes the rest
>> of the system execution.
>>
> NACK.
>
> If there is a scalability issue when handling interrupts
> it should be fixed in the driver directly.
>
> Looking at the driver is should be possible to implement
> a worker thread handling the reply descriptor, and having the
> interrupt only to fetch the reply descriptor.

I'll take a look.

>
> Cheers,
>
> Hannes
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
@ 2015-11-09 14:01       ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 14:01 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> The mpt2sas and mpt3sas drivers are spinning forever in
>> their IRQ handlers if there are a lot of jobs queued up
>> by the PCIe card. This handler is causing spikes for
>> the rest of the system and sluggish behavior.
>>
>> Marking all MSI interrupts as non-shared and moving the
>> MSI interrupts to thread context. This relexes the rest
>> of the system execution.
>>
> NACK.
>
> If there is a scalability issue when handling interrupts
> it should be fixed in the driver directly.
>
> Looking at the driver is should be possible to implement
> a worker thread handling the reply descriptor, and having the
> interrupt only to fetch the reply descriptor.

I'll take a look.

>
> Cheers,
>
> Hannes
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09  8:59       ` Arnd Bergmann
@ 2015-11-09 14:07         ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 14:07 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Hannes Reinecke, linux-scsi, timur, cov, jcm, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-arm-msm, James E.J. Bottomley,
	linux-kernel, Sreekanth Reddy, Praveen Krishnamoorthy, agross,
	MPT-FusionLinux.pdl



On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
> On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
>> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>>> Current code gives up when 32 bit DMA is not supported.
>>> This problem has been observed on systems without any
>>> memory below 4 gig.
>>>
>>> This patch tests 64 bit support before bailing out to find
>>> a working combination.
>>>
>> That feels decidedly odd.
>>
>> Why do you probe for 64bit if 32bit fails?
>
> 32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
> of places. If the machine has all of its RAM visible above 4GB
> PCI bus addresses, it must use an IOMMU.

Can you be specific? PCIe does not have this limitation. It supports 32 
bit and 64 bit TLPs.

I have not seen any limitation so far in the OS either.

Using IOMMU is fine but not required if the endpoint is a true 64 bit 
supporting endpoint. This endpoint supports 64bit too.

>
>> Typically it's the other way round, on the grounds that 64bit DMA
>> should be preferred over 32bit.
>> Can you explain why it needs to be done the other way round here?
>
> Something else is odd here, the driver already checks for
> dma_get_required_mask(), which will return the smallest mask
> that fits all of RAM. If the machine has any memory above 4GB,
> it already uses the 64-bit mask, and only falls back to
> the 32-bit mask if that fails or if all memory fits within the
> first 4GB.
>

I'll add some prints in the code to get to the bottom of it. I think the 
code is checking more than just the required mask and failing in one of 
the other conditions. At least that DMA comparison code was more than 
what I have ever seen.


> So both the description and the patch are wrong. :(
>
> 	Arnd
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09 14:07         ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 14:07 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
> On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
>> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>>> Current code gives up when 32 bit DMA is not supported.
>>> This problem has been observed on systems without any
>>> memory below 4 gig.
>>>
>>> This patch tests 64 bit support before bailing out to find
>>> a working combination.
>>>
>> That feels decidedly odd.
>>
>> Why do you probe for 64bit if 32bit fails?
>
> 32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
> of places. If the machine has all of its RAM visible above 4GB
> PCI bus addresses, it must use an IOMMU.

Can you be specific? PCIe does not have this limitation. It supports 32 
bit and 64 bit TLPs.

I have not seen any limitation so far in the OS either.

Using IOMMU is fine but not required if the endpoint is a true 64 bit 
supporting endpoint. This endpoint supports 64bit too.

>
>> Typically it's the other way round, on the grounds that 64bit DMA
>> should be preferred over 32bit.
>> Can you explain why it needs to be done the other way round here?
>
> Something else is odd here, the driver already checks for
> dma_get_required_mask(), which will return the smallest mask
> that fits all of RAM. If the machine has any memory above 4GB,
> it already uses the 64-bit mask, and only falls back to
> the 32-bit mask if that fails or if all memory fits within the
> first 4GB.
>

I'll add some prints in the code to get to the bottom of it. I think the 
code is checking more than just the required mask and failing in one of 
the other conditions. At least that DMA comparison code was more than 
what I have ever seen.


> So both the description and the patch are wrong. :(
>
> 	Arnd
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-09  1:57   ` Sinan Kaya
@ 2015-11-09 14:14     ` Andy Shevchenko
  -1 siblings, 0 replies; 86+ messages in thread
From: Andy Shevchenko @ 2015-11-09 14:14 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-scsi, timur, cov, jcm, Andy Gross, linux-arm-msm,
	linux-arm Mailing List, Doug Gilbert, James E.J. Bottomley,
	linux-kernel

On Mon, Nov 9, 2015 at 3:57 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> The MULDIV macro has been designed for small numbers.
> Compiler emits an overflow warning on 64 bit systems.
> This patch uses 64 bit numbers in order to suppress
> warning.
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>


> ---
>  drivers/scsi/sg.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 9d7b7db..112d8974 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -51,6 +51,7 @@ static int sg_version_num = 30536;    /* 2 digits for each component */
>  #include <linux/atomic.h>
>  #include <linux/ratelimit.h>
>  #include <linux/uio.h>
> +#include <asm/div64.h>
>
>  #include "scsi.h"
>  #include <scsi/scsi_dbg.h>
> @@ -85,12 +86,17 @@ static void sg_proc_cleanup(void);
>   * Replacing muldiv(x) by muldiv(x)=((x % d) * m) / d + int(x / d) * m
>   * calculates the same, but prevents the overflow when both m and d
>   * are "small" numbers (like HZ and USER_HZ).
> - * Of course an overflow is inavoidable if the result of muldiv doesn't fit
> - * in 32 bits.
>   */
> -#define MULDIV(X,MUL,DIV) ((((X % DIV) * MUL) / DIV) + ((X / DIV) * MUL))
> +static inline u64 mult_frac64(u64 x, u32 numer, u32 denom)
> +{
> +       u64 r1 = do_div(x, denom);
> +       u64 r2 = r1 * numer;
> +
> +       do_div(r2, denom);

> +       return (x * numer) + r2;

Parens are useless, noticed later, sorry.

Isn't mult_frac() enough here?

Btw, can you mention explicitly what is the warning you get
(copy'n'paste of the line would be okay)?

> +}
>
> -#define SG_DEFAULT_TIMEOUT MULDIV(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
> +#define SG_DEFAULT_TIMEOUT mult_frac64(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
>
>  int sg_big_buff = SG_DEF_RESERVED_SIZE;
>  /* N.B. This variable is readable and writeable via
> @@ -877,10 +883,10 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
>                         return result;
>                 if (val < 0)
>                         return -EIO;
> -               if (val >= MULDIV (INT_MAX, USER_HZ, HZ))
> -                   val = MULDIV (INT_MAX, USER_HZ, HZ);
> +               if (val >= mult_frac64(INT_MAX, USER_HZ, HZ))
> +                       val = mult_frac64(INT_MAX, USER_HZ, HZ);


>                 sfp->timeout_user = val;
> -               sfp->timeout = MULDIV (val, HZ, USER_HZ);
> +               sfp->timeout = mult_frac64(val, HZ, USER_HZ);
>
>                 return 0;
>         case SG_GET_TIMEOUT:    /* N.B. User receives timeout as return value */
> --
> Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-09 14:14     ` Andy Shevchenko
  0 siblings, 0 replies; 86+ messages in thread
From: Andy Shevchenko @ 2015-11-09 14:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 9, 2015 at 3:57 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
> The MULDIV macro has been designed for small numbers.
> Compiler emits an overflow warning on 64 bit systems.
> This patch uses 64 bit numbers in order to suppress
> warning.
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>


> ---
>  drivers/scsi/sg.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 9d7b7db..112d8974 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -51,6 +51,7 @@ static int sg_version_num = 30536;    /* 2 digits for each component */
>  #include <linux/atomic.h>
>  #include <linux/ratelimit.h>
>  #include <linux/uio.h>
> +#include <asm/div64.h>
>
>  #include "scsi.h"
>  #include <scsi/scsi_dbg.h>
> @@ -85,12 +86,17 @@ static void sg_proc_cleanup(void);
>   * Replacing muldiv(x) by muldiv(x)=((x % d) * m) / d + int(x / d) * m
>   * calculates the same, but prevents the overflow when both m and d
>   * are "small" numbers (like HZ and USER_HZ).
> - * Of course an overflow is inavoidable if the result of muldiv doesn't fit
> - * in 32 bits.
>   */
> -#define MULDIV(X,MUL,DIV) ((((X % DIV) * MUL) / DIV) + ((X / DIV) * MUL))
> +static inline u64 mult_frac64(u64 x, u32 numer, u32 denom)
> +{
> +       u64 r1 = do_div(x, denom);
> +       u64 r2 = r1 * numer;
> +
> +       do_div(r2, denom);

> +       return (x * numer) + r2;

Parens are useless, noticed later, sorry.

Isn't mult_frac() enough here?

Btw, can you mention explicitly what is the warning you get
(copy'n'paste of the line would be okay)?

> +}
>
> -#define SG_DEFAULT_TIMEOUT MULDIV(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
> +#define SG_DEFAULT_TIMEOUT mult_frac64(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
>
>  int sg_big_buff = SG_DEF_RESERVED_SIZE;
>  /* N.B. This variable is readable and writeable via
> @@ -877,10 +883,10 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
>                         return result;
>                 if (val < 0)
>                         return -EIO;
> -               if (val >= MULDIV (INT_MAX, USER_HZ, HZ))
> -                   val = MULDIV (INT_MAX, USER_HZ, HZ);
> +               if (val >= mult_frac64(INT_MAX, USER_HZ, HZ))
> +                       val = mult_frac64(INT_MAX, USER_HZ, HZ);


>                 sfp->timeout_user = val;
> -               sfp->timeout = MULDIV (val, HZ, USER_HZ);
> +               sfp->timeout = mult_frac64(val, HZ, USER_HZ);
>
>                 return 0;
>         case SG_GET_TIMEOUT:    /* N.B. User receives timeout as return value */
> --
> Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09 14:07         ` Sinan Kaya
@ 2015-11-09 14:33           ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-09 14:33 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-arm-kernel, Hannes Reinecke, linux-scsi, timur, cov, jcm,
	Abhijit Mahajan, Nagalakshmi Nandigama, linux-arm-msm,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, agross, MPT-FusionLinux.pdl

On Monday 09 November 2015 09:07:36 Sinan Kaya wrote:
> 
> On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
> > On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
> >> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> >>> Current code gives up when 32 bit DMA is not supported.
> >>> This problem has been observed on systems without any
> >>> memory below 4 gig.
> >>>
> >>> This patch tests 64 bit support before bailing out to find
> >>> a working combination.
> >>>
> >> That feels decidedly odd.
> >>
> >> Why do you probe for 64bit if 32bit fails?
> >
> > 32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
> > of places. If the machine has all of its RAM visible above 4GB
> > PCI bus addresses, it must use an IOMMU.
> 
> Can you be specific? PCIe does not have this limitation. It supports 32 
> bit and 64 bit TLPs.
> 
> I have not seen any limitation so far in the OS either.

See Documentation/DMA-API-HOWTO.txt

All PCI devices start out with a 32-bit DMA mask, and Linux assumes it
can always fall back to a 32-bit mask if a smaller mask (needed for
some devices that only support DMA on a subset of the 4GB) or larger
mask (needed to address high memory but can fail when the PCI host
does not support it) cannot be set.

> Using IOMMU is fine but not required if the endpoint is a true 64 bit 
> supporting endpoint. This endpoint supports 64bit too.

There are two aspects here:

a) setting a 32-bit mask should not have failed. Any device that actually
   needs 32-bit DMA will make the same call and the platform must
   guarantee that this works. If you have a broken platform that can't
   do this, complain to your board vendor so they wire up the RAM
   properly, with at least the first 2GB visible to low PCI bus
   addresses.

b) If the platform has any memory above 4GB and the supports high DMA,
   it should never have asked for the 32-bit mask before trying the
   64-bit mask first. This is only a performance optimization, but the
   driver seems to do the right thing, so I assume there is something
   wrong with the platform code.

> >> Typically it's the other way round, on the grounds that 64bit DMA
> >> should be preferred over 32bit.
> >> Can you explain why it needs to be done the other way round here?
> >
> > Something else is odd here, the driver already checks for
> > dma_get_required_mask(), which will return the smallest mask
> > that fits all of RAM. If the machine has any memory above 4GB,
> > it already uses the 64-bit mask, and only falls back to
> > the 32-bit mask if that fails or if all memory fits within the
> > first 4GB.
> >
> 
> I'll add some prints in the code to get to the bottom of it. I think the 
> code is checking more than just the required mask and failing in one of 
> the other conditions. At least that DMA comparison code was more than 
> what I have ever seen.

Ok. That should take care of b) above, but we still have a bug with a)
that will come back to bite you if you don't address it right.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09 14:33           ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-09 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 09 November 2015 09:07:36 Sinan Kaya wrote:
> 
> On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
> > On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
> >> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
> >>> Current code gives up when 32 bit DMA is not supported.
> >>> This problem has been observed on systems without any
> >>> memory below 4 gig.
> >>>
> >>> This patch tests 64 bit support before bailing out to find
> >>> a working combination.
> >>>
> >> That feels decidedly odd.
> >>
> >> Why do you probe for 64bit if 32bit fails?
> >
> > 32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
> > of places. If the machine has all of its RAM visible above 4GB
> > PCI bus addresses, it must use an IOMMU.
> 
> Can you be specific? PCIe does not have this limitation. It supports 32 
> bit and 64 bit TLPs.
> 
> I have not seen any limitation so far in the OS either.

See Documentation/DMA-API-HOWTO.txt

All PCI devices start out with a 32-bit DMA mask, and Linux assumes it
can always fall back to a 32-bit mask if a smaller mask (needed for
some devices that only support DMA on a subset of the 4GB) or larger
mask (needed to address high memory but can fail when the PCI host
does not support it) cannot be set.

> Using IOMMU is fine but not required if the endpoint is a true 64 bit 
> supporting endpoint. This endpoint supports 64bit too.

There are two aspects here:

a) setting a 32-bit mask should not have failed. Any device that actually
   needs 32-bit DMA will make the same call and the platform must
   guarantee that this works. If you have a broken platform that can't
   do this, complain to your board vendor so they wire up the RAM
   properly, with at least the first 2GB visible to low PCI bus
   addresses.

b) If the platform has any memory above 4GB and the supports high DMA,
   it should never have asked for the 32-bit mask before trying the
   64-bit mask first. This is only a performance optimization, but the
   driver seems to do the right thing, so I assume there is something
   wrong with the platform code.

> >> Typically it's the other way round, on the grounds that 64bit DMA
> >> should be preferred over 32bit.
> >> Can you explain why it needs to be done the other way round here?
> >
> > Something else is odd here, the driver already checks for
> > dma_get_required_mask(), which will return the smallest mask
> > that fits all of RAM. If the machine has any memory above 4GB,
> > it already uses the 64-bit mask, and only falls back to
> > the 32-bit mask if that fails or if all memory fits within the
> > first 4GB.
> >
> 
> I'll add some prints in the code to get to the bottom of it. I think the 
> code is checking more than just the required mask and failing in one of 
> the other conditions. At least that DMA comparison code was more than 
> what I have ever seen.

Ok. That should take care of b) above, but we still have a bug with a)
that will come back to bite you if you don't address it right.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09 14:33           ` Arnd Bergmann
@ 2015-11-09 23:22             ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 23:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Hannes Reinecke, linux-scsi, timur, cov, jcm,
	Abhijit Mahajan, Nagalakshmi Nandigama, linux-arm-msm,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, agross, MPT-FusionLinux.pdl



On 11/9/2015 9:33 AM, Arnd Bergmann wrote:
> On Monday 09 November 2015 09:07:36 Sinan Kaya wrote:
>>
>> On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
>>> On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
>>>> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>>>>> Current code gives up when 32 bit DMA is not supported.
>>>>> This problem has been observed on systems without any
>>>>> memory below 4 gig.
>>>>>
>>>>> This patch tests 64 bit support before bailing out to find
>>>>> a working combination.
>>>>>
>>>> That feels decidedly odd.
>>>>
>>>> Why do you probe for 64bit if 32bit fails?
>>>
>>> 32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
>>> of places. If the machine has all of its RAM visible above 4GB
>>> PCI bus addresses, it must use an IOMMU.
>>
>> Can you be specific? PCIe does not have this limitation. It supports 32
>> bit and 64 bit TLPs.
>>
>> I have not seen any limitation so far in the OS either.
>
> See Documentation/DMA-API-HOWTO.txt
>
> All PCI devices start out with a 32-bit DMA mask, and Linux assumes it
> can always fall back to a 32-bit mask if a smaller mask (needed for
> some devices that only support DMA on a subset of the 4GB) or larger
> mask (needed to address high memory but can fail when the PCI host
> does not support it) cannot be set.
>
>> Using IOMMU is fine but not required if the endpoint is a true 64 bit
>> supporting endpoint. This endpoint supports 64bit too.
>
> There are two aspects here:
>
> a) setting a 32-bit mask should not have failed. Any device that actually
>     needs 32-bit DMA will make the same call and the platform must
>     guarantee that this works. If you have a broken platform that can't
>     do this, complain to your board vendor so they wire up the RAM
>     properly, with at least the first 2GB visible to low PCI bus
>     addresses.
>
> b) If the platform has any memory above 4GB and the supports high DMA,
>     it should never have asked for the 32-bit mask before trying the
>     64-bit mask first. This is only a performance optimization, but the
>     driver seems to do the right thing, so I assume there is something
>     wrong with the platform code.
>
>>>> Typically it's the other way round, on the grounds that 64bit DMA
>>>> should be preferred over 32bit.
>>>> Can you explain why it needs to be done the other way round here?
>>>
>>> Something else is odd here, the driver already checks for
>>> dma_get_required_mask(), which will return the smallest mask
>>> that fits all of RAM. If the machine has any memory above 4GB,
>>> it already uses the 64-bit mask, and only falls back to
>>> the 32-bit mask if that fails or if all memory fits within the
>>> first 4GB.
>>>
>>
>> I'll add some prints in the code to get to the bottom of it. I think the
>> code is checking more than just the required mask and failing in one of
>> the other conditions. At least that DMA comparison code was more than
>> what I have ever seen.
>
> Ok. That should take care of b) above, but we still have a bug with a)
> that will come back to bite you if you don't address it right.
>
> 	Arnd
>

B should have worked and it doesn't. B works for other drivers but not 
for this particular one.

As you said,
"it should never have asked for the 32-bit mask before trying the
64-bit mask first."

this is the problem.

This HW doesn't have support for a. If I need a, I am required to use 
IOMMU. Here is what I found.

mpt2sas version 20.100.00.00 loaded
sizeof(dma_addr_t):8
ioc->dma_mask :0
dma_get_required_mask(&pdev->dev) :7fffffffff
mpt2sas0: no suitable DMA mask for 0002:01:00.0
mpt2sas0: failure at 
drivers/scsi/mpt2sas/mpt2sas_scsih.c:8498/_scsih_probe()!


Here is the code.

if (ioc->dma_mask)
	consistent_dma_mask = DMA_BIT_MASK(64);
else
	consistent_dma_mask = DMA_BIT_MASK(32); 	<-- why here?

if (sizeof(dma_addr_t) > 4) {			<-- true
	const uint64_t required_mask =
	    dma_get_required_mask(&pdev->dev);
	if ((required_mask > DMA_BIT_MASK(32)) &&	<-- true
	    !pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) && <-- true
	    !pci_set_consistent_dma_mask(pdev, consistent_dma_mask)) { <-- false
		ioc->base_add_sg_single = &_base_add_sg_single_64;
		ioc->sge_size = sizeof(Mpi2SGESimple64_t);
		ioc->dma_mask = 64;
		goto out;
	}
}

ioc->dma_mask is 0 and the driver is trying to use 32 bit even though 64 
bit supported by the platform.

I think the proper fix is to pass the required_mask back to 
consistent_dma_mask rather than using ioc->dma_mask and guessing.

pci_set_consistent_dma_mask(pdev, required_mask)

My code was just a band aid for broken code.

The driver worked after that changing the above line only.

mpt2sas_version_20.100.00.00_loaded
sizeof(dma_addr_t) :8
ioc->dma_mask :0
dma_get_required_mask(&pdev->dev) :7fffffffff
mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16411112 kB)
mpt2sas0: MSI-X vectors supported: 1, no of cores: 24, max_msix_vectors: 8
mpt2sas0: IO-APIC enabled: IRQ 105
mpt2sas0: iomem(0x00000a01005c0000), mapped(0xffff0000008d8000), size(16384)
mpt2sas0: ioport(0x0000000000000000), size(0)
mpt2sas0: Allocated physical memory: size(3392 kB)
mpt2sas0: Current Controller Queue Depth(1483), Max Controller Queue 
Depth(1720)
mpt2sas0: Scatter Gather Elements per IO(128)
mpt2sas0: LSISAS2004: FWVersion(17.00.01.00), ChipRevision(0x03), 
BiosVersion(07.33.0
mpt2sas0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot 
Buffer,Diag Trac
scsi host0: Fusion MPT SAS Host
mpt2sas0: sending port enable !!
mpt2sas0: host_add: handle(0x0001), sas_addr(0x500062b0002de4f0), phys(8)
mpt2sas0: port enable: SUCCESS
scsi 0:0:0:0: Direct-Access     ATA      Samsung SSD 845D 3X3Q PQ: 0 ANSI: 6
scsi 0:0:0:0: SATA: handle(0x0009), sas_addr(0x4433221101000000), 
phy(1), device_name



-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09 23:22             ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-09 23:22 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 9:33 AM, Arnd Bergmann wrote:
> On Monday 09 November 2015 09:07:36 Sinan Kaya wrote:
>>
>> On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
>>> On Monday 09 November 2015 08:09:39 Hannes Reinecke wrote:
>>>> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>>>>> Current code gives up when 32 bit DMA is not supported.
>>>>> This problem has been observed on systems without any
>>>>> memory below 4 gig.
>>>>>
>>>>> This patch tests 64 bit support before bailing out to find
>>>>> a working combination.
>>>>>
>>>> That feels decidedly odd.
>>>>
>>>> Why do you probe for 64bit if 32bit fails?
>>>
>>> 32-bit DMA mask on PCI cannot fail, we rely on that in all sorts
>>> of places. If the machine has all of its RAM visible above 4GB
>>> PCI bus addresses, it must use an IOMMU.
>>
>> Can you be specific? PCIe does not have this limitation. It supports 32
>> bit and 64 bit TLPs.
>>
>> I have not seen any limitation so far in the OS either.
>
> See Documentation/DMA-API-HOWTO.txt
>
> All PCI devices start out with a 32-bit DMA mask, and Linux assumes it
> can always fall back to a 32-bit mask if a smaller mask (needed for
> some devices that only support DMA on a subset of the 4GB) or larger
> mask (needed to address high memory but can fail when the PCI host
> does not support it) cannot be set.
>
>> Using IOMMU is fine but not required if the endpoint is a true 64 bit
>> supporting endpoint. This endpoint supports 64bit too.
>
> There are two aspects here:
>
> a) setting a 32-bit mask should not have failed. Any device that actually
>     needs 32-bit DMA will make the same call and the platform must
>     guarantee that this works. If you have a broken platform that can't
>     do this, complain to your board vendor so they wire up the RAM
>     properly, with at least the first 2GB visible to low PCI bus
>     addresses.
>
> b) If the platform has any memory above 4GB and the supports high DMA,
>     it should never have asked for the 32-bit mask before trying the
>     64-bit mask first. This is only a performance optimization, but the
>     driver seems to do the right thing, so I assume there is something
>     wrong with the platform code.
>
>>>> Typically it's the other way round, on the grounds that 64bit DMA
>>>> should be preferred over 32bit.
>>>> Can you explain why it needs to be done the other way round here?
>>>
>>> Something else is odd here, the driver already checks for
>>> dma_get_required_mask(), which will return the smallest mask
>>> that fits all of RAM. If the machine has any memory above 4GB,
>>> it already uses the 64-bit mask, and only falls back to
>>> the 32-bit mask if that fails or if all memory fits within the
>>> first 4GB.
>>>
>>
>> I'll add some prints in the code to get to the bottom of it. I think the
>> code is checking more than just the required mask and failing in one of
>> the other conditions. At least that DMA comparison code was more than
>> what I have ever seen.
>
> Ok. That should take care of b) above, but we still have a bug with a)
> that will come back to bite you if you don't address it right.
>
> 	Arnd
>

B should have worked and it doesn't. B works for other drivers but not 
for this particular one.

As you said,
"it should never have asked for the 32-bit mask before trying the
64-bit mask first."

this is the problem.

This HW doesn't have support for a. If I need a, I am required to use 
IOMMU. Here is what I found.

mpt2sas version 20.100.00.00 loaded
sizeof(dma_addr_t):8
ioc->dma_mask :0
dma_get_required_mask(&pdev->dev) :7fffffffff
mpt2sas0: no suitable DMA mask for 0002:01:00.0
mpt2sas0: failure at 
drivers/scsi/mpt2sas/mpt2sas_scsih.c:8498/_scsih_probe()!


Here is the code.

if (ioc->dma_mask)
	consistent_dma_mask = DMA_BIT_MASK(64);
else
	consistent_dma_mask = DMA_BIT_MASK(32); 	<-- why here?

if (sizeof(dma_addr_t) > 4) {			<-- true
	const uint64_t required_mask =
	    dma_get_required_mask(&pdev->dev);
	if ((required_mask > DMA_BIT_MASK(32)) &&	<-- true
	    !pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) && <-- true
	    !pci_set_consistent_dma_mask(pdev, consistent_dma_mask)) { <-- false
		ioc->base_add_sg_single = &_base_add_sg_single_64;
		ioc->sge_size = sizeof(Mpi2SGESimple64_t);
		ioc->dma_mask = 64;
		goto out;
	}
}

ioc->dma_mask is 0 and the driver is trying to use 32 bit even though 64 
bit supported by the platform.

I think the proper fix is to pass the required_mask back to 
consistent_dma_mask rather than using ioc->dma_mask and guessing.

pci_set_consistent_dma_mask(pdev, required_mask)

My code was just a band aid for broken code.

The driver worked after that changing the above line only.

mpt2sas_version_20.100.00.00_loaded
sizeof(dma_addr_t) :8
ioc->dma_mask :0
dma_get_required_mask(&pdev->dev) :7fffffffff
mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16411112 kB)
mpt2sas0: MSI-X vectors supported: 1, no of cores: 24, max_msix_vectors: 8
mpt2sas0: IO-APIC enabled: IRQ 105
mpt2sas0: iomem(0x00000a01005c0000), mapped(0xffff0000008d8000), size(16384)
mpt2sas0: ioport(0x0000000000000000), size(0)
mpt2sas0: Allocated physical memory: size(3392 kB)
mpt2sas0: Current Controller Queue Depth(1483), Max Controller Queue 
Depth(1720)
mpt2sas0: Scatter Gather Elements per IO(128)
mpt2sas0: LSISAS2004: FWVersion(17.00.01.00), ChipRevision(0x03), 
BiosVersion(07.33.0
mpt2sas0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot 
Buffer,Diag Trac
scsi host0: Fusion MPT SAS Host
mpt2sas0: sending port enable !!
mpt2sas0: host_add: handle(0x0001), sas_addr(0x500062b0002de4f0), phys(8)
mpt2sas0: port enable: SUCCESS
scsi 0:0:0:0: Direct-Access     ATA      Samsung SSD 845D 3X3Q PQ: 0 ANSI: 6
scsi 0:0:0:0: SATA: handle(0x0009), sas_addr(0x4433221101000000), 
phy(1), device_name



-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09 23:22             ` Sinan Kaya
@ 2015-11-09 23:29               ` Timur Tabi
  -1 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-09 23:29 UTC (permalink / raw)
  To: Sinan Kaya, Arnd Bergmann
  Cc: linux-arm-kernel, Hannes Reinecke, linux-scsi, cov, jcm,
	Abhijit Mahajan, Nagalakshmi Nandigama, linux-arm-msm,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, agross, MPT-FusionLinux.pdl

On 11/09/2015 05:22 PM, Sinan Kaya wrote:
>
> if (ioc->dma_mask)
>      consistent_dma_mask = DMA_BIT_MASK(64);
> else
>      consistent_dma_mask = DMA_BIT_MASK(32);     <-- why here?

So this change is from this patch:

	http://permalink.gmane.org/gmane.linux.kernel/1759343

Note that this was discussed back in February:

	https://lkml.org/lkml/2015/2/20/2


-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-09 23:29               ` Timur Tabi
  0 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-09 23:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/09/2015 05:22 PM, Sinan Kaya wrote:
>
> if (ioc->dma_mask)
>      consistent_dma_mask = DMA_BIT_MASK(64);
> else
>      consistent_dma_mask = DMA_BIT_MASK(32);     <-- why here?

So this change is from this patch:

	http://permalink.gmane.org/gmane.linux.kernel/1759343

Note that this was discussed back in February:

	https://lkml.org/lkml/2015/2/20/2


-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-09 14:14     ` Andy Shevchenko
  (?)
@ 2015-11-10  3:21       ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  3:21 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Doug Gilbert, James E.J. Bottomley, linux-scsi, linux-arm-msm,
	timur, Andy Gross, linux-kernel, cov, jcm,
	linux-arm Mailing List



On 11/9/2015 9:14 AM, Andy Shevchenko wrote:
> Parens are useless, noticed later, sorry.
>
> Isn't mult_frac() enough here?
>
> Btw, can you mention explicitly what is the warning you get
> (copy'n'paste of the line would be okay)?

I created this patch back in March with an older version of the compiler 
and older kernel (3.19). I'm no longer able to reproduce this with this 
compiler and linux-next.

Thread model: posix
gcc version 4.8.3 20140401 (prerelease) (crosstool-NG 
linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)

I'll drop this patch.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10  3:21       ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  3:21 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: linux-scsi, timur, cov, jcm, Andy Gross, linux-arm-msm,
	linux-arm Mailing List, Doug Gilbert, James E.J. Bottomley,
	linux-kernel



On 11/9/2015 9:14 AM, Andy Shevchenko wrote:
> Parens are useless, noticed later, sorry.
>
> Isn't mult_frac() enough here?
>
> Btw, can you mention explicitly what is the warning you get
> (copy'n'paste of the line would be okay)?

I created this patch back in March with an older version of the compiler 
and older kernel (3.19). I'm no longer able to reproduce this with this 
compiler and linux-next.

Thread model: posix
gcc version 4.8.3 20140401 (prerelease) (crosstool-NG 
linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)

I'll drop this patch.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10  3:21       ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  3:21 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 9:14 AM, Andy Shevchenko wrote:
> Parens are useless, noticed later, sorry.
>
> Isn't mult_frac() enough here?
>
> Btw, can you mention explicitly what is the warning you get
> (copy'n'paste of the line would be okay)?

I created this patch back in March with an older version of the compiler 
and older kernel (3.19). I'm no longer able to reproduce this with this 
compiler and linux-next.

Thread model: posix
gcc version 4.8.3 20140401 (prerelease) (crosstool-NG 
linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)

I'll drop this patch.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-10  3:21       ` Sinan Kaya
@ 2015-11-10  3:26         ` Timur Tabi
  -1 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10  3:26 UTC (permalink / raw)
  To: Sinan Kaya, Andy Shevchenko
  Cc: linux-scsi, cov, jcm, Andy Gross, linux-arm-msm,
	linux-arm Mailing List, Doug Gilbert, James E.J. Bottomley,
	linux-kernel

Sinan Kaya wrote:
>
> I created this patch back in March with an older version of the compiler
> and older kernel (3.19). I'm no longer able to reproduce this with this
> compiler and linux-next.
>
> Thread model: posix
> gcc version 4.8.3 20140401 (prerelease) (crosstool-NG
> linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)
>
> I'll drop this patch.

Are you sure the compiler handles the old macro correctly?  Maybe it's 
just quiescing the error message, but it's still broken?

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10  3:26         ` Timur Tabi
  0 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10  3:26 UTC (permalink / raw)
  To: linux-arm-kernel

Sinan Kaya wrote:
>
> I created this patch back in March with an older version of the compiler
> and older kernel (3.19). I'm no longer able to reproduce this with this
> compiler and linux-next.
>
> Thread model: posix
> gcc version 4.8.3 20140401 (prerelease) (crosstool-NG
> linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)
>
> I'll drop this patch.

Are you sure the compiler handles the old macro correctly?  Maybe it's 
just quiescing the error message, but it's still broken?

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-10  3:26         ` Timur Tabi
@ 2015-11-10  4:51           ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  4:51 UTC (permalink / raw)
  To: Timur Tabi, Andy Shevchenko
  Cc: linux-scsi, cov, jcm, Andy Gross, linux-arm-msm,
	linux-arm Mailing List, Doug Gilbert, James E.J. Bottomley,
	linux-kernel



On 11/9/2015 10:26 PM, Timur Tabi wrote:
> Sinan Kaya wrote:
>>
>> I created this patch back in March with an older version of the compiler
>> and older kernel (3.19). I'm no longer able to reproduce this with this
>> compiler and linux-next.
>>
>> Thread model: posix
>> gcc version 4.8.3 20140401 (prerelease) (crosstool-NG
>> linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)
>>
>> I'll drop this patch.
>
> Are you sure the compiler handles the old macro correctly?  Maybe it's
> just quiescing the error message, but it's still broken?
>

The code says it is using these macros for small integers only which 
can't overflow. I was trying to get rid of compiler warning and it seems 
to have disappeared.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10  4:51           ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  4:51 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 10:26 PM, Timur Tabi wrote:
> Sinan Kaya wrote:
>>
>> I created this patch back in March with an older version of the compiler
>> and older kernel (3.19). I'm no longer able to reproduce this with this
>> compiler and linux-next.
>>
>> Thread model: posix
>> gcc version 4.8.3 20140401 (prerelease) (crosstool-NG
>> linaro-1.13.1-4.8-2014.04 - Linaro GCC 4.8-2014.04)
>>
>> I'll drop this patch.
>
> Are you sure the compiler handles the old macro correctly?  Maybe it's
> just quiescing the error message, but it's still broken?
>

The code says it is using these macros for small integers only which 
can't overflow. I was trying to get rid of compiler warning and it seems 
to have disappeared.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-10  4:51           ` Sinan Kaya
@ 2015-11-10  4:53             ` Timur Tabi
  -1 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10  4:53 UTC (permalink / raw)
  To: Sinan Kaya, Andy Shevchenko
  Cc: linux-scsi, cov, jcm, Andy Gross, linux-arm-msm,
	linux-arm Mailing List, Doug Gilbert, James E.J. Bottomley,
	linux-kernel

Sinan Kaya wrote:
>
> The code says it is using these macros for small integers only which
> can't overflow. I was trying to get rid of compiler warning and it seems
> to have disappeared.

I would double-check the assembly code, if I were you.  I don't like it 
when warnings just go away like that.

Besides, we *should* be using do_div() for 64-bit division.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10  4:53             ` Timur Tabi
  0 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10  4:53 UTC (permalink / raw)
  To: linux-arm-kernel

Sinan Kaya wrote:
>
> The code says it is using these macros for small integers only which
> can't overflow. I was trying to get rid of compiler warning and it seems
> to have disappeared.

I would double-check the assembly code, if I were you.  I don't like it 
when warnings just go away like that.

Besides, we *should* be using do_div() for 64-bit division.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
  2015-11-09  7:15     ` Hannes Reinecke
  (?)
@ 2015-11-10  5:59       ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  5:59 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi, timur, cov, jcm
  Cc: Abhijit Mahajan, Nagalakshmi Nandigama, linux-arm-msm,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, agross, MPT-FusionLinux.pdl,
	linux-arm-kernel



On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> The mpt2sas and mpt3sas drivers are spinning forever in
>> their IRQ handlers if there are a lot of jobs queued up
>> by the PCIe card. This handler is causing spikes for
>> the rest of the system and sluggish behavior.
>>
>> Marking all MSI interrupts as non-shared and moving the
>> MSI interrupts to thread context. This relexes the rest
>> of the system execution.
>>
> NACK.
>
> If there is a scalability issue when handling interrupts
> it should be fixed in the driver directly.
>
> Looking at the driver is should be possible to implement
> a worker thread handling the reply descriptor, and having the
> interrupt only to fetch the reply descriptor.
>

Can you go into the detail about which part of the _base_interrupt 
function needs to be executed in ISR context and which part can be 
queued up to worker thread?

I'm not familiar with the hardware or the code. That's why, I moved the 
entire ISR into the thread context.



> Cheers,
>
> Hannes
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
@ 2015-11-10  5:59       ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  5:59 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi, timur, cov, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Nagalakshmi Nandigama,
	Praveen Krishnamoorthy, Sreekanth Reddy, Abhijit Mahajan,
	James E.J. Bottomley, MPT-FusionLinux.pdl, linux-kernel



On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> The mpt2sas and mpt3sas drivers are spinning forever in
>> their IRQ handlers if there are a lot of jobs queued up
>> by the PCIe card. This handler is causing spikes for
>> the rest of the system and sluggish behavior.
>>
>> Marking all MSI interrupts as non-shared and moving the
>> MSI interrupts to thread context. This relexes the rest
>> of the system execution.
>>
> NACK.
>
> If there is a scalability issue when handling interrupts
> it should be fixed in the driver directly.
>
> Looking at the driver is should be possible to implement
> a worker thread handling the reply descriptor, and having the
> interrupt only to fetch the reply descriptor.
>

Can you go into the detail about which part of the _base_interrupt 
function needs to be executed in ISR context and which part can be 
queued up to worker thread?

I'm not familiar with the hardware or the code. That's why, I moved the 
entire ISR into the thread context.



> Cheers,
>
> Hannes
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
@ 2015-11-10  5:59       ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10  5:59 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>> The mpt2sas and mpt3sas drivers are spinning forever in
>> their IRQ handlers if there are a lot of jobs queued up
>> by the PCIe card. This handler is causing spikes for
>> the rest of the system and sluggish behavior.
>>
>> Marking all MSI interrupts as non-shared and moving the
>> MSI interrupts to thread context. This relexes the rest
>> of the system execution.
>>
> NACK.
>
> If there is a scalability issue when handling interrupts
> it should be fixed in the driver directly.
>
> Looking at the driver is should be possible to implement
> a worker thread handling the reply descriptor, and having the
> interrupt only to fetch the reply descriptor.
>

Can you go into the detail about which part of the _base_interrupt 
function needs to be executed in ISR context and which part can be 
queued up to worker thread?

I'm not familiar with the hardware or the code. That's why, I moved the 
entire ISR into the thread context.



> Cheers,
>
> Hannes
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-09 23:22             ` Sinan Kaya
@ 2015-11-10  8:38               ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10  8:38 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Sinan Kaya, Abhijit Mahajan, Nagalakshmi Nandigama, linux-scsi,
	jcm, timur, James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, cov, linux-arm-msm, agross,
	MPT-FusionLinux.pdl, Hannes Reinecke

On Monday 09 November 2015 18:22:22 Sinan Kaya wrote:
> On 11/9/2015 9:33 AM, Arnd Bergmann wrote:
> > On Monday 09 November 2015 09:07:36 Sinan Kaya wrote:
> >> On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
> 
> ioc->dma_mask is 0 and the driver is trying to use 32 bit even though 64 
> bit supported by the platform.

Ok, makes sense.

> I think the proper fix is to pass the required_mask back to 
> consistent_dma_mask rather than using ioc->dma_mask and guessing.
> 
> pci_set_consistent_dma_mask(pdev, required_mask)
> 
> My code was just a band aid for broken code.
> 

No, as Timur found, the driver is correct and it intentionally
sets the 32-bit mask, and that is guaranteed to work on all sane
hardware. Don't change the driver but find a better platform for
your workload, or talk to the people that are responsible for
the platform and get them to fix it.

If the platform also doesn't have an IOMMU, you can probably work
around it by setting up the dma-ranges property of the PCI host
to map the low PCI addresses to the start of RAM. This will also
require changes in the bootloader to set up the PCI outbound translation,
and it will require implementing the DMA offset on ARM64, which I was
hoping to avoid.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10  8:38               ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10  8:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 09 November 2015 18:22:22 Sinan Kaya wrote:
> On 11/9/2015 9:33 AM, Arnd Bergmann wrote:
> > On Monday 09 November 2015 09:07:36 Sinan Kaya wrote:
> >> On 11/9/2015 3:59 AM, Arnd Bergmann wrote:
> 
> ioc->dma_mask is 0 and the driver is trying to use 32 bit even though 64 
> bit supported by the platform.

Ok, makes sense.

> I think the proper fix is to pass the required_mask back to 
> consistent_dma_mask rather than using ioc->dma_mask and guessing.
> 
> pci_set_consistent_dma_mask(pdev, required_mask)
> 
> My code was just a band aid for broken code.
> 

No, as Timur found, the driver is correct and it intentionally
sets the 32-bit mask, and that is guaranteed to work on all sane
hardware. Don't change the driver but find a better platform for
your workload, or talk to the people that are responsible for
the platform and get them to fix it.

If the platform also doesn't have an IOMMU, you can probably work
around it by setting up the dma-ranges property of the PCI host
to map the low PCI addresses to the start of RAM. This will also
require changes in the bootloader to set up the PCI outbound translation,
and it will require implementing the DMA offset on ARM64, which I was
hoping to avoid.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-10  4:53             ` Timur Tabi
@ 2015-11-10  9:23               ` Andy Shevchenko
  -1 siblings, 0 replies; 86+ messages in thread
From: Andy Shevchenko @ 2015-11-10  9:23 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Sinan Kaya, linux-scsi, cov, jcm, Andy Gross, linux-arm-msm,
	linux-arm Mailing List, Doug Gilbert, James E.J. Bottomley,
	linux-kernel

On Tue, Nov 10, 2015 at 6:53 AM, Timur Tabi <timur@codeaurora.org> wrote:
> Sinan Kaya wrote:
>>
>>
>> The code says it is using these macros for small integers only which
>> can't overflow. I was trying to get rid of compiler warning and it seems
>> to have disappeared.
>
>
> I would double-check the assembly code, if I were you.  I don't like it when
> warnings just go away like that.

+1 to that.

>
> Besides, we *should* be using do_div() for 64-bit division.

But here looks like all numbers are guaranteed to be less than or
equal to INT_MAX.
Thus, the matter is only to replace MULDIV() by mult_frac() which is
already in kernel.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10  9:23               ` Andy Shevchenko
  0 siblings, 0 replies; 86+ messages in thread
From: Andy Shevchenko @ 2015-11-10  9:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 10, 2015 at 6:53 AM, Timur Tabi <timur@codeaurora.org> wrote:
> Sinan Kaya wrote:
>>
>>
>> The code says it is using these macros for small integers only which
>> can't overflow. I was trying to get rid of compiler warning and it seems
>> to have disappeared.
>
>
> I would double-check the assembly code, if I were you.  I don't like it when
> warnings just go away like that.

+1 to that.

>
> Besides, we *should* be using do_div() for 64-bit division.

But here looks like all numbers are guaranteed to be less than or
equal to INT_MAX.
Thus, the matter is only to replace MULDIV() by mult_frac() which is
already in kernel.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 2/3] scsi: fix compiler warning for sg
  2015-11-10  4:53             ` Timur Tabi
@ 2015-11-10 10:09               ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 10:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Timur Tabi, Sinan Kaya, Andy Shevchenko, Doug Gilbert,
	James E.J. Bottomley, linux-scsi, jcm, Andy Gross, linux-kernel,
	cov, linux-arm-msm

On Monday 09 November 2015 22:53:17 Timur Tabi wrote:
> Sinan Kaya wrote:
> >
> > The code says it is using these macros for small integers only which
> > can't overflow. I was trying to get rid of compiler warning and it seems
> > to have disappeared.
> 
> I would double-check the assembly code, if I were you.  I don't like it 
> when warnings just go away like that.
> 
> Besides, we *should* be using do_div() for 64-bit division.

I stared at this code for some time and couldn't figure out whether it
is actually safe or not. The point here is that it doesn't actually do
a 64-bit division here:

	MULDIV(INT_MAX, USER_HZ, HZ)

where all arguments are 32bit and it tries to figure out whether the
ioctl argument is too big to fit into a 32-bit number

but it does a 'long' division that happens to be 64-bit long on
architectures with the respective register size when it then does

	sfp->timeout = MULDIV (val, HZ, USER_HZ);

to scale up the argument from USER_HZ to the possibly larger in-kernel
HZ value. So I think it's safe as is, but I'm still not entirely sure.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 2/3] scsi: fix compiler warning for sg
@ 2015-11-10 10:09               ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 10:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 09 November 2015 22:53:17 Timur Tabi wrote:
> Sinan Kaya wrote:
> >
> > The code says it is using these macros for small integers only which
> > can't overflow. I was trying to get rid of compiler warning and it seems
> > to have disappeared.
> 
> I would double-check the assembly code, if I were you.  I don't like it 
> when warnings just go away like that.
> 
> Besides, we *should* be using do_div() for 64-bit division.

I stared at this code for some time and couldn't figure out whether it
is actually safe or not. The point here is that it doesn't actually do
a 64-bit division here:

	MULDIV(INT_MAX, USER_HZ, HZ)

where all arguments are 32bit and it tries to figure out whether the
ioctl argument is too big to fit into a 32-bit number

but it does a 'long' division that happens to be 64-bit long on
architectures with the respective register size when it then does

	sfp->timeout = MULDIV (val, HZ, USER_HZ);

to scale up the argument from USER_HZ to the possibly larger in-kernel
HZ value. So I think it's safe as is, but I'm still not entirely sure.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10  8:38               ` Arnd Bergmann
@ 2015-11-10 16:06                 ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 16:06 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Abhijit Mahajan, Nagalakshmi Nandigama, linux-scsi, jcm, timur,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, cov, linux-arm-msm, agross,
	MPT-FusionLinux.pdl, Hannes Reinecke



On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
 > No, as Timur found, the driver is correct and it intentionally
> sets the 32-bit mask, and that is guaranteed to work on all sane
> hardware. Don't change the driver but find a better platform for
> your workload, or talk to the people that are responsible for
> the platform and get them to fix it.

Platform does have an IOMMU. No issues there. I am trying to clean out 
the patch pipe I have in order to get this card working with and without 
IOMMU.

>
> If the platform also doesn't have an IOMMU, you can probably work
> around it by setting up the dma-ranges property of the PCI host
> to map the low PCI addresses to the start of RAM. This will also
> require changes in the bootloader to set up the PCI outbound translation,
> and it will require implementing the DMA offset on ARM64, which I was
> hoping to avoid.

 From the email thread, it looks like this was introduced to support 
some legacy card that has 64 bit addressing limitations and is being 
carried around ("rotted") since then.

I'm the second guy after the powerpc architecture complaining about the 
very same issue. Any red flags?

I can't change the address map for PCIe. SBSA requires all inbound PCIe 
addresses to be non-translated.

I'll just have to stick with IOMMU for this card.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 16:06                 ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 16:06 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
 > No, as Timur found, the driver is correct and it intentionally
> sets the 32-bit mask, and that is guaranteed to work on all sane
> hardware. Don't change the driver but find a better platform for
> your workload, or talk to the people that are responsible for
> the platform and get them to fix it.

Platform does have an IOMMU. No issues there. I am trying to clean out 
the patch pipe I have in order to get this card working with and without 
IOMMU.

>
> If the platform also doesn't have an IOMMU, you can probably work
> around it by setting up the dma-ranges property of the PCI host
> to map the low PCI addresses to the start of RAM. This will also
> require changes in the bootloader to set up the PCI outbound translation,
> and it will require implementing the DMA offset on ARM64, which I was
> hoping to avoid.

 From the email thread, it looks like this was introduced to support 
some legacy card that has 64 bit addressing limitations and is being 
carried around ("rotted") since then.

I'm the second guy after the powerpc architecture complaining about the 
very same issue. Any red flags?

I can't change the address map for PCIe. SBSA requires all inbound PCIe 
addresses to be non-translated.

I'll just have to stick with IOMMU for this card.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 16:06                 ` Sinan Kaya
@ 2015-11-10 16:47                   ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 16:47 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: linux-arm-kernel, Abhijit Mahajan, Nagalakshmi Nandigama,
	linux-scsi, jcm, timur, James E.J. Bottomley, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
>  > No, as Timur found, the driver is correct and it intentionally
> > sets the 32-bit mask, and that is guaranteed to work on all sane
> > hardware. Don't change the driver but find a better platform for
> > your workload, or talk to the people that are responsible for
> > the platform and get them to fix it.
> 
> Platform does have an IOMMU. No issues there. I am trying to clean out 
> the patch pipe I have in order to get this card working with and without 
> IOMMU.

On PowerPC, I think we automatically enable the IOMMU whenever a DMA
mask is set that doesn't cover all of the RAM. We could think about
doing the same thing on ARM64 to make all devices work out of the box.

> > If the platform also doesn't have an IOMMU, you can probably work
> > around it by setting up the dma-ranges property of the PCI host
> > to map the low PCI addresses to the start of RAM. This will also
> > require changes in the bootloader to set up the PCI outbound translation,
> > and it will require implementing the DMA offset on ARM64, which I was
> > hoping to avoid.
> 
>  From the email thread, it looks like this was introduced to support 
> some legacy card that has 64 bit addressing limitations and is being 
> carried around ("rotted") since then.
> 
> I'm the second guy after the powerpc architecture complaining about the 
> very same issue. Any red flags?

What BenH was worried about here is that the driver sets different masks
for streaming and coherent mappings, which is indeed a worry that
could hit us on ARM as well, but I suppose we'll have to deal with
that in platform code.

Setting both masks to 32-bit is something that a lot of drivers do,
and without IOMMU enabled, you'd hit the same bug on all of them.
 
> I can't change the address map for PCIe. SBSA requires all inbound PCIe 
> addresses to be non-translated.

What about changing the memory map? I suspect there will be more
problems for you in the future when all of your RAM is at high
addresses.  Is this something you could fix in the bootloader by
moving the first 2GB to a different CPU physical address?

> I'll just have to stick with IOMMU for this card.

Ok. But how do you currently decide whether to use the IOMMU or not?

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 16:47                   ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
>  > No, as Timur found, the driver is correct and it intentionally
> > sets the 32-bit mask, and that is guaranteed to work on all sane
> > hardware. Don't change the driver but find a better platform for
> > your workload, or talk to the people that are responsible for
> > the platform and get them to fix it.
> 
> Platform does have an IOMMU. No issues there. I am trying to clean out 
> the patch pipe I have in order to get this card working with and without 
> IOMMU.

On PowerPC, I think we automatically enable the IOMMU whenever a DMA
mask is set that doesn't cover all of the RAM. We could think about
doing the same thing on ARM64 to make all devices work out of the box.

> > If the platform also doesn't have an IOMMU, you can probably work
> > around it by setting up the dma-ranges property of the PCI host
> > to map the low PCI addresses to the start of RAM. This will also
> > require changes in the bootloader to set up the PCI outbound translation,
> > and it will require implementing the DMA offset on ARM64, which I was
> > hoping to avoid.
> 
>  From the email thread, it looks like this was introduced to support 
> some legacy card that has 64 bit addressing limitations and is being 
> carried around ("rotted") since then.
> 
> I'm the second guy after the powerpc architecture complaining about the 
> very same issue. Any red flags?

What BenH was worried about here is that the driver sets different masks
for streaming and coherent mappings, which is indeed a worry that
could hit us on ARM as well, but I suppose we'll have to deal with
that in platform code.

Setting both masks to 32-bit is something that a lot of drivers do,
and without IOMMU enabled, you'd hit the same bug on all of them.
 
> I can't change the address map for PCIe. SBSA requires all inbound PCIe 
> addresses to be non-translated.

What about changing the memory map? I suspect there will be more
problems for you in the future when all of your RAM is at high
addresses.  Is this something you could fix in the bootloader by
moving the first 2GB to a different CPU physical address?

> I'll just have to stick with IOMMU for this card.

Ok. But how do you currently decide whether to use the IOMMU or not?

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 16:47                   ` Arnd Bergmann
@ 2015-11-10 17:00                     ` Timur Tabi
  -1 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10 17:00 UTC (permalink / raw)
  To: Arnd Bergmann, Sinan Kaya
  Cc: linux-arm-kernel, Abhijit Mahajan, Nagalakshmi Nandigama,
	linux-scsi, jcm, James E.J. Bottomley, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke

On 11/10/2015 10:47 AM, Arnd Bergmann wrote:
> What BenH was worried about here is that the driver sets different masks
> for streaming and coherent mappings, which is indeed a worry that
> could hit us on ARM as well, but I suppose we'll have to deal with
> that in platform code.
>
> Setting both masks to 32-bit is something that a lot of drivers do,
> and without IOMMU enabled, you'd hit the same bug on all of them.

Also note that I think that on PowerPC, the mask is set to 32 by default 
for all devices.  I don't think we do that on ARM64.  So on PowerPC, 
some drivers get away with not explicitly setting the mask.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 17:00                     ` Timur Tabi
  0 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/10/2015 10:47 AM, Arnd Bergmann wrote:
> What BenH was worried about here is that the driver sets different masks
> for streaming and coherent mappings, which is indeed a worry that
> could hit us on ARM as well, but I suppose we'll have to deal with
> that in platform code.
>
> Setting both masks to 32-bit is something that a lot of drivers do,
> and without IOMMU enabled, you'd hit the same bug on all of them.

Also note that I think that on PowerPC, the mask is set to 32 by default 
for all devices.  I don't think we do that on ARM64.  So on PowerPC, 
some drivers get away with not explicitly setting the mask.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 16:47                   ` Arnd Bergmann
@ 2015-11-10 17:19                     ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 17:19 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Abhijit Mahajan, Nagalakshmi Nandigama,
	linux-scsi, jcm, timur, James E.J. Bottomley, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke



On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
>> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
>>   > No, as Timur found, the driver is correct and it intentionally
>>> sets the 32-bit mask, and that is guaranteed to work on all sane
>>> hardware. Don't change the driver but find a better platform for
>>> your workload, or talk to the people that are responsible for
>>> the platform and get them to fix it.
>>
>> Platform does have an IOMMU. No issues there. I am trying to clean out
>> the patch pipe I have in order to get this card working with and without
>> IOMMU.
>
> On PowerPC, I think we automatically enable the IOMMU whenever a DMA
> mask is set that doesn't cover all of the RAM. We could think about
> doing the same thing on ARM64 to make all devices work out of the box.
>

The ACPI IORT table declares whether you enable IOMMU for a particular 
device or not. The placement of IOMMU HW is system specific. The IORT 
table gives the IOMMU HW topology to the operating system.

>>> If the platform also doesn't have an IOMMU, you can probably work
>>> around it by setting up the dma-ranges property of the PCI host
>>> to map the low PCI addresses to the start of RAM. This will also
>>> require changes in the bootloader to set up the PCI outbound translation,
>>> and it will require implementing the DMA offset on ARM64, which I was
>>> hoping to avoid.
>>
>>   From the email thread, it looks like this was introduced to support
>> some legacy card that has 64 bit addressing limitations and is being
>> carried around ("rotted") since then.
>>
>> I'm the second guy after the powerpc architecture complaining about the
>> very same issue. Any red flags?
>
> What BenH was worried about here is that the driver sets different masks
> for streaming and coherent mappings, which is indeed a worry that
> could hit us on ARM as well, but I suppose we'll have to deal with
> that in platform code.
>
> Setting both masks to 32-bit is something that a lot of drivers do,
> and without IOMMU enabled, you'd hit the same bug on all of them.
>

Maybe, maybe not. This is the only card that I had problems with.

>> I can't change the address map for PCIe. SBSA requires all inbound PCIe
>> addresses to be non-translated.
>
> What about changing the memory map? I suspect there will be more
> problems for you in the future when all of your RAM is at high
> addresses.  Is this something you could fix in the bootloader by
> moving the first 2GB to a different CPU physical address?

I'm thinking about this.

>
>> I'll just have to stick with IOMMU for this card.
>
> Ok. But how do you currently decide whether to use the IOMMU or not?
>

ACPI table. I wanted to get this fix in so that all operating systems 
whether they have IOMMU driver enabled or not would work.

> 	Arnd
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 17:19                     ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 17:19 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
>> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
>>   > No, as Timur found, the driver is correct and it intentionally
>>> sets the 32-bit mask, and that is guaranteed to work on all sane
>>> hardware. Don't change the driver but find a better platform for
>>> your workload, or talk to the people that are responsible for
>>> the platform and get them to fix it.
>>
>> Platform does have an IOMMU. No issues there. I am trying to clean out
>> the patch pipe I have in order to get this card working with and without
>> IOMMU.
>
> On PowerPC, I think we automatically enable the IOMMU whenever a DMA
> mask is set that doesn't cover all of the RAM. We could think about
> doing the same thing on ARM64 to make all devices work out of the box.
>

The ACPI IORT table declares whether you enable IOMMU for a particular 
device or not. The placement of IOMMU HW is system specific. The IORT 
table gives the IOMMU HW topology to the operating system.

>>> If the platform also doesn't have an IOMMU, you can probably work
>>> around it by setting up the dma-ranges property of the PCI host
>>> to map the low PCI addresses to the start of RAM. This will also
>>> require changes in the bootloader to set up the PCI outbound translation,
>>> and it will require implementing the DMA offset on ARM64, which I was
>>> hoping to avoid.
>>
>>   From the email thread, it looks like this was introduced to support
>> some legacy card that has 64 bit addressing limitations and is being
>> carried around ("rotted") since then.
>>
>> I'm the second guy after the powerpc architecture complaining about the
>> very same issue. Any red flags?
>
> What BenH was worried about here is that the driver sets different masks
> for streaming and coherent mappings, which is indeed a worry that
> could hit us on ARM as well, but I suppose we'll have to deal with
> that in platform code.
>
> Setting both masks to 32-bit is something that a lot of drivers do,
> and without IOMMU enabled, you'd hit the same bug on all of them.
>

Maybe, maybe not. This is the only card that I had problems with.

>> I can't change the address map for PCIe. SBSA requires all inbound PCIe
>> addresses to be non-translated.
>
> What about changing the memory map? I suspect there will be more
> problems for you in the future when all of your RAM is at high
> addresses.  Is this something you could fix in the bootloader by
> moving the first 2GB to a different CPU physical address?

I'm thinking about this.

>
>> I'll just have to stick with IOMMU for this card.
>
> Ok. But how do you currently decide whether to use the IOMMU or not?
>

ACPI table. I wanted to get this fix in so that all operating systems 
whether they have IOMMU driver enabled or not would work.

> 	Arnd
>

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 17:19                     ` Sinan Kaya
@ 2015-11-10 18:27                       ` James Bottomley
  -1 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 18:27 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tue, 2015-11-10 at 12:19 -0500, Sinan Kaya wrote:
> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> > On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> >> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
> >>   From the email thread, it looks like this was introduced to support
> >> some legacy card that has 64 bit addressing limitations and is being
> >> carried around ("rotted") since then.
> >>
> >> I'm the second guy after the powerpc architecture complaining about the
> >> very same issue. Any red flags?
> >
> > What BenH was worried about here is that the driver sets different masks
> > for streaming and coherent mappings, which is indeed a worry that
> > could hit us on ARM as well, but I suppose we'll have to deal with
> > that in platform code.
> >
> > Setting both masks to 32-bit is something that a lot of drivers do,
> > and without IOMMU enabled, you'd hit the same bug on all of them.
> >
> 
> Maybe, maybe not. This is the only card that I had problems with.

Your characterisation of "some legacy card" isn't entirely correct.
Just to clarify how this happens, most I/O cards today are intelligent
offload engines which means they have some type of embedded CPU (it can
even be a specially designed asic).  This CPU is driven by firmware
which is mostly (but not always) in the machine language of the CPU.
DMA transfers are sometimes run by this CPU, but mostly handed off to a
separate offload engine.  When the board gets revised, it's often easier
to update the offload engine to 64 bits and keep the CPU at 32 (or even
16) bits.  This means that all the internal addresses in the firmware
are 32 bit only.  As I read the comments in the original thread, it
looks like the mpt people tried to mitigate this by using segment
registers for external addresses firmware uses ... that's why they say
that they don't have to have all the addresses in DMA32 ... they just
need the upper 32 bits to be constant so they can correctly program the
segment register.  Unfortunately, we have no way to parametrise this to
the DMA allocation code.

You'll find the same thing with Adaptec SPI cards.  Their route to 64
bits was via an initial 39 bit extension that had them layering the
additional 7 bits into the unused lower region of the page descriptors
for the firmware (keeping the actual pointers to DMA at 32 bits because
they're always parametrised as address, offset, length and the address
is always a 4k page).

Eventually, everything will rev to 64 bits and this problem will go
away, but, as I suspect you know, it takes time for the embedded world
to get to where everyone else already is.

As Arnd said, if you failed to allow for this in your platform, then
oops, just don't use the card.  I think this solution would be better
than trying to get the driver to work out which cards can support 64 bit
firmware descriptors and only failing on your platform for those that
can't.

James

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 18:27                       ` James Bottomley
  0 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 18:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-11-10 at 12:19 -0500, Sinan Kaya wrote:
> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> > On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> >> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
> >>   From the email thread, it looks like this was introduced to support
> >> some legacy card that has 64 bit addressing limitations and is being
> >> carried around ("rotted") since then.
> >>
> >> I'm the second guy after the powerpc architecture complaining about the
> >> very same issue. Any red flags?
> >
> > What BenH was worried about here is that the driver sets different masks
> > for streaming and coherent mappings, which is indeed a worry that
> > could hit us on ARM as well, but I suppose we'll have to deal with
> > that in platform code.
> >
> > Setting both masks to 32-bit is something that a lot of drivers do,
> > and without IOMMU enabled, you'd hit the same bug on all of them.
> >
> 
> Maybe, maybe not. This is the only card that I had problems with.

Your characterisation of "some legacy card" isn't entirely correct.
Just to clarify how this happens, most I/O cards today are intelligent
offload engines which means they have some type of embedded CPU (it can
even be a specially designed asic).  This CPU is driven by firmware
which is mostly (but not always) in the machine language of the CPU.
DMA transfers are sometimes run by this CPU, but mostly handed off to a
separate offload engine.  When the board gets revised, it's often easier
to update the offload engine to 64 bits and keep the CPU at 32 (or even
16) bits.  This means that all the internal addresses in the firmware
are 32 bit only.  As I read the comments in the original thread, it
looks like the mpt people tried to mitigate this by using segment
registers for external addresses firmware uses ... that's why they say
that they don't have to have all the addresses in DMA32 ... they just
need the upper 32 bits to be constant so they can correctly program the
segment register.  Unfortunately, we have no way to parametrise this to
the DMA allocation code.

You'll find the same thing with Adaptec SPI cards.  Their route to 64
bits was via an initial 39 bit extension that had them layering the
additional 7 bits into the unused lower region of the page descriptors
for the firmware (keeping the actual pointers to DMA at 32 bits because
they're always parametrised as address, offset, length and the address
is always a 4k page).

Eventually, everything will rev to 64 bits and this problem will go
away, but, as I suspect you know, it takes time for the embedded world
to get to where everyone else already is.

As Arnd said, if you failed to allow for this in your platform, then
oops, just don't use the card.  I think this solution would be better
than trying to get the driver to work out which cards can support 64 bit
firmware descriptors and only failing on your platform for those that
can't.

James

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 17:00                     ` Timur Tabi
@ 2015-11-10 19:13                       ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 19:13 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Sinan Kaya, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, James E.J. Bottomley,
	linux-kernel, Sreekanth Reddy, Praveen Krishnamoorthy, cov,
	linux-arm-msm, agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tuesday 10 November 2015 11:00:59 Timur Tabi wrote:
> On 11/10/2015 10:47 AM, Arnd Bergmann wrote:
> > What BenH was worried about here is that the driver sets different masks
> > for streaming and coherent mappings, which is indeed a worry that
> > could hit us on ARM as well, but I suppose we'll have to deal with
> > that in platform code.
> >
> > Setting both masks to 32-bit is something that a lot of drivers do,
> > and without IOMMU enabled, you'd hit the same bug on all of them.
> 
> Also note that I think that on PowerPC, the mask is set to 32 by default 
> for all devices.  I don't think we do that on ARM64.  So on PowerPC, 
> some drivers get away with not explicitly setting the mask.
> 

If the mask is 64-bit by default on ARM64, that is a bug that we need
to fix urgently. Can you verify this?

A lot of PCI devices can only do 32-bit DMA, and we have plenty
of drivers that don't bother setting a mask at all because the 32-bit
mask is the default on all other architectures.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 19:13                       ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 19:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 10 November 2015 11:00:59 Timur Tabi wrote:
> On 11/10/2015 10:47 AM, Arnd Bergmann wrote:
> > What BenH was worried about here is that the driver sets different masks
> > for streaming and coherent mappings, which is indeed a worry that
> > could hit us on ARM as well, but I suppose we'll have to deal with
> > that in platform code.
> >
> > Setting both masks to 32-bit is something that a lot of drivers do,
> > and without IOMMU enabled, you'd hit the same bug on all of them.
> 
> Also note that I think that on PowerPC, the mask is set to 32 by default 
> for all devices.  I don't think we do that on ARM64.  So on PowerPC, 
> some drivers get away with not explicitly setting the mask.
> 

If the mask is 64-bit by default on ARM64, that is a bug that we need
to fix urgently. Can you verify this?

A lot of PCI devices can only do 32-bit DMA, and we have plenty
of drivers that don't bother setting a mask at all because the 32-bit
mask is the default on all other architectures.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 18:27                       ` James Bottomley
@ 2015-11-10 19:14                         ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 19:14 UTC (permalink / raw)
  To: James Bottomley
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke



On 11/10/2015 1:27 PM, James Bottomley wrote:
> On Tue, 2015-11-10 at 12:19 -0500, Sinan Kaya wrote:
>> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
>>> On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
>>>> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
>>>>    From the email thread, it looks like this was introduced to support
>>>> some legacy card that has 64 bit addressing limitations and is being
>>>> carried around ("rotted") since then.
>>>>
>>>> I'm the second guy after the powerpc architecture complaining about the
>>>> very same issue. Any red flags?
>>>
>>> What BenH was worried about here is that the driver sets different masks
>>> for streaming and coherent mappings, which is indeed a worry that
>>> could hit us on ARM as well, but I suppose we'll have to deal with
>>> that in platform code.
>>>
>>> Setting both masks to 32-bit is something that a lot of drivers do,
>>> and without IOMMU enabled, you'd hit the same bug on all of them.
>>>
>>
>> Maybe, maybe not. This is the only card that I had problems with.
>
> Your characterisation of "some legacy card" isn't entirely correct.
> Just to clarify how this happens, most I/O cards today are intelligent
> offload engines which means they have some type of embedded CPU (it can
> even be a specially designed asic).  This CPU is driven by firmware
> which is mostly (but not always) in the machine language of the CPU.
> DMA transfers are sometimes run by this CPU, but mostly handed off to a
> separate offload engine.  When the board gets revised, it's often easier
> to update the offload engine to 64 bits and keep the CPU at 32 (or even
> 16) bits.  This means that all the internal addresses in the firmware
> are 32 bit only.  As I read the comments in the original thread, it
> looks like the mpt people tried to mitigate this by using segment
> registers for external addresses firmware uses ... that's why they say
> that they don't have to have all the addresses in DMA32 ... they just
> need the upper 32 bits to be constant so they can correctly program the
> segment register.  Unfortunately, we have no way to parametrise this to
> the DMA allocation code.
>
> You'll find the same thing with Adaptec SPI cards.  Their route to 64
> bits was via an initial 39 bit extension that had them layering the
> additional 7 bits into the unused lower region of the page descriptors
> for the firmware (keeping the actual pointers to DMA at 32 bits because
> they're always parametrised as address, offset, length and the address
> is always a 4k page).
>
> Eventually, everything will rev to 64 bits and this problem will go
> away, but, as I suspect you know, it takes time for the embedded world
> to get to where everyone else already is.
>
> As Arnd said, if you failed to allow for this in your platform, then
> oops, just don't use the card.  I think this solution would be better
> than trying to get the driver to work out which cards can support 64 bit
> firmware descriptors and only failing on your platform for those that
> can't.
>
> James
>
>

James,
I was referring to this conversation here.

https://lkml.org/lkml/2015/2/20/31

"The aic79xx hardware problem was that the DMA engine could address the 
whole of memory (it had two address modes, a 39 bit one and a 64 bit 
one) but the script engine that runs the mailboxes only had a 32 bit 
activation register (the activating write points at the physical address 
of the script to begin executing)."

The fact that LSI SAS 92118i is working with 64 bit addresses suggests 
me that this problem is already solved.  I have not hit any kind of 
regressions with 93xx and 92xx families under load in a true 64 bit 
environment. I am only mentioning this based on my testing exposure.

Another comment here from you.
https://lkml.org/lkml/2015/4/2/28

"Well, it was originally a hack for altix, because they had no regions
below 4GB and had to specifically manufacture them.  As you know, in
Linux, if Intel doesn't need it, no-one cares and the implementation
bitrots."

Maybe, it is time to fix the code for more recent (even decent) hardware?

Sinan

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 19:14                         ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 19:14 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/10/2015 1:27 PM, James Bottomley wrote:
> On Tue, 2015-11-10 at 12:19 -0500, Sinan Kaya wrote:
>> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
>>> On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
>>>> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
>>>>    From the email thread, it looks like this was introduced to support
>>>> some legacy card that has 64 bit addressing limitations and is being
>>>> carried around ("rotted") since then.
>>>>
>>>> I'm the second guy after the powerpc architecture complaining about the
>>>> very same issue. Any red flags?
>>>
>>> What BenH was worried about here is that the driver sets different masks
>>> for streaming and coherent mappings, which is indeed a worry that
>>> could hit us on ARM as well, but I suppose we'll have to deal with
>>> that in platform code.
>>>
>>> Setting both masks to 32-bit is something that a lot of drivers do,
>>> and without IOMMU enabled, you'd hit the same bug on all of them.
>>>
>>
>> Maybe, maybe not. This is the only card that I had problems with.
>
> Your characterisation of "some legacy card" isn't entirely correct.
> Just to clarify how this happens, most I/O cards today are intelligent
> offload engines which means they have some type of embedded CPU (it can
> even be a specially designed asic).  This CPU is driven by firmware
> which is mostly (but not always) in the machine language of the CPU.
> DMA transfers are sometimes run by this CPU, but mostly handed off to a
> separate offload engine.  When the board gets revised, it's often easier
> to update the offload engine to 64 bits and keep the CPU at 32 (or even
> 16) bits.  This means that all the internal addresses in the firmware
> are 32 bit only.  As I read the comments in the original thread, it
> looks like the mpt people tried to mitigate this by using segment
> registers for external addresses firmware uses ... that's why they say
> that they don't have to have all the addresses in DMA32 ... they just
> need the upper 32 bits to be constant so they can correctly program the
> segment register.  Unfortunately, we have no way to parametrise this to
> the DMA allocation code.
>
> You'll find the same thing with Adaptec SPI cards.  Their route to 64
> bits was via an initial 39 bit extension that had them layering the
> additional 7 bits into the unused lower region of the page descriptors
> for the firmware (keeping the actual pointers to DMA at 32 bits because
> they're always parametrised as address, offset, length and the address
> is always a 4k page).
>
> Eventually, everything will rev to 64 bits and this problem will go
> away, but, as I suspect you know, it takes time for the embedded world
> to get to where everyone else already is.
>
> As Arnd said, if you failed to allow for this in your platform, then
> oops, just don't use the card.  I think this solution would be better
> than trying to get the driver to work out which cards can support 64 bit
> firmware descriptors and only failing on your platform for those that
> can't.
>
> James
>
>

James,
I was referring to this conversation here.

https://lkml.org/lkml/2015/2/20/31

"The aic79xx hardware problem was that the DMA engine could address the 
whole of memory (it had two address modes, a 39 bit one and a 64 bit 
one) but the script engine that runs the mailboxes only had a 32 bit 
activation register (the activating write points at the physical address 
of the script to begin executing)."

The fact that LSI SAS 92118i is working with 64 bit addresses suggests 
me that this problem is already solved.  I have not hit any kind of 
regressions with 93xx and 92xx families under load in a true 64 bit 
environment. I am only mentioning this based on my testing exposure.

Another comment here from you.
https://lkml.org/lkml/2015/4/2/28

"Well, it was originally a hack for altix, because they had no regions
below 4GB and had to specifically manufacture them.  As you know, in
Linux, if Intel doesn't need it, no-one cares and the implementation
bitrots."

Maybe, it is time to fix the code for more recent (even decent) hardware?

Sinan

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 19:14                         ` Sinan Kaya
@ 2015-11-10 19:43                           ` James Bottomley
  -1 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 19:43 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tue, 2015-11-10 at 14:14 -0500, Sinan Kaya wrote:
> 
> On 11/10/2015 1:27 PM, James Bottomley wrote:
> > On Tue, 2015-11-10 at 12:19 -0500, Sinan Kaya wrote:
> >> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> >>> On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> >>>> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
> >>>>    From the email thread, it looks like this was introduced to support
> >>>> some legacy card that has 64 bit addressing limitations and is being
> >>>> carried around ("rotted") since then.
> >>>>
> >>>> I'm the second guy after the powerpc architecture complaining about the
> >>>> very same issue. Any red flags?
> >>>
> >>> What BenH was worried about here is that the driver sets different masks
> >>> for streaming and coherent mappings, which is indeed a worry that
> >>> could hit us on ARM as well, but I suppose we'll have to deal with
> >>> that in platform code.
> >>>
> >>> Setting both masks to 32-bit is something that a lot of drivers do,
> >>> and without IOMMU enabled, you'd hit the same bug on all of them.
> >>>
> >>
> >> Maybe, maybe not. This is the only card that I had problems with.
> >
> > Your characterisation of "some legacy card" isn't entirely correct.
> > Just to clarify how this happens, most I/O cards today are intelligent
> > offload engines which means they have some type of embedded CPU (it can
> > even be a specially designed asic).  This CPU is driven by firmware
> > which is mostly (but not always) in the machine language of the CPU.
> > DMA transfers are sometimes run by this CPU, but mostly handed off to a
> > separate offload engine.  When the board gets revised, it's often easier
> > to update the offload engine to 64 bits and keep the CPU at 32 (or even
> > 16) bits.  This means that all the internal addresses in the firmware
> > are 32 bit only.  As I read the comments in the original thread, it
> > looks like the mpt people tried to mitigate this by using segment
> > registers for external addresses firmware uses ... that's why they say
> > that they don't have to have all the addresses in DMA32 ... they just
> > need the upper 32 bits to be constant so they can correctly program the
> > segment register.  Unfortunately, we have no way to parametrise this to
> > the DMA allocation code.
> >
> > You'll find the same thing with Adaptec SPI cards.  Their route to 64
> > bits was via an initial 39 bit extension that had them layering the
> > additional 7 bits into the unused lower region of the page descriptors
> > for the firmware (keeping the actual pointers to DMA at 32 bits because
> > they're always parametrised as address, offset, length and the address
> > is always a 4k page).
> >
> > Eventually, everything will rev to 64 bits and this problem will go
> > away, but, as I suspect you know, it takes time for the embedded world
> > to get to where everyone else already is.
> >
> > As Arnd said, if you failed to allow for this in your platform, then
> > oops, just don't use the card.  I think this solution would be better
> > than trying to get the driver to work out which cards can support 64 bit
> > firmware descriptors and only failing on your platform for those that
> > can't.
> >
> > James
> >
> >
> 
> James,
> I was referring to this conversation here.
> 
> https://lkml.org/lkml/2015/2/20/31
> 
> "The aic79xx hardware problem was that the DMA engine could address the 
> whole of memory (it had two address modes, a 39 bit one and a 64 bit 
> one) but the script engine that runs the mailboxes only had a 32 bit 
> activation register (the activating write points at the physical address 
> of the script to begin executing)."
> 
> The fact that LSI SAS 92118i is working with 64 bit addresses suggests 
> me that this problem is already solved.  I have not hit any kind of 
> regressions with 93xx and 92xx families under load in a true 64 bit 
> environment. I am only mentioning this based on my testing exposure.

The Issue, as stated by LSI is

        Initially set the consistent DMA mask to 32 bit and then change
        it
        to 64 bit mask after allocating RDPQ pools by calling the
        function
        _base_change_consistent_dma_mask. This is to ensure that all the
        upper 32 bits of RDPQ entries's base address to be same.

If you set a 64 bit coherent mask before this point, you're benefiting
from being lucky that all the upper 32 bits of the allocations are the
same ... we can't code a driver to rely on luck.  Particularly not when
the failure mode looks like it would be silent and deadly.

> Another comment here from you.
> https://lkml.org/lkml/2015/4/2/28
> 
> "Well, it was originally a hack for altix, because they had no regions
> below 4GB and had to specifically manufacture them.  As you know, in
> Linux, if Intel doesn't need it, no-one cares and the implementation
> bitrots."
> 
> Maybe, it is time to fix the code for more recent (even decent) hardware?

What do you mean "fix the code"?  The code isn't broken, it's
parametrising issues with particular hardware.  There's no software work
around (except allocating memory with the correct characteristics).

James

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 19:43                           ` James Bottomley
  0 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 19:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-11-10 at 14:14 -0500, Sinan Kaya wrote:
> 
> On 11/10/2015 1:27 PM, James Bottomley wrote:
> > On Tue, 2015-11-10 at 12:19 -0500, Sinan Kaya wrote:
> >> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> >>> On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> >>>> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
> >>>>    From the email thread, it looks like this was introduced to support
> >>>> some legacy card that has 64 bit addressing limitations and is being
> >>>> carried around ("rotted") since then.
> >>>>
> >>>> I'm the second guy after the powerpc architecture complaining about the
> >>>> very same issue. Any red flags?
> >>>
> >>> What BenH was worried about here is that the driver sets different masks
> >>> for streaming and coherent mappings, which is indeed a worry that
> >>> could hit us on ARM as well, but I suppose we'll have to deal with
> >>> that in platform code.
> >>>
> >>> Setting both masks to 32-bit is something that a lot of drivers do,
> >>> and without IOMMU enabled, you'd hit the same bug on all of them.
> >>>
> >>
> >> Maybe, maybe not. This is the only card that I had problems with.
> >
> > Your characterisation of "some legacy card" isn't entirely correct.
> > Just to clarify how this happens, most I/O cards today are intelligent
> > offload engines which means they have some type of embedded CPU (it can
> > even be a specially designed asic).  This CPU is driven by firmware
> > which is mostly (but not always) in the machine language of the CPU.
> > DMA transfers are sometimes run by this CPU, but mostly handed off to a
> > separate offload engine.  When the board gets revised, it's often easier
> > to update the offload engine to 64 bits and keep the CPU at 32 (or even
> > 16) bits.  This means that all the internal addresses in the firmware
> > are 32 bit only.  As I read the comments in the original thread, it
> > looks like the mpt people tried to mitigate this by using segment
> > registers for external addresses firmware uses ... that's why they say
> > that they don't have to have all the addresses in DMA32 ... they just
> > need the upper 32 bits to be constant so they can correctly program the
> > segment register.  Unfortunately, we have no way to parametrise this to
> > the DMA allocation code.
> >
> > You'll find the same thing with Adaptec SPI cards.  Their route to 64
> > bits was via an initial 39 bit extension that had them layering the
> > additional 7 bits into the unused lower region of the page descriptors
> > for the firmware (keeping the actual pointers to DMA at 32 bits because
> > they're always parametrised as address, offset, length and the address
> > is always a 4k page).
> >
> > Eventually, everything will rev to 64 bits and this problem will go
> > away, but, as I suspect you know, it takes time for the embedded world
> > to get to where everyone else already is.
> >
> > As Arnd said, if you failed to allow for this in your platform, then
> > oops, just don't use the card.  I think this solution would be better
> > than trying to get the driver to work out which cards can support 64 bit
> > firmware descriptors and only failing on your platform for those that
> > can't.
> >
> > James
> >
> >
> 
> James,
> I was referring to this conversation here.
> 
> https://lkml.org/lkml/2015/2/20/31
> 
> "The aic79xx hardware problem was that the DMA engine could address the 
> whole of memory (it had two address modes, a 39 bit one and a 64 bit 
> one) but the script engine that runs the mailboxes only had a 32 bit 
> activation register (the activating write points at the physical address 
> of the script to begin executing)."
> 
> The fact that LSI SAS 92118i is working with 64 bit addresses suggests 
> me that this problem is already solved.  I have not hit any kind of 
> regressions with 93xx and 92xx families under load in a true 64 bit 
> environment. I am only mentioning this based on my testing exposure.

The Issue, as stated by LSI is

        Initially set the consistent DMA mask to 32 bit and then change
        it
        to 64 bit mask after allocating RDPQ pools by calling the
        function
        _base_change_consistent_dma_mask. This is to ensure that all the
        upper 32 bits of RDPQ entries's base address to be same.

If you set a 64 bit coherent mask before this point, you're benefiting
from being lucky that all the upper 32 bits of the allocations are the
same ... we can't code a driver to rely on luck.  Particularly not when
the failure mode looks like it would be silent and deadly.

> Another comment here from you.
> https://lkml.org/lkml/2015/4/2/28
> 
> "Well, it was originally a hack for altix, because they had no regions
> below 4GB and had to specifically manufacture them.  As you know, in
> Linux, if Intel doesn't need it, no-one cares and the implementation
> bitrots."
> 
> Maybe, it is time to fix the code for more recent (even decent) hardware?

What do you mean "fix the code"?  The code isn't broken, it's
parametrising issues with particular hardware.  There's no software work
around (except allocating memory with the correct characteristics).

James

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 19:43                           ` James Bottomley
@ 2015-11-10 19:56                             ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 19:56 UTC (permalink / raw)
  To: James Bottomley
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke



On 11/10/2015 2:43 PM, James Bottomley wrote:
> The Issue, as stated by LSI is
>
>          Initially set the consistent DMA mask to 32 bit and then change
>          it
>          to 64 bit mask after allocating RDPQ pools by calling the
>          function
>          _base_change_consistent_dma_mask. This is to ensure that all the
>          upper 32 bits of RDPQ entries's base address to be same.
>

Need somebody from mpt to confirm that this behavior is still valid for 
the recent cards besides altix.

> If you set a 64 bit coherent mask before this point, you're benefiting
> from being lucky that all the upper 32 bits of the allocations are the
> same ... we can't code a driver to rely on luck.  Particularly not when
> the failure mode looks like it would be silent and deadly.

Of course nobody wants unreliable code.

I'm wondering if I was just lucky during my testing or the 92xx and 93xx 
hardware supports full 64 bit range. I don't have any insight into what 
the endpoint does or what it is capable of.

>
>> >Another comment here from you.
>> >https://lkml.org/lkml/2015/4/2/28
>> >
>> >"Well, it was originally a hack for altix, because they had no regions
>> >below 4GB and had to specifically manufacture them.  As you know, in
>> >Linux, if Intel doesn't need it, no-one cares and the implementation
>> >bitrots."
>> >
>> >Maybe, it is time to fix the code for more recent (even decent) hardware?
> What do you mean "fix the code"?  The code isn't broken, it's
> parametrising issues with particular hardware.  There's no software work
> around (except allocating memory with the correct characteristics).

Need confirmation. I'm questioning if we are stuck with this behavior 
because of altix or something else. If the latter case, the code could 
have used PCI ID to do something special for it.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 19:56                             ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 19:56 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/10/2015 2:43 PM, James Bottomley wrote:
> The Issue, as stated by LSI is
>
>          Initially set the consistent DMA mask to 32 bit and then change
>          it
>          to 64 bit mask after allocating RDPQ pools by calling the
>          function
>          _base_change_consistent_dma_mask. This is to ensure that all the
>          upper 32 bits of RDPQ entries's base address to be same.
>

Need somebody from mpt to confirm that this behavior is still valid for 
the recent cards besides altix.

> If you set a 64 bit coherent mask before this point, you're benefiting
> from being lucky that all the upper 32 bits of the allocations are the
> same ... we can't code a driver to rely on luck.  Particularly not when
> the failure mode looks like it would be silent and deadly.

Of course nobody wants unreliable code.

I'm wondering if I was just lucky during my testing or the 92xx and 93xx 
hardware supports full 64 bit range. I don't have any insight into what 
the endpoint does or what it is capable of.

>
>> >Another comment here from you.
>> >https://lkml.org/lkml/2015/4/2/28
>> >
>> >"Well, it was originally a hack for altix, because they had no regions
>> >below 4GB and had to specifically manufacture them.  As you know, in
>> >Linux, if Intel doesn't need it, no-one cares and the implementation
>> >bitrots."
>> >
>> >Maybe, it is time to fix the code for more recent (even decent) hardware?
> What do you mean "fix the code"?  The code isn't broken, it's
> parametrising issues with particular hardware.  There's no software work
> around (except allocating memory with the correct characteristics).

Need confirmation. I'm questioning if we are stuck with this behavior 
because of altix or something else. If the latter case, the code could 
have used PCI ID to do something special for it.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 17:19                     ` Sinan Kaya
@ 2015-11-10 19:56                       ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 19:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Sinan Kaya, Abhijit Mahajan, linux-scsi, Nagalakshmi Nandigama,
	jcm, timur, James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Hannes Reinecke, Praveen Krishnamoorthy, cov, linux-arm-msm,
	MPT-FusionLinux.pdl, agross

On Tuesday 10 November 2015 12:19:33 Sinan Kaya wrote:
> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> > On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> >> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
> >>   > No, as Timur found, the driver is correct and it intentionally
> >>> sets the 32-bit mask, and that is guaranteed to work on all sane
> >>> hardware. Don't change the driver but find a better platform for
> >>> your workload, or talk to the people that are responsible for
> >>> the platform and get them to fix it.
> >>
> >> Platform does have an IOMMU. No issues there. I am trying to clean out
> >> the patch pipe I have in order to get this card working with and without
> >> IOMMU.
> >
> > On PowerPC, I think we automatically enable the IOMMU whenever a DMA
> > mask is set that doesn't cover all of the RAM. We could think about
> > doing the same thing on ARM64 to make all devices work out of the box.
> >
> 
> The ACPI IORT table declares whether you enable IOMMU for a particular 
> device or not. The placement of IOMMU HW is system specific. The IORT 
> table gives the IOMMU HW topology to the operating system.

This sounds odd. Clearly you need to specify the IOMMU settings for each
possible PCI device independent of whether the OS actually uses the IOMMU
or not. In a lot of cases, we want to turn it off to get better performance
when the driver has set a DMA mask that covers all of RAM, but you
also want to enable the IOMMU for debugging purposes or for device
assignment if you run virtual machines. The bootloader doesn't know how
the device is going to be used, so it cannot define the policy here.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 19:56                       ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 19:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 10 November 2015 12:19:33 Sinan Kaya wrote:
> On 11/10/2015 11:47 AM, Arnd Bergmann wrote:
> > On Tuesday 10 November 2015 11:06:40 Sinan Kaya wrote:
> >> On 11/10/2015 3:38 AM, Arnd Bergmann wrote:
> >>   > No, as Timur found, the driver is correct and it intentionally
> >>> sets the 32-bit mask, and that is guaranteed to work on all sane
> >>> hardware. Don't change the driver but find a better platform for
> >>> your workload, or talk to the people that are responsible for
> >>> the platform and get them to fix it.
> >>
> >> Platform does have an IOMMU. No issues there. I am trying to clean out
> >> the patch pipe I have in order to get this card working with and without
> >> IOMMU.
> >
> > On PowerPC, I think we automatically enable the IOMMU whenever a DMA
> > mask is set that doesn't cover all of the RAM. We could think about
> > doing the same thing on ARM64 to make all devices work out of the box.
> >
> 
> The ACPI IORT table declares whether you enable IOMMU for a particular 
> device or not. The placement of IOMMU HW is system specific. The IORT 
> table gives the IOMMU HW topology to the operating system.

This sounds odd. Clearly you need to specify the IOMMU settings for each
possible PCI device independent of whether the OS actually uses the IOMMU
or not. In a lot of cases, we want to turn it off to get better performance
when the driver has set a DMA mask that covers all of RAM, but you
also want to enable the IOMMU for debugging purposes or for device
assignment if you run virtual machines. The bootloader doesn't know how
the device is going to be used, so it cannot define the policy here.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 19:56                             ` Sinan Kaya
@ 2015-11-10 20:05                               ` James Bottomley
  -1 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 20:05 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tue, 2015-11-10 at 14:56 -0500, Sinan Kaya wrote:
> 
> On 11/10/2015 2:43 PM, James Bottomley wrote:
> > The Issue, as stated by LSI is
> >
> >          Initially set the consistent DMA mask to 32 bit and then change
> >          it
> >          to 64 bit mask after allocating RDPQ pools by calling the
> >          function
> >          _base_change_consistent_dma_mask. This is to ensure that all the
> >          upper 32 bits of RDPQ entries's base address to be same.
> >
> 
> Need somebody from mpt to confirm that this behavior is still valid for 
> the recent cards besides altix.

OK, you don't seem to be understanding the problem: the Altix isn't a
LSI card, it was a SGI platform.  It was the platform where we first
discovered the issue that a lot of storage cards didn't work because it
by default had no memory below 4GB.  The reason coherent masks were
introduced was initially so the Altix could manufacture and manage a
region of memory in the lower 4GB region and we would guarantee to make
allocations from it so the storage cards would then work on that
platform.

I thought the Altix was a historical relic because after they
disappeared, there was no other platform with this issue ...  until you
came along.

James



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 20:05                               ` James Bottomley
  0 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 20:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-11-10 at 14:56 -0500, Sinan Kaya wrote:
> 
> On 11/10/2015 2:43 PM, James Bottomley wrote:
> > The Issue, as stated by LSI is
> >
> >          Initially set the consistent DMA mask to 32 bit and then change
> >          it
> >          to 64 bit mask after allocating RDPQ pools by calling the
> >          function
> >          _base_change_consistent_dma_mask. This is to ensure that all the
> >          upper 32 bits of RDPQ entries's base address to be same.
> >
> 
> Need somebody from mpt to confirm that this behavior is still valid for 
> the recent cards besides altix.

OK, you don't seem to be understanding the problem: the Altix isn't a
LSI card, it was a SGI platform.  It was the platform where we first
discovered the issue that a lot of storage cards didn't work because it
by default had no memory below 4GB.  The reason coherent masks were
introduced was initially so the Altix could manufacture and manage a
region of memory in the lower 4GB region and we would guarantee to make
allocations from it so the storage cards would then work on that
platform.

I thought the Altix was a historical relic because after they
disappeared, there was no other platform with this issue ...  until you
came along.

James

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 20:05                               ` James Bottomley
@ 2015-11-10 20:26                                 ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 20:26 UTC (permalink / raw)
  To: James Bottomley
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke



On 11/10/2015 3:05 PM, James Bottomley wrote:
> OK, you don't seem to be understanding the problem: the Altix isn't a
> LSI card, it was a SGI platform.

Got it.

> It was the platform where we first
> discovered the issue that a lot of storage cards didn't work because it
> by default had no memory below 4GB.  The reason coherent masks were
> introduced was initially so the Altix could manufacture and manage a
> region of memory in the lower 4GB region and we would guarantee to make
> allocations from it so the storage cards would then work on that
> platform.

I can't fix the issue if the card cannot do 64 bit DMA when IOMMU is not 
there. I need IOMMU enabled all the time for this card.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 20:26                                 ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 20:26 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/10/2015 3:05 PM, James Bottomley wrote:
> OK, you don't seem to be understanding the problem: the Altix isn't a
> LSI card, it was a SGI platform.

Got it.

> It was the platform where we first
> discovered the issue that a lot of storage cards didn't work because it
> by default had no memory below 4GB.  The reason coherent masks were
> introduced was initially so the Altix could manufacture and manage a
> region of memory in the lower 4GB region and we would guarantee to make
> allocations from it so the storage cards would then work on that
> platform.

I can't fix the issue if the card cannot do 64 bit DMA when IOMMU is not 
there. I need IOMMU enabled all the time for this card.

-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 20:26                                 ` Sinan Kaya
@ 2015-11-10 20:35                                   ` James Bottomley
  -1 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 20:35 UTC (permalink / raw)
  To: Sinan Kaya
  Cc: Arnd Bergmann, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, timur, linux-kernel,
	Sreekanth Reddy, Praveen Krishnamoorthy, cov, linux-arm-msm,
	agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tue, 2015-11-10 at 15:26 -0500, Sinan Kaya wrote:
> 
> On 11/10/2015 3:05 PM, James Bottomley wrote:
> > OK, you don't seem to be understanding the problem: the Altix isn't a
> > LSI card, it was a SGI platform.
> 
> Got it.
> 
> > It was the platform where we first
> > discovered the issue that a lot of storage cards didn't work because it
> > by default had no memory below 4GB.  The reason coherent masks were
> > introduced was initially so the Altix could manufacture and manage a
> > region of memory in the lower 4GB region and we would guarantee to make
> > allocations from it so the storage cards would then work on that
> > platform.
> 
> I can't fix the issue if the card cannot do 64 bit DMA when IOMMU is not 
> there. I need IOMMU enabled all the time for this card.

That depends on the limitations of your platform.  The Altix only used
an iommu to manufacture the coherent memory for the descriptors, but the
card itself mostly operated in bypass mode (using 64 bit physical
addresses rather than iommu remapped ones), so all accesses except for
the few firmware descriptor ones didn't use an iommu.  Apparently this
was for performance reasons.

So, to recap, the card itself *can* do 64 bit DMA.  The limitation is
that the RDPQ descriptors need to be all in the same region of 4G memory
and the way the driver ensures this is to set the 64 bit coherent mask
after using the 32 bit one to allocate the RDPQ pools.

James



^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 20:35                                   ` James Bottomley
  0 siblings, 0 replies; 86+ messages in thread
From: James Bottomley @ 2015-11-10 20:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-11-10 at 15:26 -0500, Sinan Kaya wrote:
> 
> On 11/10/2015 3:05 PM, James Bottomley wrote:
> > OK, you don't seem to be understanding the problem: the Altix isn't a
> > LSI card, it was a SGI platform.
> 
> Got it.
> 
> > It was the platform where we first
> > discovered the issue that a lot of storage cards didn't work because it
> > by default had no memory below 4GB.  The reason coherent masks were
> > introduced was initially so the Altix could manufacture and manage a
> > region of memory in the lower 4GB region and we would guarantee to make
> > allocations from it so the storage cards would then work on that
> > platform.
> 
> I can't fix the issue if the card cannot do 64 bit DMA when IOMMU is not 
> there. I need IOMMU enabled all the time for this card.

That depends on the limitations of your platform.  The Altix only used
an iommu to manufacture the coherent memory for the descriptors, but the
card itself mostly operated in bypass mode (using 64 bit physical
addresses rather than iommu remapped ones), so all accesses except for
the few firmware descriptor ones didn't use an iommu.  Apparently this
was for performance reasons.

So, to recap, the card itself *can* do 64 bit DMA.  The limitation is
that the RDPQ descriptors need to be all in the same region of 4G memory
and the way the driver ensures this is to set the 64 bit coherent mask
after using the 32 bit one to allocate the RDPQ pools.

James

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 19:56                       ` Arnd Bergmann
@ 2015-11-10 20:58                         ` Sinan Kaya
  -1 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 20:58 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Abhijit Mahajan, linux-scsi, Nagalakshmi Nandigama, jcm, timur,
	James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Hannes Reinecke, Praveen Krishnamoorthy, cov, linux-arm-msm,
	MPT-FusionLinux.pdl, agross



On 11/10/2015 2:56 PM, Arnd Bergmann wrote:
>> The ACPI IORT table declares whether you enable IOMMU for a particular
>> >device or not. The placement of IOMMU HW is system specific. The IORT
>> >table gives the IOMMU HW topology to the operating system.
> This sounds odd. Clearly you need to specify the IOMMU settings for each
> possible PCI device independent of whether the OS actually uses the IOMMU
> or not.

There are provisions to have DMA mask in the PCIe host bridge not at the 
PCIe device level inside IORT table. This setting is specific for each 
PCIe bus. It is not per PCIe device.

It is assumed that the endpoint device driver knows the hardware for 
PCIe devices. The driver can also query the supported DMA bits by this 
platform via DMA APIs and will request the correct DMA mask from the DMA 
subsystem (surprise!).

>In a lot of cases, we want to turn it off to get better performance
> when the driver has set a DMA mask that covers all of RAM, but you
> also want to enable the IOMMU for debugging purposes or for device
> assignment if you run virtual machines. The bootloader doesn't know how
> the device is going to be used, so it cannot define the policy here.

I think we'll end up adding a virtualization option to the UEFI BIOS 
similar to how Intel platforms work. Based on this switch, we'll end up 
patching the ACPI table.

If I remove the IORT entry, then the device is in coherent mode with 
device accessing the full RAM range.

If I have the IORT table, the device is in IOMMU translation mode.

Details are in the IORT spec.


-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 20:58                         ` Sinan Kaya
  0 siblings, 0 replies; 86+ messages in thread
From: Sinan Kaya @ 2015-11-10 20:58 UTC (permalink / raw)
  To: linux-arm-kernel



On 11/10/2015 2:56 PM, Arnd Bergmann wrote:
>> The ACPI IORT table declares whether you enable IOMMU for a particular
>> >device or not. The placement of IOMMU HW is system specific. The IORT
>> >table gives the IOMMU HW topology to the operating system.
> This sounds odd. Clearly you need to specify the IOMMU settings for each
> possible PCI device independent of whether the OS actually uses the IOMMU
> or not.

There are provisions to have DMA mask in the PCIe host bridge not at the 
PCIe device level inside IORT table. This setting is specific for each 
PCIe bus. It is not per PCIe device.

It is assumed that the endpoint device driver knows the hardware for 
PCIe devices. The driver can also query the supported DMA bits by this 
platform via DMA APIs and will request the correct DMA mask from the DMA 
subsystem (surprise!).

>In a lot of cases, we want to turn it off to get better performance
> when the driver has set a DMA mask that covers all of RAM, but you
> also want to enable the IOMMU for debugging purposes or for device
> assignment if you run virtual machines. The bootloader doesn't know how
> the device is going to be used, so it cannot define the policy here.

I think we'll end up adding a virtualization option to the UEFI BIOS 
similar to how Intel platforms work. Based on this switch, we'll end up 
patching the ACPI table.

If I remove the IORT entry, then the device is in coherent mode with 
device accessing the full RAM range.

If I have the IORT table, the device is in IOMMU translation mode.

Details are in the IORT spec.


-- 
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 19:13                       ` Arnd Bergmann
@ 2015-11-10 21:03                         ` Timur Tabi
  -1 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10 21:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Sinan Kaya, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, James E.J. Bottomley,
	linux-kernel, Sreekanth Reddy, Praveen Krishnamoorthy, cov,
	linux-arm-msm, agross, MPT-FusionLinux.pdl, Hannes Reinecke

On 11/10/2015 01:13 PM, Arnd Bergmann wrote:
> If the mask is 64-bit by default on ARM64, that is a bug that we need
> to fix urgently. Can you verify this?

I think the mask is 0 by default, because there's no code in ARM64 that 
actually sets the mask.

Take a look at arch_setup_pdev_archdata() in 
arch/powerpc/kernel/setup-common.c.

void arch_setup_pdev_archdata(struct platform_device *pdev)
{
         pdev->archdata.dma_mask = DMA_BIT_MASK(32);
         pdev->dev.dma_mask = &pdev->archdata.dma_mask;
         set_dma_ops(&pdev->dev, &dma_direct_ops);
}

I don't see anything equivalent in arch/arm64

> A lot of PCI devices can only do 32-bit DMA, and we have plenty
> of drivers that don't bother setting a mask at all because the 32-bit
> mask is the default on all other architectures.

In our drivers for 32-bit devices, we have to explicitly set the DMA 
mask to 32-bits in order to get any DMA working.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 21:03                         ` Timur Tabi
  0 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10 21:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/10/2015 01:13 PM, Arnd Bergmann wrote:
> If the mask is 64-bit by default on ARM64, that is a bug that we need
> to fix urgently. Can you verify this?

I think the mask is 0 by default, because there's no code in ARM64 that 
actually sets the mask.

Take a look at arch_setup_pdev_archdata() in 
arch/powerpc/kernel/setup-common.c.

void arch_setup_pdev_archdata(struct platform_device *pdev)
{
         pdev->archdata.dma_mask = DMA_BIT_MASK(32);
         pdev->dev.dma_mask = &pdev->archdata.dma_mask;
         set_dma_ops(&pdev->dev, &dma_direct_ops);
}

I don't see anything equivalent in arch/arm64

> A lot of PCI devices can only do 32-bit DMA, and we have plenty
> of drivers that don't bother setting a mask at all because the 32-bit
> mask is the default on all other architectures.

In our drivers for 32-bit devices, we have to explicitly set the DMA 
mask to 32-bits in order to get any DMA working.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 21:03                         ` Timur Tabi
@ 2015-11-10 21:54                           ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 21:54 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Sinan Kaya, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, James E.J. Bottomley,
	linux-kernel, Sreekanth Reddy, Praveen Krishnamoorthy, cov,
	linux-arm-msm, agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tuesday 10 November 2015 15:03:59 Timur Tabi wrote:
> On 11/10/2015 01:13 PM, Arnd Bergmann wrote:
> > If the mask is 64-bit by default on ARM64, that is a bug that we need
> > to fix urgently. Can you verify this?
> 
> I think the mask is 0 by default, because there's no code in ARM64 that 
> actually sets the mask.
> 
> Take a look at arch_setup_pdev_archdata() in 
> arch/powerpc/kernel/setup-common.c.
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
>          pdev->archdata.dma_mask = DMA_BIT_MASK(32);
>          pdev->dev.dma_mask = &pdev->archdata.dma_mask;
>          set_dma_ops(&pdev->dev, &dma_direct_ops);
> }
> 
> I don't see anything equivalent in arch/arm64

of_dma_configure() sets up an initial DMA mask for 32 bits. The
same thing happens for pci_setup_device() in architecture-independent
code?

> > A lot of PCI devices can only do 32-bit DMA, and we have plenty
> > of drivers that don't bother setting a mask at all because the 32-bit
> > mask is the default on all other architectures.
> 
> In our drivers for 32-bit devices, we have to explicitly set the DMA 
> mask to 32-bits in order to get any DMA working.

Do you mean PCI devices or platform devices?

Maybe the parent bus is lacking a dma-ranges property?

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 21:54                           ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 21:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 10 November 2015 15:03:59 Timur Tabi wrote:
> On 11/10/2015 01:13 PM, Arnd Bergmann wrote:
> > If the mask is 64-bit by default on ARM64, that is a bug that we need
> > to fix urgently. Can you verify this?
> 
> I think the mask is 0 by default, because there's no code in ARM64 that 
> actually sets the mask.
> 
> Take a look at arch_setup_pdev_archdata() in 
> arch/powerpc/kernel/setup-common.c.
> 
> void arch_setup_pdev_archdata(struct platform_device *pdev)
> {
>          pdev->archdata.dma_mask = DMA_BIT_MASK(32);
>          pdev->dev.dma_mask = &pdev->archdata.dma_mask;
>          set_dma_ops(&pdev->dev, &dma_direct_ops);
> }
> 
> I don't see anything equivalent in arch/arm64

of_dma_configure() sets up an initial DMA mask for 32 bits. The
same thing happens for pci_setup_device() in architecture-independent
code?

> > A lot of PCI devices can only do 32-bit DMA, and we have plenty
> > of drivers that don't bother setting a mask at all because the 32-bit
> > mask is the default on all other architectures.
> 
> In our drivers for 32-bit devices, we have to explicitly set the DMA 
> mask to 32-bits in order to get any DMA working.

Do you mean PCI devices or platform devices?

Maybe the parent bus is lacking a dma-ranges property?

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 21:54                           ` Arnd Bergmann
@ 2015-11-10 21:59                             ` Timur Tabi
  -1 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10 21:59 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Sinan Kaya, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, James E.J. Bottomley,
	linux-kernel, Sreekanth Reddy, Praveen Krishnamoorthy, cov,
	linux-arm-msm, agross, MPT-FusionLinux.pdl, Hannes Reinecke

On 11/10/2015 03:54 PM, Arnd Bergmann wrote:

>> In our drivers for 32-bit devices, we have to explicitly set the DMA
>> mask to 32-bits in order to get any DMA working.
>
> Do you mean PCI devices or platform devices?

Platform.

> Maybe the parent bus is lacking a dma-ranges property?

All of this applies only on device-tree platforms.  Sinan and I are 
working on an ACPI server platform.  So we never call 
of_dma_configure(), and we don't have a dma-ranges property.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 21:59                             ` Timur Tabi
  0 siblings, 0 replies; 86+ messages in thread
From: Timur Tabi @ 2015-11-10 21:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/10/2015 03:54 PM, Arnd Bergmann wrote:

>> In our drivers for 32-bit devices, we have to explicitly set the DMA
>> mask to 32-bits in order to get any DMA working.
>
> Do you mean PCI devices or platform devices?

Platform.

> Maybe the parent bus is lacking a dma-ranges property?

All of this applies only on device-tree platforms.  Sinan and I are 
working on an ACPI server platform.  So we never call 
of_dma_configure(), and we don't have a dma-ranges property.

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 20:58                         ` Sinan Kaya
@ 2015-11-10 22:06                           ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 22:06 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Sinan Kaya, Abhijit Mahajan, linux-scsi, Nagalakshmi Nandigama,
	jcm, timur, James E.J. Bottomley, linux-kernel, Sreekanth Reddy,
	Praveen Krishnamoorthy, Hannes Reinecke, linux-arm-msm, agross,
	MPT-FusionLinux.pdl, cov

On Tuesday 10 November 2015 15:58:19 Sinan Kaya wrote:
> 
> On 11/10/2015 2:56 PM, Arnd Bergmann wrote:
> >> The ACPI IORT table declares whether you enable IOMMU for a particular
> >> >device or not. The placement of IOMMU HW is system specific. The IORT
> >> >table gives the IOMMU HW topology to the operating system.
> > This sounds odd. Clearly you need to specify the IOMMU settings for each
> > possible PCI device independent of whether the OS actually uses the IOMMU
> > or not.
> 
> There are provisions to have DMA mask in the PCIe host bridge not at the 
> PCIe device level inside IORT table. This setting is specific for each 
> PCIe bus. It is not per PCIe device.

Same thing, I meant the bootloader must provide all the information that
is needed to use the IOMMU on all PCI devices. I don't care where the IOMMU
driver gets that information. Some IOMMUs require programming a
bus/device/function specific number into the I/O page tables, and they
might not always have the same algorithm to map from the PCI numbers
into their own number space.

> It is assumed that the endpoint device driver knows the hardware for 
> PCIe devices. The driver can also query the supported DMA bits by this 
> platform via DMA APIs and will request the correct DMA mask from the DMA 
> subsystem (surprise!).

I know how the negotiation works. Note that dma_get_required_mask()
will only tell you what mask the device needs to access all of memory,
while both the device and bus may have additional limitations, and
there is not always a solution.

> >In a lot of cases, we want to turn it off to get better performance
> > when the driver has set a DMA mask that covers all of RAM, but you
> > also want to enable the IOMMU for debugging purposes or for device
> > assignment if you run virtual machines. The bootloader doesn't know how
> > the device is going to be used, so it cannot define the policy here.
> 
> I think we'll end up adding a virtualization option to the UEFI BIOS 
> similar to how Intel platforms work. Based on this switch, we'll end up 
> patching the ACPI table.
> 
> If I remove the IORT entry, then the device is in coherent mode with 
> device accessing the full RAM range.
> 
> If I have the IORT table, the device is in IOMMU translation mode.
> 
> Details are in the IORT spec.

I think that would suck a lot more than being slightly out of spec
regarding SBSA if you make the low PCI addresses map to the start
of RAM. Asking users to select a 'virtualization' option based on
what kind of PCI device and kernel version they have is a major
hassle.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 22:06                           ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 10 November 2015 15:58:19 Sinan Kaya wrote:
> 
> On 11/10/2015 2:56 PM, Arnd Bergmann wrote:
> >> The ACPI IORT table declares whether you enable IOMMU for a particular
> >> >device or not. The placement of IOMMU HW is system specific. The IORT
> >> >table gives the IOMMU HW topology to the operating system.
> > This sounds odd. Clearly you need to specify the IOMMU settings for each
> > possible PCI device independent of whether the OS actually uses the IOMMU
> > or not.
> 
> There are provisions to have DMA mask in the PCIe host bridge not at the 
> PCIe device level inside IORT table. This setting is specific for each 
> PCIe bus. It is not per PCIe device.

Same thing, I meant the bootloader must provide all the information that
is needed to use the IOMMU on all PCI devices. I don't care where the IOMMU
driver gets that information. Some IOMMUs require programming a
bus/device/function specific number into the I/O page tables, and they
might not always have the same algorithm to map from the PCI numbers
into their own number space.

> It is assumed that the endpoint device driver knows the hardware for 
> PCIe devices. The driver can also query the supported DMA bits by this 
> platform via DMA APIs and will request the correct DMA mask from the DMA 
> subsystem (surprise!).

I know how the negotiation works. Note that dma_get_required_mask()
will only tell you what mask the device needs to access all of memory,
while both the device and bus may have additional limitations, and
there is not always a solution.

> >In a lot of cases, we want to turn it off to get better performance
> > when the driver has set a DMA mask that covers all of RAM, but you
> > also want to enable the IOMMU for debugging purposes or for device
> > assignment if you run virtual machines. The bootloader doesn't know how
> > the device is going to be used, so it cannot define the policy here.
> 
> I think we'll end up adding a virtualization option to the UEFI BIOS 
> similar to how Intel platforms work. Based on this switch, we'll end up 
> patching the ACPI table.
> 
> If I remove the IORT entry, then the device is in coherent mode with 
> device accessing the full RAM range.
> 
> If I have the IORT table, the device is in IOMMU translation mode.
> 
> Details are in the IORT spec.

I think that would suck a lot more than being slightly out of spec
regarding SBSA if you make the low PCI addresses map to the start
of RAM. Asking users to select a 'virtualization' option based on
what kind of PCI device and kernel version they have is a major
hassle.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
  2015-11-10 21:59                             ` Timur Tabi
@ 2015-11-10 22:08                               ` Arnd Bergmann
  -1 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 22:08 UTC (permalink / raw)
  To: Timur Tabi
  Cc: Sinan Kaya, linux-arm-kernel, Abhijit Mahajan,
	Nagalakshmi Nandigama, linux-scsi, jcm, James E.J. Bottomley,
	linux-kernel, Sreekanth Reddy, Praveen Krishnamoorthy, cov,
	linux-arm-msm, agross, MPT-FusionLinux.pdl, Hannes Reinecke

On Tuesday 10 November 2015 15:59:18 Timur Tabi wrote:
> On 11/10/2015 03:54 PM, Arnd Bergmann wrote:
> 
> >> In our drivers for 32-bit devices, we have to explicitly set the DMA
> >> mask to 32-bits in order to get any DMA working.
> >
> > Do you mean PCI devices or platform devices?
> 
> Platform.
> 
> > Maybe the parent bus is lacking a dma-ranges property?
> 
> All of this applies only on device-tree platforms.  Sinan and I are 
> working on an ACPI server platform.  So we never call 
> of_dma_configure(), and we don't have a dma-ranges property.

ACPI must have something else to mark DMA master devices and their
capabilities, right?

The platform should initialize the dma_mask pointer to a per-device
mask and set both the dma_mask and dma_coherent_mask to 32 bits
for any DMA master device, and the device driver should override
that to be an appropriate mask based on its needs later.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
@ 2015-11-10 22:08                               ` Arnd Bergmann
  0 siblings, 0 replies; 86+ messages in thread
From: Arnd Bergmann @ 2015-11-10 22:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 10 November 2015 15:59:18 Timur Tabi wrote:
> On 11/10/2015 03:54 PM, Arnd Bergmann wrote:
> 
> >> In our drivers for 32-bit devices, we have to explicitly set the DMA
> >> mask to 32-bits in order to get any DMA working.
> >
> > Do you mean PCI devices or platform devices?
> 
> Platform.
> 
> > Maybe the parent bus is lacking a dma-ranges property?
> 
> All of this applies only on device-tree platforms.  Sinan and I are 
> working on an ACPI server platform.  So we never call 
> of_dma_configure(), and we don't have a dma-ranges property.

ACPI must have something else to mark DMA master devices and their
capabilities, right?

The platform should initialize the dma_mask pointer to a per-device
mask and set both the dma_mask and dma_coherent_mask to 32 bits
for any DMA master device, and the device driver should override
that to be an appropriate mask based on its needs later.

	Arnd

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
  2015-11-10  5:59       ` Sinan Kaya
@ 2016-03-16 15:31         ` Christopher Covington
  -1 siblings, 0 replies; 86+ messages in thread
From: Christopher Covington @ 2016-03-16 15:31 UTC (permalink / raw)
  To: Sinan Kaya, Hannes Reinecke, linux-scsi, timur, jcm
  Cc: agross, linux-arm-msm, linux-arm-kernel, Nagalakshmi Nandigama,
	Praveen Krishnamoorthy, Sreekanth Reddy, Abhijit Mahajan,
	James E.J. Bottomley, MPT-FusionLinux.pdl, linux-kernel

On 11/10/2015 12:59 AM, Sinan Kaya wrote:
> 
> On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
>> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>>> The mpt2sas and mpt3sas drivers are spinning forever in
>>> their IRQ handlers if there are a lot of jobs queued up
>>> by the PCIe card. This handler is causing spikes for
>>> the rest of the system and sluggish behavior.
>>>
>>> Marking all MSI interrupts as non-shared and moving the
>>> MSI interrupts to thread context. This relexes the rest
>>> of the system execution.
>>>
>> NACK.
>>
>> If there is a scalability issue when handling interrupts
>> it should be fixed in the driver directly.
>>
>> Looking at the driver is should be possible to implement
>> a worker thread handling the reply descriptor, and having the
>> interrupt only to fetch the reply descriptor.
>>
> 
> Can you go into the detail about which part of the _base_interrupt
> function needs to be executed in ISR context and which part can be
> queued up to worker thread?
> 
> I'm not familiar with the hardware or the code. That's why, I moved the
> entire ISR into the thread context.

Hannes or others, any suggestions?

Thanks,
Christopher Covington

-- 
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

* [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution
@ 2016-03-16 15:31         ` Christopher Covington
  0 siblings, 0 replies; 86+ messages in thread
From: Christopher Covington @ 2016-03-16 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/10/2015 12:59 AM, Sinan Kaya wrote:
> 
> On 11/9/2015 2:15 AM, Hannes Reinecke wrote:
>> On 11/09/2015 02:57 AM, Sinan Kaya wrote:
>>> The mpt2sas and mpt3sas drivers are spinning forever in
>>> their IRQ handlers if there are a lot of jobs queued up
>>> by the PCIe card. This handler is causing spikes for
>>> the rest of the system and sluggish behavior.
>>>
>>> Marking all MSI interrupts as non-shared and moving the
>>> MSI interrupts to thread context. This relexes the rest
>>> of the system execution.
>>>
>> NACK.
>>
>> If there is a scalability issue when handling interrupts
>> it should be fixed in the driver directly.
>>
>> Looking at the driver is should be possible to implement
>> a worker thread handling the reply descriptor, and having the
>> interrupt only to fetch the reply descriptor.
>>
> 
> Can you go into the detail about which part of the _base_interrupt
> function needs to be executed in ISR context and which part can be
> queued up to worker thread?
> 
> I'm not familiar with the hardware or the code. That's why, I moved the
> entire ISR into the thread context.

Hannes or others, any suggestions?

Thanks,
Christopher Covington

-- 
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2016-03-16 15:31 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-09  1:57 [PATCH V2 0/3] scsi: mptxsas: updates for ARM64 Sinan Kaya
2015-11-09  1:57 ` Sinan Kaya
2015-11-09  1:57 ` [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails Sinan Kaya
2015-11-09  1:57   ` Sinan Kaya
2015-11-09  7:09   ` Hannes Reinecke
2015-11-09  7:09     ` Hannes Reinecke
2015-11-09  8:59     ` Arnd Bergmann
2015-11-09  8:59       ` Arnd Bergmann
2015-11-09 14:07       ` Sinan Kaya
2015-11-09 14:07         ` Sinan Kaya
2015-11-09 14:33         ` Arnd Bergmann
2015-11-09 14:33           ` Arnd Bergmann
2015-11-09 23:22           ` Sinan Kaya
2015-11-09 23:22             ` Sinan Kaya
2015-11-09 23:29             ` Timur Tabi
2015-11-09 23:29               ` Timur Tabi
2015-11-10  8:38             ` Arnd Bergmann
2015-11-10  8:38               ` Arnd Bergmann
2015-11-10 16:06               ` Sinan Kaya
2015-11-10 16:06                 ` Sinan Kaya
2015-11-10 16:47                 ` Arnd Bergmann
2015-11-10 16:47                   ` Arnd Bergmann
2015-11-10 17:00                   ` Timur Tabi
2015-11-10 17:00                     ` Timur Tabi
2015-11-10 19:13                     ` Arnd Bergmann
2015-11-10 19:13                       ` Arnd Bergmann
2015-11-10 21:03                       ` Timur Tabi
2015-11-10 21:03                         ` Timur Tabi
2015-11-10 21:54                         ` Arnd Bergmann
2015-11-10 21:54                           ` Arnd Bergmann
2015-11-10 21:59                           ` Timur Tabi
2015-11-10 21:59                             ` Timur Tabi
2015-11-10 22:08                             ` Arnd Bergmann
2015-11-10 22:08                               ` Arnd Bergmann
2015-11-10 17:19                   ` Sinan Kaya
2015-11-10 17:19                     ` Sinan Kaya
2015-11-10 18:27                     ` James Bottomley
2015-11-10 18:27                       ` James Bottomley
2015-11-10 19:14                       ` Sinan Kaya
2015-11-10 19:14                         ` Sinan Kaya
2015-11-10 19:43                         ` James Bottomley
2015-11-10 19:43                           ` James Bottomley
2015-11-10 19:56                           ` Sinan Kaya
2015-11-10 19:56                             ` Sinan Kaya
2015-11-10 20:05                             ` James Bottomley
2015-11-10 20:05                               ` James Bottomley
2015-11-10 20:26                               ` Sinan Kaya
2015-11-10 20:26                                 ` Sinan Kaya
2015-11-10 20:35                                 ` James Bottomley
2015-11-10 20:35                                   ` James Bottomley
2015-11-10 19:56                     ` Arnd Bergmann
2015-11-10 19:56                       ` Arnd Bergmann
2015-11-10 20:58                       ` Sinan Kaya
2015-11-10 20:58                         ` Sinan Kaya
2015-11-10 22:06                         ` Arnd Bergmann
2015-11-10 22:06                           ` Arnd Bergmann
2015-11-09 14:00     ` Sinan Kaya
2015-11-09 14:00       ` Sinan Kaya
2015-11-09  1:57 ` [PATCH V2 2/3] scsi: fix compiler warning for sg Sinan Kaya
2015-11-09  1:57   ` Sinan Kaya
2015-11-09 14:14   ` Andy Shevchenko
2015-11-09 14:14     ` Andy Shevchenko
2015-11-10  3:21     ` Sinan Kaya
2015-11-10  3:21       ` Sinan Kaya
2015-11-10  3:21       ` Sinan Kaya
2015-11-10  3:26       ` Timur Tabi
2015-11-10  3:26         ` Timur Tabi
2015-11-10  4:51         ` Sinan Kaya
2015-11-10  4:51           ` Sinan Kaya
2015-11-10  4:53           ` Timur Tabi
2015-11-10  4:53             ` Timur Tabi
2015-11-10  9:23             ` Andy Shevchenko
2015-11-10  9:23               ` Andy Shevchenko
2015-11-10 10:09             ` Arnd Bergmann
2015-11-10 10:09               ` Arnd Bergmann
2015-11-09  1:57 ` [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution Sinan Kaya
2015-11-09  1:57   ` Sinan Kaya
2015-11-09  7:15   ` Hannes Reinecke
2015-11-09  7:15     ` Hannes Reinecke
2015-11-09 14:01     ` Sinan Kaya
2015-11-09 14:01       ` Sinan Kaya
2015-11-10  5:59     ` Sinan Kaya
2015-11-10  5:59       ` Sinan Kaya
2015-11-10  5:59       ` Sinan Kaya
2016-03-16 15:31       ` Christopher Covington
2016-03-16 15:31         ` Christopher Covington

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.