linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND][PATCH 0/7] Avoid overflow at boundary_size
@ 2020-08-31 20:38 Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 1/7] powerpc/iommu: " Nicolin Chen
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

==== For this resend ====
The original series have not been acked at any patch. So I am
resending them, being suggested by Niklas.

==== Coverletter ====
We are expending the default DMA segmentation boundary to its
possible maximum value (ULONG_MAX) to indicate that a device
doesn't specify a boundary limit. So all dma_get_seg_boundary
callers should take a precaution with the return values since
it would easily get overflowed.

I scanned the entire kernel tree for all the existing callers
and found that most of callers may get overflowed in two ways:
either "+ 1" or passing it to ALIGN() that does "+ mask".

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So this series of patches fix the potential overflow with this
overflow-free shortcut.

As I don't have these platforms, testings/comments are welcome.

Thanks
Nic

Nicolin Chen (7):
  powerpc/iommu: Avoid overflow at boundary_size
  alpha: Avoid overflow at boundary_size
  ia64/sba_iommu: Avoid overflow at boundary_size
  s390/pci_dma: Avoid overflow at boundary_size
  sparc: Avoid overflow at boundary_size
  x86/amd_gart: Avoid overflow at boundary_size
  parisc: Avoid overflow at boundary_size

 arch/alpha/kernel/pci_iommu.c    | 10 ++++------
 arch/ia64/hp/common/sba_iommu.c  |  4 ++--
 arch/powerpc/kernel/iommu.c      | 11 +++++------
 arch/s390/pci/pci_dma.c          |  4 ++--
 arch/sparc/kernel/iommu-common.c |  9 +++------
 arch/sparc/kernel/iommu.c        |  4 ++--
 arch/sparc/kernel/pci_sun4v.c    |  4 ++--
 arch/x86/kernel/amd_gart_64.c    |  4 ++--
 drivers/parisc/ccio-dma.c        |  4 ++--
 drivers/parisc/sba_iommu.c       |  4 ++--
 10 files changed, 26 insertions(+), 32 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 1/7] powerpc/iommu: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-09-01 13:27   ` Michael Ellerman
  2020-08-31 20:38 ` [RESEND][PATCH 2/7] alpha: " Nicolin Chen
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1"
or passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 arch/powerpc/kernel/iommu.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 9704f3f76e63..c01ccbf8afdd 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -236,15 +236,14 @@ static unsigned long iommu_range_alloc(struct device *dev,
 		}
 	}
 
-	if (dev)
-		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				      1 << tbl->it_page_shift);
-	else
-		boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
 	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
+	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
+
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
 
 	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
-			     boundary_size >> tbl->it_page_shift, align_mask);
+			     boundary_size, align_mask);
 	if (n == -1) {
 		if (likely(pass == 0)) {
 			/* First try the pool from the start */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 2/7] alpha: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 1/7] powerpc/iommu: " Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 3/7] ia64/sba_iommu: " Nicolin Chen
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So "+ 1" would
potentially overflow.

Also, by following other places in the kernel, boundary_size
should align with the PAGE_SIZE before right shifting by the
PAGE_SHIFT. However, passing it to ALIGN() would potentially
overflow too.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 arch/alpha/kernel/pci_iommu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 81037907268d..1ef2c647bd3e 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -141,12 +141,10 @@ iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena,
 	unsigned long boundary_size;
 
 	base = arena->dma_base >> PAGE_SHIFT;
-	if (dev) {
-		boundary_size = dma_get_seg_boundary(dev) + 1;
-		boundary_size >>= PAGE_SHIFT;
-	} else {
-		boundary_size = 1UL << (32 - PAGE_SHIFT);
-	}
+
+	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (boundary_size >> PAGE_SHIFT) + 1;
 
 	/* Search forward for the first mask-aligned sequence of N free ptes */
 	ptes = arena->ptes;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 3/7] ia64/sba_iommu: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 1/7] powerpc/iommu: " Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 2/7] alpha: " Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 4/7] s390/pci_dma: " Nicolin Chen
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1"
or passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 arch/ia64/hp/common/sba_iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 656a4888c300..945954903bb0 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -485,8 +485,8 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
 	ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) == 0);
 	ASSERT(res_ptr < res_end);
 
-	boundary_size = (unsigned long long)dma_get_seg_boundary(dev) + 1;
-	boundary_size = ALIGN(boundary_size, 1ULL << iovp_shift) >> iovp_shift;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (dma_get_seg_boundary(dev) >> iovp_shift) + 1;
 
 	BUG_ON(ioc->ibase & ~iovp_mask);
 	shift = ioc->ibase >> iovp_shift;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 4/7] s390/pci_dma: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
                   ` (2 preceding siblings ...)
  2020-08-31 20:38 ` [RESEND][PATCH 3/7] ia64/sba_iommu: " Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 5/7] sparc: " Nicolin Chen
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1"
or passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 arch/s390/pci/pci_dma.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 64b1399a73f0..ecb067acc6d5 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -263,8 +263,8 @@ static unsigned long __dma_alloc_iommu(struct device *dev,
 	struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
 	unsigned long boundary_size;
 
-	boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-			      PAGE_SIZE) >> PAGE_SHIFT;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
 	return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages,
 				start, size, zdev->start_dma >> PAGE_SHIFT,
 				boundary_size, 0);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 5/7] sparc: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
                   ` (3 preceding siblings ...)
  2020-08-31 20:38 ` [RESEND][PATCH 4/7] s390/pci_dma: " Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 6/7] x86/amd_gart: " Nicolin Chen
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1"
or passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 arch/sparc/kernel/iommu-common.c | 9 +++------
 arch/sparc/kernel/iommu.c        | 4 ++--
 arch/sparc/kernel/pci_sun4v.c    | 4 ++--
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/sparc/kernel/iommu-common.c b/arch/sparc/kernel/iommu-common.c
index 59cb16691322..843e71894d04 100644
--- a/arch/sparc/kernel/iommu-common.c
+++ b/arch/sparc/kernel/iommu-common.c
@@ -166,13 +166,10 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
 		}
 	}
 
-	if (dev)
-		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				      1 << iommu->table_shift);
-	else
-		boundary_size = ALIGN(1ULL << 32, 1 << iommu->table_shift);
+	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
 
-	boundary_size = boundary_size >> iommu->table_shift;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (boundary_size >> iommu->table_shift) + 1;
 	/*
 	 * if the skip_span_boundary_check had been set during init, we set
 	 * things up so that iommu_is_span_boundary() merely checks if the
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 4ae7388b1bff..d981c37305ae 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -472,8 +472,8 @@ static int dma_4u_map_sg(struct device *dev, struct scatterlist *sglist,
 	outs->dma_length = 0;
 
 	max_seg_size = dma_get_max_seg_size(dev);
-	seg_boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				  IO_PAGE_SIZE) >> IO_PAGE_SHIFT;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	seg_boundary_size = (dma_get_seg_boundary(dev) >> IO_PAGE_SHIFT) + 1;
 	base_shift = iommu->tbl.table_map_base >> IO_PAGE_SHIFT;
 	for_each_sg(sglist, s, nelems, i) {
 		unsigned long paddr, npages, entry, out_entry = 0, slen;
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 14b93c5564e3..233fe8a20cec 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -508,8 +508,8 @@ static int dma_4v_map_sg(struct device *dev, struct scatterlist *sglist,
 	iommu_batch_start(dev, prot, ~0UL);
 
 	max_seg_size = dma_get_max_seg_size(dev);
-	seg_boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
-				  IO_PAGE_SIZE) >> IO_PAGE_SHIFT;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	seg_boundary_size = (dma_get_seg_boundary(dev) >> IO_PAGE_SHIFT) + 1;
 
 	mask = *dev->dma_mask;
 	if (!iommu_use_atu(iommu, mask))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 6/7] x86/amd_gart: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
                   ` (4 preceding siblings ...)
  2020-08-31 20:38 ` [RESEND][PATCH 5/7] sparc: " Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-08-31 20:38 ` [RESEND][PATCH 7/7] parisc: " Nicolin Chen
  2020-09-01  7:36 ` [RESEND][PATCH 0/7] " Christoph Hellwig
  7 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1"
or passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 arch/x86/kernel/amd_gart_64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index e89031e9c847..7fa0bb490065 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -96,8 +96,8 @@ static unsigned long alloc_iommu(struct device *dev, int size,
 
 	base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
 			   PAGE_SIZE) >> PAGE_SHIFT;
-	boundary_size = ALIGN((u64)dma_get_seg_boundary(dev) + 1,
-			      PAGE_SIZE) >> PAGE_SHIFT;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
 
 	spin_lock_irqsave(&iommu_bitmap_lock, flags);
 	offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, next_bit,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RESEND][PATCH 7/7] parisc: Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
                   ` (5 preceding siblings ...)
  2020-08-31 20:38 ` [RESEND][PATCH 6/7] x86/amd_gart: " Nicolin Chen
@ 2020-08-31 20:38 ` Nicolin Chen
  2020-09-01  7:36 ` [RESEND][PATCH 0/7] " Christoph Hellwig
  7 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-08-31 20:38 UTC (permalink / raw)
  To: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

The boundary_size might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1"
or passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So fixing a potential overflow with the safer shortcut.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 drivers/parisc/ccio-dma.c  | 4 ++--
 drivers/parisc/sba_iommu.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index a5507f75b524..c667d6aba764 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -356,8 +356,8 @@ ccio_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
 	** ggg sacrifices another 710 to the computer gods.
 	*/
 
-	boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
-			      1ULL << IOVP_SHIFT) >> IOVP_SHIFT;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (dma_get_seg_boundary(dev) >> IOVP_SHIFT) + 1;
 
 	if (pages_needed <= 8) {
 		/*
diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
index d4314fba0269..96bc2c617cbd 100644
--- a/drivers/parisc/sba_iommu.c
+++ b/drivers/parisc/sba_iommu.c
@@ -342,8 +342,8 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
 	unsigned long shift;
 	int ret;
 
-	boundary_size = ALIGN((unsigned long long)dma_get_seg_boundary(dev) + 1,
-			      1ULL << IOVP_SHIFT) >> IOVP_SHIFT;
+	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
+	boundary_size = (dma_get_seg_boundary(dev) >> IOVP_SHIFT) + 1;
 
 #if defined(ZX1_SUPPORT)
 	BUG_ON(ioc->ibase & ~IOVP_MASK);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RESEND][PATCH 0/7] Avoid overflow at boundary_size
  2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
                   ` (6 preceding siblings ...)
  2020-08-31 20:38 ` [RESEND][PATCH 7/7] parisc: " Nicolin Chen
@ 2020-09-01  7:36 ` Christoph Hellwig
  2020-09-01  7:54   ` Nicolin Chen
  7 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2020-09-01  7:36 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller, sfr, hch,
	linuxppc-dev, linux-kernel, linux-alpha, linux-ia64, linux-s390,
	sparclinux, linux-parisc

I really don't like all the open coded smarts in the various drivers.
What do you think about a helper like the one in the untested patch
below (on top of your series).  Also please include the original
segment boundary patch with the next resend so that the series has
the full context.

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 1ef2c647bd3ec2..6f7de4f4e191e7 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -141,10 +141,7 @@ iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena,
 	unsigned long boundary_size;
 
 	base = arena->dma_base >> PAGE_SHIFT;
-
-	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (boundary_size >> PAGE_SHIFT) + 1;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
 
 	/* Search forward for the first mask-aligned sequence of N free ptes */
 	ptes = arena->ptes;
diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 945954903bb0ba..b49b73a95067d2 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -485,8 +485,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
 	ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) == 0);
 	ASSERT(res_ptr < res_end);
 
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (dma_get_seg_boundary(dev) >> iovp_shift) + 1;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, iovp_shift);
 
 	BUG_ON(ioc->ibase & ~iovp_mask);
 	shift = ioc->ibase >> iovp_shift;
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index c01ccbf8afdd42..cbc2e62db597cf 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -236,11 +236,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
 		}
 	}
 
-	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
-	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
-
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, tbl->it_page_shift);
 
 	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
 			     boundary_size, align_mask);
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index ecb067acc6d532..4a37d8f4de9d9d 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -261,13 +261,11 @@ static unsigned long __dma_alloc_iommu(struct device *dev,
 				       unsigned long start, int size)
 {
 	struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
-	unsigned long boundary_size;
 
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
 	return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages,
 				start, size, zdev->start_dma >> PAGE_SHIFT,
-				boundary_size, 0);
+				dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT),
+				0);
 }
 
 static dma_addr_t dma_alloc_address(struct device *dev, int size)
diff --git a/arch/sparc/kernel/iommu-common.c b/arch/sparc/kernel/iommu-common.c
index 843e71894d0482..e6139c99762e11 100644
--- a/arch/sparc/kernel/iommu-common.c
+++ b/arch/sparc/kernel/iommu-common.c
@@ -166,10 +166,6 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
 		}
 	}
 
-	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
-
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (boundary_size >> iommu->table_shift) + 1;
 	/*
 	 * if the skip_span_boundary_check had been set during init, we set
 	 * things up so that iommu_is_span_boundary() merely checks if the
@@ -178,7 +174,11 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
 	if ((iommu->flags & IOMMU_NO_SPAN_BOUND) != 0) {
 		shift = 0;
 		boundary_size = iommu->poolsize * iommu->nr_pools;
+	} else {
+		boundary_size = dma_get_seg_boundary_nr_pages(dev,
+					iommu->table_shift);
 	}
+
 	n = iommu_area_alloc(iommu->map, limit, start, npages, shift,
 			     boundary_size, align_mask);
 	if (n == -1) {
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index d981c37305ae31..c3e4e2df26a8b8 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -472,8 +472,7 @@ static int dma_4u_map_sg(struct device *dev, struct scatterlist *sglist,
 	outs->dma_length = 0;
 
 	max_seg_size = dma_get_max_seg_size(dev);
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	seg_boundary_size = (dma_get_seg_boundary(dev) >> IO_PAGE_SHIFT) + 1;
+	seg_boundary_size = dma_get_seg_boundary_nr_pages(dev, IO_PAGE_SHIFT);
 	base_shift = iommu->tbl.table_map_base >> IO_PAGE_SHIFT;
 	for_each_sg(sglist, s, nelems, i) {
 		unsigned long paddr, npages, entry, out_entry = 0, slen;
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 233fe8a20cec33..6b92dd51c0026f 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -508,8 +508,7 @@ static int dma_4v_map_sg(struct device *dev, struct scatterlist *sglist,
 	iommu_batch_start(dev, prot, ~0UL);
 
 	max_seg_size = dma_get_max_seg_size(dev);
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	seg_boundary_size = (dma_get_seg_boundary(dev) >> IO_PAGE_SHIFT) + 1;
+	seg_boundary_size = dma_get_seg_boundary_nr_pages(dev, IO_PAGE_SHIFT);
 
 	mask = *dev->dma_mask;
 	if (!iommu_use_atu(iommu, mask))
diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 7fa0bb490065a1..bccc5357bffd6c 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -96,8 +96,7 @@ static unsigned long alloc_iommu(struct device *dev, int size,
 
 	base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
 			   PAGE_SIZE) >> PAGE_SHIFT;
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
 
 	spin_lock_irqsave(&iommu_bitmap_lock, flags);
 	offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, next_bit,
diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index c667d6aba7646e..ba16b7f8f80612 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -356,8 +356,7 @@ ccio_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
 	** ggg sacrifices another 710 to the computer gods.
 	*/
 
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (dma_get_seg_boundary(dev) >> IOVP_SHIFT) + 1;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, IOVP_SHIFT);
 
 	if (pages_needed <= 8) {
 		/*
diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
index 96bc2c617cbd19..959bda193b9603 100644
--- a/drivers/parisc/sba_iommu.c
+++ b/drivers/parisc/sba_iommu.c
@@ -342,8 +342,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
 	unsigned long shift;
 	int ret;
 
-	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-	boundary_size = (dma_get_seg_boundary(dev) >> IOVP_SHIFT) + 1;
+	boundary_size = dma_get_seg_boundary_nr_pages(dev, IOVP_SHIFT);
 
 #if defined(ZX1_SUPPORT)
 	BUG_ON(ioc->ibase & ~IOVP_MASK);
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 52635e91143b25..7477a164500adb 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -632,6 +632,25 @@ static inline unsigned long dma_get_seg_boundary(struct device *dev)
 	return DMA_BIT_MASK(32);
 }
 
+/**
+ * dma_get_seg_boundary_nr_pages - return the segment boundary in "page" units
+ * @dev: device to guery the boundary for
+ * @page_shift: ilog() of the the IOMMU page size
+ *
+ * Return the segment boundary in IOMMU page units (which may be different from
+ * the CPU page size) for the passed in device.
+ *
+ * If @dev is NULL a boundary of U32_MAX is assumed, this case is just for
+ * non-DMA API callers.
+ */
+static inline unsigned long dma_get_seg_boundary_nr_pages(struct device *dev,
+		unsigned int page_shift)
+{
+	if (!dev)
+		return (U32_MAX >> page_shift) + 1;
+	return (dma_get_seg_boundary(dev) >> page_shift) + 1;
+}
+
 static inline int dma_set_seg_boundary(struct device *dev, unsigned long mask)
 {
 	if (dev->dma_parms) {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RESEND][PATCH 0/7] Avoid overflow at boundary_size
  2020-09-01  7:36 ` [RESEND][PATCH 0/7] " Christoph Hellwig
@ 2020-09-01  7:54   ` Nicolin Chen
  2020-09-01  9:11     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolin Chen @ 2020-09-01  7:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: mpe, benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller, sfr, linuxppc-dev,
	linux-kernel, linux-alpha, linux-ia64, linux-s390, sparclinux,
	linux-parisc

Hi Christoph,

On Tue, Sep 01, 2020 at 09:36:23AM +0200, Christoph Hellwig wrote:
> I really don't like all the open coded smarts in the various drivers.
> What do you think about a helper like the one in the untested patch

A helper function will be actually better. I was thinking of
one yet not very sure about the naming and where to put it.

> below (on top of your series).  Also please include the original
> segment boundary patch with the next resend so that the series has
> the full context.

I will use your change instead and resend with the ULONG_MAX
change. But in that case, should I make separate changes for
different files like this series, or just one single change
like yours?

Asking this as I was expecting that those changes would get
applied by different maintainers. But now it feels like you
will merge it to your tree at once?

Thanks
Nic

> diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
> index 1ef2c647bd3ec2..6f7de4f4e191e7 100644
> --- a/arch/alpha/kernel/pci_iommu.c
> +++ b/arch/alpha/kernel/pci_iommu.c
> @@ -141,10 +141,7 @@ iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena,
>  	unsigned long boundary_size;
>  
>  	base = arena->dma_base >> PAGE_SHIFT;
> -
> -	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (boundary_size >> PAGE_SHIFT) + 1;
> +	boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
>  
>  	/* Search forward for the first mask-aligned sequence of N free ptes */
>  	ptes = arena->ptes;
> diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
> index 945954903bb0ba..b49b73a95067d2 100644
> --- a/arch/ia64/hp/common/sba_iommu.c
> +++ b/arch/ia64/hp/common/sba_iommu.c
> @@ -485,8 +485,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
>  	ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) == 0);
>  	ASSERT(res_ptr < res_end);
>  
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (dma_get_seg_boundary(dev) >> iovp_shift) + 1;
> +	boundary_size = dma_get_seg_boundary_nr_pages(dev, iovp_shift);
>  
>  	BUG_ON(ioc->ibase & ~iovp_mask);
>  	shift = ioc->ibase >> iovp_shift;
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index c01ccbf8afdd42..cbc2e62db597cf 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -236,11 +236,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
>  		}
>  	}
>  
> -	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
> -	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> -
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
> +	boundary_size = dma_get_seg_boundary_nr_pages(dev, tbl->it_page_shift);
>  
>  	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
>  			     boundary_size, align_mask);
> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
> index ecb067acc6d532..4a37d8f4de9d9d 100644
> --- a/arch/s390/pci/pci_dma.c
> +++ b/arch/s390/pci/pci_dma.c
> @@ -261,13 +261,11 @@ static unsigned long __dma_alloc_iommu(struct device *dev,
>  				       unsigned long start, int size)
>  {
>  	struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
> -	unsigned long boundary_size;
>  
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
>  	return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages,
>  				start, size, zdev->start_dma >> PAGE_SHIFT,
> -				boundary_size, 0);
> +				dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT),
> +				0);
>  }
>  
>  static dma_addr_t dma_alloc_address(struct device *dev, int size)
> diff --git a/arch/sparc/kernel/iommu-common.c b/arch/sparc/kernel/iommu-common.c
> index 843e71894d0482..e6139c99762e11 100644
> --- a/arch/sparc/kernel/iommu-common.c
> +++ b/arch/sparc/kernel/iommu-common.c
> @@ -166,10 +166,6 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
>  		}
>  	}
>  
> -	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> -
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (boundary_size >> iommu->table_shift) + 1;
>  	/*
>  	 * if the skip_span_boundary_check had been set during init, we set
>  	 * things up so that iommu_is_span_boundary() merely checks if the
> @@ -178,7 +174,11 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
>  	if ((iommu->flags & IOMMU_NO_SPAN_BOUND) != 0) {
>  		shift = 0;
>  		boundary_size = iommu->poolsize * iommu->nr_pools;
> +	} else {
> +		boundary_size = dma_get_seg_boundary_nr_pages(dev,
> +					iommu->table_shift);
>  	}
> +
>  	n = iommu_area_alloc(iommu->map, limit, start, npages, shift,
>  			     boundary_size, align_mask);
>  	if (n == -1) {
> diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
> index d981c37305ae31..c3e4e2df26a8b8 100644
> --- a/arch/sparc/kernel/iommu.c
> +++ b/arch/sparc/kernel/iommu.c
> @@ -472,8 +472,7 @@ static int dma_4u_map_sg(struct device *dev, struct scatterlist *sglist,
>  	outs->dma_length = 0;
>  
>  	max_seg_size = dma_get_max_seg_size(dev);
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	seg_boundary_size = (dma_get_seg_boundary(dev) >> IO_PAGE_SHIFT) + 1;
> +	seg_boundary_size = dma_get_seg_boundary_nr_pages(dev, IO_PAGE_SHIFT);
>  	base_shift = iommu->tbl.table_map_base >> IO_PAGE_SHIFT;
>  	for_each_sg(sglist, s, nelems, i) {
>  		unsigned long paddr, npages, entry, out_entry = 0, slen;
> diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
> index 233fe8a20cec33..6b92dd51c0026f 100644
> --- a/arch/sparc/kernel/pci_sun4v.c
> +++ b/arch/sparc/kernel/pci_sun4v.c
> @@ -508,8 +508,7 @@ static int dma_4v_map_sg(struct device *dev, struct scatterlist *sglist,
>  	iommu_batch_start(dev, prot, ~0UL);
>  
>  	max_seg_size = dma_get_max_seg_size(dev);
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	seg_boundary_size = (dma_get_seg_boundary(dev) >> IO_PAGE_SHIFT) + 1;
> +	seg_boundary_size = dma_get_seg_boundary_nr_pages(dev, IO_PAGE_SHIFT);
>  
>  	mask = *dev->dma_mask;
>  	if (!iommu_use_atu(iommu, mask))
> diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
> index 7fa0bb490065a1..bccc5357bffd6c 100644
> --- a/arch/x86/kernel/amd_gart_64.c
> +++ b/arch/x86/kernel/amd_gart_64.c
> @@ -96,8 +96,7 @@ static unsigned long alloc_iommu(struct device *dev, int size,
>  
>  	base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
>  			   PAGE_SIZE) >> PAGE_SHIFT;
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
> +	boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
>  
>  	spin_lock_irqsave(&iommu_bitmap_lock, flags);
>  	offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, next_bit,
> diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
> index c667d6aba7646e..ba16b7f8f80612 100644
> --- a/drivers/parisc/ccio-dma.c
> +++ b/drivers/parisc/ccio-dma.c
> @@ -356,8 +356,7 @@ ccio_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
>  	** ggg sacrifices another 710 to the computer gods.
>  	*/
>  
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (dma_get_seg_boundary(dev) >> IOVP_SHIFT) + 1;
> +	boundary_size = dma_get_seg_boundary_nr_pages(dev, IOVP_SHIFT);
>  
>  	if (pages_needed <= 8) {
>  		/*
> diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c
> index 96bc2c617cbd19..959bda193b9603 100644
> --- a/drivers/parisc/sba_iommu.c
> +++ b/drivers/parisc/sba_iommu.c
> @@ -342,8 +342,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
>  	unsigned long shift;
>  	int ret;
>  
> -	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> -	boundary_size = (dma_get_seg_boundary(dev) >> IOVP_SHIFT) + 1;
> +	boundary_size = dma_get_seg_boundary_nr_pages(dev, IOVP_SHIFT);
>  
>  #if defined(ZX1_SUPPORT)
>  	BUG_ON(ioc->ibase & ~IOVP_MASK);
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 52635e91143b25..7477a164500adb 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -632,6 +632,25 @@ static inline unsigned long dma_get_seg_boundary(struct device *dev)
>  	return DMA_BIT_MASK(32);
>  }
>  
> +/**
> + * dma_get_seg_boundary_nr_pages - return the segment boundary in "page" units
> + * @dev: device to guery the boundary for
> + * @page_shift: ilog() of the the IOMMU page size
> + *
> + * Return the segment boundary in IOMMU page units (which may be different from
> + * the CPU page size) for the passed in device.
> + *
> + * If @dev is NULL a boundary of U32_MAX is assumed, this case is just for
> + * non-DMA API callers.
> + */
> +static inline unsigned long dma_get_seg_boundary_nr_pages(struct device *dev,
> +		unsigned int page_shift)
> +{
> +	if (!dev)
> +		return (U32_MAX >> page_shift) + 1;
> +	return (dma_get_seg_boundary(dev) >> page_shift) + 1;
> +}
> +
>  static inline int dma_set_seg_boundary(struct device *dev, unsigned long mask)
>  {
>  	if (dev->dma_parms) {

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RESEND][PATCH 0/7] Avoid overflow at boundary_size
  2020-09-01  7:54   ` Nicolin Chen
@ 2020-09-01  9:11     ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2020-09-01  9:11 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: Christoph Hellwig, mpe, benh, paulus, rth, ink, mattst88,
	tony.luck, fenghua.yu, schnelle, gerald.schaefer, hca, gor,
	borntraeger, davem, tglx, mingo, bp, x86, hpa, James.Bottomley,
	deller, sfr, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

On Tue, Sep 01, 2020 at 12:54:01AM -0700, Nicolin Chen wrote:
> Hi Christoph,
> 
> On Tue, Sep 01, 2020 at 09:36:23AM +0200, Christoph Hellwig wrote:
> > I really don't like all the open coded smarts in the various drivers.
> > What do you think about a helper like the one in the untested patch
> 
> A helper function will be actually better. I was thinking of
> one yet not very sure about the naming and where to put it.
> 
> > below (on top of your series).  Also please include the original
> > segment boundary patch with the next resend so that the series has
> > the full context.
> 
> I will use your change instead and resend with the ULONG_MAX
> change. But in that case, should I make separate changes for
> different files like this series, or just one single change
> like yours?
> 
> Asking this as I was expecting that those changes would get
> applied by different maintainers. But now it feels like you
> will merge it to your tree at once?

I guess one patch is fine.  I can queue it up in the dma-mapping
tree as a prep patch for the default boundary change.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RESEND][PATCH 1/7] powerpc/iommu: Avoid overflow at boundary_size
  2020-08-31 20:38 ` [RESEND][PATCH 1/7] powerpc/iommu: " Nicolin Chen
@ 2020-09-01 13:27   ` Michael Ellerman
  2020-09-01 20:53     ` Nicolin Chen
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Ellerman @ 2020-09-01 13:27 UTC (permalink / raw)
  To: Nicolin Chen, benh, paulus, rth, ink, mattst88, tony.luck,
	fenghua.yu, schnelle, gerald.schaefer, hca, gor, borntraeger,
	davem, tglx, mingo, bp, x86, hpa, James.Bottomley, deller
  Cc: sfr, hch, linuxppc-dev, linux-kernel, linux-alpha, linux-ia64,
	linux-s390, sparclinux, linux-parisc

Nicolin Chen <nicoleotsuka@gmail.com> writes:
> The boundary_size might be as large as ULONG_MAX, which means
> that a device has no specific boundary limit. So either "+ 1"
> or passing it to ALIGN() would potentially overflow.
>
> According to kernel defines:
>     #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
>     #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)
>
> We can simplify the logic here:
>   ALIGN(boundary + 1, 1 << shift) >> shift
> = ALIGN_MASK(b + 1, (1 << s) - 1) >> s
> = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
> = [b + 1 + (1 << s) - 1] >> s
> = [b + (1 << s)] >> s
> = (b >> s) + 1
>
> So fixing a potential overflow with the safer shortcut.
>
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> Cc: Christoph Hellwig <hch@lst.de>
> ---
>  arch/powerpc/kernel/iommu.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)

Are you asking for acks, or for maintainers to merge the patches
individually?

> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 9704f3f76e63..c01ccbf8afdd 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -236,15 +236,14 @@ static unsigned long iommu_range_alloc(struct device *dev,
>  		}
>  	}
>  
> -	if (dev)
> -		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> -				      1 << tbl->it_page_shift);
> -	else
> -		boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
>  	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
> +	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;

Is there any path that passes a NULL dev anymore?

Both iseries_hv_alloc() and iseries_hv_map() were removed years ago.
See:
  8ee3e0d69623 ("powerpc: Remove the main legacy iSerie platform code")


So maybe we should do a lead-up patch that drops the NULL dev support,
which will then make this patch simpler.

cheers


> +	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> +	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
>  
>  	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
> -			     boundary_size >> tbl->it_page_shift, align_mask);
> +			     boundary_size, align_mask);
>  	if (n == -1) {
>  		if (likely(pass == 0)) {
>  			/* First try the pool from the start */
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RESEND][PATCH 1/7] powerpc/iommu: Avoid overflow at boundary_size
  2020-09-01 13:27   ` Michael Ellerman
@ 2020-09-01 20:53     ` Nicolin Chen
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2020-09-01 20:53 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: benh, paulus, rth, ink, mattst88, tony.luck, fenghua.yu,
	schnelle, gerald.schaefer, hca, gor, borntraeger, davem, tglx,
	mingo, bp, x86, hpa, James.Bottomley, deller, sfr, hch,
	linuxppc-dev, linux-kernel, linux-alpha, linux-ia64, linux-s390,
	sparclinux, linux-parisc

On Tue, Sep 01, 2020 at 11:27:36PM +1000, Michael Ellerman wrote:
> Nicolin Chen <nicoleotsuka@gmail.com> writes:
> > The boundary_size might be as large as ULONG_MAX, which means
> > that a device has no specific boundary limit. So either "+ 1"
> > or passing it to ALIGN() would potentially overflow.
> >
> > According to kernel defines:
> >     #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
> >     #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)
> >
> > We can simplify the logic here:
> >   ALIGN(boundary + 1, 1 << shift) >> shift
> > = ALIGN_MASK(b + 1, (1 << s) - 1) >> s
> > = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
> > = [b + 1 + (1 << s) - 1] >> s
> > = [b + (1 << s)] >> s
> > = (b >> s) + 1
> >
> > So fixing a potential overflow with the safer shortcut.
> >
> > Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> > Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > ---
> >  arch/powerpc/kernel/iommu.c | 11 +++++------
> >  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> Are you asking for acks, or for maintainers to merge the patches
> individually?

I was expecting that but Christoph just suggested me to squash them
into one so he would merge it: https://lkml.org/lkml/2020/9/1/159

Though I feel it'd be nice to get maintainers' acks before merging.

> > diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> > index 9704f3f76e63..c01ccbf8afdd 100644
> > --- a/arch/powerpc/kernel/iommu.c
> > +++ b/arch/powerpc/kernel/iommu.c
> > @@ -236,15 +236,14 @@ static unsigned long iommu_range_alloc(struct device *dev,
> >  		}
> >  	}
> >  
> > -	if (dev)
> > -		boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> > -				      1 << tbl->it_page_shift);
> > -	else
> > -		boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
> >  	/* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
> > +	boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> 
> Is there any path that passes a NULL dev anymore?
> 
> Both iseries_hv_alloc() and iseries_hv_map() were removed years ago.
> See:
>   8ee3e0d69623 ("powerpc: Remove the main legacy iSerie platform code")
> 
> 
> So maybe we should do a lead-up patch that drops the NULL dev support,
> which will then make this patch simpler.

The next version of this change will follow Christoph's suggestion
by having a helper function that takes care of !dev internally.

Thanks
Nic

> 
> 
> > +	/* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> > +	boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
> >  
> >  	n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
> > -			     boundary_size >> tbl->it_page_shift, align_mask);
> > +			     boundary_size, align_mask);
> >  	if (n == -1) {
> >  		if (likely(pass == 0)) {
> >  			/* First try the pool from the start */
> > -- 
> > 2.17.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-01 20:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 20:38 [RESEND][PATCH 0/7] Avoid overflow at boundary_size Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 1/7] powerpc/iommu: " Nicolin Chen
2020-09-01 13:27   ` Michael Ellerman
2020-09-01 20:53     ` Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 2/7] alpha: " Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 3/7] ia64/sba_iommu: " Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 4/7] s390/pci_dma: " Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 5/7] sparc: " Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 6/7] x86/amd_gart: " Nicolin Chen
2020-08-31 20:38 ` [RESEND][PATCH 7/7] parisc: " Nicolin Chen
2020-09-01  7:36 ` [RESEND][PATCH 0/7] " Christoph Hellwig
2020-09-01  7:54   ` Nicolin Chen
2020-09-01  9:11     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).