linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems
@ 2020-05-26 17:57 Alexander Dahl
  2020-05-27  9:06 ` Greg KH
  2020-05-28 18:24 ` [tip: x86/urgent] x86/dma: " tip-bot2 for Alexander Dahl
  0 siblings, 2 replies; 3+ messages in thread
From: Alexander Dahl @ 2020-05-26 17:57 UTC (permalink / raw)
  To: x86
  Cc: iommu, linux-kernel, Alan Jenkins, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H . Peter Anvin, Robin Murphy, Florian Wolters,
	Alexander Dahl, stable

The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
4 294 967 296 or 0x100000000 which is no problem on 64 bit systems.  The
patch does not change the later overall result of 0x100000 for
MAX_DMA32_PFN.  The new calculation yields the same result, but does not
require 64 bit arithmetic.

On 32 bit systems the old calculation suffers from an arithmetic
overflow in that intermediate term in braces: 4UL aka unsigned long int
is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does
not fit in 4 bytes), the in braces result is truncated to zero, the
following right shift does not alter that, so MAX_DMA32_PFN evaluates to
0 on 32 bit systems.

That wrong value is a problem in a comparision against MAX_DMA32_PFN in
the init code for swiotlb in 'pci_swiotlb_detect_4gb()' to decide if
swiotlb should be active.  That comparison yields the opposite result,
when compiling on 32 bit systems.

This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation
on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32
(and which landed in v3.0).

In practice this wasn't a problem, unless you activated CONFIG_SWIOTLB
on x86 (32 bit).

However for ARCH=x86 (32 bit) and if you have set CONFIG_IOMMU_INTEL,
since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page
is used") there's a dependency on CONFIG_SWIOTLB, which was not
necessarily active before.  That landed in v5.4, where we noticed it in
the fli4l Linux distribution.  We have CONFIG_IOMMU_INTEL active on both
32 and 64 bit kernel configs there (I could not find out why, so let's
just say historical reasons).

The effect is at boot time 64 MiB (default size) were allocated for
bounce buffers now, which is a noticeable amount of memory on small
systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
frequently used as home routers.

We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
(LTS) in fli4l and got that kernel messages for example:

  Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
  …
  Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
  …
  PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
  software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)

The initial analysis and the suggested fix was done by user 'sourcejedi'
at stackoverflow and explicitly marked as GPLv2 for inclusion in the
Linux kernel:

  https://unix.stackexchange.com/a/520525/50007

The new calculation, which does not suffer from that overflow, is the
same as for arch/mips now as suggested by Robin Murphy.

The fix was tested by fli4l users on round about two dozen different
systems, including both 32 and 64 bit archs, bare metal and virtualized
machines.

Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too")
Fixes: https://web.nettworks.org/bugs/browse/FFL-2560
Fixes: https://unix.stackexchange.com/q/520065/50007
Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Alexander Dahl <post@lespocky.de>
Cc: stable@vger.kernel.org
---

Notes:
    v3:
      - rewritten commit message to better explain that arithmetic overflow
        and added Fixes tag (Greg Kroah-Hartman)
      - rebased on v5.7-rc7
    
    v2:
      - use the same calculation as with arch/mips (Robin Murphy)

 arch/x86/include/asm/dma.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/dma.h b/arch/x86/include/asm/dma.h
index 00f7cf45e699..8e95aa4b0d17 100644
--- a/arch/x86/include/asm/dma.h
+++ b/arch/x86/include/asm/dma.h
@@ -74,7 +74,7 @@
 #define MAX_DMA_PFN   ((16UL * 1024 * 1024) >> PAGE_SHIFT)
 
 /* 4GB broken PCI/AGP hardware bus master zone */
-#define MAX_DMA32_PFN ((4UL * 1024 * 1024 * 1024) >> PAGE_SHIFT)
+#define MAX_DMA32_PFN (1UL << (32 - PAGE_SHIFT))
 
 #ifdef CONFIG_X86_32
 /* The maximum address that we can perform a DMA transfer to on this platform */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems
  2020-05-26 17:57 [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems Alexander Dahl
@ 2020-05-27  9:06 ` Greg KH
  2020-05-28 18:24 ` [tip: x86/urgent] x86/dma: " tip-bot2 for Alexander Dahl
  1 sibling, 0 replies; 3+ messages in thread
From: Greg KH @ 2020-05-27  9:06 UTC (permalink / raw)
  To: Alexander Dahl
  Cc: x86, iommu, linux-kernel, Alan Jenkins, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H . Peter Anvin, Robin Murphy,
	Florian Wolters, stable

On Tue, May 26, 2020 at 07:57:49PM +0200, Alexander Dahl wrote:
> The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
> 4 294 967 296 or 0x100000000 which is no problem on 64 bit systems.  The
> patch does not change the later overall result of 0x100000 for
> MAX_DMA32_PFN.  The new calculation yields the same result, but does not
> require 64 bit arithmetic.
> 
> On 32 bit systems the old calculation suffers from an arithmetic
> overflow in that intermediate term in braces: 4UL aka unsigned long int
> is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does
> not fit in 4 bytes), the in braces result is truncated to zero, the
> following right shift does not alter that, so MAX_DMA32_PFN evaluates to
> 0 on 32 bit systems.
> 
> That wrong value is a problem in a comparision against MAX_DMA32_PFN in
> the init code for swiotlb in 'pci_swiotlb_detect_4gb()' to decide if
> swiotlb should be active.  That comparison yields the opposite result,
> when compiling on 32 bit systems.
> 
> This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation
> on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32
> (and which landed in v3.0).
> 
> In practice this wasn't a problem, unless you activated CONFIG_SWIOTLB
> on x86 (32 bit).
> 
> However for ARCH=x86 (32 bit) and if you have set CONFIG_IOMMU_INTEL,
> since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page
> is used") there's a dependency on CONFIG_SWIOTLB, which was not
> necessarily active before.  That landed in v5.4, where we noticed it in
> the fli4l Linux distribution.  We have CONFIG_IOMMU_INTEL active on both
> 32 and 64 bit kernel configs there (I could not find out why, so let's
> just say historical reasons).
> 
> The effect is at boot time 64 MiB (default size) were allocated for
> bounce buffers now, which is a noticeable amount of memory on small
> systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
> frequently used as home routers.
> 
> We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
> (LTS) in fli4l and got that kernel messages for example:
> 
>   Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
>   …
>   Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
>   …
>   PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
>   software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)
> 
> The initial analysis and the suggested fix was done by user 'sourcejedi'
> at stackoverflow and explicitly marked as GPLv2 for inclusion in the
> Linux kernel:
> 
>   https://unix.stackexchange.com/a/520525/50007
> 
> The new calculation, which does not suffer from that overflow, is the
> same as for arch/mips now as suggested by Robin Murphy.
> 
> The fix was tested by fli4l users on round about two dozen different
> systems, including both 32 and 64 bit archs, bare metal and virtualized
> machines.
> 
> Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too")
> Fixes: https://web.nettworks.org/bugs/browse/FFL-2560
> Fixes: https://unix.stackexchange.com/q/520065/50007
> Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
> Suggested-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Alexander Dahl <post@lespocky.de>
> Cc: stable@vger.kernel.org

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip: x86/urgent] x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
  2020-05-26 17:57 [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems Alexander Dahl
  2020-05-27  9:06 ` Greg KH
@ 2020-05-28 18:24 ` tip-bot2 for Alexander Dahl
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot2 for Alexander Dahl @ 2020-05-28 18:24 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Alan Jenkins, Robin Murphy, Alexander Dahl, Borislav Petkov,
	Greg Kroah-Hartman, stable, x86, LKML

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     88743470668ef5eb6b7ba9e0f99888e5999bf172
Gitweb:        https://git.kernel.org/tip/88743470668ef5eb6b7ba9e0f99888e5999bf172
Author:        Alexander Dahl <post@lespocky.de>
AuthorDate:    Tue, 26 May 2020 19:57:49 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 28 May 2020 20:21:32 +02:00

x86/dma: Fix max PFN arithmetic overflow on 32 bit systems

The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
4 294 967 296 or 0x100000000 which is no problem on 64 bit systems.
The patch does not change the later overall result of 0x100000 for
MAX_DMA32_PFN (after it has been shifted by PAGE_SHIFT). The new
calculation yields the same result, but does not require 64 bit
arithmetic.

On 32 bit systems the old calculation suffers from an arithmetic
overflow in that intermediate term in braces: 4UL aka unsigned long int
is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does
not fit in 4 bytes), the in braces result is truncated to zero, the
following right shift does not alter that, so MAX_DMA32_PFN evaluates to
0 on 32 bit systems.

That wrong value is a problem in a comparision against MAX_DMA32_PFN in
the init code for swiotlb in pci_swiotlb_detect_4gb() to decide if
swiotlb should be active.  That comparison yields the opposite result,
when compiling on 32 bit systems.

This was not possible before

  1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too")

when that MAX_DMA32_PFN was first made visible to x86_32 (and which
landed in v3.0).

In practice this wasn't a problem, unless CONFIG_SWIOTLB is active on
x86-32.

However if one has set CONFIG_IOMMU_INTEL, since

  c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page is used")

there's a dependency on CONFIG_SWIOTLB, which was not necessarily
active before. That landed in v5.4, where we noticed it in the fli4l
Linux distribution. We have CONFIG_IOMMU_INTEL active on both 32 and 64
bit kernel configs there (I could not find out why, so let's just say
historical reasons).

The effect is at boot time 64 MiB (default size) were allocated for
bounce buffers now, which is a noticeable amount of memory on small
systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
frequently used as home routers.

We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
(LTS) in fli4l and got that kernel messages for example:

  Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
  …
  Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
  …
  PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
  software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)

The initial analysis and the suggested fix was done by user 'sourcejedi'
at stackoverflow and explicitly marked as GPLv2 for inclusion in the
Linux kernel:

  https://unix.stackexchange.com/a/520525/50007

The new calculation, which does not suffer from that overflow, is the
same as for arch/mips now as suggested by Robin Murphy.

The fix was tested by fli4l users on round about two dozen different
systems, including both 32 and 64 bit archs, bare metal and virtualized
machines.

 [ bp: Massage commit message. ]

Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too")
Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Alexander Dahl <post@lespocky.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://unix.stackexchange.com/q/520065/50007
Link: https://web.nettworks.org/bugs/browse/FFL-2560
Link: https://lkml.kernel.org/r/20200526175749.20742-1-post@lespocky.de
---
 arch/x86/include/asm/dma.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/dma.h b/arch/x86/include/asm/dma.h
index 00f7cf4..8e95aa4 100644
--- a/arch/x86/include/asm/dma.h
+++ b/arch/x86/include/asm/dma.h
@@ -74,7 +74,7 @@
 #define MAX_DMA_PFN   ((16UL * 1024 * 1024) >> PAGE_SHIFT)
 
 /* 4GB broken PCI/AGP hardware bus master zone */
-#define MAX_DMA32_PFN ((4UL * 1024 * 1024 * 1024) >> PAGE_SHIFT)
+#define MAX_DMA32_PFN (1UL << (32 - PAGE_SHIFT))
 
 #ifdef CONFIG_X86_32
 /* The maximum address that we can perform a DMA transfer to on this platform */

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-05-28 18:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-26 17:57 [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems Alexander Dahl
2020-05-27  9:06 ` Greg KH
2020-05-28 18:24 ` [tip: x86/urgent] x86/dma: " tip-bot2 for Alexander Dahl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).