IOMMU Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/4] arm64: Default to 32-bit wide ZONE_DMA
@ 2020-10-01 16:17 Nicolas Saenz Julienne
  2020-10-01 16:17 ` [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711 Nicolas Saenz Julienne
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-01 16:17 UTC (permalink / raw)
  To: robh+dt, catalin.marinas, hch, robin.murphy, devicetree, iommu, linux-mm
  Cc: will, linux-kernel, linux-arm-kernel, linux-rpi-kernel

Using two distinct DMA zones turned out to be problematic. Here's an
attempt go back to a saner default.

I tested this on both a RPi4 and QEMU.

---

Nicolas Saenz Julienne (4):
  of/fdt: Update zone_dma_bits when running in bcm2711
  dma-direct: Turn zone_dma_bits default value into a define
  arm64: Default to 32-bit ZONE_DMA
  mm: Update DMA zones description with arm64 newer behavior

 arch/arm64/mm/init.c       | 12 ++++++++----
 drivers/of/fdt.c           | 10 ++++++++++
 include/linux/dma-direct.h |  1 +
 include/linux/mmzone.h     |  5 +++--
 kernel/dma/direct.c        |  2 +-
 5 files changed, 23 insertions(+), 7 deletions(-)

-- 
2.28.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 16:17 [PATCH 0/4] arm64: Default to 32-bit wide ZONE_DMA Nicolas Saenz Julienne
@ 2020-10-01 16:17 ` Nicolas Saenz Julienne
  2020-10-01 17:15   ` Catalin Marinas
  2020-10-02  9:05   ` kernel test robot
  2020-10-01 16:17 ` [PATCH 2/4] dma-direct: Turn zone_dma_bits default value into a define Nicolas Saenz Julienne
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-01 16:17 UTC (permalink / raw)
  To: robh+dt, catalin.marinas, hch, robin.murphy, devicetree, iommu,
	linux-mm, Frank Rowand
  Cc: will, linux-kernel, linux-arm-kernel, linux-rpi-kernel

arm64 wants to be able to set ZONE_DMA's size depending on the specific
platform its being run on. Ideally this could be achieved in a smart way
by parsing all dma-ranges and calculating the smaller DMA constraint in
the system. Easier said than done. We compromised on a simpler solution
as the only platform interested in using this is the Raspberry Pi 4.

So update zone_dma_bits if the machine's compatible string matches
Raspberry Pi 4's, otherwise let arm64's mm code deal with it.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
---
 drivers/of/fdt.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 4602e467ca8b..cd0d115ef329 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -25,6 +25,7 @@
 #include <linux/serial_core.h>
 #include <linux/sysfs.h>
 #include <linux/random.h>
+#include <linux/dma-direct.h>	/* for zone_dma_bits */
 
 #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
 #include <asm/page.h>
@@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
 	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
 }
 
+void __init early_init_dt_update_zone_dma_bits(void)
+{
+	unsigned long dt_root = of_get_flat_dt_root();
+
+	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
+		zone_dma_bits = 30;
+}
+
 bool __init early_init_dt_scan(void *params)
 {
 	bool status;
@@ -1207,6 +1216,7 @@ bool __init early_init_dt_scan(void *params)
 		return false;
 
 	early_init_dt_scan_nodes();
+	early_init_dt_update_zone_dma_bits();
 	return true;
 }
 
-- 
2.28.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 2/4] dma-direct: Turn zone_dma_bits default value into a define
  2020-10-01 16:17 [PATCH 0/4] arm64: Default to 32-bit wide ZONE_DMA Nicolas Saenz Julienne
  2020-10-01 16:17 ` [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711 Nicolas Saenz Julienne
@ 2020-10-01 16:17 ` Nicolas Saenz Julienne
  2020-10-01 16:17 ` [PATCH 3/4] arm64: Default to 32-bit ZONE_DMA Nicolas Saenz Julienne
  2020-10-01 16:17 ` [PATCH 4/4] mm: Update DMA zones description Nicolas Saenz Julienne
  3 siblings, 0 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-01 16:17 UTC (permalink / raw)
  To: robh+dt, catalin.marinas, hch, robin.murphy, devicetree, iommu,
	linux-mm, Marek Szyprowski
  Cc: will, linux-kernel, linux-arm-kernel, linux-rpi-kernel

Other code might need to reference it.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
---
 include/linux/dma-direct.h | 1 +
 kernel/dma/direct.c        | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 38ed3b55034d..ef19ff505d09 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -12,6 +12,7 @@
 #include <linux/mem_encrypt.h>
 #include <linux/swiotlb.h>
 
+#define ZONE_DMA_BITS_DEFAULT	24
 extern unsigned int zone_dma_bits;
 
 /*
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 121a9c1969dd..4b3cfa02532b 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -20,7 +20,7 @@
  * it for entirely different regions. In that case the arch code needs to
  * override the variable below for dma-direct to work properly.
  */
-unsigned int zone_dma_bits __ro_after_init = 24;
+unsigned int zone_dma_bits __ro_after_init = ZONE_DMA_BITS_DEFAULT;
 
 static inline dma_addr_t phys_to_dma_direct(struct device *dev,
 		phys_addr_t phys)
-- 
2.28.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 3/4] arm64: Default to 32-bit ZONE_DMA
  2020-10-01 16:17 [PATCH 0/4] arm64: Default to 32-bit wide ZONE_DMA Nicolas Saenz Julienne
  2020-10-01 16:17 ` [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711 Nicolas Saenz Julienne
  2020-10-01 16:17 ` [PATCH 2/4] dma-direct: Turn zone_dma_bits default value into a define Nicolas Saenz Julienne
@ 2020-10-01 16:17 ` Nicolas Saenz Julienne
  2020-10-01 17:19   ` Catalin Marinas
  2020-10-01 16:17 ` [PATCH 4/4] mm: Update DMA zones description Nicolas Saenz Julienne
  3 siblings, 1 reply; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-01 16:17 UTC (permalink / raw)
  To: robh+dt, catalin.marinas, hch, robin.murphy, devicetree, iommu,
	linux-mm, linux-kernel
  Cc: will, linux-rpi-kernel, linux-arm-kernel

The Raspberry Pi 4 needs two DMA zones as some of its devices can only
DMA into the 30-bit physical address space. We solved that by creating
an extra ZONE_DMA covering the 30-bit. It turns out that creating extra
zones unnecessarily broke Kdump on large systems. So default to a single
32-bit wide ZONE_DMA and only define both zones if running on RPi4.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
---
 arch/arm64/mm/init.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index e1a69a618832..3c3f462466eb 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -43,8 +43,6 @@
 #include <asm/tlb.h>
 #include <asm/alternative.h>
 
-#define ARM64_ZONE_DMA_BITS	30
-
 /*
  * We need to be able to catch inadvertent references to memstart_addr
  * that occur (potentially in generic code) before arm64_memblock_init()
@@ -388,8 +386,14 @@ void __init arm64_memblock_init(void)
 	early_init_fdt_scan_reserved_mem();
 
 	if (IS_ENABLED(CONFIG_ZONE_DMA)) {
-		zone_dma_bits = ARM64_ZONE_DMA_BITS;
-		arm64_dma_phys_limit = max_zone_phys(ARM64_ZONE_DMA_BITS);
+		/*
+		 * early_init_dt_scan() might alter zone_dma_bits based on the
+		 * device's DT. Otherwise, have it cover the 32-bit address
+		 * space.
+		 */
+		if (zone_dma_bits == ZONE_DMA_BITS_DEFAULT)
+			zone_dma_bits = 32;
+		arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
 	}
 
 	if (IS_ENABLED(CONFIG_ZONE_DMA32))
-- 
2.28.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 4/4] mm: Update DMA zones description
  2020-10-01 16:17 [PATCH 0/4] arm64: Default to 32-bit wide ZONE_DMA Nicolas Saenz Julienne
                   ` (2 preceding siblings ...)
  2020-10-01 16:17 ` [PATCH 3/4] arm64: Default to 32-bit ZONE_DMA Nicolas Saenz Julienne
@ 2020-10-01 16:17 ` Nicolas Saenz Julienne
  2020-10-01 17:19   ` Catalin Marinas
  3 siblings, 1 reply; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-01 16:17 UTC (permalink / raw)
  To: robh+dt, catalin.marinas, hch, robin.murphy, devicetree, iommu,
	linux-mm, Andrew Morton
  Cc: will, linux-kernel, linux-arm-kernel, linux-rpi-kernel

The default behavior for arm64 changed, so reflect that.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
---
 include/linux/mmzone.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fb3bf696c05e..d28ce77ccc2a 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -363,8 +363,9 @@ enum zone_type {
 	 *  - arm only uses ZONE_DMA, the size, up to 4G, may vary depending on
 	 *    the specific device.
 	 *
-	 *  - arm64 has a fixed 1G ZONE_DMA and ZONE_DMA32 for the rest of the
-	 *    lower 4G.
+	 *  - arm64 uses a single 4GB ZONE_DMA, except on the Raspberry Pi 4,
+	 *    in which ZONE_DMA covers the first GB and ZONE_DMA32 the rest of
+	 *    the lower 4GB.
 	 *
 	 *  - powerpc only uses ZONE_DMA, the size, up to 2G, may vary
 	 *    depending on the specific device.
-- 
2.28.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 16:17 ` [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711 Nicolas Saenz Julienne
@ 2020-10-01 17:15   ` Catalin Marinas
  2020-10-01 17:23     ` Catalin Marinas
  2020-10-02  9:05   ` kernel test robot
  1 sibling, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2020-10-01 17:15 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, will, Frank Rowand, linux-kernel, linux-mm, iommu,
	robh+dt, linux-rpi-kernel, robin.murphy, hch, linux-arm-kernel

Hi Nicolas,

Thanks for putting this together.

On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index 4602e467ca8b..cd0d115ef329 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -25,6 +25,7 @@
>  #include <linux/serial_core.h>
>  #include <linux/sysfs.h>
>  #include <linux/random.h>
> +#include <linux/dma-direct.h>	/* for zone_dma_bits */
>  
>  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>  #include <asm/page.h>
> @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
>  	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
>  }
>  
> +void __init early_init_dt_update_zone_dma_bits(void)
> +{
> +	unsigned long dt_root = of_get_flat_dt_root();
> +
> +	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> +		zone_dma_bits = 30;
> +}

I think we could keep this entirely in the arm64 setup_machine_fdt() and
not pollute the core code with RPi4-specific code.

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] arm64: Default to 32-bit ZONE_DMA
  2020-10-01 16:17 ` [PATCH 3/4] arm64: Default to 32-bit ZONE_DMA Nicolas Saenz Julienne
@ 2020-10-01 17:19   ` Catalin Marinas
  0 siblings, 0 replies; 31+ messages in thread
From: Catalin Marinas @ 2020-10-01 17:19 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, will, linux-kernel, linux-mm, iommu, robh+dt,
	linux-rpi-kernel, robin.murphy, hch, linux-arm-kernel

On Thu, Oct 01, 2020 at 06:17:39PM +0200, Nicolas Saenz Julienne wrote:
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index e1a69a618832..3c3f462466eb 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -43,8 +43,6 @@
>  #include <asm/tlb.h>
>  #include <asm/alternative.h>
>  
> -#define ARM64_ZONE_DMA_BITS	30
> -
>  /*
>   * We need to be able to catch inadvertent references to memstart_addr
>   * that occur (potentially in generic code) before arm64_memblock_init()
> @@ -388,8 +386,14 @@ void __init arm64_memblock_init(void)
>  	early_init_fdt_scan_reserved_mem();
>  
>  	if (IS_ENABLED(CONFIG_ZONE_DMA)) {
> -		zone_dma_bits = ARM64_ZONE_DMA_BITS;
> -		arm64_dma_phys_limit = max_zone_phys(ARM64_ZONE_DMA_BITS);
> +		/*
> +		 * early_init_dt_scan() might alter zone_dma_bits based on the
> +		 * device's DT. Otherwise, have it cover the 32-bit address
> +		 * space.
> +		 */
> +		if (zone_dma_bits == ZONE_DMA_BITS_DEFAULT)
> +			zone_dma_bits = 32;
> +		arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);

So here we assume that if zone_dma_bits is 24, it wasn't initialised. I
think it may be simpler if we just set it in setup_machine_fdt() to 32
or 30 if RPi4. This way we don't have to depend on what the core kernel
sets.

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] mm: Update DMA zones description
  2020-10-01 16:17 ` [PATCH 4/4] mm: Update DMA zones description Nicolas Saenz Julienne
@ 2020-10-01 17:19   ` Catalin Marinas
  0 siblings, 0 replies; 31+ messages in thread
From: Catalin Marinas @ 2020-10-01 17:19 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, will, linux-kernel, linux-mm, iommu, robh+dt,
	linux-rpi-kernel, Andrew Morton, robin.murphy, hch,
	linux-arm-kernel

On Thu, Oct 01, 2020 at 06:17:40PM +0200, Nicolas Saenz Julienne wrote:
> The default behavior for arm64 changed, so reflect that.
> 
> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 17:15   ` Catalin Marinas
@ 2020-10-01 17:23     ` Catalin Marinas
  2020-10-01 17:31       ` Nicolas Saenz Julienne
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2020-10-01 17:23 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, Frank Rowand, linux-kernel, linux-mm, iommu, robh+dt,
	linux-rpi-kernel, will, hch, linux-arm-kernel, robin.murphy

On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> Hi Nicolas,
> 
> Thanks for putting this together.
> 
> On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > index 4602e467ca8b..cd0d115ef329 100644
> > --- a/drivers/of/fdt.c
> > +++ b/drivers/of/fdt.c
> > @@ -25,6 +25,7 @@
> >  #include <linux/serial_core.h>
> >  #include <linux/sysfs.h>
> >  #include <linux/random.h>
> > +#include <linux/dma-direct.h>	/* for zone_dma_bits */
> >  
> >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> >  #include <asm/page.h>
> > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> >  	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> >  }
> >  
> > +void __init early_init_dt_update_zone_dma_bits(void)
> > +{
> > +	unsigned long dt_root = of_get_flat_dt_root();
> > +
> > +	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > +		zone_dma_bits = 30;
> > +}
> 
> I think we could keep this entirely in the arm64 setup_machine_fdt() and
> not pollute the core code with RPi4-specific code.

Actually, even better, could we not move the check to
arm64_memblock_init() when we initialise zone_dma_bits?

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 17:23     ` Catalin Marinas
@ 2020-10-01 17:31       ` Nicolas Saenz Julienne
  2020-10-01 20:02         ` Rob Herring
  2020-10-02 11:55         ` Catalin Marinas
  0 siblings, 2 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-01 17:31 UTC (permalink / raw)
  To: Catalin Marinas, Rob Herring
  Cc: devicetree, Frank Rowand, linux-kernel, linux-mm, iommu, robh+dt,
	linux-rpi-kernel, will, hch, linux-arm-kernel, robin.murphy

[-- Attachment #1.1: Type: text/plain, Size: 1556 bytes --]

On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > Hi Nicolas,
> > 
> > Thanks for putting this together.
> > 
> > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > index 4602e467ca8b..cd0d115ef329 100644
> > > --- a/drivers/of/fdt.c
> > > +++ b/drivers/of/fdt.c
> > > @@ -25,6 +25,7 @@
> > >  #include <linux/serial_core.h>
> > >  #include <linux/sysfs.h>
> > >  #include <linux/random.h>
> > > +#include <linux/dma-direct.h>	/* for zone_dma_bits */
> > >  
> > >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > >  #include <asm/page.h>
> > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > >  	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > >  }
> > >  
> > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > +{
> > > +	unsigned long dt_root = of_get_flat_dt_root();
> > > +
> > > +	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > +		zone_dma_bits = 30;
> > > +}
> > 
> > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > not pollute the core code with RPi4-specific code.
> 
> Actually, even better, could we not move the check to
> arm64_memblock_init() when we initialise zone_dma_bits?

I did it this way as I vaguely remembered Rob saying he wanted to centralise
all early boot fdt code in one place. But I'll be happy to move it there.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 17:31       ` Nicolas Saenz Julienne
@ 2020-10-01 20:02         ` Rob Herring
  2020-10-02 11:55         ` Catalin Marinas
  1 sibling, 0 replies; 31+ messages in thread
From: Rob Herring @ 2020-10-01 20:02 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, Frank Rowand, Catalin Marinas, linux-kernel,
	linux-mm, Linux IOMMU,
	moderated list:BROADCOM BCM2835 ARM ARCHITECTURE, Will Deacon,
	Christoph Hellwig, linux-arm-kernel, Robin Murphy

On Thu, Oct 1, 2020 at 12:31 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > Hi Nicolas,
> > >
> > > Thanks for putting this together.
> > >
> > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > --- a/drivers/of/fdt.c
> > > > +++ b/drivers/of/fdt.c
> > > > @@ -25,6 +25,7 @@
> > > >  #include <linux/serial_core.h>
> > > >  #include <linux/sysfs.h>
> > > >  #include <linux/random.h>
> > > > +#include <linux/dma-direct.h>    /* for zone_dma_bits */
> > > >
> > > >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > > >  #include <asm/page.h>
> > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > >   of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > >  }
> > > >
> > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > +{
> > > > + unsigned long dt_root = of_get_flat_dt_root();
> > > > +
> > > > + if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > +         zone_dma_bits = 30;
> > > > +}
> > >
> > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > not pollute the core code with RPi4-specific code.
> >
> > Actually, even better, could we not move the check to
> > arm64_memblock_init() when we initialise zone_dma_bits?
>
> I did it this way as I vaguely remembered Rob saying he wanted to centralise
> all early boot fdt code in one place. But I'll be happy to move it there.

Right, unless zone_dma_bits is only an arm64 thing, then this doesn't
really have anything arch specific.

Reviewed-by: Rob Herring <robh@kernel.org>

Rob
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 16:17 ` [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711 Nicolas Saenz Julienne
  2020-10-01 17:15   ` Catalin Marinas
@ 2020-10-02  9:05   ` kernel test robot
  1 sibling, 0 replies; 31+ messages in thread
From: kernel test robot @ 2020-10-02  9:05 UTC (permalink / raw)
  To: Nicolas Saenz Julienne, robh+dt, catalin.marinas, hch,
	robin.murphy, devicetree, iommu, linux-mm, Frank Rowand
  Cc: clang-built-linux, kbuild-all, linux-kernel, will


[-- Attachment #1: Type: text/plain, Size: 2564 bytes --]

Hi Nicolas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20201001]
[also build test WARNING on v5.9-rc7]
[cannot apply to robh/for-next arm64/for-next/core hnaz-linux-mm/master linus/master v5.9-rc7 v5.9-rc6 v5.9-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nicolas-Saenz-Julienne/arm64-Default-to-32-bit-wide-ZONE_DMA/20201002-002007
base:    d39294091fee6b89d9c4a683bb19441b25098330
config: arm64-randconfig-r005-20200930 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project bcd05599d0e53977a963799d6ee4f6e0bc21331b)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/7d073ab6c280772b1bcf9e337528be2138d0bc85
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nicolas-Saenz-Julienne/arm64-Default-to-32-bit-wide-ZONE_DMA/20201002-002007
        git checkout 7d073ab6c280772b1bcf9e337528be2138d0bc85
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/of/fdt.c:1202:13: warning: no previous prototype for function 'early_init_dt_update_zone_dma_bits' [-Wmissing-prototypes]
   void __init early_init_dt_update_zone_dma_bits(void)
               ^
   drivers/of/fdt.c:1202:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void __init early_init_dt_update_zone_dma_bits(void)
   ^
   static 
   1 warning generated.

vim +/early_init_dt_update_zone_dma_bits +1202 drivers/of/fdt.c

  1201	
> 1202	void __init early_init_dt_update_zone_dma_bits(void)
  1203	{
  1204		unsigned long dt_root = of_get_flat_dt_root();
  1205	
  1206		if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
  1207			zone_dma_bits = 30;
  1208	}
  1209	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 42197 bytes --]

[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-01 17:31       ` Nicolas Saenz Julienne
  2020-10-01 20:02         ` Rob Herring
@ 2020-10-02 11:55         ` Catalin Marinas
  2020-10-08 10:05           ` Nicolas Saenz Julienne
  1 sibling, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2020-10-02 11:55 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, Frank Rowand, linux-kernel, linux-mm, iommu,
	Rob Herring, linux-rpi-kernel, will, hch, linux-arm-kernel,
	robin.murphy

On Thu, Oct 01, 2020 at 07:31:19PM +0200, Nicolas Saenz Julienne wrote:
> On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > --- a/drivers/of/fdt.c
> > > > +++ b/drivers/of/fdt.c
> > > > @@ -25,6 +25,7 @@
> > > >  #include <linux/serial_core.h>
> > > >  #include <linux/sysfs.h>
> > > >  #include <linux/random.h>
> > > > +#include <linux/dma-direct.h>	/* for zone_dma_bits */
> > > >  
> > > >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > > >  #include <asm/page.h>
> > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > >  	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > >  }
> > > >  
> > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > +{
> > > > +	unsigned long dt_root = of_get_flat_dt_root();
> > > > +
> > > > +	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > +		zone_dma_bits = 30;
> > > > +}
> > > 
> > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > not pollute the core code with RPi4-specific code.
> > 
> > Actually, even better, could we not move the check to
> > arm64_memblock_init() when we initialise zone_dma_bits?
> 
> I did it this way as I vaguely remembered Rob saying he wanted to centralise
> all early boot fdt code in one place. But I'll be happy to move it there.

I can see Rob replied and I'm fine if that's his preference. However,
what I don't particularly like is that in the arm64 code, if
zone_dma_bits == 24, we set it to 32 assuming that it wasn't touched by
the early_init_dt_update_zone_dma_bits(). What if at some point we'll
get a platform that actually needs 24 here (I truly hope not, but just
the principle of relying on magic values)?

So rather than guessing, I'd prefer if the arch code can override
ZONE_DMA_BITS_DEFAULT. Then, in arm64, we'll just set it to 32 and no
need to explicitly touch the zone_dma_bits variable.

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-02 11:55         ` Catalin Marinas
@ 2020-10-08 10:05           ` Nicolas Saenz Julienne
  2020-10-08 10:13             ` Catalin Marinas
  2020-10-09  7:10             ` Christoph Hellwig
  0 siblings, 2 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-08 10:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: devicetree, Frank Rowand, linux-kernel, linux-mm, iommu,
	Rob Herring, linux-rpi-kernel, will, hch, linux-arm-kernel,
	robin.murphy

[-- Attachment #1.1: Type: text/plain, Size: 2760 bytes --]

Hi Catalin, sorry for the late reply.

On Fri, 2020-10-02 at 12:55 +0100, Catalin Marinas wrote:
> On Thu, Oct 01, 2020 at 07:31:19PM +0200, Nicolas Saenz Julienne wrote:
> > On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > > --- a/drivers/of/fdt.c
> > > > > +++ b/drivers/of/fdt.c
> > > > > @@ -25,6 +25,7 @@
> > > > >  #include <linux/serial_core.h>
> > > > >  #include <linux/sysfs.h>
> > > > >  #include <linux/random.h>
> > > > > +#include <linux/dma-direct.h>	/* for zone_dma_bits */
> > > > >  
> > > > >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > > > >  #include <asm/page.h>
> > > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > > >  	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > > >  }
> > > > >  
> > > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > > +{
> > > > > +	unsigned long dt_root = of_get_flat_dt_root();
> > > > > +
> > > > > +	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > > +		zone_dma_bits = 30;
> > > > > +}
> > > > 
> > > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > > not pollute the core code with RPi4-specific code.
> > > 
> > > Actually, even better, could we not move the check to
> > > arm64_memblock_init() when we initialise zone_dma_bits?
> > 
> > I did it this way as I vaguely remembered Rob saying he wanted to centralise
> > all early boot fdt code in one place. But I'll be happy to move it there.
> 
> I can see Rob replied and I'm fine if that's his preference. However,
> what I don't particularly like is that in the arm64 code, if
> zone_dma_bits == 24, we set it to 32 assuming that it wasn't touched by
> the early_init_dt_update_zone_dma_bits(). What if at some point we'll
> get a platform that actually needs 24 here (I truly hope not, but just
> the principle of relying on magic values)?
> 
> So rather than guessing, I'd prefer if the arch code can override
> ZONE_DMA_BITS_DEFAULT. Then, in arm64, we'll just set it to 32 and no
> need to explicitly touch the zone_dma_bits variable.

Yes, sonds like the way to go. TBH I wasn't happy with that solution either,
but couldn't think of a nicer alternative.

Sadly I just realised that the series is incomplete, we have RPi4 users that
want to boot unsing ACPI, and this series would break things for them. I'll
have a word with them to see what we can do for their use-case.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-08 10:05           ` Nicolas Saenz Julienne
@ 2020-10-08 10:13             ` Catalin Marinas
  2020-10-08 19:43               ` Ard Biesheuvel
  2020-10-09  7:10             ` Christoph Hellwig
  1 sibling, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2020-10-08 10:13 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, Frank Rowand, linux-kernel, linux-mm, iommu,
	Rob Herring, linux-rpi-kernel, will, hch, linux-arm-kernel,
	robin.murphy

On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> On Fri, 2020-10-02 at 12:55 +0100, Catalin Marinas wrote:
> > On Thu, Oct 01, 2020 at 07:31:19PM +0200, Nicolas Saenz Julienne wrote:
> > > On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > > > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > > > --- a/drivers/of/fdt.c
> > > > > > +++ b/drivers/of/fdt.c
> > > > > > @@ -25,6 +25,7 @@
> > > > > >  #include <linux/serial_core.h>
> > > > > >  #include <linux/sysfs.h>
> > > > > >  #include <linux/random.h>
> > > > > > +#include <linux/dma-direct.h>	/* for zone_dma_bits */
> > > > > >  
> > > > > >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > > > > >  #include <asm/page.h>
> > > > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > > > >  	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > > > >  }
> > > > > >  
> > > > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > > > +{
> > > > > > +	unsigned long dt_root = of_get_flat_dt_root();
> > > > > > +
> > > > > > +	if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > > > +		zone_dma_bits = 30;
> > > > > > +}
> > > > > 
> > > > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > > > not pollute the core code with RPi4-specific code.
> > > > 
> > > > Actually, even better, could we not move the check to
> > > > arm64_memblock_init() when we initialise zone_dma_bits?
> > > 
> > > I did it this way as I vaguely remembered Rob saying he wanted to centralise
> > > all early boot fdt code in one place. But I'll be happy to move it there.
> > 
> > I can see Rob replied and I'm fine if that's his preference. However,
> > what I don't particularly like is that in the arm64 code, if
> > zone_dma_bits == 24, we set it to 32 assuming that it wasn't touched by
> > the early_init_dt_update_zone_dma_bits(). What if at some point we'll
> > get a platform that actually needs 24 here (I truly hope not, but just
> > the principle of relying on magic values)?
> > 
> > So rather than guessing, I'd prefer if the arch code can override
> > ZONE_DMA_BITS_DEFAULT. Then, in arm64, we'll just set it to 32 and no
> > need to explicitly touch the zone_dma_bits variable.
> 
> Yes, sonds like the way to go. TBH I wasn't happy with that solution either,
> but couldn't think of a nicer alternative.
> 
> Sadly I just realised that the series is incomplete, we have RPi4 users that
> want to boot unsing ACPI, and this series would break things for them. I'll
> have a word with them to see what we can do for their use-case.

Is there a way to get some SoC information from ACPI?

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-08 10:13             ` Catalin Marinas
@ 2020-10-08 19:43               ` Ard Biesheuvel
  2020-10-09  3:59                 ` Jeremy Linton
  0 siblings, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2020-10-08 19:43 UTC (permalink / raw)
  To: Catalin Marinas, Lorenzo Pieralisi
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Christoph Hellwig, Robin Murphy

(+ Lorenzo)

On Thu, 8 Oct 2020 at 12:14, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > On Fri, 2020-10-02 at 12:55 +0100, Catalin Marinas wrote:
> > > On Thu, Oct 01, 2020 at 07:31:19PM +0200, Nicolas Saenz Julienne wrote:
> > > > On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > > > > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > > > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > > > > --- a/drivers/of/fdt.c
> > > > > > > +++ b/drivers/of/fdt.c
> > > > > > > @@ -25,6 +25,7 @@
> > > > > > >  #include <linux/serial_core.h>
> > > > > > >  #include <linux/sysfs.h>
> > > > > > >  #include <linux/random.h>
> > > > > > > +#include <linux/dma-direct.h>      /* for zone_dma_bits */
> > > > > > >
> > > > > > >  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > > > > > >  #include <asm/page.h>
> > > > > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > > > > >     of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > > > > >  }
> > > > > > >
> > > > > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > > > > +{
> > > > > > > +   unsigned long dt_root = of_get_flat_dt_root();
> > > > > > > +
> > > > > > > +   if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > > > > +           zone_dma_bits = 30;
> > > > > > > +}
> > > > > >
> > > > > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > > > > not pollute the core code with RPi4-specific code.
> > > > >
> > > > > Actually, even better, could we not move the check to
> > > > > arm64_memblock_init() when we initialise zone_dma_bits?
> > > >
> > > > I did it this way as I vaguely remembered Rob saying he wanted to centralise
> > > > all early boot fdt code in one place. But I'll be happy to move it there.
> > >
> > > I can see Rob replied and I'm fine if that's his preference. However,
> > > what I don't particularly like is that in the arm64 code, if
> > > zone_dma_bits == 24, we set it to 32 assuming that it wasn't touched by
> > > the early_init_dt_update_zone_dma_bits(). What if at some point we'll
> > > get a platform that actually needs 24 here (I truly hope not, but just
> > > the principle of relying on magic values)?
> > >
> > > So rather than guessing, I'd prefer if the arch code can override
> > > ZONE_DMA_BITS_DEFAULT. Then, in arm64, we'll just set it to 32 and no
> > > need to explicitly touch the zone_dma_bits variable.
> >
> > Yes, sonds like the way to go. TBH I wasn't happy with that solution either,
> > but couldn't think of a nicer alternative.
> >
> > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > want to boot unsing ACPI, and this series would break things for them. I'll
> > have a word with them to see what we can do for their use-case.
>
> Is there a way to get some SoC information from ACPI?
>

This is unfortunate. We used ACPI _DMA methods as they were designed
to communicate the DMA limit of the XHCI controller to the OS.

It shouldn't be too hard to match the OEM id field in the DSDT, and
switch to the smaller mask. But it sucks to have to add a quirk like
that.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-08 19:43               ` Ard Biesheuvel
@ 2020-10-09  3:59                 ` Jeremy Linton
  2020-10-09  8:37                   ` Nicolas Saenz Julienne
  0 siblings, 1 reply; 31+ messages in thread
From: Jeremy Linton @ 2020-10-09  3:59 UTC (permalink / raw)
  To: Ard Biesheuvel, Catalin Marinas, Lorenzo Pieralisi
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Christoph Hellwig, Robin Murphy

On 10/8/20 2:43 PM, Ard Biesheuvel wrote:
> (+ Lorenzo)
> 
> On Thu, 8 Oct 2020 at 12:14, Catalin Marinas <catalin.marinas@arm.com> wrote:
>>
>> On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
>>> On Fri, 2020-10-02 at 12:55 +0100, Catalin Marinas wrote:
>>>> On Thu, Oct 01, 2020 at 07:31:19PM +0200, Nicolas Saenz Julienne wrote:
>>>>> On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
>>>>>> On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
>>>>>>> On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
>>>>>>>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>>>>>>>> index 4602e467ca8b..cd0d115ef329 100644
>>>>>>>> --- a/drivers/of/fdt.c
>>>>>>>> +++ b/drivers/of/fdt.c
>>>>>>>> @@ -25,6 +25,7 @@
>>>>>>>>   #include <linux/serial_core.h>
>>>>>>>>   #include <linux/sysfs.h>
>>>>>>>>   #include <linux/random.h>
>>>>>>>> +#include <linux/dma-direct.h>      /* for zone_dma_bits */
>>>>>>>>
>>>>>>>>   #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>>>>>>>>   #include <asm/page.h>
>>>>>>>> @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
>>>>>>>>      of_scan_flat_dt(early_init_dt_scan_memory, NULL);
>>>>>>>>   }
>>>>>>>>
>>>>>>>> +void __init early_init_dt_update_zone_dma_bits(void)
>>>>>>>> +{
>>>>>>>> +   unsigned long dt_root = of_get_flat_dt_root();
>>>>>>>> +
>>>>>>>> +   if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
>>>>>>>> +           zone_dma_bits = 30;
>>>>>>>> +}
>>>>>>>
>>>>>>> I think we could keep this entirely in the arm64 setup_machine_fdt() and
>>>>>>> not pollute the core code with RPi4-specific code.
>>>>>>
>>>>>> Actually, even better, could we not move the check to
>>>>>> arm64_memblock_init() when we initialise zone_dma_bits?
>>>>>
>>>>> I did it this way as I vaguely remembered Rob saying he wanted to centralise
>>>>> all early boot fdt code in one place. But I'll be happy to move it there.
>>>>
>>>> I can see Rob replied and I'm fine if that's his preference. However,
>>>> what I don't particularly like is that in the arm64 code, if
>>>> zone_dma_bits == 24, we set it to 32 assuming that it wasn't touched by
>>>> the early_init_dt_update_zone_dma_bits(). What if at some point we'll
>>>> get a platform that actually needs 24 here (I truly hope not, but just
>>>> the principle of relying on magic values)?
>>>>
>>>> So rather than guessing, I'd prefer if the arch code can override
>>>> ZONE_DMA_BITS_DEFAULT. Then, in arm64, we'll just set it to 32 and no
>>>> need to explicitly touch the zone_dma_bits variable.
>>>
>>> Yes, sonds like the way to go. TBH I wasn't happy with that solution either,
>>> but couldn't think of a nicer alternative.
>>>
>>> Sadly I just realised that the series is incomplete, we have RPi4 users that
>>> want to boot unsing ACPI, and this series would break things for them. I'll
>>> have a word with them to see what we can do for their use-case.
>>
>> Is there a way to get some SoC information from ACPI?
>>
> 
> This is unfortunate. We used ACPI _DMA methods as they were designed
> to communicate the DMA limit of the XHCI controller to the OS.
> 
> It shouldn't be too hard to match the OEM id field in the DSDT, and
> switch to the smaller mask. But it sucks to have to add a quirk like
> that.
> It also requires delaying setting the arm64_dma_phy_limit a bit, but 
that doesn't appear to be causing a problem. I've been boot/compiling 
with a patch like:

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index cada0b816c8a..9dfe776c1c75 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -14,6 +14,7 @@

  #include <linux/acpi.h>
  #include <linux/cpumask.h>
+#include <linux/dma-direct.h>
  #include <linux/efi.h>
  #include <linux/efi-bgrt.h>
  #include <linux/init.h>
@@ -168,6 +169,11 @@ static int __init acpi_fadt_sanity_check(void)
                 ret = -EINVAL;
         }

+       if (!strncmp(table->oem_id, "RPIFDN", ACPI_OEM_ID_SIZE) &&
+           !strncmp(table->oem_table_id,  "RPI4    ", 
ACPI_OEM_TABLE_ID_SIZE) &&
+           table->oem_revision <= 0x200)  {
+               zone_dma_bits = 30;
+       }
  out:
         /*
          * acpi_get_table() creates FADT table mapping that
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index cd5caca8a929..6c8aaf1570ce 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -195,6 +195,7 @@ static void __init zone_sizes_init(unsigned long 
min, unsigned long max)
         unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};

  #ifdef CONFIG_ZONE_DMA
+       arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
         max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
  #endif
  #ifdef CONFIG_ZONE_DMA32
@@ -393,7 +394,6 @@ void __init arm64_memblock_init(void)
                  */
                 if (zone_dma_bits == ZONE_DMA_BITS_DEFAULT)
                         zone_dma_bits = 32;
-               arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
         }

         if (IS_ENABLED(CONFIG_ZONE_DMA32))


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-08 10:05           ` Nicolas Saenz Julienne
  2020-10-08 10:13             ` Catalin Marinas
@ 2020-10-09  7:10             ` Christoph Hellwig
  2020-10-09  7:37               ` Ard Biesheuvel
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2020-10-09  7:10 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: devicetree, Frank Rowand, Catalin Marinas, linux-kernel,
	linux-mm, iommu, Rob Herring, linux-rpi-kernel, will, hch,
	linux-arm-kernel, robin.murphy

On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> Sadly I just realised that the series is incomplete, we have RPi4 users that
> want to boot unsing ACPI, and this series would break things for them. I'll
> have a word with them to see what we can do for their use-case.

Stupid question:  why do these users insist on a totally unsuitable
interface?  And why would we as Linux developers care to support such
a aims?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09  7:10             ` Christoph Hellwig
@ 2020-10-09  7:37               ` Ard Biesheuvel
  2020-10-09  8:36                 ` Nicolas Saenz Julienne
  0 siblings, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2020-10-09  7:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Catalin Marinas, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Robin Murphy

On Fri, 9 Oct 2020 at 09:11, Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > want to boot unsing ACPI, and this series would break things for them. I'll
> > have a word with them to see what we can do for their use-case.
>
> Stupid question:  why do these users insist on a totally unsuitable
> interface?  And why would we as Linux developers care to support such
> a aims?
>

The point is really whether we want to revert changes in Linux that
made both DT and ACPI boot work without quirks on RPi4. Having to
check the RPi4 compatible string or OEM id in core init code is awful,
regardless of whether you boot via ACPI or via DT.

The problem with this hardware is that it uses a DMA mask which is
narrower than 32, and the arm64 kernel is simply not set up to deal
with that at all. On DT, we have DMA ranges properties and the likes
to describe such limitations, on ACPI we have _DMA methods as well as
DMA range attributes in the IORT, both of which are now handled
correctly. So all the information is there, we just have to figure out
how to consume it early on.

Interestingly, this limitation always existed in the SoC, but it
wasn't until they started shipping it with more than 1 GB of DRAM that
it became a problem. This means issues like this could resurface in
the future with existing SoCs when they get shipped with more memory,
and so I would prefer fixing this in a generic way.

Also, I assume papering over the issue like this does not fix the
kdump issue fundamentally, it just works around it, and so we might
run into this again in the future.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09  7:37               ` Ard Biesheuvel
@ 2020-10-09  8:36                 ` Nicolas Saenz Julienne
  2020-10-09  9:13                   ` Ard Biesheuvel
  0 siblings, 1 reply; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-09  8:36 UTC (permalink / raw)
  To: Ard Biesheuvel, Christoph Hellwig
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Catalin Marinas, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Linux ARM, Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 2735 bytes --]

On Fri, 2020-10-09 at 09:37 +0200, Ard Biesheuvel wrote:
> On Fri, 9 Oct 2020 at 09:11, Christoph Hellwig <hch@lst.de> wrote:
> > On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > > want to boot unsing ACPI, and this series would break things for them. I'll
> > > have a word with them to see what we can do for their use-case.
> > 
> > Stupid question:  why do these users insist on a totally unsuitable
> > interface? And why would we as Linux developers care to support such
> > a aims?
>
> The point is really whether we want to revert changes in Linux that
> made both DT and ACPI boot work without quirks on RPi4.

Well, and broke a big amount of devices that were otherwise fine.

> Having to check the RPi4 compatible string or OEM id in core init code is
> awful, regardless of whether you boot via ACPI or via DT.
>
> The problem with this hardware is that it uses a DMA mask which is
> narrower than 32, and the arm64 kernel is simply not set up to deal
> with that at all. On DT, we have DMA ranges properties and the likes
> to describe such limitations, on ACPI we have _DMA methods as well as
> DMA range attributes in the IORT, both of which are now handled
> correctly. So all the information is there, we just have to figure out
> how to consume it early on.

Is it worth the effort just for a single board? I don't know about ACPI but
parsing dma-ranges that early at boot time is not trivial. My intuition tells
me that it'd be even harder for ACPI, being a more complex data structure.

> Interestingly, this limitation always existed in the SoC, but it
> wasn't until they started shipping it with more than 1 GB of DRAM that
> it became a problem. This means issues like this could resurface in
> the future with existing SoCs when they get shipped with more memory,
> and so I would prefer fixing this in a generic way.

Actually what I proposed here is pretty generic. Specially from arm64's
perspective. We call early_init_dt_scan(), which sets up zone_dma_bits based on
whatever it finds in DT. Both those operations are architecture independent.
arm64 arch code doesn't care about the logic involved in ascertaining
zone_dma_bits. I get that the last step isn't generic. But it's all setup so as
to make it as such whenever it's worth the effort.

> Also, I assume papering over the issue like this does not fix the
> kdump issue fundamentally, it just works around it, and so we might
> run into this again in the future.

Any ideas? The way I understand it the kdump issue is just a shortcoming of
the memory zones design.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09  3:59                 ` Jeremy Linton
@ 2020-10-09  8:37                   ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-09  8:37 UTC (permalink / raw)
  To: Jeremy Linton, Ard Biesheuvel, Catalin Marinas, Lorenzo Pieralisi
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Christoph Hellwig, Linux ARM,
	Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 5832 bytes --]

Hi Jeremy.

On Thu, 2020-10-08 at 22:59 -0500, Jeremy Linton wrote:
> On 10/8/20 2:43 PM, Ard Biesheuvel wrote:
> > (+ Lorenzo)
> > 
> > On Thu, 8 Oct 2020 at 12:14, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > > > On Fri, 2020-10-02 at 12:55 +0100, Catalin Marinas wrote:
> > > > > On Thu, Oct 01, 2020 at 07:31:19PM +0200, Nicolas Saenz Julienne wrote:
> > > > > > On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > > > > > > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > > > > > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > > > > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > > > > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > > > > > > --- a/drivers/of/fdt.c
> > > > > > > > > +++ b/drivers/of/fdt.c
> > > > > > > > > @@ -25,6 +25,7 @@
> > > > > > > > >   #include <linux/serial_core.h>
> > > > > > > > >   #include <linux/sysfs.h>
> > > > > > > > >   #include <linux/random.h>
> > > > > > > > > +#include <linux/dma-direct.h>      /* for zone_dma_bits */
> > > > > > > > > 
> > > > > > > > >   #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
> > > > > > > > >   #include <asm/page.h>
> > > > > > > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > > > > > > >      of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > > > > > > >   }
> > > > > > > > > 
> > > > > > > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > > > > > > +{
> > > > > > > > > +   unsigned long dt_root = of_get_flat_dt_root();
> > > > > > > > > +
> > > > > > > > > +   if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > > > > > > +           zone_dma_bits = 30;
> > > > > > > > > +}
> > > > > > > > 
> > > > > > > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > > > > > > not pollute the core code with RPi4-specific code.
> > > > > > > 
> > > > > > > Actually, even better, could we not move the check to
> > > > > > > arm64_memblock_init() when we initialise zone_dma_bits?
> > > > > > 
> > > > > > I did it this way as I vaguely remembered Rob saying he wanted to centralise
> > > > > > all early boot fdt code in one place. But I'll be happy to move it there.
> > > > > 
> > > > > I can see Rob replied and I'm fine if that's his preference. However,
> > > > > what I don't particularly like is that in the arm64 code, if
> > > > > zone_dma_bits == 24, we set it to 32 assuming that it wasn't touched by
> > > > > the early_init_dt_update_zone_dma_bits(). What if at some point we'll
> > > > > get a platform that actually needs 24 here (I truly hope not, but just
> > > > > the principle of relying on magic values)?
> > > > > 
> > > > > So rather than guessing, I'd prefer if the arch code can override
> > > > > ZONE_DMA_BITS_DEFAULT. Then, in arm64, we'll just set it to 32 and no
> > > > > need to explicitly touch the zone_dma_bits variable.
> > > > 
> > > > Yes, sonds like the way to go. TBH I wasn't happy with that solution either,
> > > > but couldn't think of a nicer alternative.
> > > > 
> > > > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > > > want to boot unsing ACPI, and this series would break things for them. I'll
> > > > have a word with them to see what we can do for their use-case.
> > > 
> > > Is there a way to get some SoC information from ACPI?
> > > 
> > 
> > This is unfortunate. We used ACPI _DMA methods as they were designed
> > to communicate the DMA limit of the XHCI controller to the OS.
> > 
> > It shouldn't be too hard to match the OEM id field in the DSDT, and
> > switch to the smaller mask. But it sucks to have to add a quirk like
> > that.
> > It also requires delaying setting the arm64_dma_phy_limit a bit, but 
> that doesn't appear to be causing a problem. I've been boot/compiling 
> with a patch like:
> 
> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index cada0b816c8a..9dfe776c1c75 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -14,6 +14,7 @@
> 
>   #include <linux/acpi.h>
>   #include <linux/cpumask.h>
> +#include <linux/dma-direct.h>
>   #include <linux/efi.h>
>   #include <linux/efi-bgrt.h>
>   #include <linux/init.h>
> @@ -168,6 +169,11 @@ static int __init acpi_fadt_sanity_check(void)
>                  ret = -EINVAL;
>          }
> 
> +       if (!strncmp(table->oem_id, "RPIFDN", ACPI_OEM_ID_SIZE) &&
> +           !strncmp(table->oem_table_id,  "RPI4    ", 
> ACPI_OEM_TABLE_ID_SIZE) &&
> +           table->oem_revision <= 0x200)  {
> +               zone_dma_bits = 30;
> +       }
>   out:
>          /*
>           * acpi_get_table() creates FADT table mapping that
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index cd5caca8a929..6c8aaf1570ce 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -195,6 +195,7 @@ static void __init zone_sizes_init(unsigned long 
> min, unsigned long max)
>          unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
> 
>   #ifdef CONFIG_ZONE_DMA
> +       arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>          max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
>   #endif
>   #ifdef CONFIG_ZONE_DMA32
> @@ -393,7 +394,6 @@ void __init arm64_memblock_init(void)
>                   */
>                  if (zone_dma_bits == ZONE_DMA_BITS_DEFAULT)
>                          zone_dma_bits = 32;
> -               arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>          }
> 
>          if (IS_ENABLED(CONFIG_ZONE_DMA32))
> 
> 

Thanks for taking the time to look at this!

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09  8:36                 ` Nicolas Saenz Julienne
@ 2020-10-09  9:13                   ` Ard Biesheuvel
  2020-10-09 13:33                     ` Nicolas Saenz Julienne
  2020-10-09 15:24                     ` Lorenzo Pieralisi
  0 siblings, 2 replies; 31+ messages in thread
From: Ard Biesheuvel @ 2020-10-09  9:13 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Catalin Marinas, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Christoph Hellwig, Linux ARM,
	Robin Murphy

On Fri, 9 Oct 2020 at 10:36, Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> On Fri, 2020-10-09 at 09:37 +0200, Ard Biesheuvel wrote:
> > On Fri, 9 Oct 2020 at 09:11, Christoph Hellwig <hch@lst.de> wrote:
> > > On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > > > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > > > want to boot unsing ACPI, and this series would break things for them. I'll
> > > > have a word with them to see what we can do for their use-case.
> > >
> > > Stupid question:  why do these users insist on a totally unsuitable
> > > interface? And why would we as Linux developers care to support such
> > > a aims?
> >
> > The point is really whether we want to revert changes in Linux that
> > made both DT and ACPI boot work without quirks on RPi4.
>
> Well, and broke a big amount of devices that were otherwise fine.
>

Yeah that was unfortunate.

> > Having to check the RPi4 compatible string or OEM id in core init code is
> > awful, regardless of whether you boot via ACPI or via DT.
> >
> > The problem with this hardware is that it uses a DMA mask which is
> > narrower than 32, and the arm64 kernel is simply not set up to deal
> > with that at all. On DT, we have DMA ranges properties and the likes
> > to describe such limitations, on ACPI we have _DMA methods as well as
> > DMA range attributes in the IORT, both of which are now handled
> > correctly. So all the information is there, we just have to figure out
> > how to consume it early on.
>
> Is it worth the effort just for a single board? I don't know about ACPI but
> parsing dma-ranges that early at boot time is not trivial. My intuition tells
> me that it'd be even harder for ACPI, being a more complex data structure.
>

Yes, it will be harder, especially for the _DMA methods.

> > Interestingly, this limitation always existed in the SoC, but it
> > wasn't until they started shipping it with more than 1 GB of DRAM that
> > it became a problem. This means issues like this could resurface in
> > the future with existing SoCs when they get shipped with more memory,
> > and so I would prefer fixing this in a generic way.
>
> Actually what I proposed here is pretty generic. Specially from arm64's
> perspective. We call early_init_dt_scan(), which sets up zone_dma_bits based on
> whatever it finds in DT. Both those operations are architecture independent.
> arm64 arch code doesn't care about the logic involved in ascertaining
> zone_dma_bits. I get that the last step isn't generic. But it's all setup so as
> to make it as such whenever it's worth the effort.
>

The problem is that, while we are providing a full description of the
SoC's capabilities, we short circuit this by inserting knowledge into
the code (that is shared between all DT architectures) that
"brcm,bcm2711" is special, and needs a DMA zone override.

I think for ACPI boot, we might be able to work around this by cold
plugging the memory above 1 GB, but I have to double check whether it
won't get pulled into ZONE_DMA32 anyway (unless anyone can answer that
for me here from the top of their head)
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09  9:13                   ` Ard Biesheuvel
@ 2020-10-09 13:33                     ` Nicolas Saenz Julienne
  2020-10-09 15:24                     ` Lorenzo Pieralisi
  1 sibling, 0 replies; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-09 13:33 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Will Deacon, Catalin Marinas, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Frank Rowand, Christoph Hellwig, Linux ARM,
	Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 4373 bytes --]

On Fri, 2020-10-09 at 11:13 +0200, Ard Biesheuvel wrote:
> On Fri, 9 Oct 2020 at 10:36, Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> > On Fri, 2020-10-09 at 09:37 +0200, Ard Biesheuvel wrote:
> > > On Fri, 9 Oct 2020 at 09:11, Christoph Hellwig <hch@lst.de> wrote:
> > > > On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > > > > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > > > > want to boot unsing ACPI, and this series would break things for them. I'll
> > > > > have a word with them to see what we can do for their use-case.
> > > > 
> > > > Stupid question:  why do these users insist on a totally unsuitable
> > > > interface? And why would we as Linux developers care to support such
> > > > a aims?
> > > 
> > > The point is really whether we want to revert changes in Linux that
> > > made both DT and ACPI boot work without quirks on RPi4.
> > 
> > Well, and broke a big amount of devices that were otherwise fine.
> > 
> 
> Yeah that was unfortunate.
> 
> > > Having to check the RPi4 compatible string or OEM id in core init code is
> > > awful, regardless of whether you boot via ACPI or via DT.
> > > 
> > > The problem with this hardware is that it uses a DMA mask which is
> > > narrower than 32, and the arm64 kernel is simply not set up to deal
> > > with that at all. On DT, we have DMA ranges properties and the likes
> > > to describe such limitations, on ACPI we have _DMA methods as well as
> > > DMA range attributes in the IORT, both of which are now handled
> > > correctly. So all the information is there, we just have to figure out
> > > how to consume it early on.
> > 
> > Is it worth the effort just for a single board? I don't know about ACPI but
> > parsing dma-ranges that early at boot time is not trivial. My intuition tells
> > me that it'd be even harder for ACPI, being a more complex data structure.
> > 
> 
> Yes, it will be harder, especially for the _DMA methods.
> 
> > > Interestingly, this limitation always existed in the SoC, but it
> > > wasn't until they started shipping it with more than 1 GB of DRAM that
> > > it became a problem. This means issues like this could resurface in
> > > the future with existing SoCs when they get shipped with more memory,
> > > and so I would prefer fixing this in a generic way.
> > 
> > Actually what I proposed here is pretty generic. Specially from arm64's
> > perspective. We call early_init_dt_scan(), which sets up zone_dma_bits based on
> > whatever it finds in DT. Both those operations are architecture independent.
> > arm64 arch code doesn't care about the logic involved in ascertaining
> > zone_dma_bits. I get that the last step isn't generic. But it's all setup so as
> > to make it as such whenever it's worth the effort.
> > 
> 
> The problem is that, while we are providing a full description of the
> SoC's capabilities, we short circuit this by inserting knowledge into
> the code (that is shared between all DT architectures) that
> "brcm,bcm2711" is special, and needs a DMA zone override.

Yes I understand this and I sympathize with it, not the most beautiful thing
out there :). But that's only half the issue, as I said, implementing this
early at boot time is a tangible amount of work and a burden to maintain just
for one board. So this is the compromise we discussed with the DT maintainer
(RobH). The series sets things up so as to be able to implement the right
thing transparently to arm64's architecture when deemed worth the effort.

Ultimately, if you're worried about inserting knowledge into the code, aren't
we doing that, in a more extreme way, when imposing an extra unwarranted zone
to the whole arm64 ecosystem?

Note that I'm more that happy to work on alternative solutions, but let's first
settle on what would be acceptable to everyone.

> I think for ACPI boot, we might be able to work around this by cold
> plugging the memory above 1 GB, but I have to double check whether it
> won't get pulled into ZONE_DMA32 anyway (unless anyone can answer that
> for me here from the top of their head)

Don't know much about what ACPI memory cold plugging involves, but we'll still
need a proper ZONE_DMA32 (i.e. spanning the whole 32-bit address space) for
RPi4.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09  9:13                   ` Ard Biesheuvel
  2020-10-09 13:33                     ` Nicolas Saenz Julienne
@ 2020-10-09 15:24                     ` Lorenzo Pieralisi
  2020-10-09 16:23                       ` Ard Biesheuvel
  1 sibling, 1 reply; 31+ messages in thread
From: Lorenzo Pieralisi @ 2020-10-09 15:24 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Catalin Marinas, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Christoph Hellwig, Robin Murphy

On Fri, Oct 09, 2020 at 11:13:59AM +0200, Ard Biesheuvel wrote:
> On Fri, 9 Oct 2020 at 10:36, Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> >
> > On Fri, 2020-10-09 at 09:37 +0200, Ard Biesheuvel wrote:
> > > On Fri, 9 Oct 2020 at 09:11, Christoph Hellwig <hch@lst.de> wrote:
> > > > On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > > > > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > > > > want to boot unsing ACPI, and this series would break things for them. I'll
> > > > > have a word with them to see what we can do for their use-case.
> > > >
> > > > Stupid question:  why do these users insist on a totally unsuitable
> > > > interface? And why would we as Linux developers care to support such
> > > > a aims?
> > >
> > > The point is really whether we want to revert changes in Linux that
> > > made both DT and ACPI boot work without quirks on RPi4.
> >
> > Well, and broke a big amount of devices that were otherwise fine.
> >
> 
> Yeah that was unfortunate.
> 
> > > Having to check the RPi4 compatible string or OEM id in core init code is
> > > awful, regardless of whether you boot via ACPI or via DT.
> > >
> > > The problem with this hardware is that it uses a DMA mask which is
> > > narrower than 32, and the arm64 kernel is simply not set up to deal
> > > with that at all. On DT, we have DMA ranges properties and the likes
> > > to describe such limitations, on ACPI we have _DMA methods as well as
> > > DMA range attributes in the IORT, both of which are now handled
> > > correctly. So all the information is there, we just have to figure out
> > > how to consume it early on.
> >
> > Is it worth the effort just for a single board? I don't know about ACPI but
> > parsing dma-ranges that early at boot time is not trivial. My intuition tells
> > me that it'd be even harder for ACPI, being a more complex data structure.
> >
> 
> Yes, it will be harder, especially for the _DMA methods.
> 
> > > Interestingly, this limitation always existed in the SoC, but it
> > > wasn't until they started shipping it with more than 1 GB of DRAM that
> > > it became a problem. This means issues like this could resurface in
> > > the future with existing SoCs when they get shipped with more memory,
> > > and so I would prefer fixing this in a generic way.
> >
> > Actually what I proposed here is pretty generic. Specially from arm64's
> > perspective. We call early_init_dt_scan(), which sets up zone_dma_bits based on
> > whatever it finds in DT. Both those operations are architecture independent.
> > arm64 arch code doesn't care about the logic involved in ascertaining
> > zone_dma_bits. I get that the last step isn't generic. But it's all setup so as
> > to make it as such whenever it's worth the effort.
> >
> 
> The problem is that, while we are providing a full description of the
> SoC's capabilities, we short circuit this by inserting knowledge into
> the code (that is shared between all DT architectures) that
> "brcm,bcm2711" is special, and needs a DMA zone override.
> 
> I think for ACPI boot, we might be able to work around this by cold
> plugging the memory above 1 GB, but I have to double check whether it
> won't get pulled into ZONE_DMA32 anyway (unless anyone can answer that
> for me here from the top of their head)

Is this information that we can infer from IORT nodes and make it
generic (ie make a DMA limit out of all IORT nodes allowed ranges) ?

We can move this check to IORT code and call it from arm64 if it
can be made to work.

Thanks,
Lorenzo
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09 15:24                     ` Lorenzo Pieralisi
@ 2020-10-09 16:23                       ` Ard Biesheuvel
  2020-10-09 17:10                         ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2020-10-09 16:23 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Catalin Marinas, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Christoph Hellwig, Robin Murphy

On Fri, 9 Oct 2020 at 17:24, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
>
> On Fri, Oct 09, 2020 at 11:13:59AM +0200, Ard Biesheuvel wrote:
> > On Fri, 9 Oct 2020 at 10:36, Nicolas Saenz Julienne
> > <nsaenzjulienne@suse.de> wrote:
> > >
> > > On Fri, 2020-10-09 at 09:37 +0200, Ard Biesheuvel wrote:
> > > > On Fri, 9 Oct 2020 at 09:11, Christoph Hellwig <hch@lst.de> wrote:
> > > > > On Thu, Oct 08, 2020 at 12:05:25PM +0200, Nicolas Saenz Julienne wrote:
> > > > > > Sadly I just realised that the series is incomplete, we have RPi4 users that
> > > > > > want to boot unsing ACPI, and this series would break things for them. I'll
> > > > > > have a word with them to see what we can do for their use-case.
> > > > >
> > > > > Stupid question:  why do these users insist on a totally unsuitable
> > > > > interface? And why would we as Linux developers care to support such
> > > > > a aims?
> > > >
> > > > The point is really whether we want to revert changes in Linux that
> > > > made both DT and ACPI boot work without quirks on RPi4.
> > >
> > > Well, and broke a big amount of devices that were otherwise fine.
> > >
> >
> > Yeah that was unfortunate.
> >
> > > > Having to check the RPi4 compatible string or OEM id in core init code is
> > > > awful, regardless of whether you boot via ACPI or via DT.
> > > >
> > > > The problem with this hardware is that it uses a DMA mask which is
> > > > narrower than 32, and the arm64 kernel is simply not set up to deal
> > > > with that at all. On DT, we have DMA ranges properties and the likes
> > > > to describe such limitations, on ACPI we have _DMA methods as well as
> > > > DMA range attributes in the IORT, both of which are now handled
> > > > correctly. So all the information is there, we just have to figure out
> > > > how to consume it early on.
> > >
> > > Is it worth the effort just for a single board? I don't know about ACPI but
> > > parsing dma-ranges that early at boot time is not trivial. My intuition tells
> > > me that it'd be even harder for ACPI, being a more complex data structure.
> > >
> >
> > Yes, it will be harder, especially for the _DMA methods.
> >
> > > > Interestingly, this limitation always existed in the SoC, but it
> > > > wasn't until they started shipping it with more than 1 GB of DRAM that
> > > > it became a problem. This means issues like this could resurface in
> > > > the future with existing SoCs when they get shipped with more memory,
> > > > and so I would prefer fixing this in a generic way.
> > >
> > > Actually what I proposed here is pretty generic. Specially from arm64's
> > > perspective. We call early_init_dt_scan(), which sets up zone_dma_bits based on
> > > whatever it finds in DT. Both those operations are architecture independent.
> > > arm64 arch code doesn't care about the logic involved in ascertaining
> > > zone_dma_bits. I get that the last step isn't generic. But it's all setup so as
> > > to make it as such whenever it's worth the effort.
> > >
> >
> > The problem is that, while we are providing a full description of the
> > SoC's capabilities, we short circuit this by inserting knowledge into
> > the code (that is shared between all DT architectures) that
> > "brcm,bcm2711" is special, and needs a DMA zone override.
> >
> > I think for ACPI boot, we might be able to work around this by cold
> > plugging the memory above 1 GB, but I have to double check whether it
> > won't get pulled into ZONE_DMA32 anyway (unless anyone can answer that
> > for me here from the top of their head)
>
> Is this information that we can infer from IORT nodes and make it
> generic (ie make a DMA limit out of all IORT nodes allowed ranges) ?
>

Yes, that seems feasible.

> We can move this check to IORT code and call it from arm64 if it
> can be made to work.
>

Finding the smallest value in the IORT, and assigning it to
zone_dma_bits if it is < 32 should be easy. But as I understand it,
having these separate DMA and DMA32 zones is what breaks kdump, no? So
how is this going to fix the underlying issue?

Nicolas, were there any other reported regressions caused by the
introduction of ZONE_DMA?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09 16:23                       ` Ard Biesheuvel
@ 2020-10-09 17:10                         ` Catalin Marinas
  2020-10-10 10:36                           ` Ard Biesheuvel
  2020-10-12  6:47                           ` Christoph Hellwig
  0 siblings, 2 replies; 31+ messages in thread
From: Catalin Marinas @ 2020-10-09 17:10 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Christoph Hellwig, Robin Murphy

On Fri, Oct 09, 2020 at 06:23:06PM +0200, Ard Biesheuvel wrote:
> On Fri, 9 Oct 2020 at 17:24, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > We can move this check to IORT code and call it from arm64 if it
> > can be made to work.
> 
> Finding the smallest value in the IORT, and assigning it to
> zone_dma_bits if it is < 32 should be easy. But as I understand it,
> having these separate DMA and DMA32 zones is what breaks kdump, no? So
> how is this going to fix the underlying issue?

If zone_dma_bits is 32, ZONE_DMA32 disappears into ZONE_DMA (GFP_DMA32
allocations fall back to ZONE_DMA).

kdump wants DMA-able memory and, without a 30-bit ZONE_DMA, that would
be the bottom 32-bit. With the introduction of ZONE_DMA, this suddenly
became 1GB. We could change kdump to allocate ZONE_DMA32 but this one
may also be small as it lost 1GB to ZONE_DMA. However, the kdump kernel
would need to be rebuilt without ZONE_DMA since it won't have any. IIRC
(it's been a while since I looked), the kdump allocation couldn't span
multiple zones.

In a separate thread, we try to fix kdump to use allocations above 4G as
a fallback but this only fixes platforms with enough RAM (and maybe it's
only those platforms that care about kdump).

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09 17:10                         ` Catalin Marinas
@ 2020-10-10 10:36                           ` Ard Biesheuvel
  2020-10-10 10:53                             ` Nicolas Saenz Julienne
  2020-10-12  6:47                           ` Christoph Hellwig
  1 sibling, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2020-10-10 10:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Christoph Hellwig, Robin Murphy

On Fri, 9 Oct 2020 at 19:10, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Fri, Oct 09, 2020 at 06:23:06PM +0200, Ard Biesheuvel wrote:
> > On Fri, 9 Oct 2020 at 17:24, Lorenzo Pieralisi
> > <lorenzo.pieralisi@arm.com> wrote:
> > > We can move this check to IORT code and call it from arm64 if it
> > > can be made to work.
> >
> > Finding the smallest value in the IORT, and assigning it to
> > zone_dma_bits if it is < 32 should be easy. But as I understand it,
> > having these separate DMA and DMA32 zones is what breaks kdump, no? So
> > how is this going to fix the underlying issue?
>
> If zone_dma_bits is 32, ZONE_DMA32 disappears into ZONE_DMA (GFP_DMA32
> allocations fall back to ZONE_DMA).
>
> kdump wants DMA-able memory and, without a 30-bit ZONE_DMA, that would
> be the bottom 32-bit. With the introduction of ZONE_DMA, this suddenly
> became 1GB. We could change kdump to allocate ZONE_DMA32 but this one
> may also be small as it lost 1GB to ZONE_DMA. However, the kdump kernel
> would need to be rebuilt without ZONE_DMA since it won't have any. IIRC
> (it's been a while since I looked), the kdump allocation couldn't span
> multiple zones.
>
> In a separate thread, we try to fix kdump to use allocations above 4G as
> a fallback but this only fixes platforms with enough RAM (and maybe it's
> only those platforms that care about kdump).
>

One thing that strikes me as odd is that we are applying the same
shifting logic to ZONE_DMA as we are applying to ZONE_DMA32, i.e., if
DRAM starts outside of the zone, it is shifted upwards.

On a typical ARM box, this gives me

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000080000000-0x00000000bfffffff]
[    0.000000]   DMA32    [mem 0x00000000c0000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x0000000fffffffff]

i.e., the 30-bit addressable range has bit 31 set, which is weird.

I wonder if it wouldn't be better (and less problematic in the general
case) to drop this logic for ZONE_DMA, and simply let it remain empty
unless there is really some memory there.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-10 10:36                           ` Ard Biesheuvel
@ 2020-10-10 10:53                             ` Nicolas Saenz Julienne
  2020-10-10 12:38                               ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Nicolas Saenz Julienne @ 2020-10-10 10:53 UTC (permalink / raw)
  To: Ard Biesheuvel, Catalin Marinas
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Christoph Hellwig, Linux ARM,
	Robin Murphy

[-- Attachment #1.1: Type: text/plain, Size: 2866 bytes --]

On Sat, 2020-10-10 at 12:36 +0200, Ard Biesheuvel wrote:
> On Fri, 9 Oct 2020 at 19:10, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Fri, Oct 09, 2020 at 06:23:06PM +0200, Ard Biesheuvel wrote:
> > > On Fri, 9 Oct 2020 at 17:24, Lorenzo Pieralisi
> > > <lorenzo.pieralisi@arm.com> wrote:
> > > > We can move this check to IORT code and call it from arm64 if it
> > > > can be made to work.
> > > 
> > > Finding the smallest value in the IORT, and assigning it to
> > > zone_dma_bits if it is < 32 should be easy. But as I understand it,
> > > having these separate DMA and DMA32 zones is what breaks kdump, no? So
> > > how is this going to fix the underlying issue?
> > 
> > If zone_dma_bits is 32, ZONE_DMA32 disappears into ZONE_DMA (GFP_DMA32
> > allocations fall back to ZONE_DMA).
> > 
> > kdump wants DMA-able memory and, without a 30-bit ZONE_DMA, that would
> > be the bottom 32-bit. With the introduction of ZONE_DMA, this suddenly
> > became 1GB. We could change kdump to allocate ZONE_DMA32 but this one
> > may also be small as it lost 1GB to ZONE_DMA. However, the kdump kernel
> > would need to be rebuilt without ZONE_DMA since it won't have any. IIRC
> > (it's been a while since I looked), the kdump allocation couldn't span
> > multiple zones.
> > 
> > In a separate thread, we try to fix kdump to use allocations above 4G as
> > a fallback but this only fixes platforms with enough RAM (and maybe it's
> > only those platforms that care about kdump).
> > 
> 
> One thing that strikes me as odd is that we are applying the same
> shifting logic to ZONE_DMA as we are applying to ZONE_DMA32, i.e., if
> DRAM starts outside of the zone, it is shifted upwards.
> 
> On a typical ARM box, this gives me
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000080000000-0x00000000bfffffff]
> [    0.000000]   DMA32    [mem 0x00000000c0000000-0x00000000ffffffff]
> [    0.000000]   Normal   [mem 0x0000000100000000-0x0000000fffffffff]
> 
> i.e., the 30-bit addressable range has bit 31 set, which is weird.

Yes I agree it's weird, and IMO plain useless. I sent a series this summer to
address this[1], which ultimately triggered the discussion we're having right
now.

Although with with your latest patch and the DT counterpart, we should be OK.
It would be weird for a HW description to define DMA constraints that are
impossible to reach on that system.

> I wonder if it wouldn't be better (and less problematic in the general
> case) to drop this logic for ZONE_DMA, and simply let it remain empty
> unless there is really some memory there.

From my experience, you can't have empty ZONE_DMA when enabled. Empty
ZONE_DMA32 is OK though. Although I'm sure it's something that can be changed.

Regards,
Nicolas

[1] https://lkml.org/lkml/2020/8/19/1022



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-10 10:53                             ` Nicolas Saenz Julienne
@ 2020-10-10 12:38                               ` Catalin Marinas
  0 siblings, 0 replies; 31+ messages in thread
From: Catalin Marinas @ 2020-10-10 12:38 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Linux Kernel Mailing List, Christoph Hellwig,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Ard Biesheuvel, Linux ARM,
	Robin Murphy

On Sat, Oct 10, 2020 at 12:53:19PM +0200, Nicolas Saenz Julienne wrote:
> On Sat, 2020-10-10 at 12:36 +0200, Ard Biesheuvel wrote:
> > On Fri, 9 Oct 2020 at 19:10, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Fri, Oct 09, 2020 at 06:23:06PM +0200, Ard Biesheuvel wrote:
> > > > On Fri, 9 Oct 2020 at 17:24, Lorenzo Pieralisi
> > > > <lorenzo.pieralisi@arm.com> wrote:
> > > > > We can move this check to IORT code and call it from arm64 if it
> > > > > can be made to work.
> > > > 
> > > > Finding the smallest value in the IORT, and assigning it to
> > > > zone_dma_bits if it is < 32 should be easy. But as I understand it,
> > > > having these separate DMA and DMA32 zones is what breaks kdump, no? So
> > > > how is this going to fix the underlying issue?
> > > 
> > > If zone_dma_bits is 32, ZONE_DMA32 disappears into ZONE_DMA (GFP_DMA32
> > > allocations fall back to ZONE_DMA).
> > > 
> > > kdump wants DMA-able memory and, without a 30-bit ZONE_DMA, that would
> > > be the bottom 32-bit. With the introduction of ZONE_DMA, this suddenly
> > > became 1GB. We could change kdump to allocate ZONE_DMA32 but this one
> > > may also be small as it lost 1GB to ZONE_DMA. However, the kdump kernel
> > > would need to be rebuilt without ZONE_DMA since it won't have any. IIRC
> > > (it's been a while since I looked), the kdump allocation couldn't span
> > > multiple zones.
> > > 
> > > In a separate thread, we try to fix kdump to use allocations above 4G as
> > > a fallback but this only fixes platforms with enough RAM (and maybe it's
> > > only those platforms that care about kdump).
> > > 
> > 
> > One thing that strikes me as odd is that we are applying the same
> > shifting logic to ZONE_DMA as we are applying to ZONE_DMA32, i.e., if
> > DRAM starts outside of the zone, it is shifted upwards.
> > 
> > On a typical ARM box, this gives me
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000080000000-0x00000000bfffffff]
> > [    0.000000]   DMA32    [mem 0x00000000c0000000-0x00000000ffffffff]
> > [    0.000000]   Normal   [mem 0x0000000100000000-0x0000000fffffffff]
> > 
> > i.e., the 30-bit addressable range has bit 31 set, which is weird.
> 
> Yes I agree it's weird, and IMO plain useless. I sent a series this summer to
> address this[1], which ultimately triggered the discussion we're having right
> now.

I don't mind assuming that ZONE_DMA is always from pfn 0 (i.e. no
dma_offset for such constrained devices). But if ZONE_DMA32 is squeezed
out with ZONE_DMA extended to 4GB, it should allow non-zero upper 32
bits. IIRC we do have SoCs with RAM starting above 4GB.

However, your patch didn't completely solve the problem for non-RPi4
platforms as there's hardware with RAM starting at 0 that doesn't need
the 1GB ZONE_DMA. We may end up with a combination of the two
approaches.

> Although with with your latest patch and the DT counterpart, we should be OK.
> It would be weird for a HW description to define DMA constraints that are
> impossible to reach on that system.

I don't remember the difficulties with parsing a DT early for inferring
the ZONE_DMA requirements. Could we not check the dma-ranges property in
the soc node? I can see bcm2711.dtsi defines a 30-bit address range. We
are not interested in the absolute physical/bus addresses, just the
size to check whether it's smaller than 32-bit.

> > I wonder if it wouldn't be better (and less problematic in the general
> > case) to drop this logic for ZONE_DMA, and simply let it remain empty
> > unless there is really some memory there.
> 
> From my experience, you can't have empty ZONE_DMA when enabled. Empty
> ZONE_DMA32 is OK though. Although I'm sure it's something that can be changed.

Indeed, because we still have GFP_DMA around which can't fall back to
ZONE_DMA32 (well, unless CONFIG_ZONE_DMA is disabled).

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-09 17:10                         ` Catalin Marinas
  2020-10-10 10:36                           ` Ard Biesheuvel
@ 2020-10-12  6:47                           ` Christoph Hellwig
  2020-10-12  8:47                             ` Catalin Marinas
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2020-10-12  6:47 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Linux Kernel Mailing List, Christoph Hellwig,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Ard Biesheuvel, Robin Murphy

On Fri, Oct 09, 2020 at 06:10:52PM +0100, Catalin Marinas wrote:
> kdump wants DMA-able memory and,

DMAable by whom?  The only way to guranteed DMAable memory is to use
the DMA memory allocator(s) and pass a specific device to them.  Everyting
else is just fundamentally broken.  Note that even when device is not
DMAable we can still use swiotlb to access it.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711
  2020-10-12  6:47                           ` Christoph Hellwig
@ 2020-10-12  8:47                             ` Catalin Marinas
  0 siblings, 0 replies; 31+ messages in thread
From: Catalin Marinas @ 2020-10-12  8:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux ARM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Frank Rowand, Linux Kernel Mailing List,
	Linux Memory Management List, iommu, Rob Herring,
	linux-rpi-kernel, Will Deacon, Ard Biesheuvel, Robin Murphy

On Mon, Oct 12, 2020 at 08:47:15AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 09, 2020 at 06:10:52PM +0100, Catalin Marinas wrote:
> > kdump wants DMA-able memory and,
> 
> DMAable by whom?  The only way to guranteed DMAable memory is to use
> the DMA memory allocator(s) and pass a specific device to them.  Everyting
> else is just fundamentally broken.  Note that even when device is not
> DMAable we can still use swiotlb to access it.

What I meant is that the new kexec'ed kernel needs some memory in the
ZONE_DMA range, currently set to the bottom 30-bit even for platforms
that can cope with the whole 32-bit range (anything other than RPi4).
The memory range available to the kdump kernels is limited to what
reserve_crashkernel() allocated, which may not fit in the lower 1GB.

There are two ongoing threads (complementary):

1. Change the arm64 reserve_crashkernel() similar to x86 which allocates
   memory above 4G with a small block in the ZONE_DMA range.

2. Allow zone_dma_bits to be 32 for arm64 platforms other than RPi4.

The second point also fixes some regressions with CMA reservations that
could no longer fit in the lower 1GB.

-- 
Catalin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, back to index

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-01 16:17 [PATCH 0/4] arm64: Default to 32-bit wide ZONE_DMA Nicolas Saenz Julienne
2020-10-01 16:17 ` [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711 Nicolas Saenz Julienne
2020-10-01 17:15   ` Catalin Marinas
2020-10-01 17:23     ` Catalin Marinas
2020-10-01 17:31       ` Nicolas Saenz Julienne
2020-10-01 20:02         ` Rob Herring
2020-10-02 11:55         ` Catalin Marinas
2020-10-08 10:05           ` Nicolas Saenz Julienne
2020-10-08 10:13             ` Catalin Marinas
2020-10-08 19:43               ` Ard Biesheuvel
2020-10-09  3:59                 ` Jeremy Linton
2020-10-09  8:37                   ` Nicolas Saenz Julienne
2020-10-09  7:10             ` Christoph Hellwig
2020-10-09  7:37               ` Ard Biesheuvel
2020-10-09  8:36                 ` Nicolas Saenz Julienne
2020-10-09  9:13                   ` Ard Biesheuvel
2020-10-09 13:33                     ` Nicolas Saenz Julienne
2020-10-09 15:24                     ` Lorenzo Pieralisi
2020-10-09 16:23                       ` Ard Biesheuvel
2020-10-09 17:10                         ` Catalin Marinas
2020-10-10 10:36                           ` Ard Biesheuvel
2020-10-10 10:53                             ` Nicolas Saenz Julienne
2020-10-10 12:38                               ` Catalin Marinas
2020-10-12  6:47                           ` Christoph Hellwig
2020-10-12  8:47                             ` Catalin Marinas
2020-10-02  9:05   ` kernel test robot
2020-10-01 16:17 ` [PATCH 2/4] dma-direct: Turn zone_dma_bits default value into a define Nicolas Saenz Julienne
2020-10-01 16:17 ` [PATCH 3/4] arm64: Default to 32-bit ZONE_DMA Nicolas Saenz Julienne
2020-10-01 17:19   ` Catalin Marinas
2020-10-01 16:17 ` [PATCH 4/4] mm: Update DMA zones description Nicolas Saenz Julienne
2020-10-01 17:19   ` Catalin Marinas

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git