All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN
@ 2015-11-30 17:35 Igor Mammedov
  2015-12-01  2:45 ` Rik van Riel
  2015-12-04  8:20 ` Ingo Molnar
  0 siblings, 2 replies; 4+ messages in thread
From: Igor Mammedov @ 2015-11-30 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, hpa, konrad.wilk, rjw, lenb, imammedo, linux-acpi,
	akataria, fujita.tomonori, revers

when memory hotplug enabled system is booted with less
than 4GB of RAM and then later more RAM is hotplugged
32-bit devices stop functioning with following error:

 nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff

the reason for this is that if x86_64 system were booted
with RAM less than 4GB, it doesn't enable SWIOTLB and
when memory is hotplugged beyond MAX_DMA32_PFN, devices
that expect 32-bit addresses can't handle 64-bit addresses.

Fix it by tracking max possible PFN when parsing
memory affinity structures from SRAT ACPI table and
enable SWIOTLB if there is hotpluggable memory
regions beyond MAX_DMA32_PFN.

It fixes KVM guests when they use emulated devices
(reproduces with ata_piix, e1000 and usb devices,
 RHBZ: 1275941, 1275977, 1271527)
It also fixes the HyperV, VMWare with emulated devices
which are affected by this issue as well.

Systems that have RAM less than 4GB and do not use
memory hotplug but still have hotplug regions in SRAT
(i.e. broken BIOS that can't disable mem hotplug)
can disable memory hotplug with 'acpi_no_memhotplug = 1'
to avoid automatic SWIOTLB initialization.

Tested on QEMU/KVM and HyperV.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 arch/x86/include/asm/acpi.h    |  5 +++++
 arch/x86/kernel/pci-swiotlb.c  |  4 ++++
 arch/x86/mm/srat.c             | 15 +++++++++++++++
 drivers/acpi/acpi_memhotplug.c |  2 +-
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index 94c18eb..53d7951 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -147,6 +147,7 @@ static inline void disable_acpi(void) { }
 #ifdef CONFIG_ACPI_NUMA
 extern int acpi_numa;
 extern int x86_acpi_numa_init(void);
+unsigned long acpi_get_max_possible_pfn(void);
 #endif /* CONFIG_ACPI_NUMA */
 
 #define acpi_unlazy_tlb(x)	leave_mm(x)
@@ -170,4 +171,8 @@ static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
 }
 #endif
 
+#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
+extern bool acpi_no_memhotplug;
+#endif /* CONFIG_ACPI_HOTPLUG_MEMORY */
+
 #endif /* _ASM_X86_ACPI_H */
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index adf0392..61d5ba5 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -88,7 +88,11 @@ int __init pci_swiotlb_detect_4gb(void)
 {
 	/* don't initialize swiotlb if iommu=off (no_iommu=1) */
 #ifdef CONFIG_X86_64
+#ifdef CONFIG_ACPI_NUMA
+	if (!no_iommu && acpi_get_max_possible_pfn() > MAX_DMA32_PFN)
+#else
 	if (!no_iommu && max_pfn > MAX_DMA32_PFN)
+#endif
 		swiotlb = 1;
 #endif
 	return swiotlb;
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index c2aea63..21b33f0 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -153,10 +153,20 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 static inline int save_add_info(void) {return 1;}
+static unsigned long max_possible_pfn __initdata;
 #else
 static inline int save_add_info(void) {return 0;}
 #endif
 
+unsigned long __init acpi_get_max_possible_pfn(void)
+{
+#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
+	if (!acpi_no_memhotplug)
+		return max_possible_pfn;
+#endif
+	return max_pfn;
+}
+
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
 int __init
 acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
@@ -203,6 +213,11 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
 		pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
 			(unsigned long long)start, (unsigned long long)end - 1);
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+	if (max_possible_pfn < PFN_UP(end - 1))
+		max_possible_pfn = PFN_UP(end - 1);
+#endif
+
 	return 0;
 out_err_bad_srat:
 	bad_srat();
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 6b0d3ef..ae38f57 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -357,7 +357,7 @@ static void acpi_memory_device_remove(struct acpi_device *device)
 	acpi_memory_device_free(mem_device);
 }
 
-static bool __initdata acpi_no_memhotplug;
+bool __initdata acpi_no_memhotplug;
 
 void __init acpi_memory_hotplug_init(void)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN
  2015-11-30 17:35 [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN Igor Mammedov
@ 2015-12-01  2:45 ` Rik van Riel
  2015-12-04  8:20 ` Ingo Molnar
  1 sibling, 0 replies; 4+ messages in thread
From: Rik van Riel @ 2015-12-01  2:45 UTC (permalink / raw)
  To: Igor Mammedov, linux-kernel
  Cc: tglx, mingo, hpa, konrad.wilk, rjw, lenb, linux-acpi, akataria,
	fujita.tomonori, revers

On 11/30/2015 12:35 PM, Igor Mammedov wrote:
> when memory hotplug enabled system is booted with less
> than 4GB of RAM and then later more RAM is hotplugged
> 32-bit devices stop functioning with following error:
> 
>  nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff
> 
> the reason for this is that if x86_64 system were booted
> with RAM less than 4GB, it doesn't enable SWIOTLB and
> when memory is hotplugged beyond MAX_DMA32_PFN, devices
> that expect 32-bit addresses can't handle 64-bit addresses.
> 
> Fix it by tracking max possible PFN when parsing
> memory affinity structures from SRAT ACPI table and
> enable SWIOTLB if there is hotpluggable memory
> regions beyond MAX_DMA32_PFN.
> 
> It fixes KVM guests when they use emulated devices
> (reproduces with ata_piix, e1000 and usb devices,
>  RHBZ: 1275941, 1275977, 1271527)
> It also fixes the HyperV, VMWare with emulated devices
> which are affected by this issue as well.
> 
> Systems that have RAM less than 4GB and do not use
> memory hotplug but still have hotplug regions in SRAT
> (i.e. broken BIOS that can't disable mem hotplug)
> can disable memory hotplug with 'acpi_no_memhotplug = 1'
> to avoid automatic SWIOTLB initialization.
> 
> Tested on QEMU/KVM and HyperV.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Acked-by: Rik van Riel <riel@redhat.com>




-- 
All rights reversed

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN
  2015-11-30 17:35 [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN Igor Mammedov
  2015-12-01  2:45 ` Rik van Riel
@ 2015-12-04  8:20 ` Ingo Molnar
  2015-12-04 11:08   ` Igor Mammedov
  1 sibling, 1 reply; 4+ messages in thread
From: Ingo Molnar @ 2015-12-04  8:20 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: linux-kernel, tglx, mingo, hpa, konrad.wilk, rjw, lenb,
	linux-acpi, akataria, fujita.tomonori, revers


* Igor Mammedov <imammedo@redhat.com> wrote:

> diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
> index 94c18eb..53d7951 100644
> --- a/arch/x86/include/asm/acpi.h
> +++ b/arch/x86/include/asm/acpi.h
> @@ -147,6 +147,7 @@ static inline void disable_acpi(void) { }
>  #ifdef CONFIG_ACPI_NUMA
>  extern int acpi_numa;
>  extern int x86_acpi_numa_init(void);
> +unsigned long acpi_get_max_possible_pfn(void);
>  #endif /* CONFIG_ACPI_NUMA */
>  
>  #define acpi_unlazy_tlb(x)	leave_mm(x)
> @@ -170,4 +171,8 @@ static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
>  }
>  #endif
>  
> +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> +extern bool acpi_no_memhotplug;
> +#endif /* CONFIG_ACPI_HOTPLUG_MEMORY */
> +
>  #endif /* _ASM_X86_ACPI_H */
> diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> index adf0392..61d5ba5 100644
> --- a/arch/x86/kernel/pci-swiotlb.c
> +++ b/arch/x86/kernel/pci-swiotlb.c
> @@ -88,7 +88,11 @@ int __init pci_swiotlb_detect_4gb(void)
>  {
>  	/* don't initialize swiotlb if iommu=off (no_iommu=1) */
>  #ifdef CONFIG_X86_64
> +#ifdef CONFIG_ACPI_NUMA
> +	if (!no_iommu && acpi_get_max_possible_pfn() > MAX_DMA32_PFN)
> +#else
>  	if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> +#endif
>  		swiotlb = 1;
>  #endif
>  	return swiotlb;
> diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
> index c2aea63..21b33f0 100644
> --- a/arch/x86/mm/srat.c
> +++ b/arch/x86/mm/srat.c
> @@ -153,10 +153,20 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  static inline int save_add_info(void) {return 1;}
> +static unsigned long max_possible_pfn __initdata;
>  #else
>  static inline int save_add_info(void) {return 0;}
>  #endif
>  
> +unsigned long __init acpi_get_max_possible_pfn(void)
> +{
> +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> +	if (!acpi_no_memhotplug)
> +		return max_possible_pfn;
> +#endif
> +	return max_pfn;
> +}
> +
>  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
>  int __init
>  acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
> @@ -203,6 +213,11 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
>  		pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
>  			(unsigned long long)start, (unsigned long long)end - 1);
>  
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	if (max_possible_pfn < PFN_UP(end - 1))
> +		max_possible_pfn = PFN_UP(end - 1);
> +#endif
> +
>  	return 0;
>  out_err_bad_srat:
>  	bad_srat();
> diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> index 6b0d3ef..ae38f57 100644
> --- a/drivers/acpi/acpi_memhotplug.c
> +++ b/drivers/acpi/acpi_memhotplug.c
> @@ -357,7 +357,7 @@ static void acpi_memory_device_remove(struct acpi_device *device)
>  	acpi_memory_device_free(mem_device);
>  }
>  
> -static bool __initdata acpi_no_memhotplug;
> +bool __initdata acpi_no_memhotplug;
>  
>  void __init acpi_memory_hotplug_init(void)
>  {

So I don't disagree with the fix in principle, but the implementation here is 
rather ugly - it spreads new non-obvious #ifdefs across various critical parts of 
the kernel.

For example this:

>  #ifdef CONFIG_X86_64
> +#ifdef CONFIG_ACPI_NUMA
> +	if (!no_iommu && acpi_get_max_possible_pfn() > MAX_DMA32_PFN)
> +#else
>  	if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> +#endif
>  		swiotlb = 1;
>  #endif

could be cleaned up by introducing a proper max_possible_pfn variable, and setting 
it from the ACPI code - instead of exporting acpi_get_max_possible_pfn().

Another pattern is:

> +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> +	if (!acpi_no_memhotplug)
> +		return max_possible_pfn;
> +#endif

this should be driven from the acpi_no_memhotplug knob, instead of spreading 
acpi_no_memhotplug uses to other callsites.

Furthermore, please split these various steps up into multiple steps (and first do 
the preparatory changes, then fix the bug in the end) - to make it easier to 
bisect and analyze if we regress existing functionality somewhere.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN
  2015-12-04  8:20 ` Ingo Molnar
@ 2015-12-04 11:08   ` Igor Mammedov
  0 siblings, 0 replies; 4+ messages in thread
From: Igor Mammedov @ 2015-12-04 11:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, tglx, mingo, hpa, konrad.wilk, rjw, lenb,
	linux-acpi, akataria, fujita.tomonori, revers

On Fri, 4 Dec 2015 09:20:50 +0100
Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Igor Mammedov <imammedo@redhat.com> wrote:
> 
> > diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
> > index 94c18eb..53d7951 100644
> > --- a/arch/x86/include/asm/acpi.h
> > +++ b/arch/x86/include/asm/acpi.h
> > @@ -147,6 +147,7 @@ static inline void disable_acpi(void) { }
> >  #ifdef CONFIG_ACPI_NUMA
> >  extern int acpi_numa;
> >  extern int x86_acpi_numa_init(void);
> > +unsigned long acpi_get_max_possible_pfn(void);
> >  #endif /* CONFIG_ACPI_NUMA */
> >  
> >  #define acpi_unlazy_tlb(x)	leave_mm(x)
> > @@ -170,4 +171,8 @@ static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
> >  }
> >  #endif
> >  
> > +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> > +extern bool acpi_no_memhotplug;
> > +#endif /* CONFIG_ACPI_HOTPLUG_MEMORY */
> > +
> >  #endif /* _ASM_X86_ACPI_H */
> > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > index adf0392..61d5ba5 100644
> > --- a/arch/x86/kernel/pci-swiotlb.c
> > +++ b/arch/x86/kernel/pci-swiotlb.c
> > @@ -88,7 +88,11 @@ int __init pci_swiotlb_detect_4gb(void)
> >  {
> >  	/* don't initialize swiotlb if iommu=off (no_iommu=1) */
> >  #ifdef CONFIG_X86_64
> > +#ifdef CONFIG_ACPI_NUMA
> > +	if (!no_iommu && acpi_get_max_possible_pfn() > MAX_DMA32_PFN)
> > +#else
> >  	if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> > +#endif
> >  		swiotlb = 1;
> >  #endif
> >  	return swiotlb;
> > diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
> > index c2aea63..21b33f0 100644
> > --- a/arch/x86/mm/srat.c
> > +++ b/arch/x86/mm/srat.c
> > @@ -153,10 +153,20 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
> >  
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> >  static inline int save_add_info(void) {return 1;}
> > +static unsigned long max_possible_pfn __initdata;
> >  #else
> >  static inline int save_add_info(void) {return 0;}
> >  #endif
> >  
> > +unsigned long __init acpi_get_max_possible_pfn(void)
> > +{
> > +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> > +	if (!acpi_no_memhotplug)
> > +		return max_possible_pfn;
> > +#endif
> > +	return max_pfn;
> > +}
> > +
> >  /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
> >  int __init
> >  acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
> > @@ -203,6 +213,11 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
> >  		pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
> >  			(unsigned long long)start, (unsigned long long)end - 1);
> >  
> > +#ifdef CONFIG_MEMORY_HOTPLUG
> > +	if (max_possible_pfn < PFN_UP(end - 1))
> > +		max_possible_pfn = PFN_UP(end - 1);
> > +#endif
> > +
> >  	return 0;
> >  out_err_bad_srat:
> >  	bad_srat();
> > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> > index 6b0d3ef..ae38f57 100644
> > --- a/drivers/acpi/acpi_memhotplug.c
> > +++ b/drivers/acpi/acpi_memhotplug.c
> > @@ -357,7 +357,7 @@ static void acpi_memory_device_remove(struct acpi_device *device)
> >  	acpi_memory_device_free(mem_device);
> >  }
> >  
> > -static bool __initdata acpi_no_memhotplug;
> > +bool __initdata acpi_no_memhotplug;
> >  
> >  void __init acpi_memory_hotplug_init(void)
> >  {
> 
> So I don't disagree with the fix in principle, but the implementation here is 
> rather ugly - it spreads new non-obvious #ifdefs across various critical parts of 
> the kernel.
> 
> For example this:
> 
> >  #ifdef CONFIG_X86_64
> > +#ifdef CONFIG_ACPI_NUMA
> > +	if (!no_iommu && acpi_get_max_possible_pfn() > MAX_DMA32_PFN)
> > +#else
> >  	if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> > +#endif
> >  		swiotlb = 1;
> >  #endif
> 
> could be cleaned up by introducing a proper max_possible_pfn variable, and setting 
> it from the ACPI code - instead of exporting acpi_get_max_possible_pfn().
Thanks for review,
I'll try to drop #ifdefs as suggested and split it
in several patches.

> 
> Another pattern is:
> 
> > +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> > +	if (!acpi_no_memhotplug)
on the second thought,
we don't need need this knob to force disabling
SWIOTLB initialization since there is an existing
"no_iommu" option to do it, so I'll drop this check.


> > +		return max_possible_pfn;
> > +#endif
> 
> this should be driven from the acpi_no_memhotplug knob, instead of spreading 
> acpi_no_memhotplug uses to other callsites.
> 
> Furthermore, please split these various steps up into multiple steps (and first do 
> the preparatory changes, then fix the bug in the end) - to make it easier to 
> bisect and analyze if we regress existing functionality somewhere.
> 
> Thanks,
> 
> 	Ingo


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-12-04 11:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-30 17:35 [PATCH] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN Igor Mammedov
2015-12-01  2:45 ` Rik van Riel
2015-12-04  8:20 ` Ingo Molnar
2015-12-04 11:08   ` Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.