All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE
@ 2010-06-11  9:17 Kenji Kaneshige
  2010-06-11  9:18 ` [PATCH 1/4] x86: ioremap: fix wrong address masking Kenji Kaneshige
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-11  9:17 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel
  Cc: linux-pci, jbarnes

Hi,

I encountered the problem that loading ioatdma driver causes kernel
hangup or kernel panic in X86_32 PAE environment. I found that this
was caused by NOT ioatdma driver but x86's ioremap() behavior in
X86_32 PAE environment. On my environment, 64-bit MMIO region is
assigned to ioatdma PCI devices, and those high physical address are
passed to ioremap() when ioatdma driver calls pcim_iomap_regions().
Current x86's ioremap() seems to not handle those high physical
addresses properly. The major problems are

- When the physical address higher than 32-bit is passed to ioremap(),
  ioremap() maps wrong address. The ioremap() uses PAGE_MASK to align
  the specified address. This makes higher 32-bit of the physical
  address cleared unexpectedly. I think ioremap() must use
  PHYSICAL_PAGE_MASK instead.

- When too high physical address for X86_32 PAE (higher than 36-bit)
  is specified, ioremap() needs to return error (NULL). The ioremap()
  seems to check it by using phys_addr_valid(), but it doesn't work
  actually. The phys_addr_valid() checks the physical address by using
  boot_cpu_data.x86_phys_bits. The boot_cpu_data.x86_phys_bits holds
  maximum physical address range of the CPU including 64-bit mode. As
  a result, the phys_addr_valid() returns true even if the physical
  address higher than 36-bit is specified.

The following patch fixes the problem.

- [PATCH 1/4] x86: ioremap: fix wrong address masking
- [PATCH 2/4] x86: ioremap: fix physical address check
- [PATCH 3/4] x86: ioremap: remove physical address warning message
- [PATCH 4/4] x86: ioremap: fix normal ram range check

  Note: the last one (PATCH 4/4) is not related to this problem. I
        found it when I was making PATCH 1, 2, 3.

I made and tested those patches against 2.6.34, and confirmed it can
also be applied to 2.6.35-rc2.

By the way, I'm wondering some change might be needed also in PCI side.
For example, current PCI subsystem disables 64-bit BAR with address
higher than 32-bit assigned if sizeof(resource_size_t) is less than 8.
But it doesn't care the case sizeof(resource_size_t) is equal to 8 on
the system that cannot handle whole 64-bit physical address, like
X86_32 PAE. In relation to this, my system is doing the following
interesting behavior.

- On x86_32 without PAE, ioatdma works because 64-bit BAR is once
  cleared and then lower address is assigned again.

- On x86_32 with PAE, ioatdma doesn' work even with my patch set.
  Without my patch, kernel hangup or panic happens. With my patch,
  ioatdma driver fails to initialize the device because ioremap()
  returns NULL.

Anyway, I think ioremap() problem needs to be fixed first.

Thanks,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] x86: ioremap: fix wrong address masking
  2010-06-11  9:17 [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE Kenji Kaneshige
@ 2010-06-11  9:18 ` Kenji Kaneshige
  2010-06-11  9:20 ` [PATCH 2/4] x86: ioremap: fix physical address check Kenji Kaneshige
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-11  9:18 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel
  Cc: linux-pci, jbarnes

Current x86 ioremap() doesn't handle physical address higher than
32-bit properly in X86_32 PAE mode. When physical address higher than
32-bit is passed to ioremap(), higher 32-bits in physical address is
cleared wrongly. Due to this bug, ioremap() can map wrong address to
linear address space.

In my case, 64-bit MMIO region was assigned to a PCI device (ioat
device) on my system. Because of the ioremap()'s bug, wrong physical
address (instead of MMIO region) was mapped to linear address space.
Because of this, loading ioatdma driver caused unexpected behavior
(kernel panic, kernel hangup, ...).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

---
 arch/x86/mm/ioremap.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Index: linux-2.6.34/arch/x86/mm/ioremap.c
===================================================================
--- linux-2.6.34.orig/arch/x86/mm/ioremap.c	2010-06-10 07:28:28.222386973 +0900
+++ linux-2.6.34/arch/x86/mm/ioremap.c	2010-06-10 07:28:31.966187993 +0900
@@ -62,7 +62,7 @@
 static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		unsigned long size, unsigned long prot_val, void *caller)
 {
-	unsigned long pfn, offset, vaddr;
+	unsigned long pfn, last_pfn, offset, vaddr;
 	resource_size_t last_addr;
 	const resource_size_t unaligned_phys_addr = phys_addr;
 	const unsigned long unaligned_size = size;
@@ -100,10 +100,8 @@
 	/*
 	 * Don't allow anybody to remap normal RAM that we're using..
 	 */
-	for (pfn = phys_addr >> PAGE_SHIFT;
-				(pfn << PAGE_SHIFT) < (last_addr & PAGE_MASK);
-				pfn++) {
-
+	last_pfn = last_addr >> PAGE_SHIFT;
+	for (pfn = phys_addr >> PAGE_SHIFT; pfn < last_pfn; pfn++) {
 		int is_ram = page_is_ram(pfn);
 
 		if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)))
@@ -115,7 +113,7 @@
 	 * Mappings have to be page-aligned
 	 */
 	offset = phys_addr & ~PAGE_MASK;
-	phys_addr &= PAGE_MASK;
+	phys_addr &= PHYSICAL_PAGE_MASK;
 	size = PAGE_ALIGN(last_addr+1) - phys_addr;
 
 	retval = reserve_memtype(phys_addr, (u64)phys_addr + size,



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-11  9:17 [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE Kenji Kaneshige
  2010-06-11  9:18 ` [PATCH 1/4] x86: ioremap: fix wrong address masking Kenji Kaneshige
@ 2010-06-11  9:20 ` Kenji Kaneshige
  2010-06-11 17:43   ` H. Peter Anvin
  2010-06-11  9:20 ` [PATCH 3/4] x86: ioremap: remove physical address warning message Kenji Kaneshige
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-11  9:20 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel
  Cc: linux-pci, jbarnes

If the physical address is too high to be handled by ioremap() in
x86_32 PAE (e.g. more than 36-bit physical address), ioremap() must
return error (NULL). However, current x86 ioremap try to map this too
high physical address, and it causes unexpected behavior.

The ioremap() seems to check the specified physical address using
phys_addr_valid(). But current phys_addr_valid() returns true even if
more than 36-bit address range is specified because
boot_cpu_data.x86_phys_bits can have more than 36 even in X86_32 PAE
mode (boot_cpu_data.x86_phys_bits seems to hold maximum capability of
the processor).

To fix the problem, this patch changes phys_addr_valid() function to
return false when more than 36-bit address range is specified in PAE.
The phys_addr_valid() function is used only by ioremap() in X86_32 PAE
mode. So this change only affects ioremap() in X86_32 PAE mode.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

---
 arch/x86/mm/physaddr.h |    4 ++++
 1 file changed, 4 insertions(+)

Index: linux-2.6.34/arch/x86/mm/physaddr.h
===================================================================
--- linux-2.6.34.orig/arch/x86/mm/physaddr.h	2010-06-10 07:28:28.177229689 +0900
+++ linux-2.6.34/arch/x86/mm/physaddr.h	2010-06-10 07:28:32.587190857 +0900
@@ -3,8 +3,12 @@
 static inline int phys_addr_valid(resource_size_t addr)
 {
 #ifdef CONFIG_PHYS_ADDR_T_64BIT
+#ifdef CONFIG_X86_64
 	return !(addr >> boot_cpu_data.x86_phys_bits);
 #else
+	return !(addr >> 36);
+#endif
+#else
 	return 1;
 #endif
 }



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/4] x86: ioremap: remove physical address warning message
  2010-06-11  9:17 [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE Kenji Kaneshige
  2010-06-11  9:18 ` [PATCH 1/4] x86: ioremap: fix wrong address masking Kenji Kaneshige
  2010-06-11  9:20 ` [PATCH 2/4] x86: ioremap: fix physical address check Kenji Kaneshige
@ 2010-06-11  9:20 ` Kenji Kaneshige
  2010-06-11 17:44   ` H. Peter Anvin
  2010-06-11  9:21 ` [PATCH 4/4] x86: ioremap: fix normal ram range check Kenji Kaneshige
  2010-06-11 17:41 ` [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE H. Peter Anvin
  4 siblings, 1 reply; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-11  9:20 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel
  Cc: linux-pci, jbarnes

Current ioremap() code for x86 displays warning message if too high
address to handle is passed. But this can happen as usual cases. For
example, if 64-bit BAR is assigned to a PCI device and its device
driver calls pci_iomap(). So this patch changes the warning messages
as follows.

- Change printk message from KERN_WARNING to KERN_DEBUG
- Remove WARN_ON_ONCE()

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

---
 arch/x86/mm/ioremap.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-2.6.34/arch/x86/mm/ioremap.c
===================================================================
--- linux-2.6.34.orig/arch/x86/mm/ioremap.c	2010-06-10 07:28:31.966187993 +0900
+++ linux-2.6.34/arch/x86/mm/ioremap.c	2010-06-10 07:28:33.146375380 +0900
@@ -78,9 +78,8 @@
 		return NULL;
 
 	if (!phys_addr_valid(phys_addr)) {
-		printk(KERN_WARNING "ioremap: invalid physical address %llx\n",
+		printk(KERN_DEBUG "ioremap: can't map physical address %llx\n",
 		       (unsigned long long)phys_addr);
-		WARN_ON_ONCE(1);
 		return NULL;
 	}
 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 4/4] x86: ioremap: fix normal ram range check
  2010-06-11  9:17 [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE Kenji Kaneshige
                   ` (2 preceding siblings ...)
  2010-06-11  9:20 ` [PATCH 3/4] x86: ioremap: remove physical address warning message Kenji Kaneshige
@ 2010-06-11  9:21 ` Kenji Kaneshige
  2010-06-11 17:41 ` [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE H. Peter Anvin
  4 siblings, 0 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-11  9:21 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel
  Cc: linux-pci, jbarnes

Check for norma RAM in x86 ioremap() code seems to not work for the
last page frame in the specified physical address range.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

---
 arch/x86/mm/ioremap.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.34/arch/x86/mm/ioremap.c
===================================================================
--- linux-2.6.34.orig/arch/x86/mm/ioremap.c	2010-06-10 07:28:33.146375380 +0900
+++ linux-2.6.34/arch/x86/mm/ioremap.c	2010-06-10 07:32:59.211196516 +0900
@@ -100,7 +100,7 @@
 	 * Don't allow anybody to remap normal RAM that we're using..
 	 */
 	last_pfn = last_addr >> PAGE_SHIFT;
-	for (pfn = phys_addr >> PAGE_SHIFT; pfn < last_pfn; pfn++) {
+	for (pfn = phys_addr >> PAGE_SHIFT; pfn <= last_pfn; pfn++) {
 		int is_ram = page_is_ram(pfn);
 
 		if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)))



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE
  2010-06-11  9:17 [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE Kenji Kaneshige
                   ` (3 preceding siblings ...)
  2010-06-11  9:21 ` [PATCH 4/4] x86: ioremap: fix normal ram range check Kenji Kaneshige
@ 2010-06-11 17:41 ` H. Peter Anvin
  4 siblings, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-11 17:41 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

On 06/11/2010 02:17 AM, Kenji Kaneshige wrote:
> 
> By the way, I'm wondering some change might be needed also in PCI side.
> For example, current PCI subsystem disables 64-bit BAR with address
> higher than 32-bit assigned if sizeof(resource_size_t) is less than 8.
> But it doesn't care the case sizeof(resource_size_t) is equal to 8 on
> the system that cannot handle whole 64-bit physical address, like
> X86_32 PAE. In relation to this, my system is doing the following
> interesting behavior.
> 
> - On x86_32 without PAE, ioatdma works because 64-bit BAR is once
>   cleared and then lower address is assigned again.
> 
> - On x86_32 with PAE, ioatdma doesn' work even with my patch set.
>   Without my patch, kernel hangup or panic happens. With my patch,
>   ioatdma driver fails to initialize the device because ioremap()
>   returns NULL.
> 
> Anyway, I think ioremap() problem needs to be fixed first.
> 

We had a patch in for a while to cap the physical address space to the
number of bits supported by the CPU.  It got removed because it caused a
regression, which turned out to be because it exposed another bug (which
has since been fixed) -- I have been meaning to put it back in.

	-hpa

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-11  9:20 ` [PATCH 2/4] x86: ioremap: fix physical address check Kenji Kaneshige
@ 2010-06-11 17:43   ` H. Peter Anvin
  2010-06-14  0:18     ` KAMEZAWA Hiroyuki
  2010-06-14  1:54     ` Kenji Kaneshige
  0 siblings, 2 replies; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-11 17:43 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
> If the physical address is too high to be handled by ioremap() in
> x86_32 PAE (e.g. more than 36-bit physical address), ioremap() must
> return error (NULL). However, current x86 ioremap try to map this too
> high physical address, and it causes unexpected behavior.

What unexpected behavior?  It is perfectly legitimately to map such a
high address in PAE mode.  We have a 36-bit kernel-imposed limit on
*RAM* in 32-bit mode (because we can't manage more than that), but there
is no reason it should apply to I/O.

	-hpa

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] x86: ioremap: remove physical address warning message
  2010-06-11  9:20 ` [PATCH 3/4] x86: ioremap: remove physical address warning message Kenji Kaneshige
@ 2010-06-11 17:44   ` H. Peter Anvin
  2010-06-14  2:06     ` Kenji Kaneshige
  0 siblings, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-11 17:44 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
> Current ioremap() code for x86 displays warning message if too high
> address to handle is passed. But this can happen as usual cases. For
> example, if 64-bit BAR is assigned to a PCI device and its device
> driver calls pci_iomap(). So this patch changes the warning messages
> as follows.
> 
> - Change printk message from KERN_WARNING to KERN_DEBUG
> - Remove WARN_ON_ONCE()
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> 
> ---
>  arch/x86/mm/ioremap.c |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> Index: linux-2.6.34/arch/x86/mm/ioremap.c
> ===================================================================
> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c	2010-06-10 07:28:31.966187993 +0900
> +++ linux-2.6.34/arch/x86/mm/ioremap.c	2010-06-10 07:28:33.146375380 +0900
> @@ -78,9 +78,8 @@
>  		return NULL;
>  
>  	if (!phys_addr_valid(phys_addr)) {
> -		printk(KERN_WARNING "ioremap: invalid physical address %llx\n",
> +		printk(KERN_DEBUG "ioremap: can't map physical address %llx\n",
>  		       (unsigned long long)phys_addr);
> -		WARN_ON_ONCE(1);
>  		return NULL;
>  	}
>  

Fair to change the message, but the priority level really seems way
insufficient.  I can see dropping the WARN_ON_ONCE() though.

	-hpa

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-11 17:43   ` H. Peter Anvin
@ 2010-06-14  0:18     ` KAMEZAWA Hiroyuki
  2010-06-14  8:59       ` KAMEZAWA Hiroyuki
  2010-06-14  1:54     ` Kenji Kaneshige
  1 sibling, 1 reply; 25+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-06-14  0:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Kenji Kaneshige, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

On Fri, 11 Jun 2010 10:43:27 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:

> On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
> > If the physical address is too high to be handled by ioremap() in
> > x86_32 PAE (e.g. more than 36-bit physical address), ioremap() must
> > return error (NULL). However, current x86 ioremap try to map this too
> > high physical address, and it causes unexpected behavior.
> 
> What unexpected behavior?  It is perfectly legitimately to map such a
> high address in PAE mode.  We have a 36-bit kernel-imposed limit on
> *RAM* in 32-bit mode (because we can't manage more than that), but there
> is no reason it should apply to I/O.
> 

I'm sorry for lack of study. 
How to access it via mapped area by ioremap() ?

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-11 17:43   ` H. Peter Anvin
  2010-06-14  0:18     ` KAMEZAWA Hiroyuki
@ 2010-06-14  1:54     ` Kenji Kaneshige
  2010-06-14  6:38       ` Maciej W. Rozycki
  2010-06-14  8:27       ` Kenji Kaneshige
  1 sibling, 2 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14  1:54 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

(2010/06/12 2:43), H. Peter Anvin wrote:
> On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
>> If the physical address is too high to be handled by ioremap() in
>> x86_32 PAE (e.g. more than 36-bit physical address), ioremap() must
>> return error (NULL). However, current x86 ioremap try to map this too
>> high physical address, and it causes unexpected behavior.
>
> What unexpected behavior?

My expectation:
The ioremap() function returns NULL if the specified physical address
is too high to handle.

Actual result (unexpected behavior):
Kernel hangup. This happens even with [PATCH 1/4] applied. I'm attaching
the console log messages at the bottom of this e-mail.

>                           It is perfectly legitimately to map such a
> high address in PAE mode.  We have a 36-bit kernel-imposed limit on
> *RAM* in 32-bit mode (because we can't manage more than that), but there
> is no reason it should apply to I/O.
>

Do you mean x86 linux can map physical address higher than 36-bit for I/O?
My understanding is as follows.
- Architectural limit of physical address in x86 32-bit mode is 40-bit
   (depnds on processor version).
- The maximum physical address supported by current x86 linux kernel in
   32-bit mode is 36-bit.
On my environment, physical address higher than 40-bit (ex. 0xfc00001c000)
is assigned to some PCI devices. I think there is no way to handle such
high physical address in 32-bit mode.

Thanks,
Kenji Kaneshige

======================================================================
Console Messages (2.6.34 + [PATCH 1/4] applied) after modprobe ioatdma
======================================================================
ioatdma: Intel(R) QuickData Technology Driver 4.00
ioatdma 0000:00:16.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ioatdma 0000:00:16.0: setting latency timer to 64
modprobe:5619 freeing invalid memtype 1c000-20000
ioatdma 0000:00:16.0: PCI INT A disabled
ioatdma 0000:00:16.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
ioatdma 0000:00:16.1: setting latency timer to 64
ioatdma 0000:00:16.1: (26) exceeds max supported channels (4)
ioatdma 0000:00:16.1: channel enumeration error
ioatdma 0000:00:16.1: Intel(R) I/OAT DMA Engine init failed
modprobe:5619 freeing invalid memtype 18000-1c000
ioatdma 0000:00:16.1: PCI INT B disabled
ioatdma 0000:00:16.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
ioatdma 0000:00:16.2: setting latency timer to 64
ioatdma 0000:00:16.2: (20) exceeds max supported channels (4)
   alloc irq_desc for 80 on node -1
   alloc kstat_irqs on node -1
ioatdma 0000:00:16.2: irq 80 for MSI/MSI-X
BUG: soft lockup - CPU#0 stuck for 61s! [modprobe:5619]
Modules linked in: ioatdma(+) autofs4 sunrpc cpufreq_ondemand acpi_cpufreq ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod e1000e igb iTCO_wdt dca sg iTCO_vendor_support i2c_i801 i2c_core pcspkr ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih lpfc scsi_transport_fc mptbase scsi_tgt scsi_transport_sas [last unloaded: microcode]
Modules linked in: ioatdma(+) autofs4 sunrpc cpufreq_ondemand acpi_cpufreq ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod e1000e igb iTCO_wdt dca sg iTCO_vendor_support i2c_i801 i2c_core pcspkr ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih lpfc scsi_transport_fc mptbase scsi_tgt scsi_transport_sas [last unloaded: microcode]

Pid: 5619, comm: modprobe Not tainted 2.6.34-kk #20 SB/PRIMEQUEST 1800E
EIP: 0060:[<c07ed5e8>] EFLAGS: 00000202 CPU: 0
EIP is at _raw_spin_lock_bh+0x18/0x20
EAX: e3608000 EBX: e305fb5c ECX: 0000007d EDX: 00000001
ESI: 0000007d EDI: e305fa94 EBP: e3609d18 ESP: e3609ccc
  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process modprobe (pid: 5619, ti=e3608000 task=e167fab0 task.ti=e3608000)
Stack:
  e305fa94 fe577cc6 205f9002 00000000 c05e5377 000fffff 01000000 00000257
<0> e3f25240 fffff000 0000ffff 00000001 00000001 e3609dae 000fffff e305fb5c
<0> ffffe000 00000000 e305fa94 e3609dbc fe578512 205f9000 00000000 e34e0c00
Call Trace:
  [<fe577cc6>] ? ioat2_alloc_and_lock+0x26/0x280 [ioatdma]
  [<c05e5377>] ? __domain_mapping+0x177/0x260
  [<fe578512>] ? ioat2_dma_prep_memcpy_lock+0x52/0x3c0 [ioatdma]
  [<c05e7339>] ? __intel_map_single+0x169/0x230
  [<c05e7456>] ? intel_map_page+0x56/0x70
  [<fe575762>] ? dma_map_single_attrs.clone.1+0x62/0x80 [ioatdma]
  [<fe57cd29>] ? ioat_dma_self_test+0xf4/0x248 [ioatdma]
  [<c049c048>] ? devm_request_threaded_irq+0x78/0xa0
  [<fe57da6c>] ? ioat3_dma_self_test+0x8/0x16 [ioatdma]
  [<fe57cb1c>] ? ioat_probe+0x2d7/0x343 [ioatdma]
  [<fe57d092>] ? ioat3_dma_probe+0x145/0x1d1 [ioatdma]
  [<fe57c7e0>] ? ioat_pci_probe+0x14b/0x181 [ioatdma]
  [<c05cd92b>] ? local_pci_probe+0xb/0x10
  [<c05ce937>] ? pci_device_probe+0xc7/0xf0
  [<c0672d17>] ? driver_probe_device+0x87/0x290
  [<c05cd9d0>] ? pci_match_device+0x10/0xb0
  [<c0672f99>] ? __driver_attach+0x79/0x80
  [<c0672f20>] ? __driver_attach+0x0/0x80
  [<c0672112>] ? bus_for_each_dev+0x52/0x80
  [<c0672b06>] ? driver_attach+0x16/0x20
  [<c0672f20>] ? __driver_attach+0x0/0x80
  [<c06724bf>] ? bus_add_driver+0x1cf/0x320
  [<c05ce810>] ? pci_device_remove+0x0/0x40
  [<c0673223>] ? driver_register+0x63/0x120
  [<fe584000>] ? ioat_init_module+0x0/0x79 [ioatdma]
  [<c05ceb5d>] ? __pci_register_driver+0x3d/0xb0
  [<fe584062>] ? ioat_init_module+0x62/0x79 [ioatdma]
  [<c040112f>] ? do_one_initcall+0x2f/0x1c0
  [<c047bb23>] ? sys_init_module+0xb3/0x220
  [<c07ee2f0>] ? do_device_not_available+0x0/0x60
  [<c07ee335>] ? do_device_not_available+0x45/0x60
  [<c0402f1f>] ? sysenter_do_call+0x12/0x28



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] x86: ioremap: remove physical address warning message
  2010-06-11 17:44   ` H. Peter Anvin
@ 2010-06-14  2:06     ` Kenji Kaneshige
  0 siblings, 0 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14  2:06 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

(2010/06/12 2:44), H. Peter Anvin wrote:
> On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
>> Current ioremap() code for x86 displays warning message if too high
>> address to handle is passed. But this can happen as usual cases. For
>> example, if 64-bit BAR is assigned to a PCI device and its device
>> driver calls pci_iomap(). So this patch changes the warning messages
>> as follows.
>>
>> - Change printk message from KERN_WARNING to KERN_DEBUG
>> - Remove WARN_ON_ONCE()
>>
>> Signed-off-by: Kenji Kaneshige<kaneshige.kenji@jp.fujitsu.com>
>>
>> ---
>>   arch/x86/mm/ioremap.c |    3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> Index: linux-2.6.34/arch/x86/mm/ioremap.c
>> ===================================================================
>> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c	2010-06-10 07:28:31.966187993 +0900
>> +++ linux-2.6.34/arch/x86/mm/ioremap.c	2010-06-10 07:28:33.146375380 +0900
>> @@ -78,9 +78,8 @@
>>   		return NULL;
>>
>>   	if (!phys_addr_valid(phys_addr)) {
>> -		printk(KERN_WARNING "ioremap: invalid physical address %llx\n",
>> +		printk(KERN_DEBUG "ioremap: can't map physical address %llx\n",
>>   		       (unsigned long long)phys_addr);
>> -		WARN_ON_ONCE(1);
>>   		return NULL;
>>   	}
>>
>
> Fair to change the message, but the priority level really seems way
> insufficient.  I can see dropping the WARN_ON_ONCE() though.
>

I think KERN_WARNING is a little too high because there is no action
user can take and there is no problem from the hardware/firmware point
of view. How about KERN_INFO instead?

Thanks,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  1:54     ` Kenji Kaneshige
@ 2010-06-14  6:38       ` Maciej W. Rozycki
  2010-06-14  8:23         ` Kenji Kaneshige
  2010-06-14 15:11         ` H. Peter Anvin
  2010-06-14  8:27       ` Kenji Kaneshige
  1 sibling, 2 replies; 25+ messages in thread
From: Maciej W. Rozycki @ 2010-06-14  6:38 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

On Mon, 14 Jun 2010, Kenji Kaneshige wrote:

> - Architectural limit of physical address in x86 32-bit mode is 40-bit
>   (depnds on processor version).

 According to documentation I happen to have handy this limit is actually 
52 bits (and space is currently available in the data structures used for 
a possible future extension up to 63 bits).

  Maciej

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  6:38       ` Maciej W. Rozycki
@ 2010-06-14  8:23         ` Kenji Kaneshige
  2010-06-14  9:02           ` Kenji Kaneshige
  2010-06-14 15:11         ` H. Peter Anvin
  1 sibling, 1 reply; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14  8:23 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

(2010/06/14 15:38), Maciej W. Rozycki wrote:
> On Mon, 14 Jun 2010, Kenji Kaneshige wrote:
>
>> - Architectural limit of physical address in x86 32-bit mode is 40-bit
>>    (depnds on processor version).
>
>   According to documentation I happen to have handy this limit is actually
> 52 bits (and space is currently available in the data structures used for
> a possible future extension up to 63 bits).

Thank you for pointing it out. I misunderstood that.

Now I think I need to add additional check to see if specified
physical address can be handled by x86 ioremap(), instead of
changing phys_addr_valid(). The code would be

static void __iomem *__ioremap_caller(...)
{
	...
#if defined(CONFIG_X86_32) && defined(CONFIG_X86_PAE)
	if (phys_addr is higer than 36-bit)  {
		printk(KERN_INFO "ioremap can't map physical address %llx\n",
		return NULL;
	}
#endif
	...
}

Thanks,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  1:54     ` Kenji Kaneshige
  2010-06-14  6:38       ` Maciej W. Rozycki
@ 2010-06-14  8:27       ` Kenji Kaneshige
  2010-06-14 15:12         ` H. Peter Anvin
  1 sibling, 1 reply; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14  8:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

> Do you mean x86 linux can map physical address higher than 36-bit for I/O?
> My understanding is as follows.
> - Architectural limit of physical address in x86 32-bit mode is 40-bit
> (depnds on processor version).
> - The maximum physical address supported by current x86 linux kernel in
> 32-bit mode is 36-bit.
> On my environment, physical address higher than 40-bit (ex. 0xfc00001c000)
> is assigned to some PCI devices. I think there is no way to handle such
> high physical address in 32-bit mode.
>
> Thanks,
> Kenji Kaneshige

Please ignore above. I misunderstood the architectural limit of physical
address in x86 32-bit mode. It is not 40-bits, but 52-bits.

Thanks,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  0:18     ` KAMEZAWA Hiroyuki
@ 2010-06-14  8:59       ` KAMEZAWA Hiroyuki
  2010-06-14  9:13         ` Kenji Kaneshige
  0 siblings, 1 reply; 25+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-06-14  8:59 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: H. Peter Anvin, Kenji Kaneshige, Thomas Gleixner, Ingo Molnar,
	linux-kernel, linux-pci, jbarnes

On Mon, 14 Jun 2010 09:18:23 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Fri, 11 Jun 2010 10:43:27 -0700
> "H. Peter Anvin" <hpa@zytor.com> wrote:
> 
> > On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
> > > If the physical address is too high to be handled by ioremap() in
> > > x86_32 PAE (e.g. more than 36-bit physical address), ioremap() must
> > > return error (NULL). However, current x86 ioremap try to map this too
> > > high physical address, and it causes unexpected behavior.
> > 
> > What unexpected behavior?  It is perfectly legitimately to map such a
> > high address in PAE mode.  We have a 36-bit kernel-imposed limit on
> > *RAM* in 32-bit mode (because we can't manage more than that), but there
> > is no reason it should apply to I/O.
> > 
> 
> I'm sorry for lack of study. 

Now, __ioremap_caller() gets 64 bit argument as physical address.
== 2.6.35-rc2 ==
static void __iomem *__ioremap_caller(resource_size_t phys_addr,
                unsigned long size, unsigned long prot_val, void *caller)
{
==

And check physical address is valid based on the value got from cpuid.
==
      if (!phys_addr_valid(phys_addr)) {
                printk(KERN_WARNING "ioremap: invalid physical address %llx\n",
                       (unsigned long long)phys_addr);
                WARN_ON_ONCE(1);
                return NULL;
        }
==


At 1st, this seems buggy.
==
        /*
         * Don't allow anybody to remap normal RAM that we're using..
         */
        for (pfn = phys_addr >> PAGE_SHIFT;
                                (pfn << PAGE_SHIFT) < (last_addr & PAGE_MASK);
                                pfn++) {

                int is_ram = page_is_ram(pfn);

                if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)))
                        return NULL;
                WARN_ON_ONCE(is_ram);
        }
==
If phys_addr > 44bit, pfn should be 64bit. don't it ?
Then, page_is_ram should eat 64bit arguments.

But here, assume phys_addr < 44bit and continue.

Next,
===
      offset = phys_addr & ~PAGE_MASK;
        phys_addr &= PAGE_MASK;
        size = PAGE_ALIGN(last_addr+1) - phys_addr;

this mask ops is bad. as the patch-1 shows.

And, finally, calls get_vm_area.

==
        /*
         * Ok, go for it..
         */
        area = get_vm_area_caller(size, VM_IOREMAP, caller);
        if (!area)
                goto err_free_memtype;
        area->phys_addr = phys_addr;
        vaddr = (unsigned long) area->addr;
==
Because area->phys_addr is 32bit. This may break something if we unmap this later.

And, page table operation is this.
==
       if (ioremap_page_range(vaddr, vaddr + size, phys_addr, prot))
                goto err_free_area;
==

Let's see lib/ioremap.c
==
int ioremap_page_range(unsigned long addr,
                       unsigned long end, unsigned long phys_addr, pgprot_t prot)
{
        pgd_t *pgd;
   
==

Oh, phys_addr is unsigned long again....Then, Kenji-san, what's buggy is this check.
I think.

Thanks,
-Kame






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  8:23         ` Kenji Kaneshige
@ 2010-06-14  9:02           ` Kenji Kaneshige
  2010-06-14 15:40             ` H. Peter Anvin
  0 siblings, 1 reply; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14  9:02 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

(2010/06/14 17:23), Kenji Kaneshige wrote:
> (2010/06/14 15:38), Maciej W. Rozycki wrote:
>> On Mon, 14 Jun 2010, Kenji Kaneshige wrote:
>>
>>> - Architectural limit of physical address in x86 32-bit mode is 40-bit
>>> (depnds on processor version).
>>
>> According to documentation I happen to have handy this limit is actually
>> 52 bits (and space is currently available in the data structures used for
>> a possible future extension up to 63 bits).
>
> Thank you for pointing it out. I misunderstood that.
>
> Now I think I need to add additional check to see if specified
> physical address can be handled by x86 ioremap(), instead of
> changing phys_addr_valid(). The code would be
>
> static void __iomem *__ioremap_caller(...)
> {
> ...
> #if defined(CONFIG_X86_32) && defined(CONFIG_X86_PAE)
> if (phys_addr is higer than 36-bit) {
> printk(KERN_INFO "ioremap can't map physical address %llx\n",
> return NULL;
> }
> #endif
> ...
> }

Please ignore above again. Sorry for inconvenient.
According to the comment from H. Peter Anvin, 36-bit limit is on
RAM in 32-bit mode. So this approach is wrong.

Now I guess there is a bug that doesn't handle physical address
higher than 32-bit properly somewhere...

Thanks,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  8:59       ` KAMEZAWA Hiroyuki
@ 2010-06-14  9:13         ` Kenji Kaneshige
  2010-06-14 11:06           ` Kenji Kaneshige
  0 siblings, 1 reply; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14  9:13 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

Thank you Hiroyuki.

So many bugs in ioremap()...

Will try with those bugs fixed.

Thanks,
Kenji Kaneshige


(2010/06/14 17:59), KAMEZAWA Hiroyuki wrote:
> On Mon, 14 Jun 2010 09:18:23 +0900
> KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>  wrote:
>
>> On Fri, 11 Jun 2010 10:43:27 -0700
>> "H. Peter Anvin"<hpa@zytor.com>  wrote:
>>
>>> On 06/11/2010 02:20 AM, Kenji Kaneshige wrote:
>>>> If the physical address is too high to be handled by ioremap() in
>>>> x86_32 PAE (e.g. more than 36-bit physical address), ioremap() must
>>>> return error (NULL). However, current x86 ioremap try to map this too
>>>> high physical address, and it causes unexpected behavior.
>>>
>>> What unexpected behavior?  It is perfectly legitimately to map such a
>>> high address in PAE mode.  We have a 36-bit kernel-imposed limit on
>>> *RAM* in 32-bit mode (because we can't manage more than that), but there
>>> is no reason it should apply to I/O.
>>>
>>
>> I'm sorry for lack of study.
>
> Now, __ioremap_caller() gets 64 bit argument as physical address.
> == 2.6.35-rc2 ==
> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>                  unsigned long size, unsigned long prot_val, void *caller)
> {
> ==
>
> And check physical address is valid based on the value got from cpuid.
> ==
>        if (!phys_addr_valid(phys_addr)) {
>                  printk(KERN_WARNING "ioremap: invalid physical address %llx\n",
>                         (unsigned long long)phys_addr);
>                  WARN_ON_ONCE(1);
>                  return NULL;
>          }
> ==
>
>
> At 1st, this seems buggy.
> ==
>          /*
>           * Don't allow anybody to remap normal RAM that we're using..
>           */
>          for (pfn = phys_addr>>  PAGE_SHIFT;
>                                  (pfn<<  PAGE_SHIFT)<  (last_addr&  PAGE_MASK);
>                                  pfn++) {
>
>                  int is_ram = page_is_ram(pfn);
>
>                  if (is_ram&&  pfn_valid(pfn)&&  !PageReserved(pfn_to_page(pfn)))
>                          return NULL;
>                  WARN_ON_ONCE(is_ram);
>          }
> ==
> If phys_addr>  44bit, pfn should be 64bit. don't it ?
> Then, page_is_ram should eat 64bit arguments.
>
> But here, assume phys_addr<  44bit and continue.
>
> Next,
> ===
>        offset = phys_addr&  ~PAGE_MASK;
>          phys_addr&= PAGE_MASK;
>          size = PAGE_ALIGN(last_addr+1) - phys_addr;
>
> this mask ops is bad. as the patch-1 shows.
>
> And, finally, calls get_vm_area.
>
> ==
>          /*
>           * Ok, go for it..
>           */
>          area = get_vm_area_caller(size, VM_IOREMAP, caller);
>          if (!area)
>                  goto err_free_memtype;
>          area->phys_addr = phys_addr;
>          vaddr = (unsigned long) area->addr;
> ==
> Because area->phys_addr is 32bit. This may break something if we unmap this later.
>
> And, page table operation is this.
> ==
>         if (ioremap_page_range(vaddr, vaddr + size, phys_addr, prot))
>                  goto err_free_area;
> ==
>
> Let's see lib/ioremap.c
> ==
> int ioremap_page_range(unsigned long addr,
>                         unsigned long end, unsigned long phys_addr, pgprot_t prot)
> {
>          pgd_t *pgd;
>
> ==
>
> Oh, phys_addr is unsigned long again....Then, Kenji-san, what's buggy is this check.
> I think.
>
> Thanks,
> -Kame
>
>
>
>
>
>
>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  9:13         ` Kenji Kaneshige
@ 2010-06-14 11:06           ` Kenji Kaneshige
  2010-06-14 18:36             ` H. Peter Anvin
  2010-06-14 20:16             ` Rolf Eike Beer
  0 siblings, 2 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-14 11:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

(2010/06/14 18:13), Kenji Kaneshige wrote:
> Thank you Hiroyuki.
> 
> So many bugs in ioremap()...
> 
> Will try with those bugs fixed.
> 
> Thanks,
> Kenji Kaneshige

The problem seems to be fixed by the following patch. This is still
under testing. I will post the patch as v2 after testing.

Thanks,
Kenji Kaneshige


Current x86 ioremap() doesn't handle physical address higher than
32-bit properly in X86_32 PAE mode. When physical address higher than
32-bit is passed to ioremap(), higher 32-bits in physical address is
cleared wrongly. Due to this bug, ioremap() can map wrong address to
linear address space.

In my case, 64-bit MMIO region was assigned to a PCI device (ioat
device) on my system. Because of the ioremap()'s bug, wrong physical
address (instead of MMIO region) was mapped to linear address space.
Because of this, loading ioatdma driver caused unexpected behavior
(kernel panic, kernel hangup, ...).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

---
 arch/x86/mm/ioremap.c   |   11 +++++------
 include/linux/io.h      |    4 ++--
 include/linux/vmalloc.h |    2 +-
 lib/ioremap.c           |   10 +++++-----
 4 files changed, 13 insertions(+), 14 deletions(-)

Index: linux-2.6.34/arch/x86/mm/ioremap.c
===================================================================
--- linux-2.6.34.orig/arch/x86/mm/ioremap.c
+++ linux-2.6.34/arch/x86/mm/ioremap.c
@@ -62,7 +62,8 @@ int ioremap_change_attr(unsigned long va
 static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		unsigned long size, unsigned long prot_val, void *caller)
 {
-	unsigned long pfn, offset, vaddr;
+	u64 pfn, last_pfn;
+	unsigned long offset, vaddr;
 	resource_size_t last_addr;
 	const resource_size_t unaligned_phys_addr = phys_addr;
 	const unsigned long unaligned_size = size;
@@ -100,10 +101,8 @@ static void __iomem *__ioremap_caller(re
 	/*
 	 * Don't allow anybody to remap normal RAM that we're using..
 	 */
-	for (pfn = phys_addr >> PAGE_SHIFT;
-				(pfn << PAGE_SHIFT) < (last_addr & PAGE_MASK);
-				pfn++) {
-
+	last_pfn = last_addr >> PAGE_SHIFT;
+	for (pfn = phys_addr >> PAGE_SHIFT; pfn < last_pfn; pfn++) {
 		int is_ram = page_is_ram(pfn);
 
 		if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)))
@@ -115,7 +114,7 @@ static void __iomem *__ioremap_caller(re
 	 * Mappings have to be page-aligned
 	 */
 	offset = phys_addr & ~PAGE_MASK;
-	phys_addr &= PAGE_MASK;
+	phys_addr &= PHYSICAL_PAGE_MASK;
 	size = PAGE_ALIGN(last_addr+1) - phys_addr;
 
 	retval = reserve_memtype(phys_addr, (u64)phys_addr + size,
Index: linux-2.6.34/include/linux/vmalloc.h
===================================================================
--- linux-2.6.34.orig/include/linux/vmalloc.h
+++ linux-2.6.34/include/linux/vmalloc.h
@@ -30,7 +30,7 @@ struct vm_struct {
 	unsigned long		flags;
 	struct page		**pages;
 	unsigned int		nr_pages;
-	unsigned long		phys_addr;
+	phys_addr_t		phys_addr;
 	void			*caller;
 };
 
Index: linux-2.6.34/lib/ioremap.c
===================================================================
--- linux-2.6.34.orig/lib/ioremap.c
+++ linux-2.6.34/lib/ioremap.c
@@ -13,10 +13,10 @@
 #include <asm/pgtable.h>
 
 static int ioremap_pte_range(pmd_t *pmd, unsigned long addr,
-		unsigned long end, unsigned long phys_addr, pgprot_t prot)
+		unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
 {
 	pte_t *pte;
-	unsigned long pfn;
+	u64 pfn;
 
 	pfn = phys_addr >> PAGE_SHIFT;
 	pte = pte_alloc_kernel(pmd, addr);
@@ -31,7 +31,7 @@ static int ioremap_pte_range(pmd_t *pmd,
 }
 
 static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr,
-		unsigned long end, unsigned long phys_addr, pgprot_t prot)
+		unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
 {
 	pmd_t *pmd;
 	unsigned long next;
@@ -49,7 +49,7 @@ static inline int ioremap_pmd_range(pud_
 }
 
 static inline int ioremap_pud_range(pgd_t *pgd, unsigned long addr,
-		unsigned long end, unsigned long phys_addr, pgprot_t prot)
+		unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
 {
 	pud_t *pud;
 	unsigned long next;
@@ -67,7 +67,7 @@ static inline int ioremap_pud_range(pgd_
 }
 
 int ioremap_page_range(unsigned long addr,
-		       unsigned long end, unsigned long phys_addr, pgprot_t prot)
+		       unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
 {
 	pgd_t *pgd;
 	unsigned long start;
Index: linux-2.6.34/include/linux/io.h
===================================================================
--- linux-2.6.34.orig/include/linux/io.h
+++ linux-2.6.34/include/linux/io.h
@@ -29,10 +29,10 @@ void __iowrite64_copy(void __iomem *to, 
 
 #ifdef CONFIG_MMU
 int ioremap_page_range(unsigned long addr, unsigned long end,
-		       unsigned long phys_addr, pgprot_t prot);
+		       phys_addr_t phys_addr, pgprot_t prot);
 #else
 static inline int ioremap_page_range(unsigned long addr, unsigned long end,
-				     unsigned long phys_addr, pgprot_t prot)
+				     phys_addr_t phys_addr, pgprot_t prot)
 {
 	return 0;
 }



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  6:38       ` Maciej W. Rozycki
  2010-06-14  8:23         ` Kenji Kaneshige
@ 2010-06-14 15:11         ` H. Peter Anvin
  1 sibling, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-14 15:11 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Kenji Kaneshige, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

On 06/13/2010 11:38 PM, Maciej W. Rozycki wrote:
> On Mon, 14 Jun 2010, Kenji Kaneshige wrote:
> 
>> - Architectural limit of physical address in x86 32-bit mode is 40-bit
>>   (depnds on processor version).
> 
>  According to documentation I happen to have handy this limit is actually 
> 52 bits (and space is currently available in the data structures used for 
> a possible future extension up to 63 bits).
> 

Yes.  There are, however, very likely bugs in several classes due to
incorrect bitmasks as well as 32-bit PFNs.

We have made the decision based on data structure limitations (and just
usability) to not support more than 2^36 bytes of RAM on 32 bits, but
those data structures should not affect I/O.  I;d like to track down and
fix the bugs instead of papering over the problem...

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  8:27       ` Kenji Kaneshige
@ 2010-06-14 15:12         ` H. Peter Anvin
  0 siblings, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-14 15:12 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: Thomas Gleixner, Ingo Molnar, linux-kernel, linux-pci, jbarnes

On 06/14/2010 01:27 AM, Kenji Kaneshige wrote:
> 
> Please ignore above. I misunderstood the architectural limit of physical
> address in x86 32-bit mode. It is not 40-bits, but 52-bits.
> 

Well, it can be anywhere from 32 to 52 bits, depending on the hardware.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14  9:02           ` Kenji Kaneshige
@ 2010-06-14 15:40             ` H. Peter Anvin
  0 siblings, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-14 15:40 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: Maciej W. Rozycki, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

On 06/14/2010 02:02 AM, Kenji Kaneshige wrote:
> 
> Now I guess there is a bug that doesn't handle physical address
> higher than 32-bit properly somewhere...
> 

More than one bug, almost certainly.

Note that all this applies to PAE mode.  Non-PAE mode has a 32-bit
limit, period.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14 11:06           ` Kenji Kaneshige
@ 2010-06-14 18:36             ` H. Peter Anvin
  2010-06-15  2:21               ` Kenji Kaneshige
  2010-06-14 20:16             ` Rolf Eike Beer
  1 sibling, 1 reply; 25+ messages in thread
From: H. Peter Anvin @ 2010-06-14 18:36 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: KAMEZAWA Hiroyuki, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

On 06/14/2010 04:06 AM, Kenji Kaneshige wrote:
> Index: linux-2.6.34/include/linux/vmalloc.h
> ===================================================================
> --- linux-2.6.34.orig/include/linux/vmalloc.h
> +++ linux-2.6.34/include/linux/vmalloc.h
> @@ -30,7 +30,7 @@ struct vm_struct {
>  	unsigned long		flags;
>  	struct page		**pages;
>  	unsigned int		nr_pages;
> -	unsigned long		phys_addr;
> +	phys_addr_t		phys_addr;
>  	void			*caller;
>  };

This really doesn't look right at all.  If this was required then it
would seem that anything using high addresses would be broken... as such
I can only assume it matters only for lowmem pages...

	-hpa

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14 11:06           ` Kenji Kaneshige
  2010-06-14 18:36             ` H. Peter Anvin
@ 2010-06-14 20:16             ` Rolf Eike Beer
  2010-06-15  2:33               ` Kenji Kaneshige
  1 sibling, 1 reply; 25+ messages in thread
From: Rolf Eike Beer @ 2010-06-14 20:16 UTC (permalink / raw)
  To: Kenji Kaneshige
  Cc: KAMEZAWA Hiroyuki, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, linux-pci, jbarnes

[-- Attachment #1: Type: Text/Plain, Size: 2126 bytes --]

Kenji Kaneshige wrote:
> (2010/06/14 18:13), Kenji Kaneshige wrote:
> > Thank you Hiroyuki.
> > 
> > So many bugs in ioremap()...
> > 
> > Will try with those bugs fixed.
> > 
> > Thanks,
> > Kenji Kaneshige
> 
> The problem seems to be fixed by the following patch. This is still
> under testing. I will post the patch as v2 after testing.
> 
> Thanks,
> Kenji Kaneshige
> 
> 
> Current x86 ioremap() doesn't handle physical address higher than
> 32-bit properly in X86_32 PAE mode. When physical address higher than
> 32-bit is passed to ioremap(), higher 32-bits in physical address is
> cleared wrongly. Due to this bug, ioremap() can map wrong address to
> linear address space.
> 
> In my case, 64-bit MMIO region was assigned to a PCI device (ioat
> device) on my system. Because of the ioremap()'s bug, wrong physical
> address (instead of MMIO region) was mapped to linear address space.
> Because of this, loading ioatdma driver caused unexpected behavior
> (kernel panic, kernel hangup, ...).
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> 
> ---
>  arch/x86/mm/ioremap.c   |   11 +++++------
>  include/linux/io.h      |    4 ++--
>  include/linux/vmalloc.h |    2 +-
>  lib/ioremap.c           |   10 +++++-----
>  4 files changed, 13 insertions(+), 14 deletions(-)
> 
> Index: linux-2.6.34/arch/x86/mm/ioremap.c
> ===================================================================
> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c
> +++ linux-2.6.34/arch/x86/mm/ioremap.c
> @@ -62,7 +62,8 @@ int ioremap_change_attr(unsigned long va
>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>  		unsigned long size, unsigned long prot_val, void *caller)
>  {
> -	unsigned long pfn, offset, vaddr;
> +	u64 pfn, last_pfn;
> +	unsigned long offset, vaddr;
>  	resource_size_t last_addr;
>  	const resource_size_t unaligned_phys_addr = phys_addr;
>  	const unsigned long unaligned_size = size;

Why do you use u64 and not resource_size_t for those? That way this would not 
be needlessly big for "real" 32 bit platforms.

Eike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14 18:36             ` H. Peter Anvin
@ 2010-06-15  2:21               ` Kenji Kaneshige
  0 siblings, 0 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-15  2:21 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: KAMEZAWA Hiroyuki, Thomas Gleixner, Ingo Molnar, linux-kernel,
	linux-pci, jbarnes

(2010/06/15 3:36), H. Peter Anvin wrote:
> On 06/14/2010 04:06 AM, Kenji Kaneshige wrote:
>> Index: linux-2.6.34/include/linux/vmalloc.h
>> ===================================================================
>> --- linux-2.6.34.orig/include/linux/vmalloc.h
>> +++ linux-2.6.34/include/linux/vmalloc.h
>> @@ -30,7 +30,7 @@ struct vm_struct {
>>   	unsigned long		flags;
>>   	struct page		**pages;
>>   	unsigned int		nr_pages;
>> -	unsigned long		phys_addr;
>> +	phys_addr_t		phys_addr;
>>   	void			*caller;
>>   };
>
> This really doesn't look right at all.  If this was required then it
> would seem that anything using high addresses would be broken... as such
> I can only assume it matters only for lowmem pages...
>

Without this change, /pric/vmallocinfo doesn't display physcal address info
properly (need also to change s_show() in mm/vmalloc (%lx in seq_printf()).

Thanks,
Kenji Kaneshige


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] x86: ioremap: fix physical address check
  2010-06-14 20:16             ` Rolf Eike Beer
@ 2010-06-15  2:33               ` Kenji Kaneshige
  0 siblings, 0 replies; 25+ messages in thread
From: Kenji Kaneshige @ 2010-06-15  2:33 UTC (permalink / raw)
  To: Rolf Eike Beer
  Cc: KAMEZAWA Hiroyuki, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, linux-pci, jbarnes

(2010/06/15 5:16), Rolf Eike Beer wrote:
> Kenji Kaneshige wrote:
>> (2010/06/14 18:13), Kenji Kaneshige wrote:
>>> Thank you Hiroyuki.
>>>
>>> So many bugs in ioremap()...
>>>
>>> Will try with those bugs fixed.
>>>
>>> Thanks,
>>> Kenji Kaneshige
>>
>> The problem seems to be fixed by the following patch. This is still
>> under testing. I will post the patch as v2 after testing.
>>
>> Thanks,
>> Kenji Kaneshige
>>
>>
>> Current x86 ioremap() doesn't handle physical address higher than
>> 32-bit properly in X86_32 PAE mode. When physical address higher than
>> 32-bit is passed to ioremap(), higher 32-bits in physical address is
>> cleared wrongly. Due to this bug, ioremap() can map wrong address to
>> linear address space.
>>
>> In my case, 64-bit MMIO region was assigned to a PCI device (ioat
>> device) on my system. Because of the ioremap()'s bug, wrong physical
>> address (instead of MMIO region) was mapped to linear address space.
>> Because of this, loading ioatdma driver caused unexpected behavior
>> (kernel panic, kernel hangup, ...).
>>
>> Signed-off-by: Kenji Kaneshige<kaneshige.kenji@jp.fujitsu.com>
>>
>> ---
>>   arch/x86/mm/ioremap.c   |   11 +++++------
>>   include/linux/io.h      |    4 ++--
>>   include/linux/vmalloc.h |    2 +-
>>   lib/ioremap.c           |   10 +++++-----
>>   4 files changed, 13 insertions(+), 14 deletions(-)
>>
>> Index: linux-2.6.34/arch/x86/mm/ioremap.c
>> ===================================================================
>> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c
>> +++ linux-2.6.34/arch/x86/mm/ioremap.c
>> @@ -62,7 +62,8 @@ int ioremap_change_attr(unsigned long va
>>   static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>   		unsigned long size, unsigned long prot_val, void *caller)
>>   {
>> -	unsigned long pfn, offset, vaddr;
>> +	u64 pfn, last_pfn;
>> +	unsigned long offset, vaddr;
>>   	resource_size_t last_addr;
>>   	const resource_size_t unaligned_phys_addr = phys_addr;
>>   	const unsigned long unaligned_size = size;
>
> Why do you use u64 and not resource_size_t for those? That way this would not
> be needlessly big for "real" 32 bit platforms.

Thank you for your comment. The reason was I found other code that uses
u64 for pfn in other code. But yes, I will change that.

Thanks,
Kenji Kaneshige



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-06-15  2:34 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-11  9:17 [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE Kenji Kaneshige
2010-06-11  9:18 ` [PATCH 1/4] x86: ioremap: fix wrong address masking Kenji Kaneshige
2010-06-11  9:20 ` [PATCH 2/4] x86: ioremap: fix physical address check Kenji Kaneshige
2010-06-11 17:43   ` H. Peter Anvin
2010-06-14  0:18     ` KAMEZAWA Hiroyuki
2010-06-14  8:59       ` KAMEZAWA Hiroyuki
2010-06-14  9:13         ` Kenji Kaneshige
2010-06-14 11:06           ` Kenji Kaneshige
2010-06-14 18:36             ` H. Peter Anvin
2010-06-15  2:21               ` Kenji Kaneshige
2010-06-14 20:16             ` Rolf Eike Beer
2010-06-15  2:33               ` Kenji Kaneshige
2010-06-14  1:54     ` Kenji Kaneshige
2010-06-14  6:38       ` Maciej W. Rozycki
2010-06-14  8:23         ` Kenji Kaneshige
2010-06-14  9:02           ` Kenji Kaneshige
2010-06-14 15:40             ` H. Peter Anvin
2010-06-14 15:11         ` H. Peter Anvin
2010-06-14  8:27       ` Kenji Kaneshige
2010-06-14 15:12         ` H. Peter Anvin
2010-06-11  9:20 ` [PATCH 3/4] x86: ioremap: remove physical address warning message Kenji Kaneshige
2010-06-11 17:44   ` H. Peter Anvin
2010-06-14  2:06     ` Kenji Kaneshige
2010-06-11  9:21 ` [PATCH 4/4] x86: ioremap: fix normal ram range check Kenji Kaneshige
2010-06-11 17:41 ` [RFC][PATCH 0/4] x86: ioremap() problem in X86_32 PAE H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.