linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: mm: Check if PUD is large when validating a kernel address
@ 2013-02-11 14:52 Mel Gorman
  2013-02-11 19:41 ` Rik van Riel
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Mel Gorman @ 2013-02-11 14:52 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: Mel Gorman, linux-kernel, linux-mm

A user reported the following oops when a backup process read
/proc/kcore.

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 6
 Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod

 Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
 RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
 RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
 RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
 RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
 R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
 FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
 Stack:
  ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
  ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
  0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading system RAM
at the 4G mark. On this system, that was the first address using 1G pages
for the virt->phys direct mapping so the PUD is pointing to a physical
address, not a PMD page.  The problem is that the page table walker in
kern_addr_valid() is not checking pud_large() and treats the physical
address as if it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If the data
happens to look like a present PMD though, it will be walked resulting in
the oops above. This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now they are
running the backup program without accessing /proc/kcore so the patch has
not been validated but I think it makes sense. If reviewers agree then it
should also be included in -stable back as far as 3.0-stable.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 arch/x86/include/asm/pgtable.h |    5 +++++
 arch/x86/mm/init_64.c          |    3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
@ 2013-02-11 19:41 ` Rik van Riel
  2013-02-12  6:40 ` Johannes Weiner
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Rik van Riel @ 2013-02-11 19:41 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On 02/11/2013 09:52 AM, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
>
>   BUG: unable to handle kernel paging request at ffffbb00ff33b000

> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
>
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.coM>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
  2013-02-11 19:41 ` Rik van Riel
@ 2013-02-12  6:40 ` Johannes Weiner
  2013-02-12 17:43 ` Michal Hocko
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Johannes Weiner @ 2013-02-12  6:40 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On Mon, Feb 11, 2013 at 02:52:36PM +0000, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
> 
>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  PGD 0
>  Oops: 0000 [#1] SMP
>  CPU 6
>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> 
>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>  Stack:
>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>  Call Trace:
>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>   [<ffffffff81151687>] vfs_read+0xc7/0x130
>   [<ffffffff811517f3>] sys_read+0x53/0xa0
>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> 
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
> 
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Agreed also on the backporting to -stable as far as possible.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
  2013-02-11 19:41 ` Rik van Riel
  2013-02-12  6:40 ` Johannes Weiner
@ 2013-02-12 17:43 ` Michal Hocko
  2013-02-13 11:02 ` [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2 Mel Gorman
  2013-02-13 12:12 ` [tip:x86/urgent] x86/mm: Check if PUD is large when validating a kernel address tip-bot for Mel Gorman
  4 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2013-02-12 17:43 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On Mon 11-02-13 14:52:36, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
> 
>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  PGD 0
>  Oops: 0000 [#1] SMP
>  CPU 6
>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> 
>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>  Stack:
>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>  Call Trace:
>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>   [<ffffffff81151687>] vfs_read+0xc7/0x130
>   [<ffffffff811517f3>] sys_read+0x53/0xa0
>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> 
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
> 
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/x86/include/asm/pgtable.h |    5 +++++
>  arch/x86/mm/init_64.c          |    3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 5199db2..1c1a955 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>  	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
> +static inline unsigned long pud_pfn(pud_t pud)
> +{
> +	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +}
> +
>  #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
>  
>  static inline int pmd_large(pmd_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..75c9a6a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
>  	if (pud_none(*pud))
>  		return 0;
>  
> +	if (pud_large(*pud))
> +		return pfn_valid(pud_pfn(*pud));
> +
>  	pmd = pmd_offset(pud, addr);
>  	if (pmd_none(*pmd))
>  		return 0;
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
                   ` (2 preceding siblings ...)
  2013-02-12 17:43 ` Michal Hocko
@ 2013-02-13 11:02 ` Mel Gorman
  2013-02-13 11:10   ` Ingo Molnar
  2013-03-01  6:43   ` Simon Jeons
  2013-02-13 12:12 ` [tip:x86/urgent] x86/mm: Check if PUD is large when validating a kernel address tip-bot for Mel Gorman
  4 siblings, 2 replies; 13+ messages in thread
From: Mel Gorman @ 2013-02-13 11:02 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: linux-kernel, linux-mm, riel, mhocko, hannes

Andrew or Ingo, please pick up.

Changelog since v1
  o Add reviewed-bys and acked-bys

A user reported a bug whereby a backup process accessing /proc/kcore
caused an oops.

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 6
 Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod

 Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
 RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
 RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
 RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
 RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
 R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
 FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
 Stack:
  ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
  ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
  0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading system RAM
at the 4G mark. On this system, that was the first address using 1G pages
for the virt->phys direct mapping so the PUD is pointing to a physical
address, not a PMD page.  The problem is that the page table walker in
kern_addr_valid() is not checking pud_large() and treats the physical
address as if it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If the data
happens to look like a present PMD though, it will be walked resulting in
the oops above. This patch adds the necessary pud_large() check.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 arch/x86/include/asm/pgtable.h |    5 +++++
 arch/x86/mm/init_64.c          |    3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-13 11:02 ` [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2 Mel Gorman
@ 2013-02-13 11:10   ` Ingo Molnar
  2013-02-13 11:14     ` Mel Gorman
  2013-03-01  6:43   ` Simon Jeons
  1 sibling, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2013-02-13 11:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes


* Mel Gorman <mgorman@suse.de> wrote:

> Andrew or Ingo, please pick up.

Already did - will push it out later today.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-13 11:10   ` Ingo Molnar
@ 2013-02-13 11:14     ` Mel Gorman
  0 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2013-02-13 11:14 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes

On Wed, Feb 13, 2013 at 12:10:31PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > Andrew or Ingo, please pick up.
> 
> Already did - will push it out later today.
> 

Whoops, thanks. Sorry for the noise.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:x86/urgent] x86/mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
                   ` (3 preceding siblings ...)
  2013-02-13 11:02 ` [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2 Mel Gorman
@ 2013-02-13 12:12 ` tip-bot for Mel Gorman
  4 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Mel Gorman @ 2013-02-13 12:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, hannes, riel, mgorman, tglx, mhocko

Commit-ID:  0ee364eb316348ddf3e0dfcd986f5f13f528f821
Gitweb:     http://git.kernel.org/tip/0ee364eb316348ddf3e0dfcd986f5f13f528f821
Author:     Mel Gorman <mgorman@suse.de>
AuthorDate: Mon, 11 Feb 2013 14:52:36 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 13 Feb 2013 10:02:55 +0100

x86/mm: Check if PUD is large when validating a kernel address

A user reported the following oops when a backup process reads
/proc/kcore:

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 [...]

 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt->phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.

The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.

This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.coM>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/pgtable.h | 5 +++++
 arch/x86/mm/init_64.c          | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-13 11:02 ` [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2 Mel Gorman
  2013-02-13 11:10   ` Ingo Molnar
@ 2013-03-01  6:43   ` Simon Jeons
  2013-03-01  9:15     ` Chen Gong
  1 sibling, 1 reply; 13+ messages in thread
From: Simon Jeons @ 2013-03-01  6:43 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes

On 02/13/2013 07:02 PM, Mel Gorman wrote:
> Andrew or Ingo, please pick up.
>
> Changelog since v1
>    o Add reviewed-bys and acked-bys
>
> A user reported a bug whereby a backup process accessing /proc/kcore
> caused an oops.
>
>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>   PGD 0
>   Oops: 0000 [#1] SMP
>   CPU 6
>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>
>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>   Stack:
>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>   Call Trace:
>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages

Do you mean there is one page which is 1G?

> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> Reviewed-by: Rik van Riel <riel@redhat.com>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>   arch/x86/include/asm/pgtable.h |    5 +++++
>   arch/x86/mm/init_64.c          |    3 +++
>   2 files changed, 8 insertions(+)
>
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 5199db2..1c1a955 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>   	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
>   }
>   
> +static inline unsigned long pud_pfn(pud_t pud)
> +{
> +	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +}
> +
>   #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
>   
>   static inline int pmd_large(pmd_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..75c9a6a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
>   	if (pud_none(*pud))
>   		return 0;
>   
> +	if (pud_large(*pud))
> +		return pfn_valid(pud_pfn(*pud));
> +
>   	pmd = pmd_offset(pud, addr);
>   	if (pmd_none(*pmd))
>   		return 0;
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  6:43   ` Simon Jeons
@ 2013-03-01  9:15     ` Chen Gong
  2013-03-01  9:21       ` Simon Jeons
  0 siblings, 1 reply; 13+ messages in thread
From: Chen Gong @ 2013-03-01  9:15 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

[-- Attachment #1: Type: text/plain, Size: 3623 bytes --]

On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
> Date: Fri, 01 Mar 2013 14:43:53 +0800
> From: Simon Jeons <simon.jeons@gmail.com>
> To: Mel Gorman <mgorman@suse.de>
> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>  <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>  linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>  kernel address v2
> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>  Thunderbird/17.0.3
> 
> On 02/13/2013 07:02 PM, Mel Gorman wrote:
> >Andrew or Ingo, please pick up.
> >
> >Changelog since v1
> >   o Add reviewed-bys and acked-bys
> >
> >A user reported a bug whereby a backup process accessing /proc/kcore
> >caused an oops.
> >
> >  BUG: unable to handle kernel paging request at ffffbb00ff33b000
> >  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >  PGD 0
> >  Oops: 0000 [#1] SMP
> >  CPU 6
> >  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> >
> >  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
> >  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
> >  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
> >  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
> >  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
> >  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
> >  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
> >  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
> >  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
> >  Stack:
> >   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
> >   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
> >   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
> >  Call Trace:
> >   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
> >   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
> >   [<ffffffff81151687>] vfs_read+0xc7/0x130
> >   [<ffffffff811517f3>] sys_read+0x53/0xa0
> >   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> >
> >Investigation determined that the bug triggered when reading system RAM
> >at the 4G mark. On this system, that was the first address using 1G pages
> 
> Do you mean there is one page which is 1G?
> 
1GB support in native kernel is started from 2.6.27 with these 2 commits:
39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
BTW, IBM System x3550 M3 is a Westmere based system.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  9:15     ` Chen Gong
@ 2013-03-01  9:21       ` Simon Jeons
  2013-03-01  9:35         ` Chen Gong
  0 siblings, 1 reply; 13+ messages in thread
From: Simon Jeons @ 2013-03-01  9:21 UTC (permalink / raw)
  To: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

On 03/01/2013 05:15 PM, Chen Gong wrote:
> On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
>> Date: Fri, 01 Mar 2013 14:43:53 +0800
>> From: Simon Jeons <simon.jeons@gmail.com>
>> To: Mel Gorman <mgorman@suse.de>
>> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>>   <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>   kernel address v2
>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>   Thunderbird/17.0.3
>>
>> On 02/13/2013 07:02 PM, Mel Gorman wrote:
>>> Andrew or Ingo, please pick up.
>>>
>>> Changelog since v1
>>>    o Add reviewed-bys and acked-bys
>>>
>>> A user reported a bug whereby a backup process accessing /proc/kcore
>>> caused an oops.
>>>
>>>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>>>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>   PGD 0
>>>   Oops: 0000 [#1] SMP
>>>   CPU 6
>>>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>>>
>>>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>>>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>>>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>>>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>>>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>>>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>>>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>>>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>>>   Stack:
>>>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>>>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>>>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>>>   Call Trace:
>>>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>>>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>>>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>>>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>>>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>>>
>>> Investigation determined that the bug triggered when reading system RAM
>>> at the 4G mark. On this system, that was the first address using 1G pages
>> Do you mean there is one page which is 1G?
>>
> 1GB support in native kernel is started from 2.6.27 with these 2 commits:

Why call kernel native? Which kend of kernel is not native?

> 39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
> BTW, IBM System x3550 M3 is a Westmere based system.
Is it only used in hugetlbfs page?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  9:21       ` Simon Jeons
@ 2013-03-01  9:35         ` Chen Gong
  2013-03-01  9:39           ` Simon Jeons
  0 siblings, 1 reply; 13+ messages in thread
From: Chen Gong @ 2013-03-01  9:35 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]

On Fri, Mar 01, 2013 at 05:21:35PM +0800, Simon Jeons wrote:
> Date: Fri, 01 Mar 2013 17:21:35 +0800
> From: Simon Jeons <simon.jeons@gmail.com>
> To: Mel Gorman <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>, Andrew
>  Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>  linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>  kernel address v2
> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>  Thunderbird/17.0.3
> 
> On 03/01/2013 05:15 PM, Chen Gong wrote:
> >On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
> >>Date: Fri, 01 Mar 2013 14:43:53 +0800
> >>From: Simon Jeons <simon.jeons@gmail.com>
> >>To: Mel Gorman <mgorman@suse.de>
> >>CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
> >>  <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
> >>  linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
> >>Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
> >>  kernel address v2
> >>User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
> >>  Thunderbird/17.0.3
> >>
> >>On 02/13/2013 07:02 PM, Mel Gorman wrote:
> >>>Andrew or Ingo, please pick up.
> >>>
> >>>Changelog since v1
> >>>   o Add reviewed-bys and acked-bys
> >>>
> >>>A user reported a bug whereby a backup process accessing /proc/kcore
> >>>caused an oops.
> >>>
> >>>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
> >>>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >>>  PGD 0
> >>>  Oops: 0000 [#1] SMP
> >>>  CPU 6
> >>>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> >>>
> >>>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
> >>>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >>>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
> >>>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
> >>>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
> >>>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
> >>>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
> >>>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
> >>>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
> >>>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
> >>>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
> >>>  Stack:
> >>>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
> >>>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
> >>>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
> >>>  Call Trace:
> >>>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
> >>>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
> >>>   [<ffffffff81151687>] vfs_read+0xc7/0x130
> >>>   [<ffffffff811517f3>] sys_read+0x53/0xa0
> >>>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> >>>
> >>>Investigation determined that the bug triggered when reading system RAM
> >>>at the 4G mark. On this system, that was the first address using 1G pages
> >>Do you mean there is one page which is 1G?
> >>
> >1GB support in native kernel is started from 2.6.27 with these 2 commits:
> 
> Why call kernel native? Which kend of kernel is not native?
relative to VMM like Xen.

> 
> >39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
> >BTW, IBM System x3550 M3 is a Westmere based system.
> Is it only used in hugetlbfs page?

Yes by now.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  9:35         ` Chen Gong
@ 2013-03-01  9:39           ` Simon Jeons
  0 siblings, 0 replies; 13+ messages in thread
From: Simon Jeons @ 2013-03-01  9:39 UTC (permalink / raw)
  To: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

On 03/01/2013 05:35 PM, Chen Gong wrote:
> On Fri, Mar 01, 2013 at 05:21:35PM +0800, Simon Jeons wrote:
>> Date: Fri, 01 Mar 2013 17:21:35 +0800
>> From: Simon Jeons <simon.jeons@gmail.com>
>> To: Mel Gorman <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>, Andrew
>>   Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>   kernel address v2
>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>   Thunderbird/17.0.3
>>
>> On 03/01/2013 05:15 PM, Chen Gong wrote:
>>> On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
>>>> Date: Fri, 01 Mar 2013 14:43:53 +0800
>>>> From: Simon Jeons <simon.jeons@gmail.com>
>>>> To: Mel Gorman <mgorman@suse.de>
>>>> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>>>>   <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>>>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>>>   kernel address v2
>>>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>>>   Thunderbird/17.0.3
>>>>
>>>> On 02/13/2013 07:02 PM, Mel Gorman wrote:
>>>>> Andrew or Ingo, please pick up.
>>>>>
>>>>> Changelog since v1
>>>>>    o Add reviewed-bys and acked-bys
>>>>>
>>>>> A user reported a bug whereby a backup process accessing /proc/kcore
>>>>> caused an oops.
>>>>>
>>>>>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>>>>>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>>>   PGD 0
>>>>>   Oops: 0000 [#1] SMP
>>>>>   CPU 6
>>>>>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>>>>>
>>>>>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>>>>>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>>>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>>>>>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>>>>>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>>>>>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>>>>>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>>>>>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>>>>>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>>>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>>>>>   Stack:
>>>>>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>>>>>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>>>>>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>>>>>   Call Trace:
>>>>>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>>>>>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>>>>>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>>>>>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>>>>>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>>>>>
>>>>> Investigation determined that the bug triggered when reading system RAM
>>>>> at the 4G mark. On this system, that was the first address using 1G pages
>>>> Do you mean there is one page which is 1G?
>>>>
>>> 1GB support in native kernel is started from 2.6.27 with these 2 commits:
>> Why call kernel native? Which kend of kernel is not native?
> relative to VMM like Xen.

Oh, I see. Thanks. :)

>
>>> 39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
>>> BTW, IBM System x3550 M3 is a Westmere based system.
>> Is it only used in hugetlbfs page?
> Yes by now.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-03-01  9:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
2013-02-11 19:41 ` Rik van Riel
2013-02-12  6:40 ` Johannes Weiner
2013-02-12 17:43 ` Michal Hocko
2013-02-13 11:02 ` [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2 Mel Gorman
2013-02-13 11:10   ` Ingo Molnar
2013-02-13 11:14     ` Mel Gorman
2013-03-01  6:43   ` Simon Jeons
2013-03-01  9:15     ` Chen Gong
2013-03-01  9:21       ` Simon Jeons
2013-03-01  9:35         ` Chen Gong
2013-03-01  9:39           ` Simon Jeons
2013-02-13 12:12 ` [tip:x86/urgent] x86/mm: Check if PUD is large when validating a kernel address tip-bot for Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).