* [RESEND PATCH 0/4] x86: /dev/mem fixes
@ 2014-09-29 11:32 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:32 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Hi all,
this is a combination of two patch sets I sent a while ago
1) Prevent possible PTE corruption with /dev/mem mmap
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
The original thread with both patch sets can be found here
https://lkml.org/lkml/2014/8/14/229
lkml: <1408025927-16826-1-git-send-email-fhrbata@redhat.com>
1) Prevent possible PTE corruption with /dev/mem mmap
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
Many thanks goes to Dave Hansen, who helped with the "final" check. Other than
that it did not get much attention, except H. Peter Anvin's complain that having
two checks for mmap and read/write for /dev/mem access is ridiculous. I for sure
do not object to this, but AFAICT it's not that simple to unify them and it's not
"directly" related to the PTE corruption. Please note that there are other
archs(ia64, arm) using these check. But I for sure can be missing something.
What this patch set does is using the existing interface to implement x86 specific
check in the least invasive way.
Anyway I tried to remove the high_memory check with a follow-up patch set 2)
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
This is an attempt to remove the high_memory limit for the read/write access to
/dev/mem. IMHO there is no reason for this limit on x86. It is presented in
the generic valid_phys_addr_range, which is used only by (read|write)_mem. IIUIC
it's main purpose is for the generic xlate_dev_mem_ptr, which is using only the
direct kernel mapping __va translation. Since the valid_phys_addr_range is
called as the first check in (read|write)_mem, it basically does not allow to
access anything above high_memory on x86.
The first patch adds high_memory check to x86's (xlate|unxlate)_dev_mem_ptr, so
the direct kernel mapping can be safely used for system RAM bellow high_memory.
This is IMHO the only valid reason to use high_memory check in (read|write)_mem.
The second patch removes the high_memory check from valid_phys_addr_range,
allowing read/write to access non-system RAM above high_memory. So far this
was possible only by using mmap.
I hope I haven't overlooked something.
Many thanks
Frantisek Hrbata (4):
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/ioremap.c | 9 ++++++---
arch/x86/mm/mmap.c | 12 ++++++++++++
arch/x86/mm/physaddr.h | 9 +++++++--
4 files changed, 29 insertions(+), 5 deletions(-)
--
1.9.3
^ permalink raw reply [flat|nested] 30+ messages in thread
* [RESEND PATCH 0/4] x86: /dev/mem fixes
@ 2014-09-29 11:32 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:32 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Hi all,
this is a combination of two patch sets I sent a while ago
1) Prevent possible PTE corruption with /dev/mem mmap
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
The original thread with both patch sets can be found here
https://lkml.org/lkml/2014/8/14/229
lkml: <1408025927-16826-1-git-send-email-fhrbata@redhat.com>
1) Prevent possible PTE corruption with /dev/mem mmap
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
Many thanks goes to Dave Hansen, who helped with the "final" check. Other than
that it did not get much attention, except H. Peter Anvin's complain that having
two checks for mmap and read/write for /dev/mem access is ridiculous. I for sure
do not object to this, but AFAICT it's not that simple to unify them and it's not
"directly" related to the PTE corruption. Please note that there are other
archs(ia64, arm) using these check. But I for sure can be missing something.
What this patch set does is using the existing interface to implement x86 specific
check in the least invasive way.
Anyway I tried to remove the high_memory check with a follow-up patch set 2)
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
This is an attempt to remove the high_memory limit for the read/write access to
/dev/mem. IMHO there is no reason for this limit on x86. It is presented in
the generic valid_phys_addr_range, which is used only by (read|write)_mem. IIUIC
it's main purpose is for the generic xlate_dev_mem_ptr, which is using only the
direct kernel mapping __va translation. Since the valid_phys_addr_range is
called as the first check in (read|write)_mem, it basically does not allow to
access anything above high_memory on x86.
The first patch adds high_memory check to x86's (xlate|unxlate)_dev_mem_ptr, so
the direct kernel mapping can be safely used for system RAM bellow high_memory.
This is IMHO the only valid reason to use high_memory check in (read|write)_mem.
The second patch removes the high_memory check from valid_phys_addr_range,
allowing read/write to access non-system RAM above high_memory. So far this
was possible only by using mmap.
I hope I haven't overlooked something.
Many thanks
Frantisek Hrbata (4):
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/ioremap.c | 9 ++++++---
arch/x86/mm/mmap.c | 12 ++++++++++++
arch/x86/mm/physaddr.h | 9 +++++++--
4 files changed, 29 insertions(+), 5 deletions(-)
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [RESEND PATCH 1/4] x86: add arch_pfn_possible helper
2014-09-29 11:32 ` Frantisek Hrbata
@ 2014-09-29 11:32 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:32 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Add helper to check maximum possible pfn on x86. Also make the current
phys_addr_valid helper using it internally.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/physaddr.h | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/physaddr.h b/arch/x86/mm/physaddr.h
index a3cd5a0..9df8e3a 100644
--- a/arch/x86/mm/physaddr.h
+++ b/arch/x86/mm/physaddr.h
@@ -1,10 +1,15 @@
#include <asm/processor.h>
-static inline int phys_addr_valid(resource_size_t addr)
+static inline int arch_pfn_possible(unsigned long pfn)
{
#ifdef CONFIG_PHYS_ADDR_T_64BIT
- return !(addr >> boot_cpu_data.x86_phys_bits);
+ return !(pfn >> (boot_cpu_data.x86_phys_bits - PAGE_SHIFT));
#else
return 1;
#endif
}
+
+static inline int phys_addr_valid(resource_size_t addr)
+{
+ return arch_pfn_possible(addr >> PAGE_SHIFT);
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 1/4] x86: add arch_pfn_possible helper
@ 2014-09-29 11:32 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:32 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Add helper to check maximum possible pfn on x86. Also make the current
phys_addr_valid helper using it internally.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/physaddr.h | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/physaddr.h b/arch/x86/mm/physaddr.h
index a3cd5a0..9df8e3a 100644
--- a/arch/x86/mm/physaddr.h
+++ b/arch/x86/mm/physaddr.h
@@ -1,10 +1,15 @@
#include <asm/processor.h>
-static inline int phys_addr_valid(resource_size_t addr)
+static inline int arch_pfn_possible(unsigned long pfn)
{
#ifdef CONFIG_PHYS_ADDR_T_64BIT
- return !(addr >> boot_cpu_data.x86_phys_bits);
+ return !(pfn >> (boot_cpu_data.x86_phys_bits - PAGE_SHIFT));
#else
return 1;
#endif
}
+
+static inline int phys_addr_valid(resource_size_t addr)
+{
+ return arch_pfn_possible(addr >> PAGE_SHIFT);
+}
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-29 11:32 ` Frantisek Hrbata
@ 2014-09-29 11:33 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:33 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Prevent possible PTE corruption while calling mmap on /dev/mem with large
offset.
oops info, please note the PTE value 8008000000000225.
---------------------------------8<--------------------------------------
[85739.124496] rep: Corrupted page table at address 7f63852f8000
[85739.130242] PGD ba2eb067 PUD b99c1067 PMD a2fa5067 PTE 8008000000000225
[85739.136941] Bad pagetable: 000d [#1] SMP
[85739.141002] Modules linked in: cfg80211 rfkill x86_pkg_temp_thermal coretemp
kvm_intel kvm bnx2 crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel microcode iTCO_wdt ipmi_si i2c_i801 iTCO_vendor_support
ipmi_msghandler dcdbas shpchp lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd
sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core
[85739.172620] CPU: 3 PID: 21900 Comm: rep Not tainted 3.15.8-200.fc20.x86_64 #1
[85739.179768] Hardware name: Dell Inc. PowerEdge R210 II/09T7VV, BIOS 2.0.4 02/29/2012
[85739.187512] task: ffff8800b9b3b160 ti: ffff8800ba270000 task.ti: ffff8800ba270000
[85739.194988] RIP: 0033:[<0000000000400773>] [<0000000000400773>] 0x400773
[85739.201799] RSP: 002b:00007fffe4ca3c80 EFLAGS: 00010213
[85739.207119] RAX: 00007f63852f8000 RBX: 0000000000000000 RCX: 00007f6384e0b8ca
[85739.214249] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[85739.221407] RBP: 00007fffe4ca3cc0 R08: 0000000000000003 R09: 0008000000000000
[85739.228545] R10: 0000000000000001 R11: 0000000000000206 R12: 00000000004005b0
[85739.235676] R13: 00007fffe4ca3da0 R14: 0000000000000000 R15: 0000000000000000
[85739.242835] FS: 00007f63852ea740(0000) GS:ffff88013fcc0000(0000) knlGS:0000000000000000
[85739.250925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[85739.256669] CR2: 00007f63852f8000 CR3: 00000000b9ba0000 CR4: 00000000001407e0
---------------------------------8<--------------------------------------
According to [1] Chapter 4 Paging, some higher bits in 64bit
PTE(X86_64 || X86_32_PAE) are reserved and have to be set to zero. For example,
for IA-32e and 4KB page [1] 4.5 IA-32e Paging: Table 4-19, bits 51-M(MAXPHYADDR)
are reserved. So for a CPU with e.g. 48bit phys addr width, bits 51-48 have to
be zero. If one of the reserved bits is set, [1] 4.7 Page-Fault Exceptions,
the #PF is generated with RSVD error code.
<quote>
RSVD flag (bit 3).
This flag is 1 if there is no valid translation for the linear address because a
reserved bit was set in one of the paging-structure entries used to translate
that address. (Because reserved bits are not checked in a paging-structure entry
whose P flag is 0, bit 3 of the error code can be set only if bit 0 is also
set.)
</quote>
In mmap_mem() the first check is valid_mmap_phys_addr_range(), but it always
returns 1 for x86. So it's possible to use any pgoff we want and
to set the PTE's reserved bits in remap_pfn_range(). Meaning there is a
possibility to use mmap on /dev/mem and cause system panic. It's probably
not that serious, because access to /dev/mem is limited and the system has
to have the panic_on_oops set, but still I think we should check this and
return error.
The path for this problem is:
mmap_mem() => remap_pfn_range() => page present => touch page => tlb miss =>
walk through paging structures => reserved bit set => #pf with rsvd flag
This patch adds check for x86. With this fix mmap returns -EINVAL if the
requested phys addr is larger then the supported phys addr width.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A
x86_64 reproducer
---------------------------------8<--------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
#define OFFSET 0x8000000000000LL
int main(int argc, char *argv[])
{
int fd;
long ps;
long pgoff;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY);
if (fd == -1)
die("cannot open /dev/mem");
pgoff = (OFFSET + (ps - 1)) & ~(ps - 1);
map = mmap(NULL, ps, PROT_READ, MAP_SHARED, fd, pgoff);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
x86_32_PAE reproducer
---------------------------------8<--------------------------------------
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
/* 37th bit in PTE */
#define OFFSET 0x2000000
int main(int argc, char *argv[])
{
int fd;
long ps;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY|O_LARGEFILE);
if (fd == -1)
die("cannot open /dev/mem");
map = (char *)syscall(SYS_mmap2, NULL, ps, PROT_READ, MAP_SHARED, fd, OFFSET);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/mmap.c | 12 ++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index b8237d8..55c59d5 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -243,6 +243,10 @@ static inline void flush_write_buffers(void)
#endif
}
+#define ARCH_HAS_VALID_PHYS_ADDR_RANGE
+extern int valid_phys_addr_range(phys_addr_t addr, size_t count);
+extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t count);
+
#endif /* __KERNEL__ */
extern void native_io_delay(void);
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 919b912..c8acb10 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -31,6 +31,8 @@
#include <linux/sched.h>
#include <asm/elf.h>
+#include "physaddr.h"
+
struct va_alignment __read_mostly va_align = {
.flags = -1,
};
@@ -122,3 +124,13 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
mm->get_unmapped_area = arch_get_unmapped_area_topdown;
}
}
+
+int valid_phys_addr_range(phys_addr_t addr, size_t count)
+{
+ return addr + count <= __pa(high_memory);
+}
+
+int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
+{
+ return arch_pfn_possible(pfn + (count >> PAGE_SHIFT));
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-29 11:33 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:33 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Prevent possible PTE corruption while calling mmap on /dev/mem with large
offset.
oops info, please note the PTE value 8008000000000225.
---------------------------------8<--------------------------------------
[85739.124496] rep: Corrupted page table at address 7f63852f8000
[85739.130242] PGD ba2eb067 PUD b99c1067 PMD a2fa5067 PTE 8008000000000225
[85739.136941] Bad pagetable: 000d [#1] SMP
[85739.141002] Modules linked in: cfg80211 rfkill x86_pkg_temp_thermal coretemp
kvm_intel kvm bnx2 crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel microcode iTCO_wdt ipmi_si i2c_i801 iTCO_vendor_support
ipmi_msghandler dcdbas shpchp lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd
sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core
[85739.172620] CPU: 3 PID: 21900 Comm: rep Not tainted 3.15.8-200.fc20.x86_64 #1
[85739.179768] Hardware name: Dell Inc. PowerEdge R210 II/09T7VV, BIOS 2.0.4 02/29/2012
[85739.187512] task: ffff8800b9b3b160 ti: ffff8800ba270000 task.ti: ffff8800ba270000
[85739.194988] RIP: 0033:[<0000000000400773>] [<0000000000400773>] 0x400773
[85739.201799] RSP: 002b:00007fffe4ca3c80 EFLAGS: 00010213
[85739.207119] RAX: 00007f63852f8000 RBX: 0000000000000000 RCX: 00007f6384e0b8ca
[85739.214249] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[85739.221407] RBP: 00007fffe4ca3cc0 R08: 0000000000000003 R09: 0008000000000000
[85739.228545] R10: 0000000000000001 R11: 0000000000000206 R12: 00000000004005b0
[85739.235676] R13: 00007fffe4ca3da0 R14: 0000000000000000 R15: 0000000000000000
[85739.242835] FS: 00007f63852ea740(0000) GS:ffff88013fcc0000(0000) knlGS:0000000000000000
[85739.250925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[85739.256669] CR2: 00007f63852f8000 CR3: 00000000b9ba0000 CR4: 00000000001407e0
---------------------------------8<--------------------------------------
According to [1] Chapter 4 Paging, some higher bits in 64bit
PTE(X86_64 || X86_32_PAE) are reserved and have to be set to zero. For example,
for IA-32e and 4KB page [1] 4.5 IA-32e Paging: Table 4-19, bits 51-M(MAXPHYADDR)
are reserved. So for a CPU with e.g. 48bit phys addr width, bits 51-48 have to
be zero. If one of the reserved bits is set, [1] 4.7 Page-Fault Exceptions,
the #PF is generated with RSVD error code.
<quote>
RSVD flag (bit 3).
This flag is 1 if there is no valid translation for the linear address because a
reserved bit was set in one of the paging-structure entries used to translate
that address. (Because reserved bits are not checked in a paging-structure entry
whose P flag is 0, bit 3 of the error code can be set only if bit 0 is also
set.)
</quote>
In mmap_mem() the first check is valid_mmap_phys_addr_range(), but it always
returns 1 for x86. So it's possible to use any pgoff we want and
to set the PTE's reserved bits in remap_pfn_range(). Meaning there is a
possibility to use mmap on /dev/mem and cause system panic. It's probably
not that serious, because access to /dev/mem is limited and the system has
to have the panic_on_oops set, but still I think we should check this and
return error.
The path for this problem is:
mmap_mem() => remap_pfn_range() => page present => touch page => tlb miss =>
walk through paging structures => reserved bit set => #pf with rsvd flag
This patch adds check for x86. With this fix mmap returns -EINVAL if the
requested phys addr is larger then the supported phys addr width.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A
x86_64 reproducer
---------------------------------8<--------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
#define OFFSET 0x8000000000000LL
int main(int argc, char *argv[])
{
int fd;
long ps;
long pgoff;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY);
if (fd == -1)
die("cannot open /dev/mem");
pgoff = (OFFSET + (ps - 1)) & ~(ps - 1);
map = mmap(NULL, ps, PROT_READ, MAP_SHARED, fd, pgoff);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
x86_32_PAE reproducer
---------------------------------8<--------------------------------------
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
/* 37th bit in PTE */
#define OFFSET 0x2000000
int main(int argc, char *argv[])
{
int fd;
long ps;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY|O_LARGEFILE);
if (fd == -1)
die("cannot open /dev/mem");
map = (char *)syscall(SYS_mmap2, NULL, ps, PROT_READ, MAP_SHARED, fd, OFFSET);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/mmap.c | 12 ++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index b8237d8..55c59d5 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -243,6 +243,10 @@ static inline void flush_write_buffers(void)
#endif
}
+#define ARCH_HAS_VALID_PHYS_ADDR_RANGE
+extern int valid_phys_addr_range(phys_addr_t addr, size_t count);
+extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t count);
+
#endif /* __KERNEL__ */
extern void native_io_delay(void);
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 919b912..c8acb10 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -31,6 +31,8 @@
#include <linux/sched.h>
#include <asm/elf.h>
+#include "physaddr.h"
+
struct va_alignment __read_mostly va_align = {
.flags = -1,
};
@@ -122,3 +124,13 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
mm->get_unmapped_area = arch_get_unmapped_area_topdown;
}
}
+
+int valid_phys_addr_range(phys_addr_t addr, size_t count)
+{
+ return addr + count <= __pa(high_memory);
+}
+
+int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
+{
+ return arch_pfn_possible(pfn + (count >> PAGE_SHIFT));
+}
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 3/4] x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
2014-09-29 11:32 ` Frantisek Hrbata
@ 2014-09-29 11:33 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:33 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
So far (xlate|unxlate)_dev_mem_ptr for read/write /dev/mem relies on a generic
high_memory check in valid_phys_addr_range(), which does not allow to access any
memory above high_memory whatsoever. By adding the high_memory check to
(xlate|unxlate)_dev_mem_ptr, it still will be possible to use __va safely for
kernel mapped memory and it will also allow read/write to access non-system RAM
above high_memory once the high_memory check is removed from
valid_phys_addr_range.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/ioremap.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index baff1da..1ae7323 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -320,8 +320,11 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void *addr;
unsigned long start = phys & PAGE_MASK;
- /* If page is RAM, we can use __va. Otherwise ioremap and unmap. */
- if (page_is_ram(start >> PAGE_SHIFT))
+ /*
+ * If page is RAM and is mapped by kernel, we can use __va.
+ * Otherwise ioremap and unmap.
+ */
+ if (page_is_ram(start >> PAGE_SHIFT) && phys <= __pa(high_memory))
return __va(phys);
addr = (void __force *)ioremap_cache(start, PAGE_SIZE);
@@ -333,7 +336,7 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void unxlate_dev_mem_ptr(unsigned long phys, void *addr)
{
- if (page_is_ram(phys >> PAGE_SHIFT))
+ if (page_is_ram(phys >> PAGE_SHIFT) && phys <= __pa(high_memory))
return;
iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 3/4] x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
@ 2014-09-29 11:33 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:33 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
So far (xlate|unxlate)_dev_mem_ptr for read/write /dev/mem relies on a generic
high_memory check in valid_phys_addr_range(), which does not allow to access any
memory above high_memory whatsoever. By adding the high_memory check to
(xlate|unxlate)_dev_mem_ptr, it still will be possible to use __va safely for
kernel mapped memory and it will also allow read/write to access non-system RAM
above high_memory once the high_memory check is removed from
valid_phys_addr_range.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/ioremap.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index baff1da..1ae7323 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -320,8 +320,11 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void *addr;
unsigned long start = phys & PAGE_MASK;
- /* If page is RAM, we can use __va. Otherwise ioremap and unmap. */
- if (page_is_ram(start >> PAGE_SHIFT))
+ /*
+ * If page is RAM and is mapped by kernel, we can use __va.
+ * Otherwise ioremap and unmap.
+ */
+ if (page_is_ram(start >> PAGE_SHIFT) && phys <= __pa(high_memory))
return __va(phys);
addr = (void __force *)ioremap_cache(start, PAGE_SIZE);
@@ -333,7 +336,7 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void unxlate_dev_mem_ptr(unsigned long phys, void *addr)
{
- if (page_is_ram(phys >> PAGE_SHIFT))
+ if (page_is_ram(phys >> PAGE_SHIFT) && phys <= __pa(high_memory))
return;
iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 4/4] x86: remove high_memory check from valid_phys_addr_range
2014-09-29 11:32 ` Frantisek Hrbata
@ 2014-09-29 11:33 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:33 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
There is no need to block read/write access to /dev/mem for phys. addr. above
high_memory for non-system RAM. The only limitation should be
boot_cpu_data.x86_phys_bits(max phys. addr. size).
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/mmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index c8acb10..7fa0242 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -127,7 +127,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
int valid_phys_addr_range(phys_addr_t addr, size_t count)
{
- return addr + count <= __pa(high_memory);
+ return arch_pfn_possible((addr + count) >> PAGE_SHIFT);
}
int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [RESEND PATCH 4/4] x86: remove high_memory check from valid_phys_addr_range
@ 2014-09-29 11:33 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-29 11:33 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
There is no need to block read/write access to /dev/mem for phys. addr. above
high_memory for non-system RAM. The only limitation should be
boot_cpu_data.x86_phys_bits(max phys. addr. size).
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/mmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index c8acb10..7fa0242 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -127,7 +127,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
int valid_phys_addr_range(phys_addr_t addr, size_t count)
{
- return addr + count <= __pa(high_memory);
+ return arch_pfn_possible((addr + count) >> PAGE_SHIFT);
}
int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-29 11:33 ` Frantisek Hrbata
@ 2014-09-29 20:15 ` Thomas Gleixner
-1 siblings, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2014-09-29 20:15 UTC (permalink / raw)
To: Frantisek Hrbata
Cc: linux-kernel, linux-mm, mingo, hpa, x86, oleg, kamaleshb,
hechjie, akpm, dave.hansen, dvlasenk, prarit, lwoodman,
hannsj_uhl, torvalds
On Mon, 29 Sep 2014, Frantisek Hrbata wrote:
> V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
AFAICT, Dave also asked you to change size_t count into something more
intuitive, i.e. nr_bytes or such.
> +int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
And right he is. I really had to look twice to see that count is
actually number of bytes and not number of pages, which is what you
expect after pfn.
> +{
> + return arch_pfn_possible(pfn + (count >> PAGE_SHIFT));
> +}
Thanks,
tglx
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-29 20:15 ` Thomas Gleixner
0 siblings, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2014-09-29 20:15 UTC (permalink / raw)
To: Frantisek Hrbata
Cc: linux-kernel, linux-mm, mingo, hpa, x86, oleg, kamaleshb,
hechjie, akpm, dave.hansen, dvlasenk, prarit, lwoodman,
hannsj_uhl, torvalds
On Mon, 29 Sep 2014, Frantisek Hrbata wrote:
> V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
AFAICT, Dave also asked you to change size_t count into something more
intuitive, i.e. nr_bytes or such.
> +int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
And right he is. I really had to look twice to see that count is
actually number of bytes and not number of pages, which is what you
expect after pfn.
> +{
> + return arch_pfn_possible(pfn + (count >> PAGE_SHIFT));
> +}
Thanks,
tglx
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-29 20:15 ` Thomas Gleixner
@ 2014-09-30 12:41 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 12:41 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, linux-mm, mingo, hpa, x86, oleg, kamaleshb,
hechjie, akpm, dave.hansen, dvlasenk, prarit, lwoodman,
hannsj_uhl, torvalds
On Mon, Sep 29, 2014 at 10:15:28PM +0200, Thomas Gleixner wrote:
> On Mon, 29 Sep 2014, Frantisek Hrbata wrote:
> > V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
>
> AFAICT, Dave also asked you to change size_t count into something more
> intuitive, i.e. nr_bytes or such.
Hi,
mea culpa, I for unknown reason changed it from "size" to "count". I guess some
cut&paste mess. The correct prototype used elsewhere in kernel is
int valid_mmap_phys_addr_range(unsigned long pfn, size_t size)
Does it make sense to replace "count" with "size" so it's consistent with the
rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
I will fix this and I'm sorry Dave I did not change it as discussed. It totally
slipped my mind.
Many thanks Thomas.
>
> > +int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
>
> And right he is. I really had to look twice to see that count is
> actually number of bytes and not number of pages, which is what you
> expect after pfn.
>
> > +{
> > + return arch_pfn_possible(pfn + (count >> PAGE_SHIFT));
> > +}
>
> Thanks,
>
> tglx
--
Frantisek Hrbata
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-30 12:41 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 12:41 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, linux-mm, mingo, hpa, x86, oleg, kamaleshb,
hechjie, akpm, dave.hansen, dvlasenk, prarit, lwoodman,
hannsj_uhl, torvalds
On Mon, Sep 29, 2014 at 10:15:28PM +0200, Thomas Gleixner wrote:
> On Mon, 29 Sep 2014, Frantisek Hrbata wrote:
> > V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
>
> AFAICT, Dave also asked you to change size_t count into something more
> intuitive, i.e. nr_bytes or such.
Hi,
mea culpa, I for unknown reason changed it from "size" to "count". I guess some
cut&paste mess. The correct prototype used elsewhere in kernel is
int valid_mmap_phys_addr_range(unsigned long pfn, size_t size)
Does it make sense to replace "count" with "size" so it's consistent with the
rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
I will fix this and I'm sorry Dave I did not change it as discussed. It totally
slipped my mind.
Many thanks Thomas.
>
> > +int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
>
> And right he is. I really had to look twice to see that count is
> actually number of bytes and not number of pages, which is what you
> expect after pfn.
>
> > +{
> > + return arch_pfn_possible(pfn + (count >> PAGE_SHIFT));
> > +}
>
> Thanks,
>
> tglx
--
Frantisek Hrbata
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-30 12:41 ` Frantisek Hrbata
@ 2014-09-30 14:27 ` Dave Hansen
-1 siblings, 0 replies; 30+ messages in thread
From: Dave Hansen @ 2014-09-30 14:27 UTC (permalink / raw)
To: Frantisek Hrbata, Thomas Gleixner
Cc: linux-kernel, linux-mm, mingo, hpa, x86, oleg, kamaleshb,
hechjie, akpm, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
On 09/30/2014 05:41 AM, Frantisek Hrbata wrote:
> Does it make sense to replace "count" with "size" so it's consistent with the
> rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
I don't care what it is as long as it has a unit in it.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-30 14:27 ` Dave Hansen
0 siblings, 0 replies; 30+ messages in thread
From: Dave Hansen @ 2014-09-30 14:27 UTC (permalink / raw)
To: Frantisek Hrbata, Thomas Gleixner
Cc: linux-kernel, linux-mm, mingo, hpa, x86, oleg, kamaleshb,
hechjie, akpm, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
On 09/30/2014 05:41 AM, Frantisek Hrbata wrote:
> Does it make sense to replace "count" with "size" so it's consistent with the
> rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
I don't care what it is as long as it has a unit in it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 0/4] x86: /dev/mem fixes
2014-09-29 11:32 ` Frantisek Hrbata
@ 2014-09-30 16:24 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:24 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
This is a second version of the patch set. The only change is in
2/4 x86: add phys addr validity check for /dev/mem mmap
where the "count" was replaced with "len_bytes" for better readability.
The rest is just a refresh because of this change.
The original message follows ...
Hi all,
this is a combination of two patch sets I sent a while ago
1) Prevent possible PTE corruption with /dev/mem mmap
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
The original thread with both patch sets can be found here
https://lkml.org/lkml/2014/8/14/229
lkml: <1408025927-16826-1-git-send-email-fhrbata@redhat.com>
1) Prevent possible PTE corruption with /dev/mem mmap
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
Many thanks goes to Dave Hansen, who helped with the "final" check. Other than
that it did not get much attention, except H. Peter Anvin's complain that having
two checks for mmap and read/write for /dev/mem access is ridiculous. I for sure
do not object to this, but AFAICT it's not that simple to unify them and it's not
"directly" related to the PTE corruption. Please note that there are other
archs(ia64, arm) using these check. But I for sure can be missing something.
What this patch set does is using the existing interface to implement x86 specific
check in the least invasive way.
Anyway I tried to remove the high_memory check with a follow-up patch set 2)
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
This is an attempt to remove the high_memory limit for the read/write access to
/dev/mem. IMHO there is no reason for this limit on x86. It is presented in
the generic valid_phys_addr_range, which is used only by (read|write)_mem. IIUIC
it's main purpose is for the generic xlate_dev_mem_ptr, which is using only the
direct kernel mapping __va translation. Since the valid_phys_addr_range is
called as the first check in (read|write)_mem, it basically does not allow to
access anything above high_memory on x86.
The first patch adds high_memory check to x86's (xlate|unxlate)_dev_mem_ptr, so
the direct kernel mapping can be safely used for system RAM bellow high_memory.
This is IMHO the only valid reason to use high_memory check in (read|write)_mem.
The second patch removes the high_memory check from valid_phys_addr_range,
allowing read/write to access non-system RAM above high_memory. So far this
was possible only by using mmap.
I hope I haven't overlooked something.
Many thanks
Frantisek Hrbata (4):
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/ioremap.c | 9 ++++++---
arch/x86/mm/mmap.c | 12 ++++++++++++
arch/x86/mm/physaddr.h | 9 +++++++--
4 files changed, 29 insertions(+), 5 deletions(-)
--
1.9.3
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 0/4] x86: /dev/mem fixes
@ 2014-09-30 16:24 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:24 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
This is a second version of the patch set. The only change is in
2/4 x86: add phys addr validity check for /dev/mem mmap
where the "count" was replaced with "len_bytes" for better readability.
The rest is just a refresh because of this change.
The original message follows ...
Hi all,
this is a combination of two patch sets I sent a while ago
1) Prevent possible PTE corruption with /dev/mem mmap
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
The original thread with both patch sets can be found here
https://lkml.org/lkml/2014/8/14/229
lkml: <1408025927-16826-1-git-send-email-fhrbata@redhat.com>
1) Prevent possible PTE corruption with /dev/mem mmap
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
Many thanks goes to Dave Hansen, who helped with the "final" check. Other than
that it did not get much attention, except H. Peter Anvin's complain that having
two checks for mmap and read/write for /dev/mem access is ridiculous. I for sure
do not object to this, but AFAICT it's not that simple to unify them and it's not
"directly" related to the PTE corruption. Please note that there are other
archs(ia64, arm) using these check. But I for sure can be missing something.
What this patch set does is using the existing interface to implement x86 specific
check in the least invasive way.
Anyway I tried to remove the high_memory check with a follow-up patch set 2)
2) x86: allow read/write /dev/mem to access non-system RAM above high_memory
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
This is an attempt to remove the high_memory limit for the read/write access to
/dev/mem. IMHO there is no reason for this limit on x86. It is presented in
the generic valid_phys_addr_range, which is used only by (read|write)_mem. IIUIC
it's main purpose is for the generic xlate_dev_mem_ptr, which is using only the
direct kernel mapping __va translation. Since the valid_phys_addr_range is
called as the first check in (read|write)_mem, it basically does not allow to
access anything above high_memory on x86.
The first patch adds high_memory check to x86's (xlate|unxlate)_dev_mem_ptr, so
the direct kernel mapping can be safely used for system RAM bellow high_memory.
This is IMHO the only valid reason to use high_memory check in (read|write)_mem.
The second patch removes the high_memory check from valid_phys_addr_range,
allowing read/write to access non-system RAM above high_memory. So far this
was possible only by using mmap.
I hope I haven't overlooked something.
Many thanks
Frantisek Hrbata (4):
x86: add arch_pfn_possible helper
x86: add phys addr validity check for /dev/mem mmap
x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
x86: remove high_memory check from valid_phys_addr_range
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/ioremap.c | 9 ++++++---
arch/x86/mm/mmap.c | 12 ++++++++++++
arch/x86/mm/physaddr.h | 9 +++++++--
4 files changed, 29 insertions(+), 5 deletions(-)
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 1/4] x86: add arch_pfn_possible helper
2014-09-30 16:24 ` Frantisek Hrbata
@ 2014-09-30 16:25 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Add helper to check maximum possible pfn on x86. Also make the current
phys_addr_valid helper using it internally.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/physaddr.h | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/physaddr.h b/arch/x86/mm/physaddr.h
index a3cd5a0..9df8e3a 100644
--- a/arch/x86/mm/physaddr.h
+++ b/arch/x86/mm/physaddr.h
@@ -1,10 +1,15 @@
#include <asm/processor.h>
-static inline int phys_addr_valid(resource_size_t addr)
+static inline int arch_pfn_possible(unsigned long pfn)
{
#ifdef CONFIG_PHYS_ADDR_T_64BIT
- return !(addr >> boot_cpu_data.x86_phys_bits);
+ return !(pfn >> (boot_cpu_data.x86_phys_bits - PAGE_SHIFT));
#else
return 1;
#endif
}
+
+static inline int phys_addr_valid(resource_size_t addr)
+{
+ return arch_pfn_possible(addr >> PAGE_SHIFT);
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 1/4] x86: add arch_pfn_possible helper
@ 2014-09-30 16:25 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Add helper to check maximum possible pfn on x86. Also make the current
phys_addr_valid helper using it internally.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/physaddr.h | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/physaddr.h b/arch/x86/mm/physaddr.h
index a3cd5a0..9df8e3a 100644
--- a/arch/x86/mm/physaddr.h
+++ b/arch/x86/mm/physaddr.h
@@ -1,10 +1,15 @@
#include <asm/processor.h>
-static inline int phys_addr_valid(resource_size_t addr)
+static inline int arch_pfn_possible(unsigned long pfn)
{
#ifdef CONFIG_PHYS_ADDR_T_64BIT
- return !(addr >> boot_cpu_data.x86_phys_bits);
+ return !(pfn >> (boot_cpu_data.x86_phys_bits - PAGE_SHIFT));
#else
return 1;
#endif
}
+
+static inline int phys_addr_valid(resource_size_t addr)
+{
+ return arch_pfn_possible(addr >> PAGE_SHIFT);
+}
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-30 16:24 ` Frantisek Hrbata
@ 2014-09-30 16:25 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Prevent possible PTE corruption while calling mmap on /dev/mem with large
offset.
oops info, please note the PTE value 8008000000000225.
---------------------------------8<--------------------------------------
[85739.124496] rep: Corrupted page table at address 7f63852f8000
[85739.130242] PGD ba2eb067 PUD b99c1067 PMD a2fa5067 PTE 8008000000000225
[85739.136941] Bad pagetable: 000d [#1] SMP
[85739.141002] Modules linked in: cfg80211 rfkill x86_pkg_temp_thermal coretemp
kvm_intel kvm bnx2 crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel microcode iTCO_wdt ipmi_si i2c_i801 iTCO_vendor_support
ipmi_msghandler dcdbas shpchp lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd
sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core
[85739.172620] CPU: 3 PID: 21900 Comm: rep Not tainted 3.15.8-200.fc20.x86_64 #1
[85739.179768] Hardware name: Dell Inc. PowerEdge R210 II/09T7VV, BIOS 2.0.4 02/29/2012
[85739.187512] task: ffff8800b9b3b160 ti: ffff8800ba270000 task.ti: ffff8800ba270000
[85739.194988] RIP: 0033:[<0000000000400773>] [<0000000000400773>] 0x400773
[85739.201799] RSP: 002b:00007fffe4ca3c80 EFLAGS: 00010213
[85739.207119] RAX: 00007f63852f8000 RBX: 0000000000000000 RCX: 00007f6384e0b8ca
[85739.214249] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[85739.221407] RBP: 00007fffe4ca3cc0 R08: 0000000000000003 R09: 0008000000000000
[85739.228545] R10: 0000000000000001 R11: 0000000000000206 R12: 00000000004005b0
[85739.235676] R13: 00007fffe4ca3da0 R14: 0000000000000000 R15: 0000000000000000
[85739.242835] FS: 00007f63852ea740(0000) GS:ffff88013fcc0000(0000) knlGS:0000000000000000
[85739.250925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[85739.256669] CR2: 00007f63852f8000 CR3: 00000000b9ba0000 CR4: 00000000001407e0
---------------------------------8<--------------------------------------
According to [1] Chapter 4 Paging, some higher bits in 64bit
PTE(X86_64 || X86_32_PAE) are reserved and have to be set to zero. For example,
for IA-32e and 4KB page [1] 4.5 IA-32e Paging: Table 4-19, bits 51-M(MAXPHYADDR)
are reserved. So for a CPU with e.g. 48bit phys addr width, bits 51-48 have to
be zero. If one of the reserved bits is set, [1] 4.7 Page-Fault Exceptions,
the #PF is generated with RSVD error code.
<quote>
RSVD flag (bit 3).
This flag is 1 if there is no valid translation for the linear address because a
reserved bit was set in one of the paging-structure entries used to translate
that address. (Because reserved bits are not checked in a paging-structure entry
whose P flag is 0, bit 3 of the error code can be set only if bit 0 is also
set.)
</quote>
In mmap_mem() the first check is valid_mmap_phys_addr_range(), but it always
returns 1 for x86. So it's possible to use any pgoff we want and
to set the PTE's reserved bits in remap_pfn_range(). Meaning there is a
possibility to use mmap on /dev/mem and cause system panic. It's probably
not that serious, because access to /dev/mem is limited and the system has
to have the panic_on_oops set, but still I think we should check this and
return error.
The path for this problem is:
mmap_mem() => remap_pfn_range() => page present => touch page => tlb miss =>
walk through paging structures => reserved bit set => #pf with rsvd flag
This patch adds check for x86. With this fix mmap returns -EINVAL if the
requested phys addr is larger then the supported phys addr width.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A
x86_64 reproducer
---------------------------------8<--------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
#define OFFSET 0x8000000000000LL
int main(int argc, char *argv[])
{
int fd;
long ps;
long pgoff;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY);
if (fd == -1)
die("cannot open /dev/mem");
pgoff = (OFFSET + (ps - 1)) & ~(ps - 1);
map = mmap(NULL, ps, PROT_READ, MAP_SHARED, fd, pgoff);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
x86_32_PAE reproducer
---------------------------------8<--------------------------------------
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
/* 37th bit in PTE */
#define OFFSET 0x2000000
int main(int argc, char *argv[])
{
int fd;
long ps;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY|O_LARGEFILE);
if (fd == -1)
die("cannot open /dev/mem");
map = (char *)syscall(SYS_mmap2, NULL, ps, PROT_READ, MAP_SHARED, fd, OFFSET);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
V3: use len_bytes instead of count, thanks to Dave Hansen and Thomas Gleixner
V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/mmap.c | 12 ++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index b8237d8..49ede3c 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -243,6 +243,10 @@ static inline void flush_write_buffers(void)
#endif
}
+#define ARCH_HAS_VALID_PHYS_ADDR_RANGE
+extern int valid_phys_addr_range(phys_addr_t addr, size_t count);
+extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t len_bytes);
+
#endif /* __KERNEL__ */
extern void native_io_delay(void);
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 919b912..77a13f8 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -31,6 +31,8 @@
#include <linux/sched.h>
#include <asm/elf.h>
+#include "physaddr.h"
+
struct va_alignment __read_mostly va_align = {
.flags = -1,
};
@@ -122,3 +124,13 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
mm->get_unmapped_area = arch_get_unmapped_area_topdown;
}
}
+
+int valid_phys_addr_range(phys_addr_t addr, size_t count)
+{
+ return addr + count <= __pa(high_memory);
+}
+
+int valid_mmap_phys_addr_range(unsigned long pfn, size_t len_bytes)
+{
+ return arch_pfn_possible(pfn + (len_bytes >> PAGE_SHIFT));
+}
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-30 16:25 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
Prevent possible PTE corruption while calling mmap on /dev/mem with large
offset.
oops info, please note the PTE value 8008000000000225.
---------------------------------8<--------------------------------------
[85739.124496] rep: Corrupted page table at address 7f63852f8000
[85739.130242] PGD ba2eb067 PUD b99c1067 PMD a2fa5067 PTE 8008000000000225
[85739.136941] Bad pagetable: 000d [#1] SMP
[85739.141002] Modules linked in: cfg80211 rfkill x86_pkg_temp_thermal coretemp
kvm_intel kvm bnx2 crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel microcode iTCO_wdt ipmi_si i2c_i801 iTCO_vendor_support
ipmi_msghandler dcdbas shpchp lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd
sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core
[85739.172620] CPU: 3 PID: 21900 Comm: rep Not tainted 3.15.8-200.fc20.x86_64 #1
[85739.179768] Hardware name: Dell Inc. PowerEdge R210 II/09T7VV, BIOS 2.0.4 02/29/2012
[85739.187512] task: ffff8800b9b3b160 ti: ffff8800ba270000 task.ti: ffff8800ba270000
[85739.194988] RIP: 0033:[<0000000000400773>] [<0000000000400773>] 0x400773
[85739.201799] RSP: 002b:00007fffe4ca3c80 EFLAGS: 00010213
[85739.207119] RAX: 00007f63852f8000 RBX: 0000000000000000 RCX: 00007f6384e0b8ca
[85739.214249] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[85739.221407] RBP: 00007fffe4ca3cc0 R08: 0000000000000003 R09: 0008000000000000
[85739.228545] R10: 0000000000000001 R11: 0000000000000206 R12: 00000000004005b0
[85739.235676] R13: 00007fffe4ca3da0 R14: 0000000000000000 R15: 0000000000000000
[85739.242835] FS: 00007f63852ea740(0000) GS:ffff88013fcc0000(0000) knlGS:0000000000000000
[85739.250925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[85739.256669] CR2: 00007f63852f8000 CR3: 00000000b9ba0000 CR4: 00000000001407e0
---------------------------------8<--------------------------------------
According to [1] Chapter 4 Paging, some higher bits in 64bit
PTE(X86_64 || X86_32_PAE) are reserved and have to be set to zero. For example,
for IA-32e and 4KB page [1] 4.5 IA-32e Paging: Table 4-19, bits 51-M(MAXPHYADDR)
are reserved. So for a CPU with e.g. 48bit phys addr width, bits 51-48 have to
be zero. If one of the reserved bits is set, [1] 4.7 Page-Fault Exceptions,
the #PF is generated with RSVD error code.
<quote>
RSVD flag (bit 3).
This flag is 1 if there is no valid translation for the linear address because a
reserved bit was set in one of the paging-structure entries used to translate
that address. (Because reserved bits are not checked in a paging-structure entry
whose P flag is 0, bit 3 of the error code can be set only if bit 0 is also
set.)
</quote>
In mmap_mem() the first check is valid_mmap_phys_addr_range(), but it always
returns 1 for x86. So it's possible to use any pgoff we want and
to set the PTE's reserved bits in remap_pfn_range(). Meaning there is a
possibility to use mmap on /dev/mem and cause system panic. It's probably
not that serious, because access to /dev/mem is limited and the system has
to have the panic_on_oops set, but still I think we should check this and
return error.
The path for this problem is:
mmap_mem() => remap_pfn_range() => page present => touch page => tlb miss =>
walk through paging structures => reserved bit set => #pf with rsvd flag
This patch adds check for x86. With this fix mmap returns -EINVAL if the
requested phys addr is larger then the supported phys addr width.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A
x86_64 reproducer
---------------------------------8<--------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
#define OFFSET 0x8000000000000LL
int main(int argc, char *argv[])
{
int fd;
long ps;
long pgoff;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY);
if (fd == -1)
die("cannot open /dev/mem");
pgoff = (OFFSET + (ps - 1)) & ~(ps - 1);
map = mmap(NULL, ps, PROT_READ, MAP_SHARED, fd, pgoff);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
x86_32_PAE reproducer
---------------------------------8<--------------------------------------
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <err.h>
#include <stdlib.h>
#include <sys/mman.h>
#define die(fmt, ...) err(1, fmt, ##__VA_ARGS__)
/* 37th bit in PTE */
#define OFFSET 0x2000000
int main(int argc, char *argv[])
{
int fd;
long ps;
char *map;
char c;
ps = sysconf(_SC_PAGE_SIZE);
if (ps == -1)
die("cannot get page size");
fd = open("/dev/mem", O_RDONLY|O_LARGEFILE);
if (fd == -1)
die("cannot open /dev/mem");
map = (char *)syscall(SYS_mmap2, NULL, ps, PROT_READ, MAP_SHARED, fd, OFFSET);
if (map == MAP_FAILED)
die("cannot mmap");
c = map[0];
if (munmap(map, ps) == -1)
die("cannot munmap");
if (close(fd) == -1)
die("cannot close");
return 0;
}
---------------------------------8<--------------------------------------
V3: use len_bytes instead of count, thanks to Dave Hansen and Thomas Gleixner
V2: fix pfn check in valid_mmap_phys_addr_range, thanks to Dave Hansen
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/include/asm/io.h | 4 ++++
arch/x86/mm/mmap.c | 12 ++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index b8237d8..49ede3c 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -243,6 +243,10 @@ static inline void flush_write_buffers(void)
#endif
}
+#define ARCH_HAS_VALID_PHYS_ADDR_RANGE
+extern int valid_phys_addr_range(phys_addr_t addr, size_t count);
+extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t len_bytes);
+
#endif /* __KERNEL__ */
extern void native_io_delay(void);
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 919b912..77a13f8 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -31,6 +31,8 @@
#include <linux/sched.h>
#include <asm/elf.h>
+#include "physaddr.h"
+
struct va_alignment __read_mostly va_align = {
.flags = -1,
};
@@ -122,3 +124,13 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
mm->get_unmapped_area = arch_get_unmapped_area_topdown;
}
}
+
+int valid_phys_addr_range(phys_addr_t addr, size_t count)
+{
+ return addr + count <= __pa(high_memory);
+}
+
+int valid_mmap_phys_addr_range(unsigned long pfn, size_t len_bytes)
+{
+ return arch_pfn_possible(pfn + (len_bytes >> PAGE_SHIFT));
+}
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 3/4] x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
2014-09-30 16:24 ` Frantisek Hrbata
@ 2014-09-30 16:25 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
So far (xlate|unxlate)_dev_mem_ptr for read/write /dev/mem relies on a generic
high_memory check in valid_phys_addr_range(), which does not allow to access any
memory above high_memory whatsoever. By adding the high_memory check to
(xlate|unxlate)_dev_mem_ptr, it still will be possible to use __va safely for
kernel mapped memory and it will also allow read/write to access non-system RAM
above high_memory once the high_memory check is removed from
valid_phys_addr_range.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/ioremap.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index baff1da..1ae7323 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -320,8 +320,11 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void *addr;
unsigned long start = phys & PAGE_MASK;
- /* If page is RAM, we can use __va. Otherwise ioremap and unmap. */
- if (page_is_ram(start >> PAGE_SHIFT))
+ /*
+ * If page is RAM and is mapped by kernel, we can use __va.
+ * Otherwise ioremap and unmap.
+ */
+ if (page_is_ram(start >> PAGE_SHIFT) && phys <= __pa(high_memory))
return __va(phys);
addr = (void __force *)ioremap_cache(start, PAGE_SIZE);
@@ -333,7 +336,7 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void unxlate_dev_mem_ptr(unsigned long phys, void *addr)
{
- if (page_is_ram(phys >> PAGE_SHIFT))
+ if (page_is_ram(phys >> PAGE_SHIFT) && phys <= __pa(high_memory))
return;
iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 3/4] x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr
@ 2014-09-30 16:25 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
So far (xlate|unxlate)_dev_mem_ptr for read/write /dev/mem relies on a generic
high_memory check in valid_phys_addr_range(), which does not allow to access any
memory above high_memory whatsoever. By adding the high_memory check to
(xlate|unxlate)_dev_mem_ptr, it still will be possible to use __va safely for
kernel mapped memory and it will also allow read/write to access non-system RAM
above high_memory once the high_memory check is removed from
valid_phys_addr_range.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/ioremap.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index baff1da..1ae7323 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -320,8 +320,11 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void *addr;
unsigned long start = phys & PAGE_MASK;
- /* If page is RAM, we can use __va. Otherwise ioremap and unmap. */
- if (page_is_ram(start >> PAGE_SHIFT))
+ /*
+ * If page is RAM and is mapped by kernel, we can use __va.
+ * Otherwise ioremap and unmap.
+ */
+ if (page_is_ram(start >> PAGE_SHIFT) && phys <= __pa(high_memory))
return __va(phys);
addr = (void __force *)ioremap_cache(start, PAGE_SIZE);
@@ -333,7 +336,7 @@ void *xlate_dev_mem_ptr(unsigned long phys)
void unxlate_dev_mem_ptr(unsigned long phys, void *addr)
{
- if (page_is_ram(phys >> PAGE_SHIFT))
+ if (page_is_ram(phys >> PAGE_SHIFT) && phys <= __pa(high_memory))
return;
iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 4/4] x86: remove high_memory check from valid_phys_addr_range
2014-09-30 16:24 ` Frantisek Hrbata
@ 2014-09-30 16:25 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
There is no need to block read/write access to /dev/mem for phys. addr. above
high_memory for non-system RAM. The only limitation should be
boot_cpu_data.x86_phys_bits(max phys. addr. size).
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/mmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 77a13f8..788f2b7 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -127,7 +127,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
int valid_phys_addr_range(phys_addr_t addr, size_t count)
{
- return addr + count <= __pa(high_memory);
+ return arch_pfn_possible((addr + count) >> PAGE_SHIFT);
}
int valid_mmap_phys_addr_range(unsigned long pfn, size_t len_bytes)
--
1.9.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 4/4] x86: remove high_memory check from valid_phys_addr_range
@ 2014-09-30 16:25 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:25 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, tglx, mingo, hpa, x86, oleg, kamaleshb, hechjie, akpm,
dave.hansen, dvlasenk, prarit, lwoodman, hannsj_uhl, torvalds
There is no need to block read/write access to /dev/mem for phys. addr. above
high_memory for non-system RAM. The only limitation should be
boot_cpu_data.x86_phys_bits(max phys. addr. size).
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
---
arch/x86/mm/mmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 77a13f8..788f2b7 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -127,7 +127,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
int valid_phys_addr_range(phys_addr_t addr, size_t count)
{
- return addr + count <= __pa(high_memory);
+ return arch_pfn_possible((addr + count) >> PAGE_SHIFT);
}
int valid_mmap_phys_addr_range(unsigned long pfn, size_t len_bytes)
--
1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-30 14:27 ` Dave Hansen
@ 2014-09-30 16:33 ` Frantisek Hrbata
-1 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:33 UTC (permalink / raw)
To: Dave Hansen
Cc: Thomas Gleixner, linux-kernel, linux-mm, mingo, hpa, x86, oleg,
kamaleshb, hechjie, akpm, dvlasenk, prarit, lwoodman, hannsj_uhl,
torvalds
On Tue, Sep 30, 2014 at 07:27:51AM -0700, Dave Hansen wrote:
> On 09/30/2014 05:41 AM, Frantisek Hrbata wrote:
> > Does it make sense to replace "count" with "size" so it's consistent with the
> > rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
>
> I don't care what it is as long as it has a unit in it.
Hi Dave/Thomas,
I sent v2 of this patch set, which uses the "len_bytes". Again, I'm sorry I
forgot to fix this the first time.
Many thanks
--
Frantisek Hrbata
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-30 16:33 ` Frantisek Hrbata
0 siblings, 0 replies; 30+ messages in thread
From: Frantisek Hrbata @ 2014-09-30 16:33 UTC (permalink / raw)
To: Dave Hansen
Cc: Thomas Gleixner, linux-kernel, linux-mm, mingo, hpa, x86, oleg,
kamaleshb, hechjie, akpm, dvlasenk, prarit, lwoodman, hannsj_uhl,
torvalds
On Tue, Sep 30, 2014 at 07:27:51AM -0700, Dave Hansen wrote:
> On 09/30/2014 05:41 AM, Frantisek Hrbata wrote:
> > Does it make sense to replace "count" with "size" so it's consistent with the
> > rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
>
> I don't care what it is as long as it has a unit in it.
Hi Dave/Thomas,
I sent v2 of this patch set, which uses the "len_bytes". Again, I'm sorry I
forgot to fix this the first time.
Many thanks
--
Frantisek Hrbata
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
2014-09-30 16:33 ` Frantisek Hrbata
@ 2014-09-30 21:57 ` Thomas Gleixner
-1 siblings, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2014-09-30 21:57 UTC (permalink / raw)
To: Frantisek Hrbata
Cc: Dave Hansen, linux-kernel, linux-mm, mingo, hpa, x86, oleg,
kamaleshb, hechjie, akpm, dvlasenk, prarit, lwoodman, hannsj_uhl,
torvalds
On Tue, 30 Sep 2014, Frantisek Hrbata wrote:
> On Tue, Sep 30, 2014 at 07:27:51AM -0700, Dave Hansen wrote:
> > On 09/30/2014 05:41 AM, Frantisek Hrbata wrote:
> > > Does it make sense to replace "count" with "size" so it's consistent with the
> > > rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
> >
> > I don't care what it is as long as it has a unit in it.
>
> Hi Dave/Thomas,
>
> I sent v2 of this patch set, which uses the "len_bytes". Again, I'm sorry I
> forgot to fix this the first time.
No problem. Shit happens.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap
@ 2014-09-30 21:57 ` Thomas Gleixner
0 siblings, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2014-09-30 21:57 UTC (permalink / raw)
To: Frantisek Hrbata
Cc: Dave Hansen, linux-kernel, linux-mm, mingo, hpa, x86, oleg,
kamaleshb, hechjie, akpm, dvlasenk, prarit, lwoodman, hannsj_uhl,
torvalds
On Tue, 30 Sep 2014, Frantisek Hrbata wrote:
> On Tue, Sep 30, 2014 at 07:27:51AM -0700, Dave Hansen wrote:
> > On 09/30/2014 05:41 AM, Frantisek Hrbata wrote:
> > > Does it make sense to replace "count" with "size" so it's consistent with the
> > > rest or do you prefer "nr_bytes" or as Dave proposed "len_bytes"?
> >
> > I don't care what it is as long as it has a unit in it.
>
> Hi Dave/Thomas,
>
> I sent v2 of this patch set, which uses the "len_bytes". Again, I'm sorry I
> forgot to fix this the first time.
No problem. Shit happens.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2014-09-30 21:57 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-29 11:32 [RESEND PATCH 0/4] x86: /dev/mem fixes Frantisek Hrbata
2014-09-29 11:32 ` Frantisek Hrbata
2014-09-29 11:32 ` [RESEND PATCH 1/4] x86: add arch_pfn_possible helper Frantisek Hrbata
2014-09-29 11:32 ` Frantisek Hrbata
2014-09-29 11:33 ` [RESEND PATCH 2/4] x86: add phys addr validity check for /dev/mem mmap Frantisek Hrbata
2014-09-29 11:33 ` Frantisek Hrbata
2014-09-29 20:15 ` Thomas Gleixner
2014-09-29 20:15 ` Thomas Gleixner
2014-09-30 12:41 ` Frantisek Hrbata
2014-09-30 12:41 ` Frantisek Hrbata
2014-09-30 14:27 ` Dave Hansen
2014-09-30 14:27 ` Dave Hansen
2014-09-30 16:33 ` Frantisek Hrbata
2014-09-30 16:33 ` Frantisek Hrbata
2014-09-30 21:57 ` Thomas Gleixner
2014-09-30 21:57 ` Thomas Gleixner
2014-09-29 11:33 ` [RESEND PATCH 3/4] x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr Frantisek Hrbata
2014-09-29 11:33 ` Frantisek Hrbata
2014-09-29 11:33 ` [RESEND PATCH 4/4] x86: remove high_memory check from valid_phys_addr_range Frantisek Hrbata
2014-09-29 11:33 ` Frantisek Hrbata
2014-09-30 16:24 ` [PATCH v2 0/4] x86: /dev/mem fixes Frantisek Hrbata
2014-09-30 16:24 ` Frantisek Hrbata
2014-09-30 16:25 ` [PATCH v2 1/4] x86: add arch_pfn_possible helper Frantisek Hrbata
2014-09-30 16:25 ` Frantisek Hrbata
2014-09-30 16:25 ` [PATCH v2 2/4] x86: add phys addr validity check for /dev/mem mmap Frantisek Hrbata
2014-09-30 16:25 ` Frantisek Hrbata
2014-09-30 16:25 ` [PATCH v2 3/4] x86: add high_memory check to (xlate|unxlate)_dev_mem_ptr Frantisek Hrbata
2014-09-30 16:25 ` Frantisek Hrbata
2014-09-30 16:25 ` [PATCH v2 4/4] x86: remove high_memory check from valid_phys_addr_range Frantisek Hrbata
2014-09-30 16:25 ` Frantisek Hrbata
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.