[PATCHv4,3/5] x86/mm: fix 32-bit mmap() for 64-bit ELF
diff mbox series

Message ID 20170130120432.6716-4-dsafonov@virtuozzo.com
State New, archived
Headers show
Series
  • Fix compatible mmap() return pointer over 4Gb
Related show

Commit Message

Dmitry Safonov Jan. 30, 2017, 12:04 p.m. UTC
Fix 32-bit compat_sys_mmap() mapping VMA over 4Gb in 64-bit binaries
and 64-bit sys_mmap() mapping VMA only under 4Gb in 32-bit binaries.
Introduced new bases for compat syscalls in mm_struct:
mmap_compat_base and mmap_compat_legacy_base for top-down and
bottom-up allocations accordingly.
Taught arch_get_unmapped_area{,_topdown}() to use the new mmap_bases
in compat syscalls for high/low limits in vm_unmapped_area().

I discovered that bug on ZDTM tests for compat 32-bit C/R.
Working compat sys_mmap() in 64-bit binaries is really needed for that
purpose, as 32-bit applications are restored from 64-bit CRIU binary.

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/Kconfig                 |  7 +++++++
 arch/x86/Kconfig             |  1 +
 arch/x86/kernel/sys_x86_64.c | 28 ++++++++++++++++++++++++----
 arch/x86/mm/mmap.c           | 36 ++++++++++++++++++++++++------------
 include/linux/mm_types.h     |  5 +++++
 5 files changed, 61 insertions(+), 16 deletions(-)

Comments

Thomas Gleixner Feb. 11, 2017, 7:49 p.m. UTC | #1
On Mon, 30 Jan 2017, Dmitry Safonov wrote:

> Fix 32-bit compat_sys_mmap() mapping VMA over 4Gb in 64-bit binaries
> and 64-bit sys_mmap() mapping VMA only under 4Gb in 32-bit binaries.
> Introduced new bases for compat syscalls in mm_struct:
> mmap_compat_base and mmap_compat_legacy_base for top-down and
> bottom-up allocations accordingly.
> Taught arch_get_unmapped_area{,_topdown}() to use the new mmap_bases
> in compat syscalls for high/low limits in vm_unmapped_area().
> 
> I discovered that bug on ZDTM tests for compat 32-bit C/R.
> Working compat sys_mmap() in 64-bit binaries is really needed for that
> purpose, as 32-bit applications are restored from 64-bit CRIU binary.

Again that changelog sucks.

Explain the problem/bug first. Then explain the way to fix it and do not
tell fairy tales about what you did without explaing the bug in the first
place.

Documentation....SubittingPatches explains that very well.


> +config HAVE_ARCH_COMPAT_MMAP_BASES
> +	bool
> +	help
> +	  If this is set, one program can do native and compatible syscall
> +	  mmap() on architecture. Thus kernel has different bases to
> +	  compute high and low virtual address limits for allocation.

Sigh. How is a user supposed to decode this?

	  This allows 64bit applications to invoke syscalls in 64bit and
	  32bit mode. Required for ....

>  
> @@ -113,10 +114,19 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
>  		if (current->flags & PF_RANDOMIZE) {
>  			*begin = randomize_page(*begin, 0x02000000);
>  		}
> -	} else {
> -		*begin = current->mm->mmap_legacy_base;
> -		*end = TASK_SIZE;
> +		return;
>  	}
> +
> +#ifdef CONFIG_COMPAT

Can you please find a solution which does not create that ifdef horror in
the code? Just a few accessors to those compat fields are required to do
that.

> +
> +#ifdef CONFIG_COMPAT
> +	arch_pick_mmap_base(&mm->mmap_compat_base, &mm->mmap_compat_legacy_base,
> +			arch_compat_rnd(), IA32_PAGE_OFFSET);
> +#endif

Ditto

Thanks,

	tglx
Dmitry Safonov Feb. 14, 2017, 3:24 p.m. UTC | #2
On 02/11/2017 10:49 PM, Thomas Gleixner wrote:
> On Mon, 30 Jan 2017, Dmitry Safonov wrote:
>
>> Fix 32-bit compat_sys_mmap() mapping VMA over 4Gb in 64-bit binaries
>> and 64-bit sys_mmap() mapping VMA only under 4Gb in 32-bit binaries.
>> Introduced new bases for compat syscalls in mm_struct:
>> mmap_compat_base and mmap_compat_legacy_base for top-down and
>> bottom-up allocations accordingly.
>> Taught arch_get_unmapped_area{,_topdown}() to use the new mmap_bases
>> in compat syscalls for high/low limits in vm_unmapped_area().
>>
>> I discovered that bug on ZDTM tests for compat 32-bit C/R.
>> Working compat sys_mmap() in 64-bit binaries is really needed for that
>> purpose, as 32-bit applications are restored from 64-bit CRIU binary.
>
> Again that changelog sucks.
>
> Explain the problem/bug first. Then explain the way to fix it and do not
> tell fairy tales about what you did without explaing the bug in the first
> place.
>
> Documentation....SubittingPatches explains that very well.

Rewrote changelog.

>> +config HAVE_ARCH_COMPAT_MMAP_BASES
>> +	bool
>> +	help
>> +	  If this is set, one program can do native and compatible syscall
>> +	  mmap() on architecture. Thus kernel has different bases to
>> +	  compute high and low virtual address limits for allocation.
>
> Sigh. How is a user supposed to decode this?
>
> 	  This allows 64bit applications to invoke syscalls in 64bit and
> 	  32bit mode. Required for ....

Ok

>>
>> @@ -113,10 +114,19 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
>>  		if (current->flags & PF_RANDOMIZE) {
>>  			*begin = randomize_page(*begin, 0x02000000);
>>  		}
>> -	} else {
>> -		*begin = current->mm->mmap_legacy_base;
>> -		*end = TASK_SIZE;
>> +		return;
>>  	}
>> +
>> +#ifdef CONFIG_COMPAT
>
> Can you please find a solution which does not create that ifdef horror in
> the code? Just a few accessors to those compat fields are required to do
> that.

I'll try

>> +
>> +#ifdef CONFIG_COMPAT
>> +	arch_pick_mmap_base(&mm->mmap_compat_base, &mm->mmap_compat_legacy_base,
>> +			arch_compat_rnd(), IA32_PAGE_OFFSET);
>> +#endif
>
> Ditto
>
> Thanks,
>
> 	tglx
>

Patch
diff mbox series

diff --git a/arch/Kconfig b/arch/Kconfig
index 99839c23d453..6bdca6d86855 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -671,6 +671,13 @@  config ARCH_MMAP_RND_COMPAT_BITS
 	  This value can be changed after boot using the
 	  /proc/sys/vm/mmap_rnd_compat_bits tunable
 
+config HAVE_ARCH_COMPAT_MMAP_BASES
+	bool
+	help
+	  If this is set, one program can do native and compatible syscall
+	  mmap() on architecture. Thus kernel has different bases to
+	  compute high and low virtual address limits for allocation.
+
 config HAVE_COPY_THREAD_TLS
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e487493bbd47..b3acb836567a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -102,6 +102,7 @@  config X86
 	select HAVE_ARCH_KMEMCHECK
 	select HAVE_ARCH_MMAP_RND_BITS		if MMU
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if MMU && COMPAT
+	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index a55ed63b9f91..90be0839441d 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -16,6 +16,7 @@ 
 #include <linux/uaccess.h>
 #include <linux/elf.h>
 
+#include <asm/compat.h>
 #include <asm/ia32.h>
 #include <asm/syscalls.h>
 
@@ -113,10 +114,19 @@  static void find_start_end(unsigned long flags, unsigned long *begin,
 		if (current->flags & PF_RANDOMIZE) {
 			*begin = randomize_page(*begin, 0x02000000);
 		}
-	} else {
-		*begin = current->mm->mmap_legacy_base;
-		*end = TASK_SIZE;
+		return;
 	}
+
+#ifdef CONFIG_COMPAT
+	if (in_compat_syscall()) {
+		*begin = current->mm->mmap_compat_legacy_base;
+		*end = IA32_PAGE_OFFSET;
+		return;
+	}
+#endif
+
+	*begin = current->mm->mmap_legacy_base;
+	*end = TASK_SIZE_MAX;
 }
 
 unsigned long
@@ -157,6 +167,16 @@  arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	return vm_unmapped_area(&info);
 }
 
+static unsigned long find_top(void)
+{
+#ifdef CONFIG_COMPAT
+	if (in_compat_syscall())
+		return current->mm->mmap_compat_base;
+	else
+#endif
+		return current->mm->mmap_base;
+}
+
 unsigned long
 arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 			  const unsigned long len, const unsigned long pgoff,
@@ -190,7 +210,7 @@  arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
 	info.length = len;
 	info.low_limit = PAGE_SIZE;
-	info.high_limit = mm->mmap_base;
+	info.high_limit = find_top();
 	info.align_mask = 0;
 	info.align_offset = pgoff << PAGE_SHIFT;
 	if (filp) {
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 98be520fd270..99e6a81d9c87 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -70,6 +70,8 @@  static int mmap_is_legacy(void)
 #ifdef CONFIG_COMPAT
 static unsigned long arch_compat_rnd(void)
 {
+	if (!(current->flags & PF_RANDOMIZE))
+		return 0;
 	return (get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1))
 		<< PAGE_SHIFT;
 }
@@ -77,6 +79,8 @@  static unsigned long arch_compat_rnd(void)
 
 static unsigned long arch_native_rnd(void)
 {
+	if (!(current->flags & PF_RANDOMIZE))
+		return 0;
 	return (get_random_long() & ((1UL << mmap_rnd_bits) - 1)) << PAGE_SHIFT;
 }
 
@@ -112,22 +116,30 @@  static unsigned long mmap_legacy_base(unsigned long rnd,
  * This function, called very early during the creation of a new
  * process VM image, sets up which VM layout function to use:
  */
-void arch_pick_mmap_layout(struct mm_struct *mm)
+static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base,
+		unsigned long random_factor, unsigned long task_size)
 {
-	unsigned long random_factor = 0UL;
-
-	if (current->flags & PF_RANDOMIZE)
-		random_factor = arch_mmap_rnd();
-
-	mm->mmap_legacy_base = mmap_legacy_base(random_factor, TASK_SIZE);
+	*legacy_base = mmap_legacy_base(random_factor, task_size);
+	if (mmap_is_legacy())
+		*base = *legacy_base;
+	else
+		*base = mmap_base(random_factor, task_size);
+}
 
-	if (mmap_is_legacy()) {
-		mm->mmap_base = mm->mmap_legacy_base;
+void arch_pick_mmap_layout(struct mm_struct *mm)
+{
+	if (mmap_is_legacy())
 		mm->get_unmapped_area = arch_get_unmapped_area;
-	} else {
-		mm->mmap_base = mmap_base(random_factor, TASK_SIZE);
+	else
 		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-	}
+
+	arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base,
+			arch_native_rnd(), TASK_SIZE_MAX);
+
+#ifdef CONFIG_COMPAT
+	arch_pick_mmap_base(&mm->mmap_compat_base, &mm->mmap_compat_legacy_base,
+			arch_compat_rnd(), IA32_PAGE_OFFSET);
+#endif
 }
 
 const char *arch_vma_name(struct vm_area_struct *vma)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 808751d7b737..48274a84cebe 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -404,6 +404,11 @@  struct mm_struct {
 #endif
 	unsigned long mmap_base;		/* base of mmap area */
 	unsigned long mmap_legacy_base;         /* base of mmap area in bottom-up allocations */
+#ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
+	/* Base adresses for compatible mmap() */
+	unsigned long mmap_compat_base;
+	unsigned long mmap_compat_legacy_base;
+#endif
 	unsigned long task_size;		/* size of task vm space */
 	unsigned long highest_vm_end;		/* highest vma end address */
 	pgd_t * pgd;