linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv10 0/2] mremap vDSO for 32-bit
@ 2016-06-28 11:35 Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
  0 siblings, 2 replies; 7+ messages in thread
From: Dmitry Safonov @ 2016-06-28 11:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: 0x7f454c46, mingo, luto, linux-mm, Dmitry Safonov

The first patch adds support of mremapping 32-bit vDSO.
One could move vDSO vma before this patch, but fast syscalls
on moved vDSO hasn't been working. It's because of code that
relies on mm->context.vdso pointer.
So all this code is just fixup for that pointer on moving.
(Also adds preventing for splitting vDSO vma).
As Andy notted, 64-bit vDSO mremap also has worked only by a chance
before this patches.
The second patch adds a test for the new functionality.

I need possibility to move vDSO in CRIU - on restore we need
to choose vma's position:
- if vDSO blob of restoring application is the same as the kernel has,
  we need to move it on the same place;
- if it differs, we need to choose place that wasn't tooken by other
  vma of restoring application and than add jump trampolines to it
  from the place of vDSO in restoring application.

CRIU code now relies on possibility on x86_64 to mremap vDSO.
Without this patch that may be broken in future.
And as I work on C/R of compatible 32-bit applications on x86_64,
I need mremap to work also for 32-bit vDSO. Which does not work,
because of context.vdso pointer mentioned above. 

Changes:
v10: run selftest after fork() and treat child segfaults for a nice
     error reports.
v9: Added cover-letter with changelog and reasons for patches
v8: Add WARN_ON_ONCE on current->mm != new_vma->vm_mm;
    run test for x86_64 too;
    removed fixed VDSO_SIZE - check EINVAL mremap return for partial remapping
v7: Build fix
v6: Moved vdso_image_32 check and fixup code into vdso_fix_landing function
    with ifdefs around
v5: As Andy suggested, add a check that new_vma->vm_mm and current->mm are
    the same, also check not only in_ia32_syscall() but image == &vdso_image_32;
    added test for mremapping vDSO
v4: Drop __maybe_unused & use image from mm->context instead vdso_image_32
v3: As Andy suggested, return EINVAL in case of splitting vdso blob on mremap;
    used is_ia32_task instead of ifdefs 
v2: Added __maybe_unused for pt_regs in vdso_mremap

Dmitry Safonov (2):
  x86/vdso: add mremap hook to vm_special_mapping
  selftest/x86: add mremap vdso test

 arch/x86/entry/vdso/vma.c                      |  47 +++++++++--
 include/linux/mm_types.h                       |   3 +
 mm/mmap.c                                      |  10 +++
 tools/testing/selftests/x86/Makefile           |   3 +-
 tools/testing/selftests/x86/test_mremap_vdso.c | 111 +++++++++++++++++++++++++
 5 files changed, 168 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/x86/test_mremap_vdso.c

-- 
2.9.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping
  2016-06-28 11:35 [PATCHv10 0/2] mremap vDSO for 32-bit Dmitry Safonov
@ 2016-06-28 11:35 ` Dmitry Safonov
  2016-07-06 14:03   ` Andy Lutomirski
  2016-07-08 13:16   ` [tip:x86/mm] x86/vdso: Add " tip-bot for Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
  1 sibling, 2 replies; 7+ messages in thread
From: Dmitry Safonov @ 2016-06-28 11:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, mingo, luto, linux-mm, Dmitry Safonov,
	Thomas Gleixner, H. Peter Anvin, x86

Add possibility for userspace 32-bit applications to move
vdso mapping. Previously, when userspace app called
mremap for vdso, in return path it would land on previous
address of vdso page, resulting in segmentation violation.
Now it lands fine and returns to userspace with remapped vdso.
This will also fix context.vdso pointer for 64-bit, which does not
affect the user of vdso after mremap by now, but this may change.

As suggested by Andy, return EINVAL for mremap that splits vdso image.

Renamed and moved text_mapping structure declaration inside
map_vdso, as it used only there and now it complement
vvar_mapping variable.

There is still problem for remapping vdso in glibc applications:
linker relocates addresses for syscalls on vdso page, so
you need to relink with the new addresses. Or the next syscall
through glibc may fail:
  Program received signal SIGSEGV, Segmentation fault.
  #0  0xf7fd9b80 in __kernel_vsyscall ()
  #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-mm@kvack.org
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vma.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/mm_types.h  |  3 +++
 mm/mmap.c                 | 10 ++++++++++
 3 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index ab220ac9b3b9..3329844e3c43 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -12,6 +12,7 @@
 #include <linux/random.h>
 #include <linux/elf.h>
 #include <linux/cpu.h>
+#include <linux/ptrace.h>
 #include <asm/pvclock.h>
 #include <asm/vgtod.h>
 #include <asm/proto.h>
@@ -97,10 +98,40 @@ static int vdso_fault(const struct vm_special_mapping *sm,
 	return 0;
 }
 
-static const struct vm_special_mapping text_mapping = {
-	.name = "[vdso]",
-	.fault = vdso_fault,
-};
+static void vdso_fix_landing(const struct vdso_image *image,
+		struct vm_area_struct *new_vma)
+{
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+	if (in_ia32_syscall() && image == &vdso_image_32) {
+		struct pt_regs *regs = current_pt_regs();
+		unsigned long vdso_land = image->sym_int80_landing_pad;
+		unsigned long old_land_addr = vdso_land +
+			(unsigned long)current->mm->context.vdso;
+
+		/* Fixing userspace landing - look at do_fast_syscall_32 */
+		if (regs->ip == old_land_addr)
+			regs->ip = new_vma->vm_start + vdso_land;
+	}
+#endif
+}
+
+static int vdso_mremap(const struct vm_special_mapping *sm,
+		struct vm_area_struct *new_vma)
+{
+	unsigned long new_size = new_vma->vm_end - new_vma->vm_start;
+	const struct vdso_image *image = current->mm->context.vdso_image;
+
+	if (image->size != new_size)
+		return -EINVAL;
+
+	if (WARN_ON_ONCE(current->mm != new_vma->vm_mm))
+		return -EFAULT;
+
+	vdso_fix_landing(image, new_vma);
+	current->mm->context.vdso = (void __user *)new_vma->vm_start;
+
+	return 0;
+}
 
 static int vvar_fault(const struct vm_special_mapping *sm,
 		      struct vm_area_struct *vma, struct vm_fault *vmf)
@@ -151,6 +182,12 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 	struct vm_area_struct *vma;
 	unsigned long addr, text_start;
 	int ret = 0;
+
+	static const struct vm_special_mapping vdso_mapping = {
+		.name = "[vdso]",
+		.fault = vdso_fault,
+		.mremap = vdso_mremap,
+	};
 	static const struct vm_special_mapping vvar_mapping = {
 		.name = "[vvar]",
 		.fault = vvar_fault,
@@ -185,7 +222,7 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 				       image->size,
 				       VM_READ|VM_EXEC|
 				       VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
-				       &text_mapping);
+				       &vdso_mapping);
 
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index e093e1d3285b..79472b22d23f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -595,6 +595,9 @@ struct vm_special_mapping {
 	int (*fault)(const struct vm_special_mapping *sm,
 		     struct vm_area_struct *vma,
 		     struct vm_fault *vmf);
+
+	int (*mremap)(const struct vm_special_mapping *sm,
+		     struct vm_area_struct *new_vma);
 };
 
 enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index 25c2b4e0fbdc..86b18f334f4f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2961,9 +2961,19 @@ static const char *special_mapping_name(struct vm_area_struct *vma)
 	return ((struct vm_special_mapping *)vma->vm_private_data)->name;
 }
 
+static int special_mapping_mremap(struct vm_area_struct *new_vma)
+{
+	struct vm_special_mapping *sm = new_vma->vm_private_data;
+
+	if (sm->mremap)
+		return sm->mremap(sm, new_vma);
+	return 0;
+}
+
 static const struct vm_operations_struct special_mapping_vmops = {
 	.close = special_mapping_close,
 	.fault = special_mapping_fault,
+	.mremap = special_mapping_mremap,
 	.name = special_mapping_name,
 };
 
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCHv10 2/2] selftest/x86: add mremap vdso test
  2016-06-28 11:35 [PATCHv10 0/2] mremap vDSO for 32-bit Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
@ 2016-06-28 11:35 ` Dmitry Safonov
  2016-07-08 12:17   ` Ingo Molnar
  2016-07-08 13:16   ` [tip:x86/mm] selftests/x86: Add vDSO mremap() test tip-bot for Dmitry Safonov
  1 sibling, 2 replies; 7+ messages in thread
From: Dmitry Safonov @ 2016-06-28 11:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, mingo, luto, linux-mm, Dmitry Safonov,
	Thomas Gleixner, H. Peter Anvin, Shuah Khan, x86,
	linux-kselftest

Should print on success:
[root@localhost ~]# ./test_mremap_vdso_32
	AT_SYSINFO_EHDR is 0xf773f000
[NOTE]	Moving vDSO: [f773f000, f7740000] -> [a000000, a001000]
[OK]

Or print that mremap for vDSO is unsupported:
[root@localhost ~]# ./test_mremap_vdso_32
	AT_SYSINFO_EHDR is 0xf773c000
[NOTE]	Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000]
[FAIL]	mremap() of the vDSO does not work on this kernel!

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: x86@kernel.org
Cc: linux-kselftest@vger.kernel.org
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 tools/testing/selftests/x86/Makefile           |   3 +-
 tools/testing/selftests/x86/test_mremap_vdso.c | 111 +++++++++++++++++++++++++
 2 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/test_mremap_vdso.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index abe9c35c1cb6..c701349d4920 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -5,7 +5,8 @@ include ../lib.mk
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall \
-			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test
+			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test \
+			test_mremap_vdso
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/test_mremap_vdso.c b/tools/testing/selftests/x86/test_mremap_vdso.c
new file mode 100644
index 000000000000..a489a2410664
--- /dev/null
+++ b/tools/testing/selftests/x86/test_mremap_vdso.c
@@ -0,0 +1,111 @@
+/*
+ * 32-bit test to check vdso mremap.
+ *
+ * Copyright (c) 2016 Dmitry Safonov
+ * Suggested-by: Andrew Lutomirski
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+/*
+ * Can be built statically:
+ * gcc -Os -Wall -static -m32 test_mremap_vdso.c
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+#include <string.h>
+
+#include <sys/mman.h>
+#include <sys/auxv.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+
+#define PAGE_SIZE	4096
+
+static int try_to_remap(void *vdso_addr, unsigned long size)
+{
+	void *dest_addr, *new_addr;
+
+	/* Searching for memory location where to remap */
+	dest_addr = mmap(0, size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
+	if (dest_addr == MAP_FAILED) {
+		printf("[WARN]\tmmap failed (%d): %m\n", errno);
+		return 0;
+	}
+
+	printf("[NOTE]\tMoving vDSO: [%p, %#lx] -> [%p, %#lx]\n",
+		vdso_addr, (unsigned long)vdso_addr + size,
+		dest_addr, (unsigned long)dest_addr + size);
+	fflush(stdout);
+
+	new_addr = mremap(vdso_addr, size, size,
+			MREMAP_FIXED|MREMAP_MAYMOVE, dest_addr);
+	if ((unsigned long)new_addr == (unsigned long)-1) {
+		munmap(dest_addr, size);
+		if (errno == EINVAL) {
+			printf("[NOTE]\tvDSO partial move failed, will try with bigger size\n");
+			return -1; /* Retry with larger */
+		}
+		printf("[FAIL]\tmremap failed (%d): %m\n", errno);
+		return 1;
+	}
+
+	return 0;
+
+}
+
+int main(int argc, char **argv, char **envp)
+{
+	pid_t child;
+
+	child = fork();
+	if (child == -1) {
+		printf("[WARN]\tfailed to fork (%d): %m\n", errno);
+		return 1;
+	}
+
+	if (child == 0) {
+		unsigned long vdso_size = PAGE_SIZE;
+		unsigned long auxval;
+		int ret = -1;
+
+		auxval = getauxval(AT_SYSINFO_EHDR);
+		printf("\tAT_SYSINFO_EHDR is %#lx\n", auxval);
+		if (!auxval || auxval == -ENOENT) {
+			printf("[WARN]\tgetauxval failed\n");
+			return 0;
+		}
+
+		/* Simpler than parsing ELF header */
+		while (ret < 0) {
+			ret = try_to_remap((void *)auxval, vdso_size);
+			vdso_size += PAGE_SIZE;
+		}
+
+		/* Glibc is likely to explode now - exit with raw syscall */
+		asm volatile ("int $0x80" : : "a" (__NR_exit), "b" (!!ret));
+	} else {
+		int status;
+
+		if (waitpid(child, &status, 0) != child ||
+			!WIFEXITED(status)) {
+			printf("[FAIL]\tmremap() of the vDSO does not work on this kernel!\n");
+			return 1;
+		} else if (WEXITSTATUS(status) != 0) {
+			printf("[FAIL]\tChild failed with %d\n",
+					WEXITSTATUS(status));
+			return 1;
+		}
+		printf("[OK]\n");
+	}
+
+	return 0;
+}
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
@ 2016-07-06 14:03   ` Andy Lutomirski
  2016-07-08 13:16   ` [tip:x86/mm] x86/vdso: Add " tip-bot for Dmitry Safonov
  1 sibling, 0 replies; 7+ messages in thread
From: Andy Lutomirski @ 2016-07-06 14:03 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Ingo Molnar, Andrew Lutomirski,
	linux-mm, Thomas Gleixner, H. Peter Anvin, X86 ML

On Tue, Jun 28, 2016 at 4:35 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> Add possibility for userspace 32-bit applications to move
> vdso mapping. Previously, when userspace app called
> mremap for vdso, in return path it would land on previous
> address of vdso page, resulting in segmentation violation.
> Now it lands fine and returns to userspace with remapped vdso.
> This will also fix context.vdso pointer for 64-bit, which does not
> affect the user of vdso after mremap by now, but this may change.
>
> As suggested by Andy, return EINVAL for mremap that splits vdso image.
>
> Renamed and moved text_mapping structure declaration inside
> map_vdso, as it used only there and now it complement
> vvar_mapping variable.
>
> There is still problem for remapping vdso in glibc applications:
> linker relocates addresses for syscalls on vdso page, so
> you need to relink with the new addresses. Or the next syscall
> through glibc may fail:
>   Program received signal SIGSEGV, Segmentation fault.
>   #0  0xf7fd9b80 in __kernel_vsyscall ()
>   #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv10 2/2] selftest/x86: add mremap vdso test
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
@ 2016-07-08 12:17   ` Ingo Molnar
  2016-07-08 13:16   ` [tip:x86/mm] selftests/x86: Add vDSO mremap() test tip-bot for Dmitry Safonov
  1 sibling, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2016-07-08 12:17 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, 0x7f454c46, mingo, luto, linux-mm, Thomas Gleixner,
	H. Peter Anvin, Shuah Khan, x86, linux-kselftest


* Dmitry Safonov <dsafonov@virtuozzo.com> wrote:

> Or print that mremap for vDSO is unsupported:
> [root@localhost ~]# ./test_mremap_vdso_32
> 	AT_SYSINFO_EHDR is 0xf773c000
> [NOTE]	Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000]
> [FAIL]	mremap() of the vDSO does not work on this kernel!

Hm, I tried this on a 64-bit kernel and got:

triton:~/tip/tools/testing/selftests/x86> ./test_mremap_vdso_32 
        AT_SYSINFO_EHDR is 0xf7773000
[NOTE]  Moving vDSO: [0xf7773000, 0xf7774000] -> [0xf776e000, 0xf776f000]
Segmentation fault

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tip:x86/mm] x86/vdso: Add mremap hook to vm_special_mapping
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
  2016-07-06 14:03   ` Andy Lutomirski
@ 2016-07-08 13:16   ` tip-bot for Dmitry Safonov
  1 sibling, 0 replies; 7+ messages in thread
From: tip-bot for Dmitry Safonov @ 2016-07-08 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, dvlasenk, hpa, torvalds, dsafonov, bp, brgerst,
	linux-kernel, luto, peterz, tglx, mingo

Commit-ID:  b059a453b1cf1c8453c2b2ed373d3147d6264ebd
Gitweb:     http://git.kernel.org/tip/b059a453b1cf1c8453c2b2ed373d3147d6264ebd
Author:     Dmitry Safonov <dsafonov@virtuozzo.com>
AuthorDate: Tue, 28 Jun 2016 14:35:38 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 14:17:51 +0200

x86/vdso: Add mremap hook to vm_special_mapping

Add possibility for 32-bit user-space applications to move
the vDSO mapping.

Previously, when a user-space app called mremap() for the vDSO
address, in the syscall return path it would land on the previous
address of the vDSOpage, resulting in segmentation violation.

Now it lands fine and returns to userspace with a remapped vDSO.

This will also fix the context.vdso pointer for 64-bit, which does
not affect the user of vDSO after mremap() currently, but this
may change in the future.

As suggested by Andy, return -EINVAL for mremap() that would
split the vDSO image: that operation cannot possibly result in
a working system so reject it.

Renamed and moved the text_mapping structure declaration inside
map_vdso(), as it used only there and now it complements the
vvar_mapping variable.

There is still a problem for remapping the vDSO in glibc
applications: the linker relocates addresses for syscalls
on the vDSO page, so you need to relink with the new
addresses.

Without that the next syscall through glibc may fail:

  Program received signal SIGSEGV, Segmentation fault.
  #0  0xf7fd9b80 in __kernel_vsyscall ()
  #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160628113539.13606-2-dsafonov@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/vdso/vma.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/mm_types.h  |  3 +++
 mm/mmap.c                 | 10 ++++++++++
 3 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index ab220ac..3329844 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -12,6 +12,7 @@
 #include <linux/random.h>
 #include <linux/elf.h>
 #include <linux/cpu.h>
+#include <linux/ptrace.h>
 #include <asm/pvclock.h>
 #include <asm/vgtod.h>
 #include <asm/proto.h>
@@ -97,10 +98,40 @@ static int vdso_fault(const struct vm_special_mapping *sm,
 	return 0;
 }
 
-static const struct vm_special_mapping text_mapping = {
-	.name = "[vdso]",
-	.fault = vdso_fault,
-};
+static void vdso_fix_landing(const struct vdso_image *image,
+		struct vm_area_struct *new_vma)
+{
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+	if (in_ia32_syscall() && image == &vdso_image_32) {
+		struct pt_regs *regs = current_pt_regs();
+		unsigned long vdso_land = image->sym_int80_landing_pad;
+		unsigned long old_land_addr = vdso_land +
+			(unsigned long)current->mm->context.vdso;
+
+		/* Fixing userspace landing - look at do_fast_syscall_32 */
+		if (regs->ip == old_land_addr)
+			regs->ip = new_vma->vm_start + vdso_land;
+	}
+#endif
+}
+
+static int vdso_mremap(const struct vm_special_mapping *sm,
+		struct vm_area_struct *new_vma)
+{
+	unsigned long new_size = new_vma->vm_end - new_vma->vm_start;
+	const struct vdso_image *image = current->mm->context.vdso_image;
+
+	if (image->size != new_size)
+		return -EINVAL;
+
+	if (WARN_ON_ONCE(current->mm != new_vma->vm_mm))
+		return -EFAULT;
+
+	vdso_fix_landing(image, new_vma);
+	current->mm->context.vdso = (void __user *)new_vma->vm_start;
+
+	return 0;
+}
 
 static int vvar_fault(const struct vm_special_mapping *sm,
 		      struct vm_area_struct *vma, struct vm_fault *vmf)
@@ -151,6 +182,12 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 	struct vm_area_struct *vma;
 	unsigned long addr, text_start;
 	int ret = 0;
+
+	static const struct vm_special_mapping vdso_mapping = {
+		.name = "[vdso]",
+		.fault = vdso_fault,
+		.mremap = vdso_mremap,
+	};
 	static const struct vm_special_mapping vvar_mapping = {
 		.name = "[vvar]",
 		.fault = vvar_fault,
@@ -185,7 +222,7 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 				       image->size,
 				       VM_READ|VM_EXEC|
 				       VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
-				       &text_mapping);
+				       &vdso_mapping);
 
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index ca3e517..917f2b6 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -594,6 +594,9 @@ struct vm_special_mapping {
 	int (*fault)(const struct vm_special_mapping *sm,
 		     struct vm_area_struct *vma,
 		     struct vm_fault *vmf);
+
+	int (*mremap)(const struct vm_special_mapping *sm,
+		     struct vm_area_struct *new_vma);
 };
 
 enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index de2c176..234edff 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2943,9 +2943,19 @@ static const char *special_mapping_name(struct vm_area_struct *vma)
 	return ((struct vm_special_mapping *)vma->vm_private_data)->name;
 }
 
+static int special_mapping_mremap(struct vm_area_struct *new_vma)
+{
+	struct vm_special_mapping *sm = new_vma->vm_private_data;
+
+	if (sm->mremap)
+		return sm->mremap(sm, new_vma);
+	return 0;
+}
+
 static const struct vm_operations_struct special_mapping_vmops = {
 	.close = special_mapping_close,
 	.fault = special_mapping_fault,
+	.mremap = special_mapping_mremap,
 	.name = special_mapping_name,
 };
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [tip:x86/mm] selftests/x86: Add vDSO mremap() test
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
  2016-07-08 12:17   ` Ingo Molnar
@ 2016-07-08 13:16   ` tip-bot for Dmitry Safonov
  1 sibling, 0 replies; 7+ messages in thread
From: tip-bot for Dmitry Safonov @ 2016-07-08 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: shuahkh, hpa, dsafonov, tglx, mingo, torvalds, dvlasenk, brgerst,
	bp, peterz, luto, jpoimboe, linux-kernel

Commit-ID:  f80fd3a5fff88a9ace7e8cd11d07cf874a63ea9f
Gitweb:     http://git.kernel.org/tip/f80fd3a5fff88a9ace7e8cd11d07cf874a63ea9f
Author:     Dmitry Safonov <dsafonov@virtuozzo.com>
AuthorDate: Tue, 28 Jun 2016 14:35:39 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 14:17:51 +0200

selftests/x86: Add vDSO mremap() test

Should print this on vDSO remapping success (on new kernels):

 [root@localhost ~]# ./test_mremap_vdso_32
	AT_SYSINFO_EHDR is 0xf773f000
 [NOTE]	Moving vDSO: [f773f000, f7740000] -> [a000000, a001000]
 [OK]

Or print that mremap() for vDSOs is unsupported:

 [root@localhost ~]# ./test_mremap_vdso_32
	AT_SYSINFO_EHDR is 0xf773c000
 [NOTE]	Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000]
 [FAIL]	mremap() of the vDSO does not work on this kernel!

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160628113539.13606-3-dsafonov@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/testing/selftests/x86/Makefile           |   2 +-
 tools/testing/selftests/x86/test_mremap_vdso.c | 111 +++++++++++++++++++++++++
 2 files changed, 112 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index c73425d..543a6d0 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -4,7 +4,7 @@ include ../lib.mk
 
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
-TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall \
+TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \
 			check_initial_reg_state sigreturn ldt_gdt iopl
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
diff --git a/tools/testing/selftests/x86/test_mremap_vdso.c b/tools/testing/selftests/x86/test_mremap_vdso.c
new file mode 100644
index 0000000..bf0d687
--- /dev/null
+++ b/tools/testing/selftests/x86/test_mremap_vdso.c
@@ -0,0 +1,111 @@
+/*
+ * 32-bit test to check vDSO mremap.
+ *
+ * Copyright (c) 2016 Dmitry Safonov
+ * Suggested-by: Andrew Lutomirski
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+/*
+ * Can be built statically:
+ * gcc -Os -Wall -static -m32 test_mremap_vdso.c
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+#include <string.h>
+
+#include <sys/mman.h>
+#include <sys/auxv.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+
+#define PAGE_SIZE	4096
+
+static int try_to_remap(void *vdso_addr, unsigned long size)
+{
+	void *dest_addr, *new_addr;
+
+	/* Searching for memory location where to remap */
+	dest_addr = mmap(0, size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
+	if (dest_addr == MAP_FAILED) {
+		printf("[WARN]\tmmap failed (%d): %m\n", errno);
+		return 0;
+	}
+
+	printf("[NOTE]\tMoving vDSO: [%p, %#lx] -> [%p, %#lx]\n",
+		vdso_addr, (unsigned long)vdso_addr + size,
+		dest_addr, (unsigned long)dest_addr + size);
+	fflush(stdout);
+
+	new_addr = mremap(vdso_addr, size, size,
+			MREMAP_FIXED|MREMAP_MAYMOVE, dest_addr);
+	if ((unsigned long)new_addr == (unsigned long)-1) {
+		munmap(dest_addr, size);
+		if (errno == EINVAL) {
+			printf("[NOTE]\tvDSO partial move failed, will try with bigger size\n");
+			return -1; /* Retry with larger */
+		}
+		printf("[FAIL]\tmremap failed (%d): %m\n", errno);
+		return 1;
+	}
+
+	return 0;
+
+}
+
+int main(int argc, char **argv, char **envp)
+{
+	pid_t child;
+
+	child = fork();
+	if (child == -1) {
+		printf("[WARN]\tfailed to fork (%d): %m\n", errno);
+		return 1;
+	}
+
+	if (child == 0) {
+		unsigned long vdso_size = PAGE_SIZE;
+		unsigned long auxval;
+		int ret = -1;
+
+		auxval = getauxval(AT_SYSINFO_EHDR);
+		printf("\tAT_SYSINFO_EHDR is %#lx\n", auxval);
+		if (!auxval || auxval == -ENOENT) {
+			printf("[WARN]\tgetauxval failed\n");
+			return 0;
+		}
+
+		/* Simpler than parsing ELF header */
+		while (ret < 0) {
+			ret = try_to_remap((void *)auxval, vdso_size);
+			vdso_size += PAGE_SIZE;
+		}
+
+		/* Glibc is likely to explode now - exit with raw syscall */
+		asm volatile ("int $0x80" : : "a" (__NR_exit), "b" (!!ret));
+	} else {
+		int status;
+
+		if (waitpid(child, &status, 0) != child ||
+			!WIFEXITED(status)) {
+			printf("[FAIL]\tmremap() of the vDSO does not work on this kernel!\n");
+			return 1;
+		} else if (WEXITSTATUS(status) != 0) {
+			printf("[FAIL]\tChild failed with %d\n",
+					WEXITSTATUS(status));
+			return 1;
+		}
+		printf("[OK]\n");
+	}
+
+	return 0;
+}

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-07-08 13:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-28 11:35 [PATCHv10 0/2] mremap vDSO for 32-bit Dmitry Safonov
2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
2016-07-06 14:03   ` Andy Lutomirski
2016-07-08 13:16   ` [tip:x86/mm] x86/vdso: Add " tip-bot for Dmitry Safonov
2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
2016-07-08 12:17   ` Ingo Molnar
2016-07-08 13:16   ` [tip:x86/mm] selftests/x86: Add vDSO mremap() test tip-bot for Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).