linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv10 0/2] mremap vDSO for 32-bit
@ 2016-06-28 11:35 Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
  0 siblings, 2 replies; 5+ messages in thread
From: Dmitry Safonov @ 2016-06-28 11:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: 0x7f454c46, mingo, luto, linux-mm, Dmitry Safonov

The first patch adds support of mremapping 32-bit vDSO.
One could move vDSO vma before this patch, but fast syscalls
on moved vDSO hasn't been working. It's because of code that
relies on mm->context.vdso pointer.
So all this code is just fixup for that pointer on moving.
(Also adds preventing for splitting vDSO vma).
As Andy notted, 64-bit vDSO mremap also has worked only by a chance
before this patches.
The second patch adds a test for the new functionality.

I need possibility to move vDSO in CRIU - on restore we need
to choose vma's position:
- if vDSO blob of restoring application is the same as the kernel has,
  we need to move it on the same place;
- if it differs, we need to choose place that wasn't tooken by other
  vma of restoring application and than add jump trampolines to it
  from the place of vDSO in restoring application.

CRIU code now relies on possibility on x86_64 to mremap vDSO.
Without this patch that may be broken in future.
And as I work on C/R of compatible 32-bit applications on x86_64,
I need mremap to work also for 32-bit vDSO. Which does not work,
because of context.vdso pointer mentioned above. 

Changes:
v10: run selftest after fork() and treat child segfaults for a nice
     error reports.
v9: Added cover-letter with changelog and reasons for patches
v8: Add WARN_ON_ONCE on current->mm != new_vma->vm_mm;
    run test for x86_64 too;
    removed fixed VDSO_SIZE - check EINVAL mremap return for partial remapping
v7: Build fix
v6: Moved vdso_image_32 check and fixup code into vdso_fix_landing function
    with ifdefs around
v5: As Andy suggested, add a check that new_vma->vm_mm and current->mm are
    the same, also check not only in_ia32_syscall() but image == &vdso_image_32;
    added test for mremapping vDSO
v4: Drop __maybe_unused & use image from mm->context instead vdso_image_32
v3: As Andy suggested, return EINVAL in case of splitting vdso blob on mremap;
    used is_ia32_task instead of ifdefs 
v2: Added __maybe_unused for pt_regs in vdso_mremap

Dmitry Safonov (2):
  x86/vdso: add mremap hook to vm_special_mapping
  selftest/x86: add mremap vdso test

 arch/x86/entry/vdso/vma.c                      |  47 +++++++++--
 include/linux/mm_types.h                       |   3 +
 mm/mmap.c                                      |  10 +++
 tools/testing/selftests/x86/Makefile           |   3 +-
 tools/testing/selftests/x86/test_mremap_vdso.c | 111 +++++++++++++++++++++++++
 5 files changed, 168 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/x86/test_mremap_vdso.c

-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping
  2016-06-28 11:35 [PATCHv10 0/2] mremap vDSO for 32-bit Dmitry Safonov
@ 2016-06-28 11:35 ` Dmitry Safonov
  2016-07-06 14:03   ` Andy Lutomirski
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
  1 sibling, 1 reply; 5+ messages in thread
From: Dmitry Safonov @ 2016-06-28 11:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, mingo, luto, linux-mm, Dmitry Safonov,
	Thomas Gleixner, H. Peter Anvin, x86

Add possibility for userspace 32-bit applications to move
vdso mapping. Previously, when userspace app called
mremap for vdso, in return path it would land on previous
address of vdso page, resulting in segmentation violation.
Now it lands fine and returns to userspace with remapped vdso.
This will also fix context.vdso pointer for 64-bit, which does not
affect the user of vdso after mremap by now, but this may change.

As suggested by Andy, return EINVAL for mremap that splits vdso image.

Renamed and moved text_mapping structure declaration inside
map_vdso, as it used only there and now it complement
vvar_mapping variable.

There is still problem for remapping vdso in glibc applications:
linker relocates addresses for syscalls on vdso page, so
you need to relink with the new addresses. Or the next syscall
through glibc may fail:
  Program received signal SIGSEGV, Segmentation fault.
  #0  0xf7fd9b80 in __kernel_vsyscall ()
  #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-mm@kvack.org
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vma.c | 47 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/mm_types.h  |  3 +++
 mm/mmap.c                 | 10 ++++++++++
 3 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index ab220ac9b3b9..3329844e3c43 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -12,6 +12,7 @@
 #include <linux/random.h>
 #include <linux/elf.h>
 #include <linux/cpu.h>
+#include <linux/ptrace.h>
 #include <asm/pvclock.h>
 #include <asm/vgtod.h>
 #include <asm/proto.h>
@@ -97,10 +98,40 @@ static int vdso_fault(const struct vm_special_mapping *sm,
 	return 0;
 }
 
-static const struct vm_special_mapping text_mapping = {
-	.name = "[vdso]",
-	.fault = vdso_fault,
-};
+static void vdso_fix_landing(const struct vdso_image *image,
+		struct vm_area_struct *new_vma)
+{
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+	if (in_ia32_syscall() && image == &vdso_image_32) {
+		struct pt_regs *regs = current_pt_regs();
+		unsigned long vdso_land = image->sym_int80_landing_pad;
+		unsigned long old_land_addr = vdso_land +
+			(unsigned long)current->mm->context.vdso;
+
+		/* Fixing userspace landing - look at do_fast_syscall_32 */
+		if (regs->ip == old_land_addr)
+			regs->ip = new_vma->vm_start + vdso_land;
+	}
+#endif
+}
+
+static int vdso_mremap(const struct vm_special_mapping *sm,
+		struct vm_area_struct *new_vma)
+{
+	unsigned long new_size = new_vma->vm_end - new_vma->vm_start;
+	const struct vdso_image *image = current->mm->context.vdso_image;
+
+	if (image->size != new_size)
+		return -EINVAL;
+
+	if (WARN_ON_ONCE(current->mm != new_vma->vm_mm))
+		return -EFAULT;
+
+	vdso_fix_landing(image, new_vma);
+	current->mm->context.vdso = (void __user *)new_vma->vm_start;
+
+	return 0;
+}
 
 static int vvar_fault(const struct vm_special_mapping *sm,
 		      struct vm_area_struct *vma, struct vm_fault *vmf)
@@ -151,6 +182,12 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 	struct vm_area_struct *vma;
 	unsigned long addr, text_start;
 	int ret = 0;
+
+	static const struct vm_special_mapping vdso_mapping = {
+		.name = "[vdso]",
+		.fault = vdso_fault,
+		.mremap = vdso_mremap,
+	};
 	static const struct vm_special_mapping vvar_mapping = {
 		.name = "[vvar]",
 		.fault = vvar_fault,
@@ -185,7 +222,7 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 				       image->size,
 				       VM_READ|VM_EXEC|
 				       VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
-				       &text_mapping);
+				       &vdso_mapping);
 
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index e093e1d3285b..79472b22d23f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -595,6 +595,9 @@ struct vm_special_mapping {
 	int (*fault)(const struct vm_special_mapping *sm,
 		     struct vm_area_struct *vma,
 		     struct vm_fault *vmf);
+
+	int (*mremap)(const struct vm_special_mapping *sm,
+		     struct vm_area_struct *new_vma);
 };
 
 enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index 25c2b4e0fbdc..86b18f334f4f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2961,9 +2961,19 @@ static const char *special_mapping_name(struct vm_area_struct *vma)
 	return ((struct vm_special_mapping *)vma->vm_private_data)->name;
 }
 
+static int special_mapping_mremap(struct vm_area_struct *new_vma)
+{
+	struct vm_special_mapping *sm = new_vma->vm_private_data;
+
+	if (sm->mremap)
+		return sm->mremap(sm, new_vma);
+	return 0;
+}
+
 static const struct vm_operations_struct special_mapping_vmops = {
 	.close = special_mapping_close,
 	.fault = special_mapping_fault,
+	.mremap = special_mapping_mremap,
 	.name = special_mapping_name,
 };
 
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCHv10 2/2] selftest/x86: add mremap vdso test
  2016-06-28 11:35 [PATCHv10 0/2] mremap vDSO for 32-bit Dmitry Safonov
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
@ 2016-06-28 11:35 ` Dmitry Safonov
  2016-07-08 12:17   ` Ingo Molnar
  1 sibling, 1 reply; 5+ messages in thread
From: Dmitry Safonov @ 2016-06-28 11:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, mingo, luto, linux-mm, Dmitry Safonov,
	Thomas Gleixner, H. Peter Anvin, Shuah Khan, x86,
	linux-kselftest

Should print on success:
[root@localhost ~]# ./test_mremap_vdso_32
	AT_SYSINFO_EHDR is 0xf773f000
[NOTE]	Moving vDSO: [f773f000, f7740000] -> [a000000, a001000]
[OK]

Or print that mremap for vDSO is unsupported:
[root@localhost ~]# ./test_mremap_vdso_32
	AT_SYSINFO_EHDR is 0xf773c000
[NOTE]	Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000]
[FAIL]	mremap() of the vDSO does not work on this kernel!

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: x86@kernel.org
Cc: linux-kselftest@vger.kernel.org
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 tools/testing/selftests/x86/Makefile           |   3 +-
 tools/testing/selftests/x86/test_mremap_vdso.c | 111 +++++++++++++++++++++++++
 2 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/test_mremap_vdso.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index abe9c35c1cb6..c701349d4920 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -5,7 +5,8 @@ include ../lib.mk
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall \
-			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test
+			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test \
+			test_mremap_vdso
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/test_mremap_vdso.c b/tools/testing/selftests/x86/test_mremap_vdso.c
new file mode 100644
index 000000000000..a489a2410664
--- /dev/null
+++ b/tools/testing/selftests/x86/test_mremap_vdso.c
@@ -0,0 +1,111 @@
+/*
+ * 32-bit test to check vdso mremap.
+ *
+ * Copyright (c) 2016 Dmitry Safonov
+ * Suggested-by: Andrew Lutomirski
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+/*
+ * Can be built statically:
+ * gcc -Os -Wall -static -m32 test_mremap_vdso.c
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+#include <string.h>
+
+#include <sys/mman.h>
+#include <sys/auxv.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+
+#define PAGE_SIZE	4096
+
+static int try_to_remap(void *vdso_addr, unsigned long size)
+{
+	void *dest_addr, *new_addr;
+
+	/* Searching for memory location where to remap */
+	dest_addr = mmap(0, size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
+	if (dest_addr == MAP_FAILED) {
+		printf("[WARN]\tmmap failed (%d): %m\n", errno);
+		return 0;
+	}
+
+	printf("[NOTE]\tMoving vDSO: [%p, %#lx] -> [%p, %#lx]\n",
+		vdso_addr, (unsigned long)vdso_addr + size,
+		dest_addr, (unsigned long)dest_addr + size);
+	fflush(stdout);
+
+	new_addr = mremap(vdso_addr, size, size,
+			MREMAP_FIXED|MREMAP_MAYMOVE, dest_addr);
+	if ((unsigned long)new_addr == (unsigned long)-1) {
+		munmap(dest_addr, size);
+		if (errno == EINVAL) {
+			printf("[NOTE]\tvDSO partial move failed, will try with bigger size\n");
+			return -1; /* Retry with larger */
+		}
+		printf("[FAIL]\tmremap failed (%d): %m\n", errno);
+		return 1;
+	}
+
+	return 0;
+
+}
+
+int main(int argc, char **argv, char **envp)
+{
+	pid_t child;
+
+	child = fork();
+	if (child == -1) {
+		printf("[WARN]\tfailed to fork (%d): %m\n", errno);
+		return 1;
+	}
+
+	if (child == 0) {
+		unsigned long vdso_size = PAGE_SIZE;
+		unsigned long auxval;
+		int ret = -1;
+
+		auxval = getauxval(AT_SYSINFO_EHDR);
+		printf("\tAT_SYSINFO_EHDR is %#lx\n", auxval);
+		if (!auxval || auxval == -ENOENT) {
+			printf("[WARN]\tgetauxval failed\n");
+			return 0;
+		}
+
+		/* Simpler than parsing ELF header */
+		while (ret < 0) {
+			ret = try_to_remap((void *)auxval, vdso_size);
+			vdso_size += PAGE_SIZE;
+		}
+
+		/* Glibc is likely to explode now - exit with raw syscall */
+		asm volatile ("int $0x80" : : "a" (__NR_exit), "b" (!!ret));
+	} else {
+		int status;
+
+		if (waitpid(child, &status, 0) != child ||
+			!WIFEXITED(status)) {
+			printf("[FAIL]\tmremap() of the vDSO does not work on this kernel!\n");
+			return 1;
+		} else if (WEXITSTATUS(status) != 0) {
+			printf("[FAIL]\tChild failed with %d\n",
+					WEXITSTATUS(status));
+			return 1;
+		}
+		printf("[OK]\n");
+	}
+
+	return 0;
+}
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping
  2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
@ 2016-07-06 14:03   ` Andy Lutomirski
  0 siblings, 0 replies; 5+ messages in thread
From: Andy Lutomirski @ 2016-07-06 14:03 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Ingo Molnar, Andrew Lutomirski,
	linux-mm, Thomas Gleixner, H. Peter Anvin, X86 ML

On Tue, Jun 28, 2016 at 4:35 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> Add possibility for userspace 32-bit applications to move
> vdso mapping. Previously, when userspace app called
> mremap for vdso, in return path it would land on previous
> address of vdso page, resulting in segmentation violation.
> Now it lands fine and returns to userspace with remapped vdso.
> This will also fix context.vdso pointer for 64-bit, which does not
> affect the user of vdso after mremap by now, but this may change.
>
> As suggested by Andy, return EINVAL for mremap that splits vdso image.
>
> Renamed and moved text_mapping structure declaration inside
> map_vdso, as it used only there and now it complement
> vvar_mapping variable.
>
> There is still problem for remapping vdso in glibc applications:
> linker relocates addresses for syscalls on vdso page, so
> you need to relink with the new addresses. Or the next syscall
> through glibc may fail:
>   Program received signal SIGSEGV, Segmentation fault.
>   #0  0xf7fd9b80 in __kernel_vsyscall ()
>   #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6

Acked-by: Andy Lutomirski <luto@kernel.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCHv10 2/2] selftest/x86: add mremap vdso test
  2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
@ 2016-07-08 12:17   ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2016-07-08 12:17 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, 0x7f454c46, mingo, luto, linux-mm, Thomas Gleixner,
	H. Peter Anvin, Shuah Khan, x86, linux-kselftest


* Dmitry Safonov <dsafonov@virtuozzo.com> wrote:

> Or print that mremap for vDSO is unsupported:
> [root@localhost ~]# ./test_mremap_vdso_32
> 	AT_SYSINFO_EHDR is 0xf773c000
> [NOTE]	Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000]
> [FAIL]	mremap() of the vDSO does not work on this kernel!

Hm, I tried this on a 64-bit kernel and got:

triton:~/tip/tools/testing/selftests/x86> ./test_mremap_vdso_32 
        AT_SYSINFO_EHDR is 0xf7773000
[NOTE]  Moving vDSO: [0xf7773000, 0xf7774000] -> [0xf776e000, 0xf776f000]
Segmentation fault

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-08 12:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-28 11:35 [PATCHv10 0/2] mremap vDSO for 32-bit Dmitry Safonov
2016-06-28 11:35 ` [PATCHv10 1/2] x86/vdso: add mremap hook to vm_special_mapping Dmitry Safonov
2016-07-06 14:03   ` Andy Lutomirski
2016-06-28 11:35 ` [PATCHv10 2/2] selftest/x86: add mremap vdso test Dmitry Safonov
2016-07-08 12:17   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).