All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64
@ 2016-08-31 13:59 ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Changes from v3:
- proper ifdefs around vdso_image_32
- missed Reviewed-by tag

Changes from v2:
- reworked map_vdso() part with Andy suggestions
- int arch_prctl(ARCH_MAP_VDSO_*, addr) now returns size of mapped
  vdso blob on success, which is handy for the following blob parsing
  in userspace
- disallowed two vDSO blobs mappings: as Andy noted,
  __insert_special_mapping may not get all accounting right, which
  may lead to abuse this API from userspace. Return -EEXIST if process
  has mapped vdso blob - this will ensure that caller knows what it does.

The following changes are available since v1:
- killed PR_REG_SIZE macro as Oleg suggested
- cleared SA_IA32_ABI|SA_X32_ABI from oact->sa.sa_flags in do_sigaction()
  as noticed by Oleg
- moved SA_IA32_ABI|SA_X32_ABI from uapi header as those flags shouldn't
  be exposed to user-space

I also reworked CRIU's patches to work with this patches set, rather than
on first RFC that swapped TIF_IA32 with arch_prctl. By now it yet fails
~10% of 32-bit tests of CRIU's test suite called ZDTM.
The CRIU branch for this can be viewed on [6] and v3 patches to add
this functionality have been sent to maillist [7].

The patches set is based on [3] and while it's not yet applied -- it
may make kbuild test robot unhappy.

Description from v1 [5]:

This patches set is an attempt to add checkpoint/restore
for 32-bit tasks in compatibility mode on x86_64 hosts.

Restore in CRIU starts from one root restoring process, which
reads info for all threads being restored from images files.
This information is used further to find out which processes
share some resources. Later shared resources are restored only
by one process and all other inherit them.
After that it calls clone() and new threads restore their
properties in parallel. Those threads inherit all parent's
mappings and fetch properties from those mappings
(and do clone themself, if they have children/subthreads). [1]
Then starts restorer blob's play, it's PIE binary, which
unmaps all unneeded for restoring VMAs, maps new VMAs and
finalize restoring with sigreturn syscall. [2]

To restore of 32-bit task we need three things to do in running
x86_64 restorer blob:
a) set code selector to __USER32_CS (to run 32-bit code);
b) remap vdso blob from 64-bit to 32-bit
   This is primary needed because restore may happen on a different
   kernel, which has different vDSO image than we had on dump.
c) if 32-bit vDSO differ to dumped image, move it on free place
   and add jump trampolines to that place.
d) switch TIF_IA32 flag, so kernel would know that it deals with
   compatible 32-bit application.

>From all this:
a) setting CS may be done from userspace, no patches needed;
b) patches 1-3 add ability to map different vDSO blobs on x86 kernel;
c) for remapping/moving 32-bit vDSO blob patches have been send earlier
   and seems to be accepted [3]
d) and for swapping TIF_IA32 flag discussion with Andy ended in conclusion
   that it's better to remove this flag completely.
   Patches 4-6 deletes usage of TIF_IA32 from ptrace, signal and coredump
   code. This is rework/resend of RFC [4]

[1] https://criu.org/Checkpoint/Restore#Restore
[2] https://criu.org/Restorer_context
[3] https://lkml.org/lkml/2016/6/28/489
[4] https://lkml.org/lkml/2016/4/25/650
[5] https://lkml.org/lkml/2016/6/1/425
[6] https://github.com/0x7f454c46/criu/tree/compat-4
[7] https://lists.openvz.org/pipermail/criu/2016-June/029788.html

Dmitry Safonov (6):
  x86/vdso: unmap vdso blob on vvar mapping failure
  x86/vdso: replace calculate_addr in map_vdso() with addr
  x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  x86/coredump: use pr_reg size, rather that TIF_IA32 flag
  x86/ptrace: down with test_thread_flag(TIF_IA32)
  x86/signal: add SA_{X32,IA32}_ABI sa_flags

 arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
 arch/x86/ia32/ia32_signal.c       |  2 +-
 arch/x86/include/asm/compat.h     |  8 ++--
 arch/x86/include/asm/fpu/signal.h |  6 +++
 arch/x86/include/asm/signal.h     |  4 ++
 arch/x86/include/asm/vdso.h       |  2 +
 arch/x86/include/uapi/asm/prctl.h |  6 +++
 arch/x86/kernel/process_64.c      | 25 ++++++++++++
 arch/x86/kernel/ptrace.c          |  2 +-
 arch/x86/kernel/signal.c          | 20 +++++-----
 arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
 fs/binfmt_elf.c                   | 23 ++++-------
 kernel/signal.c                   |  7 ++++
 13 files changed, 162 insertions(+), 58 deletions(-)

-- 
2.9.0

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64
@ 2016-08-31 13:59 ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Changes from v3:
- proper ifdefs around vdso_image_32
- missed Reviewed-by tag

Changes from v2:
- reworked map_vdso() part with Andy suggestions
- int arch_prctl(ARCH_MAP_VDSO_*, addr) now returns size of mapped
  vdso blob on success, which is handy for the following blob parsing
  in userspace
- disallowed two vDSO blobs mappings: as Andy noted,
  __insert_special_mapping may not get all accounting right, which
  may lead to abuse this API from userspace. Return -EEXIST if process
  has mapped vdso blob - this will ensure that caller knows what it does.

The following changes are available since v1:
- killed PR_REG_SIZE macro as Oleg suggested
- cleared SA_IA32_ABI|SA_X32_ABI from oact->sa.sa_flags in do_sigaction()
  as noticed by Oleg
- moved SA_IA32_ABI|SA_X32_ABI from uapi header as those flags shouldn't
  be exposed to user-space

I also reworked CRIU's patches to work with this patches set, rather than
on first RFC that swapped TIF_IA32 with arch_prctl. By now it yet fails
~10% of 32-bit tests of CRIU's test suite called ZDTM.
The CRIU branch for this can be viewed on [6] and v3 patches to add
this functionality have been sent to maillist [7].

The patches set is based on [3] and while it's not yet applied -- it
may make kbuild test robot unhappy.

Description from v1 [5]:

This patches set is an attempt to add checkpoint/restore
for 32-bit tasks in compatibility mode on x86_64 hosts.

Restore in CRIU starts from one root restoring process, which
reads info for all threads being restored from images files.
This information is used further to find out which processes
share some resources. Later shared resources are restored only
by one process and all other inherit them.
After that it calls clone() and new threads restore their
properties in parallel. Those threads inherit all parent's
mappings and fetch properties from those mappings
(and do clone themself, if they have children/subthreads). [1]
Then starts restorer blob's play, it's PIE binary, which
unmaps all unneeded for restoring VMAs, maps new VMAs and
finalize restoring with sigreturn syscall. [2]

To restore of 32-bit task we need three things to do in running
x86_64 restorer blob:
a) set code selector to __USER32_CS (to run 32-bit code);
b) remap vdso blob from 64-bit to 32-bit
   This is primary needed because restore may happen on a different
   kernel, which has different vDSO image than we had on dump.
c) if 32-bit vDSO differ to dumped image, move it on free place
   and add jump trampolines to that place.
d) switch TIF_IA32 flag, so kernel would know that it deals with
   compatible 32-bit application.

>From all this:
a) setting CS may be done from userspace, no patches needed;
b) patches 1-3 add ability to map different vDSO blobs on x86 kernel;
c) for remapping/moving 32-bit vDSO blob patches have been send earlier
   and seems to be accepted [3]
d) and for swapping TIF_IA32 flag discussion with Andy ended in conclusion
   that it's better to remove this flag completely.
   Patches 4-6 deletes usage of TIF_IA32 from ptrace, signal and coredump
   code. This is rework/resend of RFC [4]

[1] https://criu.org/Checkpoint/Restore#Restore
[2] https://criu.org/Restorer_context
[3] https://lkml.org/lkml/2016/6/28/489
[4] https://lkml.org/lkml/2016/4/25/650
[5] https://lkml.org/lkml/2016/6/1/425
[6] https://github.com/0x7f454c46/criu/tree/compat-4
[7] https://lists.openvz.org/pipermail/criu/2016-June/029788.html

Dmitry Safonov (6):
  x86/vdso: unmap vdso blob on vvar mapping failure
  x86/vdso: replace calculate_addr in map_vdso() with addr
  x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  x86/coredump: use pr_reg size, rather that TIF_IA32 flag
  x86/ptrace: down with test_thread_flag(TIF_IA32)
  x86/signal: add SA_{X32,IA32}_ABI sa_flags

 arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
 arch/x86/ia32/ia32_signal.c       |  2 +-
 arch/x86/include/asm/compat.h     |  8 ++--
 arch/x86/include/asm/fpu/signal.h |  6 +++
 arch/x86/include/asm/signal.h     |  4 ++
 arch/x86/include/asm/vdso.h       |  2 +
 arch/x86/include/uapi/asm/prctl.h |  6 +++
 arch/x86/kernel/process_64.c      | 25 ++++++++++++
 arch/x86/kernel/ptrace.c          |  2 +-
 arch/x86/kernel/signal.c          | 20 +++++-----
 arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
 fs/binfmt_elf.c                   | 23 ++++-------
 kernel/signal.c                   |  7 ++++
 13 files changed, 162 insertions(+), 58 deletions(-)

-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCHv4 1/6] x86/vdso: unmap vdso blob on vvar mapping failure
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-08-31 13:59   ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

If remapping of vDSO blob failed on vvar mapping,
we need to unmap previously mapped vDSO blob.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vma.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index f840766659a8..3bab6ba3ffc5 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -238,12 +238,14 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
-		goto up_fail;
+		do_munmap(mm, text_start, image->size);
 	}
 
 up_fail:
-	if (ret)
+	if (ret) {
 		current->mm->context.vdso = NULL;
+		current->mm->context.vdso_image = NULL;
+	}
 
 	up_write(&mm->mmap_sem);
 	return ret;
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 1/6] x86/vdso: unmap vdso blob on vvar mapping failure
@ 2016-08-31 13:59   ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

If remapping of vDSO blob failed on vvar mapping,
we need to unmap previously mapped vDSO blob.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vma.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index f840766659a8..3bab6ba3ffc5 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -238,12 +238,14 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
-		goto up_fail;
+		do_munmap(mm, text_start, image->size);
 	}
 
 up_fail:
-	if (ret)
+	if (ret) {
 		current->mm->context.vdso = NULL;
+		current->mm->context.vdso_image = NULL;
+	}
 
 	up_write(&mm->mmap_sem);
 	return ret;
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 2/6] x86/vdso: replace calculate_addr in map_vdso() with addr
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-08-31 13:59   ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

That will allow to specify address where to map vDSO blob.
For the randomized vDSO mappings introduce map_vdso_randomized()
which will simplify calls to map_vdso.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/entry/vdso/vma.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 3bab6ba3ffc5..5bcb25a9e573 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -176,11 +176,16 @@ static int vvar_fault(const struct vm_special_mapping *sm,
 	return VM_FAULT_SIGBUS;
 }
 
-static int map_vdso(const struct vdso_image *image, bool calculate_addr)
+/*
+ * Add vdso and vvar mappings to current process.
+ * @image          - blob to map
+ * @addr           - request a specific address (zero to map at free addr)
+ */
+static int map_vdso(const struct vdso_image *image, unsigned long addr)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
-	unsigned long addr, text_start;
+	unsigned long text_start;
 	int ret = 0;
 
 	static const struct vm_special_mapping vdso_mapping = {
@@ -193,13 +198,6 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 		.fault = vvar_fault,
 	};
 
-	if (calculate_addr) {
-		addr = vdso_addr(current->mm->start_stack,
-				 image->size - image->sym_vvar_start);
-	} else {
-		addr = 0;
-	}
-
 	if (down_write_killable(&mm->mmap_sem))
 		return -EINTR;
 
@@ -251,13 +249,20 @@ up_fail:
 	return ret;
 }
 
+static int map_vdso_randomized(const struct vdso_image *image)
+{
+	unsigned long addr = vdso_addr(current->mm->start_stack,
+				 image->size - image->sym_vvar_start);
+	return map_vdso(image, addr);
+}
+
 #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
 static int load_vdso32(void)
 {
 	if (vdso32_enabled != 1)  /* Other values all mean "disabled" */
 		return 0;
 
-	return map_vdso(&vdso_image_32, false);
+	return map_vdso(&vdso_image_32, 0);
 }
 #endif
 
@@ -267,7 +272,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	if (!vdso64_enabled)
 		return 0;
 
-	return map_vdso(&vdso_image_64, true);
+	return map_vdso_randomized(&vdso_image_64);
 }
 
 #ifdef CONFIG_COMPAT
@@ -278,8 +283,7 @@ int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
 	if (test_thread_flag(TIF_X32)) {
 		if (!vdso64_enabled)
 			return 0;
-
-		return map_vdso(&vdso_image_x32, true);
+		return map_vdso_randomized(&vdso_image_x32);
 	}
 #endif
 #ifdef CONFIG_IA32_EMULATION
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 2/6] x86/vdso: replace calculate_addr in map_vdso() with addr
@ 2016-08-31 13:59   ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

That will allow to specify address where to map vDSO blob.
For the randomized vDSO mappings introduce map_vdso_randomized()
which will simplify calls to map_vdso.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/entry/vdso/vma.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 3bab6ba3ffc5..5bcb25a9e573 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -176,11 +176,16 @@ static int vvar_fault(const struct vm_special_mapping *sm,
 	return VM_FAULT_SIGBUS;
 }
 
-static int map_vdso(const struct vdso_image *image, bool calculate_addr)
+/*
+ * Add vdso and vvar mappings to current process.
+ * @image          - blob to map
+ * @addr           - request a specific address (zero to map at free addr)
+ */
+static int map_vdso(const struct vdso_image *image, unsigned long addr)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
-	unsigned long addr, text_start;
+	unsigned long text_start;
 	int ret = 0;
 
 	static const struct vm_special_mapping vdso_mapping = {
@@ -193,13 +198,6 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 		.fault = vvar_fault,
 	};
 
-	if (calculate_addr) {
-		addr = vdso_addr(current->mm->start_stack,
-				 image->size - image->sym_vvar_start);
-	} else {
-		addr = 0;
-	}
-
 	if (down_write_killable(&mm->mmap_sem))
 		return -EINTR;
 
@@ -251,13 +249,20 @@ up_fail:
 	return ret;
 }
 
+static int map_vdso_randomized(const struct vdso_image *image)
+{
+	unsigned long addr = vdso_addr(current->mm->start_stack,
+				 image->size - image->sym_vvar_start);
+	return map_vdso(image, addr);
+}
+
 #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
 static int load_vdso32(void)
 {
 	if (vdso32_enabled != 1)  /* Other values all mean "disabled" */
 		return 0;
 
-	return map_vdso(&vdso_image_32, false);
+	return map_vdso(&vdso_image_32, 0);
 }
 #endif
 
@@ -267,7 +272,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	if (!vdso64_enabled)
 		return 0;
 
-	return map_vdso(&vdso_image_64, true);
+	return map_vdso_randomized(&vdso_image_64);
 }
 
 #ifdef CONFIG_COMPAT
@@ -278,8 +283,7 @@ int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
 	if (test_thread_flag(TIF_X32)) {
 		if (!vdso64_enabled)
 			return 0;
-
-		return map_vdso(&vdso_image_x32, true);
+		return map_vdso_randomized(&vdso_image_x32);
 	}
 #endif
 #ifdef CONFIG_IA32_EMULATION
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-08-31 13:59   ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Add API to change vdso blob type with arch_prctl.
As this is usefull only by needs of CRIU, expose
this interface under CONFIG_CHECKPOINT_RESTORE.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/entry/vdso/vma.c         | 45 ++++++++++++++++++++++++++++++---------
 arch/x86/include/asm/vdso.h       |  2 ++
 arch/x86/include/uapi/asm/prctl.h |  6 ++++++
 arch/x86/kernel/process_64.c      | 25 ++++++++++++++++++++++
 4 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 5bcb25a9e573..dad2b2d8ff03 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -176,6 +176,16 @@ static int vvar_fault(const struct vm_special_mapping *sm,
 	return VM_FAULT_SIGBUS;
 }
 
+static const struct vm_special_mapping vdso_mapping = {
+	.name = "[vdso]",
+	.fault = vdso_fault,
+	.mremap = vdso_mremap,
+};
+static const struct vm_special_mapping vvar_mapping = {
+	.name = "[vvar]",
+	.fault = vvar_fault,
+};
+
 /*
  * Add vdso and vvar mappings to current process.
  * @image          - blob to map
@@ -188,16 +198,6 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
 	unsigned long text_start;
 	int ret = 0;
 
-	static const struct vm_special_mapping vdso_mapping = {
-		.name = "[vdso]",
-		.fault = vdso_fault,
-		.mremap = vdso_mremap,
-	};
-	static const struct vm_special_mapping vvar_mapping = {
-		.name = "[vvar]",
-		.fault = vvar_fault,
-	};
-
 	if (down_write_killable(&mm->mmap_sem))
 		return -EINTR;
 
@@ -256,6 +256,31 @@ static int map_vdso_randomized(const struct vdso_image *image)
 	return map_vdso(image, addr);
 }
 
+int map_vdso_once(const struct vdso_image *image, unsigned long addr)
+{
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+
+	down_write(&mm->mmap_sem);
+	/*
+	 * Check if we have already mapped vdso blob - fail to prevent
+	 * abusing from userspace install_speciall_mapping, which may
+	 * not do accounting and rlimit right.
+	 * We could search vma near context.vdso, but it's a slowpath,
+	 * so let's explicitely check all VMAs to be completely sure.
+	 */
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if (vma->vm_private_data == &vdso_mapping ||
+				vma->vm_private_data == &vvar_mapping) {
+			up_write(&mm->mmap_sem);
+			return -EEXIST;
+		}
+	}
+	up_write(&mm->mmap_sem);
+
+	return map_vdso(image, addr);
+}
+
 #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
 static int load_vdso32(void)
 {
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 43dc55be524e..2444189cbe28 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -41,6 +41,8 @@ extern const struct vdso_image vdso_image_32;
 
 extern void __init init_vdso_image(const struct vdso_image *image);
 
+extern int map_vdso_once(const struct vdso_image *image, unsigned long addr);
+
 #endif /* __ASSEMBLER__ */
 
 #endif /* _ASM_X86_VDSO_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 3ac5032fae09..ae135de547f5 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -6,4 +6,10 @@
 #define ARCH_GET_FS 0x1003
 #define ARCH_GET_GS 0x1004
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+# define ARCH_MAP_VDSO_X32	0x2001
+# define ARCH_MAP_VDSO_32	0x2002
+# define ARCH_MAP_VDSO_64	0x2003
+#endif
+
 #endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 63236d8f84bf..f240a465920b 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -49,6 +49,7 @@
 #include <asm/debugreg.h>
 #include <asm/switch_to.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/vdso.h>
 
 asmlinkage extern void ret_from_fork(void);
 
@@ -524,6 +525,17 @@ void set_personality_ia32(bool x32)
 }
 EXPORT_SYMBOL_GPL(set_personality_ia32);
 
+static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
+{
+	int ret;
+
+	ret = map_vdso_once(image, addr);
+	if (ret)
+		return ret;
+
+	return (long)image->size;
+}
+
 long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
 {
 	int ret = 0;
@@ -577,6 +589,19 @@ long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
 		break;
 	}
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_X86_X32
+	case ARCH_MAP_VDSO_X32:
+		return prctl_map_vdso(&vdso_image_x32, addr);
+#endif
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+	case ARCH_MAP_VDSO_32:
+		return prctl_map_vdso(&vdso_image_32, addr);
+#endif
+	case ARCH_MAP_VDSO_64:
+		return prctl_map_vdso(&vdso_image_64, addr);
+#endif
+
 	default:
 		ret = -EINVAL;
 		break;
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
@ 2016-08-31 13:59   ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Add API to change vdso blob type with arch_prctl.
As this is usefull only by needs of CRIU, expose
this interface under CONFIG_CHECKPOINT_RESTORE.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/entry/vdso/vma.c         | 45 ++++++++++++++++++++++++++++++---------
 arch/x86/include/asm/vdso.h       |  2 ++
 arch/x86/include/uapi/asm/prctl.h |  6 ++++++
 arch/x86/kernel/process_64.c      | 25 ++++++++++++++++++++++
 4 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 5bcb25a9e573..dad2b2d8ff03 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -176,6 +176,16 @@ static int vvar_fault(const struct vm_special_mapping *sm,
 	return VM_FAULT_SIGBUS;
 }
 
+static const struct vm_special_mapping vdso_mapping = {
+	.name = "[vdso]",
+	.fault = vdso_fault,
+	.mremap = vdso_mremap,
+};
+static const struct vm_special_mapping vvar_mapping = {
+	.name = "[vvar]",
+	.fault = vvar_fault,
+};
+
 /*
  * Add vdso and vvar mappings to current process.
  * @image          - blob to map
@@ -188,16 +198,6 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
 	unsigned long text_start;
 	int ret = 0;
 
-	static const struct vm_special_mapping vdso_mapping = {
-		.name = "[vdso]",
-		.fault = vdso_fault,
-		.mremap = vdso_mremap,
-	};
-	static const struct vm_special_mapping vvar_mapping = {
-		.name = "[vvar]",
-		.fault = vvar_fault,
-	};
-
 	if (down_write_killable(&mm->mmap_sem))
 		return -EINTR;
 
@@ -256,6 +256,31 @@ static int map_vdso_randomized(const struct vdso_image *image)
 	return map_vdso(image, addr);
 }
 
+int map_vdso_once(const struct vdso_image *image, unsigned long addr)
+{
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+
+	down_write(&mm->mmap_sem);
+	/*
+	 * Check if we have already mapped vdso blob - fail to prevent
+	 * abusing from userspace install_speciall_mapping, which may
+	 * not do accounting and rlimit right.
+	 * We could search vma near context.vdso, but it's a slowpath,
+	 * so let's explicitely check all VMAs to be completely sure.
+	 */
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if (vma->vm_private_data == &vdso_mapping ||
+				vma->vm_private_data == &vvar_mapping) {
+			up_write(&mm->mmap_sem);
+			return -EEXIST;
+		}
+	}
+	up_write(&mm->mmap_sem);
+
+	return map_vdso(image, addr);
+}
+
 #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
 static int load_vdso32(void)
 {
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 43dc55be524e..2444189cbe28 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -41,6 +41,8 @@ extern const struct vdso_image vdso_image_32;
 
 extern void __init init_vdso_image(const struct vdso_image *image);
 
+extern int map_vdso_once(const struct vdso_image *image, unsigned long addr);
+
 #endif /* __ASSEMBLER__ */
 
 #endif /* _ASM_X86_VDSO_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 3ac5032fae09..ae135de547f5 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -6,4 +6,10 @@
 #define ARCH_GET_FS 0x1003
 #define ARCH_GET_GS 0x1004
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+# define ARCH_MAP_VDSO_X32	0x2001
+# define ARCH_MAP_VDSO_32	0x2002
+# define ARCH_MAP_VDSO_64	0x2003
+#endif
+
 #endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 63236d8f84bf..f240a465920b 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -49,6 +49,7 @@
 #include <asm/debugreg.h>
 #include <asm/switch_to.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/vdso.h>
 
 asmlinkage extern void ret_from_fork(void);
 
@@ -524,6 +525,17 @@ void set_personality_ia32(bool x32)
 }
 EXPORT_SYMBOL_GPL(set_personality_ia32);
 
+static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
+{
+	int ret;
+
+	ret = map_vdso_once(image, addr);
+	if (ret)
+		return ret;
+
+	return (long)image->size;
+}
+
 long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
 {
 	int ret = 0;
@@ -577,6 +589,19 @@ long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
 		break;
 	}
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_X86_X32
+	case ARCH_MAP_VDSO_X32:
+		return prctl_map_vdso(&vdso_image_x32, addr);
+#endif
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+	case ARCH_MAP_VDSO_32:
+		return prctl_map_vdso(&vdso_image_32, addr);
+#endif
+	case ARCH_MAP_VDSO_64:
+		return prctl_map_vdso(&vdso_image_64, addr);
+#endif
+
 	default:
 		ret = -EINVAL;
 		break;
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 4/6] x86/coredump: use pr_reg size, rather that TIF_IA32 flag
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-08-31 13:59   ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Killed PR_REG_SIZE and PR_REG_PTR macro as we can get regset size
from regset view.
I wish I could also kill PRSTATUS_SIZE nicely.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/include/asm/compat.h |  8 ++++----
 fs/binfmt_elf.c               | 23 ++++++++---------------
 2 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index a18806165fe4..03d269bed941 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -275,10 +275,10 @@ struct compat_shmid64_ds {
 #ifdef CONFIG_X86_X32_ABI
 typedef struct user_regs_struct compat_elf_gregset_t;
 
-#define PR_REG_SIZE(S) (test_thread_flag(TIF_IA32) ? 68 : 216)
-#define PRSTATUS_SIZE(S) (test_thread_flag(TIF_IA32) ? 144 : 296)
-#define SET_PR_FPVALID(S,V) \
-  do { *(int *) (((void *) &((S)->pr_reg)) + PR_REG_SIZE(0)) = (V); } \
+/* Full regset -- prstatus on x32, otherwise on ia32 */
+#define PRSTATUS_SIZE(S, R) (R != sizeof(S.pr_reg) ? 144 : 296)
+#define SET_PR_FPVALID(S, V, R) \
+  do { *(int *) (((void *) &((S)->pr_reg)) + R) = (V); } \
   while (0)
 
 #define COMPAT_USE_64BIT_TIME \
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 7f6aff3f72eb..8533aaaba2d2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1624,20 +1624,12 @@ static void do_thread_regset_writeback(struct task_struct *task,
 		regset->writeback(task, regset, 1);
 }
 
-#ifndef PR_REG_SIZE
-#define PR_REG_SIZE(S) sizeof(S)
-#endif
-
 #ifndef PRSTATUS_SIZE
-#define PRSTATUS_SIZE(S) sizeof(S)
-#endif
-
-#ifndef PR_REG_PTR
-#define PR_REG_PTR(S) (&((S)->pr_reg))
+#define PRSTATUS_SIZE(S, R) sizeof(S)
 #endif
 
 #ifndef SET_PR_FPVALID
-#define SET_PR_FPVALID(S, V) ((S)->pr_fpvalid = (V))
+#define SET_PR_FPVALID(S, V, R) ((S)->pr_fpvalid = (V))
 #endif
 
 static int fill_thread_core_info(struct elf_thread_core_info *t,
@@ -1645,6 +1637,7 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
 				 long signr, size_t *total)
 {
 	unsigned int i;
+	unsigned int regset_size = view->regsets[0].n * view->regsets[0].size;
 
 	/*
 	 * NT_PRSTATUS is the one special case, because the regset data
@@ -1653,12 +1646,11 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
 	 * We assume that regset 0 is NT_PRSTATUS.
 	 */
 	fill_prstatus(&t->prstatus, t->task, signr);
-	(void) view->regsets[0].get(t->task, &view->regsets[0],
-				    0, PR_REG_SIZE(t->prstatus.pr_reg),
-				    PR_REG_PTR(&t->prstatus), NULL);
+	(void) view->regsets[0].get(t->task, &view->regsets[0], 0, regset_size,
+				    &t->prstatus.pr_reg, NULL);
 
 	fill_note(&t->notes[0], "CORE", NT_PRSTATUS,
-		  PRSTATUS_SIZE(t->prstatus), &t->prstatus);
+		  PRSTATUS_SIZE(t->prstatus, regset_size), &t->prstatus);
 	*total += notesize(&t->notes[0]);
 
 	do_thread_regset_writeback(t->task, &view->regsets[0]);
@@ -1688,7 +1680,8 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
 						  regset->core_note_type,
 						  size, data);
 				else {
-					SET_PR_FPVALID(&t->prstatus, 1);
+					SET_PR_FPVALID(&t->prstatus,
+							1, regset_size);
 					fill_note(&t->notes[i], "CORE",
 						  NT_PRFPREG, size, data);
 				}
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 4/6] x86/coredump: use pr_reg size, rather that TIF_IA32 flag
@ 2016-08-31 13:59   ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Killed PR_REG_SIZE and PR_REG_PTR macro as we can get regset size
from regset view.
I wish I could also kill PRSTATUS_SIZE nicely.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/include/asm/compat.h |  8 ++++----
 fs/binfmt_elf.c               | 23 ++++++++---------------
 2 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index a18806165fe4..03d269bed941 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -275,10 +275,10 @@ struct compat_shmid64_ds {
 #ifdef CONFIG_X86_X32_ABI
 typedef struct user_regs_struct compat_elf_gregset_t;
 
-#define PR_REG_SIZE(S) (test_thread_flag(TIF_IA32) ? 68 : 216)
-#define PRSTATUS_SIZE(S) (test_thread_flag(TIF_IA32) ? 144 : 296)
-#define SET_PR_FPVALID(S,V) \
-  do { *(int *) (((void *) &((S)->pr_reg)) + PR_REG_SIZE(0)) = (V); } \
+/* Full regset -- prstatus on x32, otherwise on ia32 */
+#define PRSTATUS_SIZE(S, R) (R != sizeof(S.pr_reg) ? 144 : 296)
+#define SET_PR_FPVALID(S, V, R) \
+  do { *(int *) (((void *) &((S)->pr_reg)) + R) = (V); } \
   while (0)
 
 #define COMPAT_USE_64BIT_TIME \
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 7f6aff3f72eb..8533aaaba2d2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1624,20 +1624,12 @@ static void do_thread_regset_writeback(struct task_struct *task,
 		regset->writeback(task, regset, 1);
 }
 
-#ifndef PR_REG_SIZE
-#define PR_REG_SIZE(S) sizeof(S)
-#endif
-
 #ifndef PRSTATUS_SIZE
-#define PRSTATUS_SIZE(S) sizeof(S)
-#endif
-
-#ifndef PR_REG_PTR
-#define PR_REG_PTR(S) (&((S)->pr_reg))
+#define PRSTATUS_SIZE(S, R) sizeof(S)
 #endif
 
 #ifndef SET_PR_FPVALID
-#define SET_PR_FPVALID(S, V) ((S)->pr_fpvalid = (V))
+#define SET_PR_FPVALID(S, V, R) ((S)->pr_fpvalid = (V))
 #endif
 
 static int fill_thread_core_info(struct elf_thread_core_info *t,
@@ -1645,6 +1637,7 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
 				 long signr, size_t *total)
 {
 	unsigned int i;
+	unsigned int regset_size = view->regsets[0].n * view->regsets[0].size;
 
 	/*
 	 * NT_PRSTATUS is the one special case, because the regset data
@@ -1653,12 +1646,11 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
 	 * We assume that regset 0 is NT_PRSTATUS.
 	 */
 	fill_prstatus(&t->prstatus, t->task, signr);
-	(void) view->regsets[0].get(t->task, &view->regsets[0],
-				    0, PR_REG_SIZE(t->prstatus.pr_reg),
-				    PR_REG_PTR(&t->prstatus), NULL);
+	(void) view->regsets[0].get(t->task, &view->regsets[0], 0, regset_size,
+				    &t->prstatus.pr_reg, NULL);
 
 	fill_note(&t->notes[0], "CORE", NT_PRSTATUS,
-		  PRSTATUS_SIZE(t->prstatus), &t->prstatus);
+		  PRSTATUS_SIZE(t->prstatus, regset_size), &t->prstatus);
 	*total += notesize(&t->notes[0]);
 
 	do_thread_regset_writeback(t->task, &view->regsets[0]);
@@ -1688,7 +1680,8 @@ static int fill_thread_core_info(struct elf_thread_core_info *t,
 						  regset->core_note_type,
 						  size, data);
 				else {
-					SET_PR_FPVALID(&t->prstatus, 1);
+					SET_PR_FPVALID(&t->prstatus,
+							1, regset_size);
 					fill_note(&t->notes[i], "CORE",
 						  NT_PRFPREG, size, data);
 				}
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 5/6] x86/ptrace: down with test_thread_flag(TIF_IA32)
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-08-31 13:59   ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov, Pedro Alves

As the task isn't executing at the moment of {GET,SET}REGS,
return regset that corresponds to code selector, rather than
value of TIF_IA32 flag.
I.e. if we ptrace i386 elf binary that has just changed it's
code selector to __USER_CS, than GET_REGS will return
full x86_64 register set.

Note, that this will work only if application has changed it's CS.
If the application does 32-bit syscall with __USER_CS, ptrace
will still return 64-bit register set. Which might be still confusing
for tools that expect TS_COMPACT to be exposed [1, 2].

So this this change should make PTRACE_GETREGSET more reliable and
this will be another step to drop TIF_{IA32,X32} flags.

[1]: https://sourceforge.net/p/strace/mailman/message/30471411/
[2]: https://lkml.org/lkml/2012/1/18/320

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/kernel/ptrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f79576a541ff..ad0bab8fc594 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1358,7 +1358,7 @@ void update_regset_xstate_info(unsigned int size, u64 xstate_mask)
 const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 {
 #ifdef CONFIG_IA32_EMULATION
-	if (test_tsk_thread_flag(task, TIF_IA32))
+	if (!user_64bit_mode(task_pt_regs(task)))
 #endif
 #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
 		return &user_x86_32_view;
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 5/6] x86/ptrace: down with test_thread_flag(TIF_IA32)
@ 2016-08-31 13:59   ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov, Pedro Alves

As the task isn't executing at the moment of {GET,SET}REGS,
return regset that corresponds to code selector, rather than
value of TIF_IA32 flag.
I.e. if we ptrace i386 elf binary that has just changed it's
code selector to __USER_CS, than GET_REGS will return
full x86_64 register set.

Note, that this will work only if application has changed it's CS.
If the application does 32-bit syscall with __USER_CS, ptrace
will still return 64-bit register set. Which might be still confusing
for tools that expect TS_COMPACT to be exposed [1, 2].

So this this change should make PTRACE_GETREGSET more reliable and
this will be another step to drop TIF_{IA32,X32} flags.

[1]: https://sourceforge.net/p/strace/mailman/message/30471411/
[2]: https://lkml.org/lkml/2012/1/18/320

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 arch/x86/kernel/ptrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f79576a541ff..ad0bab8fc594 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1358,7 +1358,7 @@ void update_regset_xstate_info(unsigned int size, u64 xstate_mask)
 const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 {
 #ifdef CONFIG_IA32_EMULATION
-	if (test_tsk_thread_flag(task, TIF_IA32))
+	if (!user_64bit_mode(task_pt_regs(task)))
 #endif
 #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
 		return &user_x86_32_view;
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-08-31 13:59   ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.

So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.

Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/ia32/ia32_signal.c       |  2 +-
 arch/x86/include/asm/fpu/signal.h |  6 ++++++
 arch/x86/include/asm/signal.h     |  4 ++++
 arch/x86/kernel/signal.c          | 20 +++++++++++---------
 arch/x86/kernel/signal_compat.c   | 34 +++++++++++++++++++++++++++++++---
 kernel/signal.c                   |  7 +++++++
 6 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 2f29f4e407c3..cb13c0564ea7 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -378,7 +378,7 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig,
 		put_user_ex(*((u64 *)&code), (u64 __user *)frame->retcode);
 	} put_user_catch(err);
 
-	err |= copy_siginfo_to_user32(&frame->info, &ksig->info);
+	err |= __copy_siginfo_to_user32(&frame->info, &ksig->info, false);
 	err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
 				     regs, set->sig[0]);
 	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 0e970d00dfcd..20a1fbf7fe4e 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -19,6 +19,12 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
 # define ia32_setup_rt_frame	__setup_rt_frame
 #endif
 
+#ifdef CONFIG_COMPAT
+int __copy_siginfo_to_user32(compat_siginfo_t __user *to,
+		const siginfo_t *from, bool x32_ABI);
+#endif
+
+
 extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
 			      struct task_struct *tsk);
 extern void convert_to_fxsr(struct task_struct *tsk,
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index dd1e7d6387ab..8af22be0fe61 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -23,6 +23,10 @@ typedef struct {
 	unsigned long sig[_NSIG_WORDS];
 } sigset_t;
 
+/* non-uapi in-kernel SA_FLAGS for those indicates ABI for a signal frame */
+#define SA_IA32_ABI	0x02000000u
+#define SA_X32_ABI	0x01000000u
+
 #ifndef CONFIG_COMPAT
 typedef sigset_t compat_sigset_t;
 #endif
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 04cb3212db2d..b1a5d252d482 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -42,6 +42,7 @@
 #include <asm/syscalls.h>
 
 #include <asm/sigframe.h>
+#include <asm/signal.h>
 
 #define COPY(x)			do {			\
 	get_user_ex(regs->x, &sc->x);			\
@@ -547,7 +548,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
 		return -EFAULT;
 
 	if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
-		if (copy_siginfo_to_user32(&frame->info, &ksig->info))
+		if (__copy_siginfo_to_user32(&frame->info, &ksig->info, true))
 			return -EFAULT;
 	}
 
@@ -660,20 +661,21 @@ badframe:
 	return 0;
 }
 
-static inline int is_ia32_compat_frame(void)
+static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-	       test_thread_flag(TIF_IA32);
+		ksig->ka.sa.sa_flags & SA_IA32_ABI;
 }
 
-static inline int is_ia32_frame(void)
+static inline int is_ia32_frame(struct ksignal *ksig)
 {
-	return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame();
+	return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig);
 }
 
-static inline int is_x32_frame(void)
+static inline int is_x32_frame(struct ksignal *ksig)
 {
-	return IS_ENABLED(CONFIG_X86_X32_ABI) && test_thread_flag(TIF_X32);
+	return IS_ENABLED(CONFIG_X86_X32_ABI) &&
+		ksig->ka.sa.sa_flags & SA_X32_ABI;
 }
 
 static int
@@ -684,12 +686,12 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
 	compat_sigset_t *cset = (compat_sigset_t *) set;
 
 	/* Set up the stack frame */
-	if (is_ia32_frame()) {
+	if (is_ia32_frame(ksig)) {
 		if (ksig->ka.sa.sa_flags & SA_SIGINFO)
 			return ia32_setup_rt_frame(usig, ksig, cset, regs);
 		else
 			return ia32_setup_frame(usig, ksig, cset, regs);
-	} else if (is_x32_frame()) {
+	} else if (is_x32_frame(ksig)) {
 		return x32_setup_rt_frame(ksig, cset, regs);
 	} else {
 		return __setup_rt_frame(ksig->sig, ksig, set, regs);
diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index b44564bf86a8..40df33753bae 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -1,5 +1,6 @@
 #include <linux/compat.h>
 #include <linux/uaccess.h>
+#include <linux/ptrace.h>
 
 /*
  * The compat_siginfo_t structure and handing code is very easy
@@ -92,10 +93,31 @@ static inline void signal_compat_build_tests(void)
 	/* any new si_fields should be added here */
 }
 
-int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
+void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oact)
+{
+	/* Don't leak in-kernel non-uapi flags to user-space */
+	if (oact)
+		oact->sa.sa_flags &= ~(SA_IA32_ABI | SA_X32_ABI);
+
+	if (!act)
+		return;
+
+	/* Don't let flags to be set from userspace */
+	act->sa.sa_flags &= ~(SA_IA32_ABI | SA_X32_ABI);
+
+	if (user_64bit_mode(current_pt_regs()))
+		return;
+
+	if (in_ia32_syscall())
+		act->sa.sa_flags |= SA_IA32_ABI;
+	if (in_x32_syscall())
+		act->sa.sa_flags |= SA_X32_ABI;
+}
+
+int __copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from,
+		bool x32_ABI)
 {
 	int err = 0;
-	bool ia32 = test_thread_flag(TIF_IA32);
 
 	signal_compat_build_tests();
 
@@ -146,7 +168,7 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
 				put_user_ex(from->si_arch, &to->si_arch);
 				break;
 			case __SI_CHLD >> 16:
-				if (ia32) {
+				if (!x32_ABI) {
 					put_user_ex(from->si_utime, &to->si_utime);
 					put_user_ex(from->si_stime, &to->si_stime);
 				} else {
@@ -180,6 +202,12 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
 	return err;
 }
 
+/* from syscall's path, where we know the ABI */
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
+{
+	return __copy_siginfo_to_user32(to, from, in_x32_syscall());
+}
+
 int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
 {
 	int err = 0;
diff --git a/kernel/signal.c b/kernel/signal.c
index af21afc00d08..75761acc77cf 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3044,6 +3044,11 @@ void kernel_sigaction(int sig, __sighandler_t action)
 }
 EXPORT_SYMBOL(kernel_sigaction);
 
+void __weak sigaction_compat_abi(struct k_sigaction *act,
+		struct k_sigaction *oact)
+{
+}
+
 int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 {
 	struct task_struct *p = current, *t;
@@ -3059,6 +3064,8 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 	if (oact)
 		*oact = *k;
 
+	sigaction_compat_abi(act, oact);
+
 	if (act) {
 		sigdelsetmask(&act->sa.sa_mask,
 			      sigmask(SIGKILL) | sigmask(SIGSTOP));
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-08-31 13:59   ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul, Dmitry Safonov

Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.

So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.

Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mm@kvack.org
Cc: x86@kernel.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/ia32/ia32_signal.c       |  2 +-
 arch/x86/include/asm/fpu/signal.h |  6 ++++++
 arch/x86/include/asm/signal.h     |  4 ++++
 arch/x86/kernel/signal.c          | 20 +++++++++++---------
 arch/x86/kernel/signal_compat.c   | 34 +++++++++++++++++++++++++++++++---
 kernel/signal.c                   |  7 +++++++
 6 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 2f29f4e407c3..cb13c0564ea7 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -378,7 +378,7 @@ int ia32_setup_rt_frame(int sig, struct ksignal *ksig,
 		put_user_ex(*((u64 *)&code), (u64 __user *)frame->retcode);
 	} put_user_catch(err);
 
-	err |= copy_siginfo_to_user32(&frame->info, &ksig->info);
+	err |= __copy_siginfo_to_user32(&frame->info, &ksig->info, false);
 	err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
 				     regs, set->sig[0]);
 	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 0e970d00dfcd..20a1fbf7fe4e 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -19,6 +19,12 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
 # define ia32_setup_rt_frame	__setup_rt_frame
 #endif
 
+#ifdef CONFIG_COMPAT
+int __copy_siginfo_to_user32(compat_siginfo_t __user *to,
+		const siginfo_t *from, bool x32_ABI);
+#endif
+
+
 extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
 			      struct task_struct *tsk);
 extern void convert_to_fxsr(struct task_struct *tsk,
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index dd1e7d6387ab..8af22be0fe61 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -23,6 +23,10 @@ typedef struct {
 	unsigned long sig[_NSIG_WORDS];
 } sigset_t;
 
+/* non-uapi in-kernel SA_FLAGS for those indicates ABI for a signal frame */
+#define SA_IA32_ABI	0x02000000u
+#define SA_X32_ABI	0x01000000u
+
 #ifndef CONFIG_COMPAT
 typedef sigset_t compat_sigset_t;
 #endif
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 04cb3212db2d..b1a5d252d482 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -42,6 +42,7 @@
 #include <asm/syscalls.h>
 
 #include <asm/sigframe.h>
+#include <asm/signal.h>
 
 #define COPY(x)			do {			\
 	get_user_ex(regs->x, &sc->x);			\
@@ -547,7 +548,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
 		return -EFAULT;
 
 	if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
-		if (copy_siginfo_to_user32(&frame->info, &ksig->info))
+		if (__copy_siginfo_to_user32(&frame->info, &ksig->info, true))
 			return -EFAULT;
 	}
 
@@ -660,20 +661,21 @@ badframe:
 	return 0;
 }
 
-static inline int is_ia32_compat_frame(void)
+static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-	       test_thread_flag(TIF_IA32);
+		ksig->ka.sa.sa_flags & SA_IA32_ABI;
 }
 
-static inline int is_ia32_frame(void)
+static inline int is_ia32_frame(struct ksignal *ksig)
 {
-	return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame();
+	return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig);
 }
 
-static inline int is_x32_frame(void)
+static inline int is_x32_frame(struct ksignal *ksig)
 {
-	return IS_ENABLED(CONFIG_X86_X32_ABI) && test_thread_flag(TIF_X32);
+	return IS_ENABLED(CONFIG_X86_X32_ABI) &&
+		ksig->ka.sa.sa_flags & SA_X32_ABI;
 }
 
 static int
@@ -684,12 +686,12 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
 	compat_sigset_t *cset = (compat_sigset_t *) set;
 
 	/* Set up the stack frame */
-	if (is_ia32_frame()) {
+	if (is_ia32_frame(ksig)) {
 		if (ksig->ka.sa.sa_flags & SA_SIGINFO)
 			return ia32_setup_rt_frame(usig, ksig, cset, regs);
 		else
 			return ia32_setup_frame(usig, ksig, cset, regs);
-	} else if (is_x32_frame()) {
+	} else if (is_x32_frame(ksig)) {
 		return x32_setup_rt_frame(ksig, cset, regs);
 	} else {
 		return __setup_rt_frame(ksig->sig, ksig, set, regs);
diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index b44564bf86a8..40df33753bae 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -1,5 +1,6 @@
 #include <linux/compat.h>
 #include <linux/uaccess.h>
+#include <linux/ptrace.h>
 
 /*
  * The compat_siginfo_t structure and handing code is very easy
@@ -92,10 +93,31 @@ static inline void signal_compat_build_tests(void)
 	/* any new si_fields should be added here */
 }
 
-int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
+void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oact)
+{
+	/* Don't leak in-kernel non-uapi flags to user-space */
+	if (oact)
+		oact->sa.sa_flags &= ~(SA_IA32_ABI | SA_X32_ABI);
+
+	if (!act)
+		return;
+
+	/* Don't let flags to be set from userspace */
+	act->sa.sa_flags &= ~(SA_IA32_ABI | SA_X32_ABI);
+
+	if (user_64bit_mode(current_pt_regs()))
+		return;
+
+	if (in_ia32_syscall())
+		act->sa.sa_flags |= SA_IA32_ABI;
+	if (in_x32_syscall())
+		act->sa.sa_flags |= SA_X32_ABI;
+}
+
+int __copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from,
+		bool x32_ABI)
 {
 	int err = 0;
-	bool ia32 = test_thread_flag(TIF_IA32);
 
 	signal_compat_build_tests();
 
@@ -146,7 +168,7 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
 				put_user_ex(from->si_arch, &to->si_arch);
 				break;
 			case __SI_CHLD >> 16:
-				if (ia32) {
+				if (!x32_ABI) {
 					put_user_ex(from->si_utime, &to->si_utime);
 					put_user_ex(from->si_stime, &to->si_stime);
 				} else {
@@ -180,6 +202,12 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
 	return err;
 }
 
+/* from syscall's path, where we know the ABI */
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
+{
+	return __copy_siginfo_to_user32(to, from, in_x32_syscall());
+}
+
 int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
 {
 	int err = 0;
diff --git a/kernel/signal.c b/kernel/signal.c
index af21afc00d08..75761acc77cf 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3044,6 +3044,11 @@ void kernel_sigaction(int sig, __sighandler_t action)
 }
 EXPORT_SYMBOL(kernel_sigaction);
 
+void __weak sigaction_compat_abi(struct k_sigaction *act,
+		struct k_sigaction *oact)
+{
+}
+
 int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 {
 	struct task_struct *p = current, *t;
@@ -3059,6 +3064,8 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
 	if (oact)
 		*oact = *k;
 
+	sigaction_compat_abi(act, oact);
+
 	if (act) {
 		sigdelsetmask(&act->sa.sa_mask,
 			      sigmask(SIGKILL) | sigmask(SIGSTOP));
-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  2016-08-31 13:59   ` Dmitry Safonov
@ 2016-08-31 14:04     ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 14:04 UTC (permalink / raw)
  To: Dmitry Safonov, Andy Lutomirski
  Cc: linux-kernel, Oleg Nesterov, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov, Pavel Emelyanov

Hi Andy,
can I have your acks for 2-3 patches, or should I fix something else
in those patches?

2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
> Add API to change vdso blob type with arch_prctl.
> As this is usefull only by needs of CRIU, expose
> this interface under CONFIG_CHECKPOINT_RESTORE.
>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-mm@kvack.org
> Cc: x86@kernel.org
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Pavel Emelyanov <xemul@virtuozzo.com>
> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>

Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
@ 2016-08-31 14:04     ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 14:04 UTC (permalink / raw)
  To: Dmitry Safonov, Andy Lutomirski
  Cc: linux-kernel, Oleg Nesterov, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov, Pavel Emelyanov

Hi Andy,
can I have your acks for 2-3 patches, or should I fix something else
in those patches?

2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
> Add API to change vdso blob type with arch_prctl.
> As this is usefull only by needs of CRIU, expose
> this interface under CONFIG_CHECKPOINT_RESTORE.
>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-mm@kvack.org
> Cc: x86@kernel.org
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Pavel Emelyanov <xemul@virtuozzo.com>
> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>

Thanks,
             Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-08-31 13:59   ` Dmitry Safonov
@ 2016-08-31 14:07     ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 14:07 UTC (permalink / raw)
  To: Dmitry Safonov, Oleg Nesterov
  Cc: linux-kernel, Andy Lutomirski, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov, Pavel Emelyanov

Hi Oleg,
can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
or there is something left to fix?

2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
> Introduce new flags that defines which ABI to use on creating sigframe.
> Those flags kernel will set according to sigaction syscall ABI,
> which set handler for the signal being delivered.
>
> So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
> Those flags will be used only under CONFIG_COMPAT.
>
> Similar way ARM uses sa_flags to differ in which mode deliver signal
> for 26-bit applications (look at SA_THIRYTWO).
>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-mm@kvack.org
> Cc: x86@kernel.org
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Pavel Emelyanov <xemul@virtuozzo.com>
> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
> Reviewed-by: Andy Lutomirski <luto@kernel.org>

Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-08-31 14:07     ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 14:07 UTC (permalink / raw)
  To: Dmitry Safonov, Oleg Nesterov
  Cc: linux-kernel, Andy Lutomirski, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov, Pavel Emelyanov

Hi Oleg,
can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
or there is something left to fix?

2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
> Introduce new flags that defines which ABI to use on creating sigframe.
> Those flags kernel will set according to sigaction syscall ABI,
> which set handler for the signal being delivered.
>
> So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
> Those flags will be used only under CONFIG_COMPAT.
>
> Similar way ARM uses sa_flags to differ in which mode deliver signal
> for 26-bit applications (look at SA_THIRYTWO).
>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: linux-mm@kvack.org
> Cc: x86@kernel.org
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Pavel Emelyanov <xemul@virtuozzo.com>
> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
> Reviewed-by: Andy Lutomirski <luto@kernel.org>

Thanks,
             Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  2016-08-31 14:04     ` Dmitry Safonov
@ 2016-08-31 14:56       ` Andy Lutomirski
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2016-08-31 14:56 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Oleg Nesterov,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Cyrill Gorcunov, Pavel Emelyanov

On Wed, Aug 31, 2016 at 7:04 AM, Dmitry Safonov <0x7f454c46@gmail.com> wrote:
> Hi Andy,
> can I have your acks for 2-3 patches, or should I fix something else
> in those patches?
>
> 2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
>> Add API to change vdso blob type with arch_prctl.
>> As this is usefull only by needs of CRIU, expose
>> this interface under CONFIG_CHECKPOINT_RESTORE.


I thought the vm_file stuff was still being iterated on.  Did I misunderstand?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
@ 2016-08-31 14:56       ` Andy Lutomirski
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2016-08-31 14:56 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Oleg Nesterov,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Cyrill Gorcunov, Pavel Emelyanov

On Wed, Aug 31, 2016 at 7:04 AM, Dmitry Safonov <0x7f454c46@gmail.com> wrote:
> Hi Andy,
> can I have your acks for 2-3 patches, or should I fix something else
> in those patches?
>
> 2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
>> Add API to change vdso blob type with arch_prctl.
>> As this is usefull only by needs of CRIU, expose
>> this interface under CONFIG_CHECKPOINT_RESTORE.


I thought the vm_file stuff was still being iterated on.  Did I misunderstand?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  2016-08-31 14:56       ` Andy Lutomirski
@ 2016-08-31 15:01         ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 15:01 UTC (permalink / raw)
  To: Andy Lutomirski, Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Oleg Nesterov, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov,
	Pavel Emelyanov

On 08/31/2016 05:56 PM, Andy Lutomirski wrote:
> On Wed, Aug 31, 2016 at 7:04 AM, Dmitry Safonov <0x7f454c46@gmail.com> wrote:
>> Hi Andy,
>> can I have your acks for 2-3 patches, or should I fix something else
>> in those patches?
>>
>> 2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
>>> Add API to change vdso blob type with arch_prctl.
>>> As this is usefull only by needs of CRIU, expose
>>> this interface under CONFIG_CHECKPOINT_RESTORE.
>
>
> I thought the vm_file stuff was still being iterated on.  Did I misunderstand?

Yep, vm_file is being iterated, separately from vdso-map/compatible
patches.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
@ 2016-08-31 15:01         ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-08-31 15:01 UTC (permalink / raw)
  To: Andy Lutomirski, Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Oleg Nesterov, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov,
	Pavel Emelyanov

On 08/31/2016 05:56 PM, Andy Lutomirski wrote:
> On Wed, Aug 31, 2016 at 7:04 AM, Dmitry Safonov <0x7f454c46@gmail.com> wrote:
>> Hi Andy,
>> can I have your acks for 2-3 patches, or should I fix something else
>> in those patches?
>>
>> 2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
>>> Add API to change vdso blob type with arch_prctl.
>>> As this is usefull only by needs of CRIU, expose
>>> this interface under CONFIG_CHECKPOINT_RESTORE.
>
>
> I thought the vm_file stuff was still being iterated on.  Did I misunderstand?

Yep, vm_file is being iterated, separately from vdso-map/compatible
patches.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  2016-08-31 15:01         ` Dmitry Safonov
@ 2016-08-31 15:08           ` Andy Lutomirski
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2016-08-31 15:08 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Oleg Nesterov,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Cyrill Gorcunov, Pavel Emelyanov

On Wed, Aug 31, 2016 at 8:01 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> On 08/31/2016 05:56 PM, Andy Lutomirski wrote:
>>
>> On Wed, Aug 31, 2016 at 7:04 AM, Dmitry Safonov <0x7f454c46@gmail.com>
>> wrote:
>>>
>>> Hi Andy,
>>> can I have your acks for 2-3 patches, or should I fix something else
>>> in those patches?
>>>
>>> 2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
>>>>
>>>> Add API to change vdso blob type with arch_prctl.
>>>> As this is usefull only by needs of CRIU, expose
>>>> this interface under CONFIG_CHECKPOINT_RESTORE.
>>
>>
>>
>> I thought the vm_file stuff was still being iterated on.  Did I
>> misunderstand?
>
>
> Yep, vm_file is being iterated, separately from vdso-map/compatible
> patches.

OK

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
@ 2016-08-31 15:08           ` Andy Lutomirski
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2016-08-31 15:08 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Oleg Nesterov,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Cyrill Gorcunov, Pavel Emelyanov

On Wed, Aug 31, 2016 at 8:01 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> On 08/31/2016 05:56 PM, Andy Lutomirski wrote:
>>
>> On Wed, Aug 31, 2016 at 7:04 AM, Dmitry Safonov <0x7f454c46@gmail.com>
>> wrote:
>>>
>>> Hi Andy,
>>> can I have your acks for 2-3 patches, or should I fix something else
>>> in those patches?
>>>
>>> 2016-08-31 16:59 GMT+03:00 Dmitry Safonov <dsafonov@virtuozzo.com>:
>>>>
>>>> Add API to change vdso blob type with arch_prctl.
>>>> As this is usefull only by needs of CRIU, expose
>>>> this interface under CONFIG_CHECKPOINT_RESTORE.
>>
>>
>>
>> I thought the vm_file stuff was still being iterated on.  Did I
>> misunderstand?
>
>
> Yep, vm_file is being iterated, separately from vdso-map/compatible
> patches.

OK

-- 
Andy Lutomirski
AMA Capital Management, LLC

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 2/6] x86/vdso: replace calculate_addr in map_vdso() with addr
  2016-08-31 13:59   ` Dmitry Safonov
@ 2016-08-31 20:00     ` Andy Lutomirski
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2016-08-31 20:00 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Andrew Lutomirski, Oleg Nesterov,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Cyrill Gorcunov, Pavel Emelyanov

On Wed, Aug 31, 2016 at 6:59 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> That will allow to specify address where to map vDSO blob.
> For the randomized vDSO mappings introduce map_vdso_randomized()
> which will simplify calls to map_vdso.

Still Acked-by: Andy Lutomirski <luto@kernel.org>

--Andy

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 2/6] x86/vdso: replace calculate_addr in map_vdso() with addr
@ 2016-08-31 20:00     ` Andy Lutomirski
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2016-08-31 20:00 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Andrew Lutomirski, Oleg Nesterov,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Cyrill Gorcunov, Pavel Emelyanov

On Wed, Aug 31, 2016 at 6:59 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> That will allow to specify address where to map vDSO blob.
> For the randomized vDSO mappings introduce map_vdso_randomized()
> which will simplify calls to map_vdso.

Still Acked-by: Andy Lutomirski <luto@kernel.org>

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64
  2016-08-31 13:59 ` Dmitry Safonov
@ 2016-09-01  6:18   ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2016-09-01  6:18 UTC (permalink / raw)
  To: Dmitry Safonov, Andy Lutomirski, Oleg Nesterov, Al Viro, Andrew Morton
  Cc: linux-kernel, 0x7f454c46, luto, oleg, tglx, hpa, mingo, linux-mm,
	x86, gorcunov, xemul


* Dmitry Safonov <dsafonov@virtuozzo.com> wrote:

> Changes from v3:
> - proper ifdefs around vdso_image_32
> - missed Reviewed-by tag

>  arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
>  arch/x86/ia32/ia32_signal.c       |  2 +-
>  arch/x86/include/asm/compat.h     |  8 ++--
>  arch/x86/include/asm/fpu/signal.h |  6 +++
>  arch/x86/include/asm/signal.h     |  4 ++
>  arch/x86/include/asm/vdso.h       |  2 +
>  arch/x86/include/uapi/asm/prctl.h |  6 +++
>  arch/x86/kernel/process_64.c      | 25 ++++++++++++
>  arch/x86/kernel/ptrace.c          |  2 +-
>  arch/x86/kernel/signal.c          | 20 +++++-----
>  arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
>  fs/binfmt_elf.c                   | 23 ++++-------
>  kernel/signal.c                   |  7 ++++
>  13 files changed, 162 insertions(+), 58 deletions(-)

Ok, this series looks good to me - does anyone have any objections?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64
@ 2016-09-01  6:18   ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2016-09-01  6:18 UTC (permalink / raw)
  To: Dmitry Safonov, Andy Lutomirski, Oleg Nesterov, Al Viro, Andrew Morton
  Cc: linux-kernel, 0x7f454c46, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul


* Dmitry Safonov <dsafonov@virtuozzo.com> wrote:

> Changes from v3:
> - proper ifdefs around vdso_image_32
> - missed Reviewed-by tag

>  arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
>  arch/x86/ia32/ia32_signal.c       |  2 +-
>  arch/x86/include/asm/compat.h     |  8 ++--
>  arch/x86/include/asm/fpu/signal.h |  6 +++
>  arch/x86/include/asm/signal.h     |  4 ++
>  arch/x86/include/asm/vdso.h       |  2 +
>  arch/x86/include/uapi/asm/prctl.h |  6 +++
>  arch/x86/kernel/process_64.c      | 25 ++++++++++++
>  arch/x86/kernel/ptrace.c          |  2 +-
>  arch/x86/kernel/signal.c          | 20 +++++-----
>  arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
>  fs/binfmt_elf.c                   | 23 ++++-------
>  kernel/signal.c                   |  7 ++++
>  13 files changed, 162 insertions(+), 58 deletions(-)

Ok, this series looks good to me - does anyone have any objections?

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64
  2016-09-01  6:18   ` Ingo Molnar
@ 2016-09-01  8:19     ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-09-01  8:19 UTC (permalink / raw)
  To: Ingo Molnar, Andy Lutomirski, Oleg Nesterov, Al Viro, Andrew Morton
  Cc: linux-kernel, 0x7f454c46, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul

On 09/01/2016 09:18 AM, Ingo Molnar wrote:
>
> * Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>
>> Changes from v3:
>> - proper ifdefs around vdso_image_32
>> - missed Reviewed-by tag
>
>>  arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
>>  arch/x86/ia32/ia32_signal.c       |  2 +-
>>  arch/x86/include/asm/compat.h     |  8 ++--
>>  arch/x86/include/asm/fpu/signal.h |  6 +++
>>  arch/x86/include/asm/signal.h     |  4 ++
>>  arch/x86/include/asm/vdso.h       |  2 +
>>  arch/x86/include/uapi/asm/prctl.h |  6 +++
>>  arch/x86/kernel/process_64.c      | 25 ++++++++++++
>>  arch/x86/kernel/ptrace.c          |  2 +-
>>  arch/x86/kernel/signal.c          | 20 +++++-----
>>  arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
>>  fs/binfmt_elf.c                   | 23 ++++-------
>>  kernel/signal.c                   |  7 ++++
>>  13 files changed, 162 insertions(+), 58 deletions(-)
>
> Ok, this series looks good to me - does anyone have any objections?

Thanks, Ingo!

There is a nitpick from Andy about checking both vm_ops and
vm_private_data to avoid (unlikely) confusion with some other VMA
in map_vdso_once().
I'll fix that for the next version, which will be ready to be applied,
if no one has any other objections.

-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64
@ 2016-09-01  8:19     ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-09-01  8:19 UTC (permalink / raw)
  To: Ingo Molnar, Andy Lutomirski, Oleg Nesterov, Al Viro, Andrew Morton
  Cc: linux-kernel, 0x7f454c46, tglx, hpa, mingo, linux-mm, x86,
	gorcunov, xemul

On 09/01/2016 09:18 AM, Ingo Molnar wrote:
>
> * Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>
>> Changes from v3:
>> - proper ifdefs around vdso_image_32
>> - missed Reviewed-by tag
>
>>  arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
>>  arch/x86/ia32/ia32_signal.c       |  2 +-
>>  arch/x86/include/asm/compat.h     |  8 ++--
>>  arch/x86/include/asm/fpu/signal.h |  6 +++
>>  arch/x86/include/asm/signal.h     |  4 ++
>>  arch/x86/include/asm/vdso.h       |  2 +
>>  arch/x86/include/uapi/asm/prctl.h |  6 +++
>>  arch/x86/kernel/process_64.c      | 25 ++++++++++++
>>  arch/x86/kernel/ptrace.c          |  2 +-
>>  arch/x86/kernel/signal.c          | 20 +++++-----
>>  arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
>>  fs/binfmt_elf.c                   | 23 ++++-------
>>  kernel/signal.c                   |  7 ++++
>>  13 files changed, 162 insertions(+), 58 deletions(-)
>
> Ok, this series looks good to me - does anyone have any objections?

Thanks, Ingo!

There is a nitpick from Andy about checking both vm_ops and
vm_private_data to avoid (unlikely) confusion with some other VMA
in map_vdso_once().
I'll fix that for the next version, which will be ready to be applied,
if no one has any other objections.

-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-08-31 14:07     ` Dmitry Safonov
@ 2016-09-01 12:27       ` Oleg Nesterov
  -1 siblings, 0 replies; 40+ messages in thread
From: Oleg Nesterov @ 2016-09-01 12:27 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, linux-kernel, Andy Lutomirski, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov,
	Pavel Emelyanov

On 08/31, Dmitry Safonov wrote:
>
> Hi Oleg,
> can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
> or there is something left to fix?

Well yes... Although let me repeat, I am not sure I personally like
the very idea of 3/6 and 6/6. But as I already said I do not feel I
understand the problem space enough, so I won't argue.

However, let me ask again. Did you consider another option? Why criu
can't exec a dummy 32-bit binary before anything else?

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-09-01 12:27       ` Oleg Nesterov
  0 siblings, 0 replies; 40+ messages in thread
From: Oleg Nesterov @ 2016-09-01 12:27 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, linux-kernel, Andy Lutomirski, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML, Cyrill Gorcunov,
	Pavel Emelyanov

On 08/31, Dmitry Safonov wrote:
>
> Hi Oleg,
> can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
> or there is something left to fix?

Well yes... Although let me repeat, I am not sure I personally like
the very idea of 3/6 and 6/6. But as I already said I do not feel I
understand the problem space enough, so I won't argue.

However, let me ask again. Did you consider another option? Why criu
can't exec a dummy 32-bit binary before anything else?

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-09-01 12:27       ` Oleg Nesterov
@ 2016-09-01 12:45         ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2016-09-01 12:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Dmitry Safonov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

On Thu, Sep 01, 2016 at 02:27:44PM +0200, Oleg Nesterov wrote:
> > Hi Oleg,
> > can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
> > or there is something left to fix?
> 
> Well yes... Although let me repeat, I am not sure I personally like
> the very idea of 3/6 and 6/6. But as I already said I do not feel I
> understand the problem space enough, so I won't argue.
> 
> However, let me ask again. Did you consider another option? Why criu
> can't exec a dummy 32-bit binary before anything else?

I'm not really sure how this would look then. If I understand you
correctly you propose to exec dummy 32bit during "forking" stage
where we're recreating a process tree, before anything else. If
true this implies that we will need two criu engines: one compiled
with 64 bit and (same) second but compiled with 32 bits, no?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-09-01 12:45         ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2016-09-01 12:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Dmitry Safonov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

On Thu, Sep 01, 2016 at 02:27:44PM +0200, Oleg Nesterov wrote:
> > Hi Oleg,
> > can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
> > or there is something left to fix?
> 
> Well yes... Although let me repeat, I am not sure I personally like
> the very idea of 3/6 and 6/6. But as I already said I do not feel I
> understand the problem space enough, so I won't argue.
> 
> However, let me ask again. Did you consider another option? Why criu
> can't exec a dummy 32-bit binary before anything else?

I'm not really sure how this would look then. If I understand you
correctly you propose to exec dummy 32bit during "forking" stage
where we're recreating a process tree, before anything else. If
true this implies that we will need two criu engines: one compiled
with 64 bit and (same) second but compiled with 32 bits, no?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-09-01 12:45         ` Cyrill Gorcunov
@ 2016-09-01 13:47           ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-09-01 13:47 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

Thanks for your replies Oleg, Cyrill,

2016-09-01 15:45 GMT+03:00 Cyrill Gorcunov <gorcunov@gmail.com>:
> On Thu, Sep 01, 2016 at 02:27:44PM +0200, Oleg Nesterov wrote:
>> > Hi Oleg,
>> > can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
>> > or there is something left to fix?
>>
>> Well yes... Although let me repeat, I am not sure I personally like
>> the very idea of 3/6 and 6/6. But as I already said I do not feel I
>> understand the problem space enough, so I won't argue.
>>
>> However, let me ask again. Did you consider another option? Why criu
>> can't exec a dummy 32-bit binary before anything else?
>
> I'm not really sure how this would look then. If I understand you
> correctly you propose to exec dummy 32bit during "forking" stage
> where we're recreating a process tree, before anything else. If
> true this implies that we will need two criu engines: one compiled
> with 64 bit and (same) second but compiled with 32 bits, no?

Yep, we would need then full CRIU, but compiled in 32 bits.
And it can be then even more complicated, as 64-bit parent
can have 32-bit child, which can have 64-bit child... et cetera.

And the biggest problem in this approach would be not the size of
code changes to CRIU (which are already quite large with this
patches set), but AFAICS, it will have big performance penalty:
we would need to bounce process tree, processes properties
from parent-CRIU to child-CRIU after exec() call and down on
the processes hierarchy, recreating processes while synchronizing
process's data from images.

As for now, we already have time-critical problems in СRIU and
we try to reduce the number of system calls, while it's still slow
at some places. But that approach will lead to:
o exec different CRIU
o initialize it (i.e, parse /proc/self/maps to know it's vmas)
o transphere process tree, for each process it's properties with IPC
   after exec()
It will all go for a large number of syscalls in total.

So, for the current patches set the performance penalty is one call
to arch_prctl() to map 32-bit vdso blob. It's even smaller, as one
specifies the address on which to map the blob and doesn't need
additional mremap()'s to move the blob on needed location.
And this arch_prctl() API is visible under CHECKPOINT_RESTORE
config option, so will not bother anyone.

-- 
             Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-09-01 13:47           ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2016-09-01 13:47 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

Thanks for your replies Oleg, Cyrill,

2016-09-01 15:45 GMT+03:00 Cyrill Gorcunov <gorcunov@gmail.com>:
> On Thu, Sep 01, 2016 at 02:27:44PM +0200, Oleg Nesterov wrote:
>> > Hi Oleg,
>> > can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
>> > or there is something left to fix?
>>
>> Well yes... Although let me repeat, I am not sure I personally like
>> the very idea of 3/6 and 6/6. But as I already said I do not feel I
>> understand the problem space enough, so I won't argue.
>>
>> However, let me ask again. Did you consider another option? Why criu
>> can't exec a dummy 32-bit binary before anything else?
>
> I'm not really sure how this would look then. If I understand you
> correctly you propose to exec dummy 32bit during "forking" stage
> where we're recreating a process tree, before anything else. If
> true this implies that we will need two criu engines: one compiled
> with 64 bit and (same) second but compiled with 32 bits, no?

Yep, we would need then full CRIU, but compiled in 32 bits.
And it can be then even more complicated, as 64-bit parent
can have 32-bit child, which can have 64-bit child... et cetera.

And the biggest problem in this approach would be not the size of
code changes to CRIU (which are already quite large with this
patches set), but AFAICS, it will have big performance penalty:
we would need to bounce process tree, processes properties
from parent-CRIU to child-CRIU after exec() call and down on
the processes hierarchy, recreating processes while synchronizing
process's data from images.

As for now, we already have time-critical problems in СRIU and
we try to reduce the number of system calls, while it's still slow
at some places. But that approach will lead to:
o exec different CRIU
o initialize it (i.e, parse /proc/self/maps to know it's vmas)
o transphere process tree, for each process it's properties with IPC
   after exec()
It will all go for a large number of syscalls in total.

So, for the current patches set the performance penalty is one call
to arch_prctl() to map 32-bit vdso blob. It's even smaller, as one
specifies the address on which to map the blob and doesn't need
additional mremap()'s to move the blob on needed location.
And this arch_prctl() API is visible under CHECKPOINT_RESTORE
config option, so will not bother anyone.

-- 
             Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-09-01 13:47           ` Dmitry Safonov
@ 2016-09-01 13:59             ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2016-09-01 13:59 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Oleg Nesterov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

On Thu, Sep 01, 2016 at 04:47:23PM +0300, Dmitry Safonov wrote:
> Thanks for your replies Oleg, Cyrill,
> 
> 2016-09-01 15:45 GMT+03:00 Cyrill Gorcunov <gorcunov@gmail.com>:
> > On Thu, Sep 01, 2016 at 02:27:44PM +0200, Oleg Nesterov wrote:
> >> > Hi Oleg,
> >> > can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
> >> > or there is something left to fix?
> >>
> >> Well yes... Although let me repeat, I am not sure I personally like
> >> the very idea of 3/6 and 6/6. But as I already said I do not feel I
> >> understand the problem space enough, so I won't argue.
> >>
> >> However, let me ask again. Did you consider another option? Why criu
> >> can't exec a dummy 32-bit binary before anything else?
> >
> > I'm not really sure how this would look then. If I understand you
> > correctly you propose to exec dummy 32bit during "forking" stage
> > where we're recreating a process tree, before anything else. If
> > true this implies that we will need two criu engines: one compiled
> > with 64 bit and (same) second but compiled with 32 bits, no?
> 
> Yep, we would need then full CRIU, but compiled in 32 bits.
> And it can be then even more complicated, as 64-bit parent
> can have 32-bit child, which can have 64-bit child... et cetera.

Yup, this gonna be a mess, that's why I asked, because I suspect
Oleg meant something else maybe?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-09-01 13:59             ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2016-09-01 13:59 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Oleg Nesterov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

On Thu, Sep 01, 2016 at 04:47:23PM +0300, Dmitry Safonov wrote:
> Thanks for your replies Oleg, Cyrill,
> 
> 2016-09-01 15:45 GMT+03:00 Cyrill Gorcunov <gorcunov@gmail.com>:
> > On Thu, Sep 01, 2016 at 02:27:44PM +0200, Oleg Nesterov wrote:
> >> > Hi Oleg,
> >> > can I have your acks or reviewed-by tags for 4-5-6 patches in the series,
> >> > or there is something left to fix?
> >>
> >> Well yes... Although let me repeat, I am not sure I personally like
> >> the very idea of 3/6 and 6/6. But as I already said I do not feel I
> >> understand the problem space enough, so I won't argue.
> >>
> >> However, let me ask again. Did you consider another option? Why criu
> >> can't exec a dummy 32-bit binary before anything else?
> >
> > I'm not really sure how this would look then. If I understand you
> > correctly you propose to exec dummy 32bit during "forking" stage
> > where we're recreating a process tree, before anything else. If
> > true this implies that we will need two criu engines: one compiled
> > with 64 bit and (same) second but compiled with 32 bits, no?
> 
> Yep, we would need then full CRIU, but compiled in 32 bits.
> And it can be then even more complicated, as 64-bit parent
> can have 32-bit child, which can have 64-bit child... et cetera.

Yup, this gonna be a mess, that's why I asked, because I suspect
Oleg meant something else maybe?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
  2016-09-01 13:47           ` Dmitry Safonov
@ 2016-09-01 16:56             ` Oleg Nesterov
  -1 siblings, 0 replies; 40+ messages in thread
From: Oleg Nesterov @ 2016-09-01 16:56 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Cyrill Gorcunov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

On 09/01, Dmitry Safonov wrote:
>
> And the biggest problem in this approach would be not the size of
> code changes to CRIU (which are already quite large with this
> patches set), but AFAICS, it will have big performance penalty:
> we would need to bounce process tree, processes properties
> from parent-CRIU to child-CRIU after exec() call and down on
> the processes hierarchy, recreating processes while synchronizing
> process's data from images.
>
> As for now, we already have time-critical problems in СRIU and
> we try to reduce the number of system calls, while it's still slow
> at some places. But that approach will lead to:
> o exec different CRIU
> o initialize it (i.e, parse /proc/self/maps to know it's vmas)
> o transphere process tree, for each process it's properties with IPC
>    after exec()
> It will all go for a large number of syscalls in total.

I do not really understand why it has to be so complicated, but
I can be easily wrong.

> And this arch_prctl() API is visible under CHECKPOINT_RESTORE
> config option, so will not bother anyone.

I mostly dislike 6/6. This new feauture looks a bit strange to me.

Nevermind, let me repeat once again, I am not trying to argue with
this series. No objections from me.

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags
@ 2016-09-01 16:56             ` Oleg Nesterov
  0 siblings, 0 replies; 40+ messages in thread
From: Oleg Nesterov @ 2016-09-01 16:56 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Cyrill Gorcunov, Dmitry Safonov, linux-kernel, Andy Lutomirski,
	Thomas Gleixner, H. Peter Anvin, Ingo Molnar, linux-mm, X86 ML,
	Pavel Emelyanov

On 09/01, Dmitry Safonov wrote:
>
> And the biggest problem in this approach would be not the size of
> code changes to CRIU (which are already quite large with this
> patches set), but AFAICS, it will have big performance penalty:
> we would need to bounce process tree, processes properties
> from parent-CRIU to child-CRIU after exec() call and down on
> the processes hierarchy, recreating processes while synchronizing
> process's data from images.
>
> As for now, we already have time-critical problems in D!RIU and
> we try to reduce the number of system calls, while it's still slow
> at some places. But that approach will lead to:
> o exec different CRIU
> o initialize it (i.e, parse /proc/self/maps to know it's vmas)
> o transphere process tree, for each process it's properties with IPC
>    after exec()
> It will all go for a large number of syscalls in total.

I do not really understand why it has to be so complicated, but
I can be easily wrong.

> And this arch_prctl() API is visible under CHECKPOINT_RESTORE
> config option, so will not bother anyone.

I mostly dislike 6/6. This new feauture looks a bit strange to me.

Nevermind, let me repeat once again, I am not trying to argue with
this series. No objections from me.

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2016-09-01 21:13 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-31 13:59 [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64 Dmitry Safonov
2016-08-31 13:59 ` Dmitry Safonov
2016-08-31 13:59 ` [PATCHv4 1/6] x86/vdso: unmap vdso blob on vvar mapping failure Dmitry Safonov
2016-08-31 13:59   ` Dmitry Safonov
2016-08-31 13:59 ` [PATCHv4 2/6] x86/vdso: replace calculate_addr in map_vdso() with addr Dmitry Safonov
2016-08-31 13:59   ` Dmitry Safonov
2016-08-31 20:00   ` Andy Lutomirski
2016-08-31 20:00     ` Andy Lutomirski
2016-08-31 13:59 ` [PATCHv4 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_* Dmitry Safonov
2016-08-31 13:59   ` Dmitry Safonov
2016-08-31 14:04   ` Dmitry Safonov
2016-08-31 14:04     ` Dmitry Safonov
2016-08-31 14:56     ` Andy Lutomirski
2016-08-31 14:56       ` Andy Lutomirski
2016-08-31 15:01       ` Dmitry Safonov
2016-08-31 15:01         ` Dmitry Safonov
2016-08-31 15:08         ` Andy Lutomirski
2016-08-31 15:08           ` Andy Lutomirski
2016-08-31 13:59 ` [PATCHv4 4/6] x86/coredump: use pr_reg size, rather that TIF_IA32 flag Dmitry Safonov
2016-08-31 13:59   ` Dmitry Safonov
2016-08-31 13:59 ` [PATCHv4 5/6] x86/ptrace: down with test_thread_flag(TIF_IA32) Dmitry Safonov
2016-08-31 13:59   ` Dmitry Safonov
2016-08-31 13:59 ` [PATCHv4 6/6] x86/signal: add SA_{X32,IA32}_ABI sa_flags Dmitry Safonov
2016-08-31 13:59   ` Dmitry Safonov
2016-08-31 14:07   ` Dmitry Safonov
2016-08-31 14:07     ` Dmitry Safonov
2016-09-01 12:27     ` Oleg Nesterov
2016-09-01 12:27       ` Oleg Nesterov
2016-09-01 12:45       ` Cyrill Gorcunov
2016-09-01 12:45         ` Cyrill Gorcunov
2016-09-01 13:47         ` Dmitry Safonov
2016-09-01 13:47           ` Dmitry Safonov
2016-09-01 13:59           ` Cyrill Gorcunov
2016-09-01 13:59             ` Cyrill Gorcunov
2016-09-01 16:56           ` Oleg Nesterov
2016-09-01 16:56             ` Oleg Nesterov
2016-09-01  6:18 ` [PATCHv4 0/6] x86: 32-bit compatible C/R on x86_64 Ingo Molnar
2016-09-01  6:18   ` Ingo Molnar
2016-09-01  8:19   ` Dmitry Safonov
2016-09-01  8:19     ` Dmitry Safonov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.