All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 16:37 ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 16:37 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, Dmitry Safonov, Adam Borowski, linux-mm,
	Andrei Vagin, Cyrill Gorcunov, Borislav Petkov,
	Kirill A. Shutemov, x86, H. Peter Anvin, Andy Lutomirski,
	Ingo Molnar, Thomas Gleixner

After my changes to mmap(), its code now relies on the bitness of
performing syscall. According to that, it chooses the base of allocation:
mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
It was done by:
  commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
32-bit mmap()").

The code afterwards relies on in_compat_syscall() returning true for
32-bit syscalls. It's usually so while we're in context of application
that does 32-bit syscalls. But during exec() it is not valid for x32 ELF.
The reason is that the application hasn't yet done any syscall, so x32
bit has not being set.
That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
in elf_map(), that is called from do_execve()->load_elf_binary().
For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.

I suggest to set x32 bit before first return to userspace, during
setting personality at exec(). This way we can rely on
in_compat_syscall() during exec().

Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
32-bit mmap()")
Cc: 0x7f454c46@gmail.com
Cc: linux-mm@kvack.org
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: x86@kernel.org
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
v2:
- specifying mmap() allocation path which failed during exec()
- fix comment style

 arch/x86/kernel/process_64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d6b784a5520d..d3d4d9abcaf8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
 		if (current->mm)
 			current->mm->context.ia32_compat = TIF_X32;
 		current->personality &= ~READ_IMPLIES_EXEC;
-		/* in_compat_syscall() uses the presence of the x32
-		   syscall bit flag to determine compat status */
+		/*
+		 * in_compat_syscall() uses the presence of the x32
+		 * syscall bit flag to determine compat status.
+		 * On the bitness of syscall relies x86 mmap() code,
+		 * so set x32 syscall bit right here to make
+		 * in_compat_syscall() work during exec().
+		 */
+		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
 		current->thread.status &= ~TS_COMPAT;
 	} else {
 		set_thread_flag(TIF_IA32);
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 16:37 ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 16:37 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, Dmitry Safonov, Adam Borowski, linux-mm,
	Andrei Vagin, Cyrill Gorcunov, Borislav Petkov,
	Kirill A. Shutemov, x86, H. Peter Anvin, Andy Lutomirski,
	Ingo Molnar, Thomas Gleixner

After my changes to mmap(), its code now relies on the bitness of
performing syscall. According to that, it chooses the base of allocation:
mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
It was done by:
  commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
32-bit mmap()").

The code afterwards relies on in_compat_syscall() returning true for
32-bit syscalls. It's usually so while we're in context of application
that does 32-bit syscalls. But during exec() it is not valid for x32 ELF.
The reason is that the application hasn't yet done any syscall, so x32
bit has not being set.
That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
in elf_map(), that is called from do_execve()->load_elf_binary().
For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.

I suggest to set x32 bit before first return to userspace, during
setting personality at exec(). This way we can rely on
in_compat_syscall() during exec().

Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
32-bit mmap()")
Cc: 0x7f454c46@gmail.com
Cc: linux-mm@kvack.org
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: x86@kernel.org
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
v2:
- specifying mmap() allocation path which failed during exec()
- fix comment style

 arch/x86/kernel/process_64.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d6b784a5520d..d3d4d9abcaf8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
 		if (current->mm)
 			current->mm->context.ia32_compat = TIF_X32;
 		current->personality &= ~READ_IMPLIES_EXEC;
-		/* in_compat_syscall() uses the presence of the x32
-		   syscall bit flag to determine compat status */
+		/*
+		 * in_compat_syscall() uses the presence of the x32
+		 * syscall bit flag to determine compat status.
+		 * On the bitness of syscall relies x86 mmap() code,
+		 * so set x32 syscall bit right here to make
+		 * in_compat_syscall() work during exec().
+		 */
+		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
 		current->thread.status &= ~TS_COMPAT;
 	} else {
 		set_thread_flag(TIF_IA32);
-- 
2.12.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 16:37 ` Dmitry Safonov
@ 2017-03-21 17:17   ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 17:17 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, 0x7f454c46, Adam Borowski, linux-mm, Andrei Vagin,
	Borislav Petkov, Kirill A. Shutemov, x86, H. Peter Anvin,
	Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
...
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index d6b784a5520d..d3d4d9abcaf8 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>  		if (current->mm)
>  			current->mm->context.ia32_compat = TIF_X32;
>  		current->personality &= ~READ_IMPLIES_EXEC;
> -		/* in_compat_syscall() uses the presence of the x32
> -		   syscall bit flag to determine compat status */
> +		/*
> +		 * in_compat_syscall() uses the presence of the x32
> +		 * syscall bit flag to determine compat status.
> +		 * On the bitness of syscall relies x86 mmap() code,
> +		 * so set x32 syscall bit right here to make
> +		 * in_compat_syscall() work during exec().
> +		 */
> +		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>  		current->thread.status &= ~TS_COMPAT;

Hi! I must admit I didn't follow close the overall series (so can't
comment much here :) but I have a slightly unrelated question -- is
there a way to figure out if task is running in x32 mode say with
some ptrace or procfs sign?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 17:17   ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 17:17 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, 0x7f454c46, Adam Borowski, linux-mm, Andrei Vagin,
	Borislav Petkov, Kirill A. Shutemov, x86, H. Peter Anvin,
	Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
...
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index d6b784a5520d..d3d4d9abcaf8 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>  		if (current->mm)
>  			current->mm->context.ia32_compat = TIF_X32;
>  		current->personality &= ~READ_IMPLIES_EXEC;
> -		/* in_compat_syscall() uses the presence of the x32
> -		   syscall bit flag to determine compat status */
> +		/*
> +		 * in_compat_syscall() uses the presence of the x32
> +		 * syscall bit flag to determine compat status.
> +		 * On the bitness of syscall relies x86 mmap() code,
> +		 * so set x32 syscall bit right here to make
> +		 * in_compat_syscall() work during exec().
> +		 */
> +		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>  		current->thread.status &= ~TS_COMPAT;

Hi! I must admit I didn't follow close the overall series (so can't
comment much here :) but I have a slightly unrelated question -- is
there a way to figure out if task is running in x32 mode say with
some ptrace or procfs sign?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 16:37 ` Dmitry Safonov
@ 2017-03-21 17:27   ` hpa
  -1 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 17:27 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel
  Cc: 0x7f454c46, Adam Borowski, linux-mm, Andrei Vagin,
	Cyrill Gorcunov, Borislav Petkov, Kirill A. Shutemov, x86,
	Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On March 21, 2017 9:37:12 AM PDT, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>After my changes to mmap(), its code now relies on the bitness of
>performing syscall. According to that, it chooses the base of
>allocation:
>mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
>It was done by:
>  commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>32-bit mmap()").
>
>The code afterwards relies on in_compat_syscall() returning true for
>32-bit syscalls. It's usually so while we're in context of application
>that does 32-bit syscalls. But during exec() it is not valid for x32
>ELF.
>The reason is that the application hasn't yet done any syscall, so x32
>bit has not being set.
>That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
>in elf_map(), that is called from do_execve()->load_elf_binary().
>For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.
>
>I suggest to set x32 bit before first return to userspace, during
>setting personality at exec(). This way we can rely on
>in_compat_syscall() during exec().
>
>Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>32-bit mmap()")
>Cc: 0x7f454c46@gmail.com
>Cc: linux-mm@kvack.org
>Cc: Andrei Vagin <avagin@gmail.com>
>Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>Cc: Borislav Petkov <bp@suse.de>
>Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>Cc: x86@kernel.org
>Cc: H. Peter Anvin <hpa@zytor.com>
>Cc: Andy Lutomirski <luto@kernel.org>
>Cc: Ingo Molnar <mingo@redhat.com>
>Cc: Thomas Gleixner <tglx@linutronix.de>
>Reported-by: Adam Borowski <kilobyte@angband.pl>
>Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
>---
>v2:
>- specifying mmap() allocation path which failed during exec()
>- fix comment style
>
> arch/x86/kernel/process_64.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/kernel/process_64.c
>b/arch/x86/kernel/process_64.c
>index d6b784a5520d..d3d4d9abcaf8 100644
>--- a/arch/x86/kernel/process_64.c
>+++ b/arch/x86/kernel/process_64.c
>@@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
> 		if (current->mm)
> 			current->mm->context.ia32_compat = TIF_X32;
> 		current->personality &= ~READ_IMPLIES_EXEC;
>-		/* in_compat_syscall() uses the presence of the x32
>-		   syscall bit flag to determine compat status */
>+		/*
>+		 * in_compat_syscall() uses the presence of the x32
>+		 * syscall bit flag to determine compat status.
>+		 * On the bitness of syscall relies x86 mmap() code,
>+		 * so set x32 syscall bit right here to make
>+		 * in_compat_syscall() work during exec().
>+		 */
>+		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
> 		current->thread.status &= ~TS_COMPAT;
> 	} else {
> 		set_thread_flag(TIF_IA32);

You also need to clear the bit for an x32 -> x86-64 exec.  Otherwise it seems okay to me.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 17:27   ` hpa
  0 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 17:27 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel
  Cc: 0x7f454c46, Adam Borowski, linux-mm, Andrei Vagin,
	Cyrill Gorcunov, Borislav Petkov, Kirill A. Shutemov, x86,
	Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On March 21, 2017 9:37:12 AM PDT, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>After my changes to mmap(), its code now relies on the bitness of
>performing syscall. According to that, it chooses the base of
>allocation:
>mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
>It was done by:
>  commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>32-bit mmap()").
>
>The code afterwards relies on in_compat_syscall() returning true for
>32-bit syscalls. It's usually so while we're in context of application
>that does 32-bit syscalls. But during exec() it is not valid for x32
>ELF.
>The reason is that the application hasn't yet done any syscall, so x32
>bit has not being set.
>That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
>in elf_map(), that is called from do_execve()->load_elf_binary().
>For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.
>
>I suggest to set x32 bit before first return to userspace, during
>setting personality at exec(). This way we can rely on
>in_compat_syscall() during exec().
>
>Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>32-bit mmap()")
>Cc: 0x7f454c46@gmail.com
>Cc: linux-mm@kvack.org
>Cc: Andrei Vagin <avagin@gmail.com>
>Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>Cc: Borislav Petkov <bp@suse.de>
>Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>Cc: x86@kernel.org
>Cc: H. Peter Anvin <hpa@zytor.com>
>Cc: Andy Lutomirski <luto@kernel.org>
>Cc: Ingo Molnar <mingo@redhat.com>
>Cc: Thomas Gleixner <tglx@linutronix.de>
>Reported-by: Adam Borowski <kilobyte@angband.pl>
>Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
>---
>v2:
>- specifying mmap() allocation path which failed during exec()
>- fix comment style
>
> arch/x86/kernel/process_64.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/kernel/process_64.c
>b/arch/x86/kernel/process_64.c
>index d6b784a5520d..d3d4d9abcaf8 100644
>--- a/arch/x86/kernel/process_64.c
>+++ b/arch/x86/kernel/process_64.c
>@@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
> 		if (current->mm)
> 			current->mm->context.ia32_compat = TIF_X32;
> 		current->personality &= ~READ_IMPLIES_EXEC;
>-		/* in_compat_syscall() uses the presence of the x32
>-		   syscall bit flag to determine compat status */
>+		/*
>+		 * in_compat_syscall() uses the presence of the x32
>+		 * syscall bit flag to determine compat status.
>+		 * On the bitness of syscall relies x86 mmap() code,
>+		 * so set x32 syscall bit right here to make
>+		 * in_compat_syscall() work during exec().
>+		 */
>+		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
> 		current->thread.status &= ~TS_COMPAT;
> 	} else {
> 		set_thread_flag(TIF_IA32);

You also need to clear the bit for an x32 -> x86-64 exec.  Otherwise it seems okay to me.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 17:27   ` hpa
@ 2017-03-21 17:27     ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 17:27 UTC (permalink / raw)
  To: hpa, linux-kernel
  Cc: 0x7f454c46, Adam Borowski, linux-mm, Andrei Vagin,
	Cyrill Gorcunov, Borislav Petkov, Kirill A. Shutemov, x86,
	Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On 03/21/2017 08:27 PM, hpa@zytor.com wrote:
> On March 21, 2017 9:37:12 AM PDT, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>> After my changes to mmap(), its code now relies on the bitness of
>> performing syscall. According to that, it chooses the base of
>> allocation:
>> mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
>> It was done by:
>>  commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>> 32-bit mmap()").
>>
>> The code afterwards relies on in_compat_syscall() returning true for
>> 32-bit syscalls. It's usually so while we're in context of application
>> that does 32-bit syscalls. But during exec() it is not valid for x32
>> ELF.
>> The reason is that the application hasn't yet done any syscall, so x32
>> bit has not being set.
>> That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
>> in elf_map(), that is called from do_execve()->load_elf_binary().
>> For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.
>>
>> I suggest to set x32 bit before first return to userspace, during
>> setting personality at exec(). This way we can rely on
>> in_compat_syscall() during exec().
>>
>> Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>> 32-bit mmap()")
>> Cc: 0x7f454c46@gmail.com
>> Cc: linux-mm@kvack.org
>> Cc: Andrei Vagin <avagin@gmail.com>
>> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Cc: x86@kernel.org
>> Cc: H. Peter Anvin <hpa@zytor.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Reported-by: Adam Borowski <kilobyte@angband.pl>
>> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
>> ---
>> v2:
>> - specifying mmap() allocation path which failed during exec()
>> - fix comment style
>>
>> arch/x86/kernel/process_64.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/process_64.c
>> b/arch/x86/kernel/process_64.c
>> index d6b784a5520d..d3d4d9abcaf8 100644
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>> 		if (current->mm)
>> 			current->mm->context.ia32_compat = TIF_X32;
>> 		current->personality &= ~READ_IMPLIES_EXEC;
>> -		/* in_compat_syscall() uses the presence of the x32
>> -		   syscall bit flag to determine compat status */
>> +		/*
>> +		 * in_compat_syscall() uses the presence of the x32
>> +		 * syscall bit flag to determine compat status.
>> +		 * On the bitness of syscall relies x86 mmap() code,
>> +		 * so set x32 syscall bit right here to make
>> +		 * in_compat_syscall() work during exec().
>> +		 */
>> +		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>> 		current->thread.status &= ~TS_COMPAT;
>> 	} else {
>> 		set_thread_flag(TIF_IA32);
>
> You also need to clear the bit for an x32 -> x86-64 exec.  Otherwise it seems okay to me.

Oh, indeed!
Thanks for catching, I'll send v3 with it.

-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 17:27     ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 17:27 UTC (permalink / raw)
  To: hpa, linux-kernel
  Cc: 0x7f454c46, Adam Borowski, linux-mm, Andrei Vagin,
	Cyrill Gorcunov, Borislav Petkov, Kirill A. Shutemov, x86,
	Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On 03/21/2017 08:27 PM, hpa@zytor.com wrote:
> On March 21, 2017 9:37:12 AM PDT, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>> After my changes to mmap(), its code now relies on the bitness of
>> performing syscall. According to that, it chooses the base of
>> allocation:
>> mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
>> It was done by:
>>  commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>> 32-bit mmap()").
>>
>> The code afterwards relies on in_compat_syscall() returning true for
>> 32-bit syscalls. It's usually so while we're in context of application
>> that does 32-bit syscalls. But during exec() it is not valid for x32
>> ELF.
>> The reason is that the application hasn't yet done any syscall, so x32
>> bit has not being set.
>> That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
>> in elf_map(), that is called from do_execve()->load_elf_binary().
>> For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.
>>
>> I suggest to set x32 bit before first return to userspace, during
>> setting personality at exec(). This way we can rely on
>> in_compat_syscall() during exec().
>>
>> Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
>> 32-bit mmap()")
>> Cc: 0x7f454c46@gmail.com
>> Cc: linux-mm@kvack.org
>> Cc: Andrei Vagin <avagin@gmail.com>
>> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Cc: x86@kernel.org
>> Cc: H. Peter Anvin <hpa@zytor.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Reported-by: Adam Borowski <kilobyte@angband.pl>
>> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
>> ---
>> v2:
>> - specifying mmap() allocation path which failed during exec()
>> - fix comment style
>>
>> arch/x86/kernel/process_64.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/process_64.c
>> b/arch/x86/kernel/process_64.c
>> index d6b784a5520d..d3d4d9abcaf8 100644
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>> 		if (current->mm)
>> 			current->mm->context.ia32_compat = TIF_X32;
>> 		current->personality &= ~READ_IMPLIES_EXEC;
>> -		/* in_compat_syscall() uses the presence of the x32
>> -		   syscall bit flag to determine compat status */
>> +		/*
>> +		 * in_compat_syscall() uses the presence of the x32
>> +		 * syscall bit flag to determine compat status.
>> +		 * On the bitness of syscall relies x86 mmap() code,
>> +		 * so set x32 syscall bit right here to make
>> +		 * in_compat_syscall() work during exec().
>> +		 */
>> +		task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>> 		current->thread.status &= ~TS_COMPAT;
>> 	} else {
>> 		set_thread_flag(TIF_IA32);
>
> You also need to clear the bit for an x32 -> x86-64 exec.  Otherwise it seems okay to me.

Oh, indeed!
Thanks for catching, I'll send v3 with it.

-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 17:17   ` Cyrill Gorcunov
@ 2017-03-21 17:45     ` Andy Lutomirski
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2017-03-21 17:45 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
> ...
>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>> index d6b784a5520d..d3d4d9abcaf8 100644
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>               if (current->mm)
>>                       current->mm->context.ia32_compat = TIF_X32;
>>               current->personality &= ~READ_IMPLIES_EXEC;
>> -             /* in_compat_syscall() uses the presence of the x32
>> -                syscall bit flag to determine compat status */
>> +             /*
>> +              * in_compat_syscall() uses the presence of the x32
>> +              * syscall bit flag to determine compat status.
>> +              * On the bitness of syscall relies x86 mmap() code,
>> +              * so set x32 syscall bit right here to make
>> +              * in_compat_syscall() work during exec().
>> +              */
>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>               current->thread.status &= ~TS_COMPAT;
>
> Hi! I must admit I didn't follow close the overall series (so can't
> comment much here :) but I have a slightly unrelated question -- is
> there a way to figure out if task is running in x32 mode say with
> some ptrace or procfs sign?

You should be able to figure out of a *syscall* is x32 by simply
looking at bit 30 in the syscall number.  (This is unlike i386, which
is currently not reflected in ptrace.)

Do we actually have an x32 per-task mode at all?  If so, maybe we can
just remove it on top of Dmitry's series.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 17:45     ` Andy Lutomirski
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2017-03-21 17:45 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
> ...
>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>> index d6b784a5520d..d3d4d9abcaf8 100644
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>               if (current->mm)
>>                       current->mm->context.ia32_compat = TIF_X32;
>>               current->personality &= ~READ_IMPLIES_EXEC;
>> -             /* in_compat_syscall() uses the presence of the x32
>> -                syscall bit flag to determine compat status */
>> +             /*
>> +              * in_compat_syscall() uses the presence of the x32
>> +              * syscall bit flag to determine compat status.
>> +              * On the bitness of syscall relies x86 mmap() code,
>> +              * so set x32 syscall bit right here to make
>> +              * in_compat_syscall() work during exec().
>> +              */
>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>               current->thread.status &= ~TS_COMPAT;
>
> Hi! I must admit I didn't follow close the overall series (so can't
> comment much here :) but I have a slightly unrelated question -- is
> there a way to figure out if task is running in x32 mode say with
> some ptrace or procfs sign?

You should be able to figure out of a *syscall* is x32 by simply
looking at bit 30 in the syscall number.  (This is unlike i386, which
is currently not reflected in ptrace.)

Do we actually have an x32 per-task mode at all?  If so, maybe we can
just remove it on top of Dmitry's series.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Q] Figuring out task mode
  2017-03-21 17:45     ` Andy Lutomirski
@ 2017-03-21 18:05       ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 18:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

/I renamed the mail's subject/

On Tue, Mar 21, 2017 at 10:45:57AM -0700, Andy Lutomirski wrote:
> >> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
> >>               current->thread.status &= ~TS_COMPAT;
> >
> > Hi! I must admit I didn't follow close the overall series (so can't
> > comment much here :) but I have a slightly unrelated question -- is
> > there a way to figure out if task is running in x32 mode say with
> > some ptrace or procfs sign?
> 
> You should be able to figure out of a *syscall* is x32 by simply
> looking at bit 30 in the syscall number.  (This is unlike i386, which
> is currently not reflected in ptrace.)

Yes, syscall number will help but from criu perpspective (until
Dima's patches are merged into mainlie) we need to figure out
if we can dump x32 tasks without running parasite code inside,
ie via plain ptrace call or some procfs output. But looks like
it's impossible for now.

> Do we actually have an x32 per-task mode at all?  If so, maybe we can
> just remove it on top of Dmitry's series.

Don't think so, x32 should be set upon exec and without Dima's series
it is immutable I think.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Q] Figuring out task mode
@ 2017-03-21 18:05       ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 18:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

/I renamed the mail's subject/

On Tue, Mar 21, 2017 at 10:45:57AM -0700, Andy Lutomirski wrote:
> >> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
> >>               current->thread.status &= ~TS_COMPAT;
> >
> > Hi! I must admit I didn't follow close the overall series (so can't
> > comment much here :) but I have a slightly unrelated question -- is
> > there a way to figure out if task is running in x32 mode say with
> > some ptrace or procfs sign?
> 
> You should be able to figure out of a *syscall* is x32 by simply
> looking at bit 30 in the syscall number.  (This is unlike i386, which
> is currently not reflected in ptrace.)

Yes, syscall number will help but from criu perpspective (until
Dima's patches are merged into mainlie) we need to figure out
if we can dump x32 tasks without running parasite code inside,
ie via plain ptrace call or some procfs output. But looks like
it's impossible for now.

> Do we actually have an x32 per-task mode at all?  If so, maybe we can
> just remove it on top of Dmitry's series.

Don't think so, x32 should be set upon exec and without Dima's series
it is immutable I think.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 17:45     ` Andy Lutomirski
@ 2017-03-21 18:09       ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 18:09 UTC (permalink / raw)
  To: Andy Lutomirski, Cyrill Gorcunov
  Cc: linux-kernel, Dmitry Safonov, Adam Borowski, linux-mm,
	Andrei Vagin, Borislav Petkov, Kirill A. Shutemov, X86 ML,
	H. Peter Anvin, Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>> ...
>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>> --- a/arch/x86/kernel/process_64.c
>>> +++ b/arch/x86/kernel/process_64.c
>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>               if (current->mm)
>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>> -             /* in_compat_syscall() uses the presence of the x32
>>> -                syscall bit flag to determine compat status */
>>> +             /*
>>> +              * in_compat_syscall() uses the presence of the x32
>>> +              * syscall bit flag to determine compat status.
>>> +              * On the bitness of syscall relies x86 mmap() code,
>>> +              * so set x32 syscall bit right here to make
>>> +              * in_compat_syscall() work during exec().
>>> +              */
>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>               current->thread.status &= ~TS_COMPAT;
>>
>> Hi! I must admit I didn't follow close the overall series (so can't
>> comment much here :) but I have a slightly unrelated question -- is
>> there a way to figure out if task is running in x32 mode say with
>> some ptrace or procfs sign?
>
> You should be able to figure out of a *syscall* is x32 by simply
> looking at bit 30 in the syscall number.  (This is unlike i386, which
> is currently not reflected in ptrace.)

The process could be stopped with PTRACE_SEIZE and I think, it'll not
have x32 syscall bit at that moment.

I guess the question comes from that we're releasing CRIU 3.0 with
32-bit C/R and some other cool stuff, but we don't support x32 yet.
As we don't want release a thing that we aren't properly testing.
So for a while we should error on dumping x32 applications.

I think, the best way for now is to check physicall address of vdso
from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we could
also add check for %ds from ptrace's register set. For x32 it's set to
__USER_DS, while for native it's 0 (looking at start_thread() and
compat_start_thread()). The application can simply change it without
any consequence - so it's not very reliable, we could only warn at
catching it, not rely on this.

>
> Do we actually have an x32 per-task mode at all?  If so, maybe we can
> just remove it on top of Dmitry's series.

-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 18:09       ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 18:09 UTC (permalink / raw)
  To: Andy Lutomirski, Cyrill Gorcunov
  Cc: linux-kernel, Dmitry Safonov, Adam Borowski, linux-mm,
	Andrei Vagin, Borislav Petkov, Kirill A. Shutemov, X86 ML,
	H. Peter Anvin, Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>> ...
>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>> --- a/arch/x86/kernel/process_64.c
>>> +++ b/arch/x86/kernel/process_64.c
>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>               if (current->mm)
>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>> -             /* in_compat_syscall() uses the presence of the x32
>>> -                syscall bit flag to determine compat status */
>>> +             /*
>>> +              * in_compat_syscall() uses the presence of the x32
>>> +              * syscall bit flag to determine compat status.
>>> +              * On the bitness of syscall relies x86 mmap() code,
>>> +              * so set x32 syscall bit right here to make
>>> +              * in_compat_syscall() work during exec().
>>> +              */
>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>               current->thread.status &= ~TS_COMPAT;
>>
>> Hi! I must admit I didn't follow close the overall series (so can't
>> comment much here :) but I have a slightly unrelated question -- is
>> there a way to figure out if task is running in x32 mode say with
>> some ptrace or procfs sign?
>
> You should be able to figure out of a *syscall* is x32 by simply
> looking at bit 30 in the syscall number.  (This is unlike i386, which
> is currently not reflected in ptrace.)

The process could be stopped with PTRACE_SEIZE and I think, it'll not
have x32 syscall bit at that moment.

I guess the question comes from that we're releasing CRIU 3.0 with
32-bit C/R and some other cool stuff, but we don't support x32 yet.
As we don't want release a thing that we aren't properly testing.
So for a while we should error on dumping x32 applications.

I think, the best way for now is to check physicall address of vdso
from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we could
also add check for %ds from ptrace's register set. For x32 it's set to
__USER_DS, while for native it's 0 (looking at start_thread() and
compat_start_thread()). The application can simply change it without
any consequence - so it's not very reliable, we could only warn at
catching it, not rely on this.

>
> Do we actually have an x32 per-task mode at all?  If so, maybe we can
> just remove it on top of Dmitry's series.

-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 18:09       ` Dmitry Safonov
@ 2017-03-21 18:40         ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 18:40 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 09:09:40PM +0300, Dmitry Safonov wrote:
> 
> I guess the question comes from that we're releasing CRIU 3.0 with
> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
> As we don't want release a thing that we aren't properly testing.
> So for a while we should error on dumping x32 applications.

yes

> I think, the best way for now is to check physicall address of vdso
> from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we could
> also add check for %ds from ptrace's register set. For x32 it's set to
> __USER_DS, while for native it's 0 (looking at start_thread() and
> compat_start_thread()). The application can simply change it without
> any consequence - so it's not very reliable, we could only warn at
> catching it, not rely on this.

indeed, thanks!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 18:40         ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 18:40 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 09:09:40PM +0300, Dmitry Safonov wrote:
> 
> I guess the question comes from that we're releasing CRIU 3.0 with
> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
> As we don't want release a thing that we aren't properly testing.
> So for a while we should error on dumping x32 applications.

yes

> I think, the best way for now is to check physicall address of vdso
> from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we could
> also add check for %ds from ptrace's register set. For x32 it's set to
> __USER_DS, while for native it's 0 (looking at start_thread() and
> compat_start_thread()). The application can simply change it without
> any consequence - so it's not very reliable, we could only warn at
> catching it, not rely on this.

indeed, thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 17:45     ` Andy Lutomirski
@ 2017-03-21 18:49       ` hpa
  -1 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 18:49 UTC (permalink / raw)
  To: Andy Lutomirski, Cyrill Gorcunov
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On March 21, 2017 10:45:57 AM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>wrote:
>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>> ...
>>> diff --git a/arch/x86/kernel/process_64.c
>b/arch/x86/kernel/process_64.c
>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>> --- a/arch/x86/kernel/process_64.c
>>> +++ b/arch/x86/kernel/process_64.c
>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>               if (current->mm)
>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>> -             /* in_compat_syscall() uses the presence of the x32
>>> -                syscall bit flag to determine compat status */
>>> +             /*
>>> +              * in_compat_syscall() uses the presence of the x32
>>> +              * syscall bit flag to determine compat status.
>>> +              * On the bitness of syscall relies x86 mmap() code,
>>> +              * so set x32 syscall bit right here to make
>>> +              * in_compat_syscall() work during exec().
>>> +              */
>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>               current->thread.status &= ~TS_COMPAT;
>>
>> Hi! I must admit I didn't follow close the overall series (so can't
>> comment much here :) but I have a slightly unrelated question -- is
>> there a way to figure out if task is running in x32 mode say with
>> some ptrace or procfs sign?
>
>You should be able to figure out of a *syscall* is x32 by simply
>looking at bit 30 in the syscall number.  (This is unlike i386, which
>is currently not reflected in ptrace.)
>
>Do we actually have an x32 per-task mode at all?  If so, maybe we can
>just remove it on top of Dmitry's series.

We do, for things like signal delivery mostly.  We have tried relying on it as little as possible, intentionally.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 18:49       ` hpa
  0 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 18:49 UTC (permalink / raw)
  To: Andy Lutomirski, Cyrill Gorcunov
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On March 21, 2017 10:45:57 AM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>wrote:
>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>> ...
>>> diff --git a/arch/x86/kernel/process_64.c
>b/arch/x86/kernel/process_64.c
>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>> --- a/arch/x86/kernel/process_64.c
>>> +++ b/arch/x86/kernel/process_64.c
>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>               if (current->mm)
>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>> -             /* in_compat_syscall() uses the presence of the x32
>>> -                syscall bit flag to determine compat status */
>>> +             /*
>>> +              * in_compat_syscall() uses the presence of the x32
>>> +              * syscall bit flag to determine compat status.
>>> +              * On the bitness of syscall relies x86 mmap() code,
>>> +              * so set x32 syscall bit right here to make
>>> +              * in_compat_syscall() work during exec().
>>> +              */
>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>               current->thread.status &= ~TS_COMPAT;
>>
>> Hi! I must admit I didn't follow close the overall series (so can't
>> comment much here :) but I have a slightly unrelated question -- is
>> there a way to figure out if task is running in x32 mode say with
>> some ptrace or procfs sign?
>
>You should be able to figure out of a *syscall* is x32 by simply
>looking at bit 30 in the syscall number.  (This is unlike i386, which
>is currently not reflected in ptrace.)
>
>Do we actually have an x32 per-task mode at all?  If so, maybe we can
>just remove it on top of Dmitry's series.

We do, for things like signal delivery mostly.  We have tried relying on it as little as possible, intentionally.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 18:40         ` Cyrill Gorcunov
@ 2017-03-21 18:51           ` hpa
  -1 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 18:51 UTC (permalink / raw)
  To: Cyrill Gorcunov, Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On March 21, 2017 11:40:58 AM PDT, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>On Tue, Mar 21, 2017 at 09:09:40PM +0300, Dmitry Safonov wrote:
>> 
>> I guess the question comes from that we're releasing CRIU 3.0 with
>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>> As we don't want release a thing that we aren't properly testing.
>> So for a while we should error on dumping x32 applications.
>
>yes
>
>> I think, the best way for now is to check physicall address of vdso
>> from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we
>could
>> also add check for %ds from ptrace's register set. For x32 it's set
>to
>> __USER_DS, while for native it's 0 (looking at start_thread() and
>> compat_start_thread()). The application can simply change it without
>> any consequence - so it's not very reliable, we could only warn at
>> catching it, not rely on this.
>
>indeed, thanks!

I proposed to the ptrace people a virtual register for this and a few other things, but it got bikeshed to death.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 18:51           ` hpa
  0 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 18:51 UTC (permalink / raw)
  To: Cyrill Gorcunov, Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, Andy Lutomirski, Ingo Molnar, Thomas Gleixner

On March 21, 2017 11:40:58 AM PDT, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>On Tue, Mar 21, 2017 at 09:09:40PM +0300, Dmitry Safonov wrote:
>> 
>> I guess the question comes from that we're releasing CRIU 3.0 with
>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>> As we don't want release a thing that we aren't properly testing.
>> So for a while we should error on dumping x32 applications.
>
>yes
>
>> I think, the best way for now is to check physicall address of vdso
>> from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we
>could
>> also add check for %ds from ptrace's register set. For x32 it's set
>to
>> __USER_DS, while for native it's 0 (looking at start_thread() and
>> compat_start_thread()). The application can simply change it without
>> any consequence - so it's not very reliable, we could only warn at
>> catching it, not rely on this.
>
>indeed, thanks!

I proposed to the ptrace people a virtual register for this and a few other things, but it got bikeshed to death.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 18:51           ` hpa
@ 2017-03-21 19:07             ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 19:07 UTC (permalink / raw)
  To: hpa
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Dmitry Safonov,
	Adam Borowski, linux-mm, Andrei Vagin, Borislav Petkov,
	Kirill A. Shutemov, X86 ML, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 11:51:09AM -0700, hpa@zytor.com wrote:
> >
> >indeed, thanks!
> 
> I proposed to the ptrace people a virtual register for this and a few other things, but it got bikeshed to death.

Any mail reference left? Would like to read it.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:07             ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 19:07 UTC (permalink / raw)
  To: hpa
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Dmitry Safonov,
	Adam Borowski, linux-mm, Andrei Vagin, Borislav Petkov,
	Kirill A. Shutemov, X86 ML, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 11:51:09AM -0700, hpa@zytor.com wrote:
> >
> >indeed, thanks!
> 
> I proposed to the ptrace people a virtual register for this and a few other things, but it got bikeshed to death.

Any mail reference left? Would like to read it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 18:40         ` Cyrill Gorcunov
@ 2017-03-21 19:19           ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 19:19 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 09:40 PM, Cyrill Gorcunov wrote:
> On Tue, Mar 21, 2017 at 09:09:40PM +0300, Dmitry Safonov wrote:
>>
>> I guess the question comes from that we're releasing CRIU 3.0 with
>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>> As we don't want release a thing that we aren't properly testing.
>> So for a while we should error on dumping x32 applications.
>
> yes
>
>> I think, the best way for now is to check physicall address of vdso
>> from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we could
>> also add check for %ds from ptrace's register set. For x32 it's set to
>> __USER_DS, while for native it's 0 (looking at start_thread() and
>> compat_start_thread()). The application can simply change it without
>> any consequence - so it's not very reliable, we could only warn at
>> catching it, not rely on this.
>
> indeed, thanks!

Also, even more simple-minded: for now we could just check binary magic
from /proc/.../exe, for now stopping on x32 binaries.

-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:19           ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 19:19 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 09:40 PM, Cyrill Gorcunov wrote:
> On Tue, Mar 21, 2017 at 09:09:40PM +0300, Dmitry Safonov wrote:
>>
>> I guess the question comes from that we're releasing CRIU 3.0 with
>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>> As we don't want release a thing that we aren't properly testing.
>> So for a while we should error on dumping x32 applications.
>
> yes
>
>> I think, the best way for now is to check physicall address of vdso
>> from /proc/.../pagemap. If it's CONFIG_VDSO=n kernel, I guess we could
>> also add check for %ds from ptrace's register set. For x32 it's set to
>> __USER_DS, while for native it's 0 (looking at start_thread() and
>> compat_start_thread()). The application can simply change it without
>> any consequence - so it's not very reliable, we could only warn at
>> catching it, not rely on this.
>
> indeed, thanks!

Also, even more simple-minded: for now we could just check binary magic
from /proc/.../exe, for now stopping on x32 binaries.

-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 19:07             ` Cyrill Gorcunov
@ 2017-03-21 19:20               ` hpa
  -1 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 19:20 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Dmitry Safonov,
	Adam Borowski, linux-mm, Andrei Vagin, Borislav Petkov,
	Kirill A. Shutemov, X86 ML, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On March 21, 2017 12:07:13 PM PDT, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>On Tue, Mar 21, 2017 at 11:51:09AM -0700, hpa@zytor.com wrote:
>> >
>> >indeed, thanks!
>> 
>> I proposed to the ptrace people a virtual register for this and a few
>other things, but it got bikeshed to death.
>
>Any mail reference left? Would like to read it.

Not sure...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:20               ` hpa
  0 siblings, 0 replies; 40+ messages in thread
From: hpa @ 2017-03-21 19:20 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, Andy Lutomirski, linux-kernel, Dmitry Safonov,
	Adam Borowski, linux-mm, Andrei Vagin, Borislav Petkov,
	Kirill A. Shutemov, X86 ML, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On March 21, 2017 12:07:13 PM PDT, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>On Tue, Mar 21, 2017 at 11:51:09AM -0700, hpa@zytor.com wrote:
>> >
>> >indeed, thanks!
>> 
>> I proposed to the ptrace people a virtual register for this and a few
>other things, but it got bikeshed to death.
>
>Any mail reference left? Would like to read it.

Not sure...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 19:19           ` Dmitry Safonov
@ 2017-03-21 19:24             ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 19:24 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 10:19:01PM +0300, Dmitry Safonov wrote:
> > 
> > indeed, thanks!
> 
> Also, even more simple-minded: for now we could just check binary magic
> from /proc/.../exe, for now stopping on x32 binaries.

File may not exist and elfheader wiped out as well.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:24             ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 19:24 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 10:19:01PM +0300, Dmitry Safonov wrote:
> > 
> > indeed, thanks!
> 
> Also, even more simple-minded: for now we could just check binary magic
> from /proc/.../exe, for now stopping on x32 binaries.

File may not exist and elfheader wiped out as well.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 18:09       ` Dmitry Safonov
@ 2017-03-21 19:31         ` Andy Lutomirski
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2017-03-21 19:31 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Cyrill Gorcunov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>
>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>> wrote:
>>>
>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>> ...
>>>>
>>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>> --- a/arch/x86/kernel/process_64.c
>>>> +++ b/arch/x86/kernel/process_64.c
>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>               if (current->mm)
>>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>>> -             /* in_compat_syscall() uses the presence of the x32
>>>> -                syscall bit flag to determine compat status */
>>>> +             /*
>>>> +              * in_compat_syscall() uses the presence of the x32
>>>> +              * syscall bit flag to determine compat status.
>>>> +              * On the bitness of syscall relies x86 mmap() code,
>>>> +              * so set x32 syscall bit right here to make
>>>> +              * in_compat_syscall() work during exec().
>>>> +              */
>>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>               current->thread.status &= ~TS_COMPAT;
>>>
>>>
>>> Hi! I must admit I didn't follow close the overall series (so can't
>>> comment much here :) but I have a slightly unrelated question -- is
>>> there a way to figure out if task is running in x32 mode say with
>>> some ptrace or procfs sign?
>>
>>
>> You should be able to figure out of a *syscall* is x32 by simply
>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>> is currently not reflected in ptrace.)
>
>
> The process could be stopped with PTRACE_SEIZE and I think, it'll not
> have x32 syscall bit at that moment.
>
> I guess the question comes from that we're releasing CRIU 3.0 with
> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
> As we don't want release a thing that we aren't properly testing.
> So for a while we should error on dumping x32 applications.

I'm curious: shouldn't x32 CRIU just work?  What goes wrong?

--Andy

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:31         ` Andy Lutomirski
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2017-03-21 19:31 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Cyrill Gorcunov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>
>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>> wrote:
>>>
>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>> ...
>>>>
>>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>> --- a/arch/x86/kernel/process_64.c
>>>> +++ b/arch/x86/kernel/process_64.c
>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>               if (current->mm)
>>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>>> -             /* in_compat_syscall() uses the presence of the x32
>>>> -                syscall bit flag to determine compat status */
>>>> +             /*
>>>> +              * in_compat_syscall() uses the presence of the x32
>>>> +              * syscall bit flag to determine compat status.
>>>> +              * On the bitness of syscall relies x86 mmap() code,
>>>> +              * so set x32 syscall bit right here to make
>>>> +              * in_compat_syscall() work during exec().
>>>> +              */
>>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>               current->thread.status &= ~TS_COMPAT;
>>>
>>>
>>> Hi! I must admit I didn't follow close the overall series (so can't
>>> comment much here :) but I have a slightly unrelated question -- is
>>> there a way to figure out if task is running in x32 mode say with
>>> some ptrace or procfs sign?
>>
>>
>> You should be able to figure out of a *syscall* is x32 by simply
>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>> is currently not reflected in ptrace.)
>
>
> The process could be stopped with PTRACE_SEIZE and I think, it'll not
> have x32 syscall bit at that moment.
>
> I guess the question comes from that we're releasing CRIU 3.0 with
> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
> As we don't want release a thing that we aren't properly testing.
> So for a while we should error on dumping x32 applications.

I'm curious: shouldn't x32 CRIU just work?  What goes wrong?

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 19:24             ` Cyrill Gorcunov
@ 2017-03-21 19:34               ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 19:34 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 10:24 PM, Cyrill Gorcunov wrote:
> On Tue, Mar 21, 2017 at 10:19:01PM +0300, Dmitry Safonov wrote:
>>>
>>> indeed, thanks!
>>
>> Also, even more simple-minded: for now we could just check binary magic
>> from /proc/.../exe, for now stopping on x32 binaries.
>
> File may not exist and elfheader wiped out as well.

Yep, not very reliable.

-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:34               ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 19:34 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andy Lutomirski, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 10:24 PM, Cyrill Gorcunov wrote:
> On Tue, Mar 21, 2017 at 10:19:01PM +0300, Dmitry Safonov wrote:
>>>
>>> indeed, thanks!
>>
>> Also, even more simple-minded: for now we could just check binary magic
>> from /proc/.../exe, for now stopping on x32 binaries.
>
> File may not exist and elfheader wiped out as well.

Yep, not very reliable.

-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 19:31         ` Andy Lutomirski
@ 2017-03-21 19:34           ` Cyrill Gorcunov
  -1 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 19:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 12:31:51PM -0700, Andy Lutomirski wrote:
...
> > I guess the question comes from that we're releasing CRIU 3.0 with
> > 32-bit C/R and some other cool stuff, but we don't support x32 yet.
> > As we don't want release a thing that we aren't properly testing.
> > So for a while we should error on dumping x32 applications.
> 
> I'm curious: shouldn't x32 CRIU just work?  What goes wrong?

Anything ;) We didn't tried as far as I know but i bet
somthing will be broken for sure.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:34           ` Cyrill Gorcunov
  0 siblings, 0 replies; 40+ messages in thread
From: Cyrill Gorcunov @ 2017-03-21 19:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 12:31:51PM -0700, Andy Lutomirski wrote:
...
> > I guess the question comes from that we're releasing CRIU 3.0 with
> > 32-bit C/R and some other cool stuff, but we don't support x32 yet.
> > As we don't want release a thing that we aren't properly testing.
> > So for a while we should error on dumping x32 applications.
> 
> I'm curious: shouldn't x32 CRIU just work?  What goes wrong?

Anything ;) We didn't tried as far as I know but i bet
somthing will be broken for sure.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 19:31         ` Andy Lutomirski
@ 2017-03-21 19:42           ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 19:42 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Cyrill Gorcunov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 10:31 PM, Andy Lutomirski wrote:
> On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>>
>>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>>> wrote:
>>>>
>>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>>> ...
>>>>>
>>>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>>> --- a/arch/x86/kernel/process_64.c
>>>>> +++ b/arch/x86/kernel/process_64.c
>>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>>               if (current->mm)
>>>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>>>> -             /* in_compat_syscall() uses the presence of the x32
>>>>> -                syscall bit flag to determine compat status */
>>>>> +             /*
>>>>> +              * in_compat_syscall() uses the presence of the x32
>>>>> +              * syscall bit flag to determine compat status.
>>>>> +              * On the bitness of syscall relies x86 mmap() code,
>>>>> +              * so set x32 syscall bit right here to make
>>>>> +              * in_compat_syscall() work during exec().
>>>>> +              */
>>>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>>               current->thread.status &= ~TS_COMPAT;
>>>>
>>>>
>>>> Hi! I must admit I didn't follow close the overall series (so can't
>>>> comment much here :) but I have a slightly unrelated question -- is
>>>> there a way to figure out if task is running in x32 mode say with
>>>> some ptrace or procfs sign?
>>>
>>>
>>> You should be able to figure out of a *syscall* is x32 by simply
>>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>>> is currently not reflected in ptrace.)
>>
>>
>> The process could be stopped with PTRACE_SEIZE and I think, it'll not
>> have x32 syscall bit at that moment.
>>
>> I guess the question comes from that we're releasing CRIU 3.0 with
>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>> As we don't want release a thing that we aren't properly testing.
>> So for a while we should error on dumping x32 applications.
>
> I'm curious: shouldn't x32 CRIU just work?  What goes wrong?

I also think, it should be quite easy to add, as we have arch_prctl() 
for vdso and etc.
But there are things, which will not work if we just dump application
as 64-bit.

For example, what comes to mind: sys_get_robust_list(), it has different 
pointers for 64-bit or for x32/ia32 applications: robust_list
and compat_robust_list. So during C/R we should sometimes call
compatible syscalls for x32 applications to dump/restore, as for futex
list e.g., native will return NULL or empty list.

>
> --Andy
>


-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 19:42           ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 19:42 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Cyrill Gorcunov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 10:31 PM, Andy Lutomirski wrote:
> On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov <dsafonov@virtuozzo.com> wrote:
>> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>>
>>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>>> wrote:
>>>>
>>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>>> ...
>>>>>
>>>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>>> --- a/arch/x86/kernel/process_64.c
>>>>> +++ b/arch/x86/kernel/process_64.c
>>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>>               if (current->mm)
>>>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>>>> -             /* in_compat_syscall() uses the presence of the x32
>>>>> -                syscall bit flag to determine compat status */
>>>>> +             /*
>>>>> +              * in_compat_syscall() uses the presence of the x32
>>>>> +              * syscall bit flag to determine compat status.
>>>>> +              * On the bitness of syscall relies x86 mmap() code,
>>>>> +              * so set x32 syscall bit right here to make
>>>>> +              * in_compat_syscall() work during exec().
>>>>> +              */
>>>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>>               current->thread.status &= ~TS_COMPAT;
>>>>
>>>>
>>>> Hi! I must admit I didn't follow close the overall series (so can't
>>>> comment much here :) but I have a slightly unrelated question -- is
>>>> there a way to figure out if task is running in x32 mode say with
>>>> some ptrace or procfs sign?
>>>
>>>
>>> You should be able to figure out of a *syscall* is x32 by simply
>>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>>> is currently not reflected in ptrace.)
>>
>>
>> The process could be stopped with PTRACE_SEIZE and I think, it'll not
>> have x32 syscall bit at that moment.
>>
>> I guess the question comes from that we're releasing CRIU 3.0 with
>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>> As we don't want release a thing that we aren't properly testing.
>> So for a while we should error on dumping x32 applications.
>
> I'm curious: shouldn't x32 CRIU just work?  What goes wrong?

I also think, it should be quite easy to add, as we have arch_prctl() 
for vdso and etc.
But there are things, which will not work if we just dump application
as 64-bit.

For example, what comes to mind: sys_get_robust_list(), it has different 
pointers for 64-bit or for x32/ia32 applications: robust_list
and compat_robust_list. So during C/R we should sometimes call
compatible syscalls for x32 applications to dump/restore, as for futex
list e.g., native will return NULL or empty list.

>
> --Andy
>


-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
  2017-03-21 19:42           ` Dmitry Safonov
@ 2017-03-21 20:04             ` Dmitry Safonov
  -1 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 20:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Cyrill Gorcunov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 10:42 PM, Dmitry Safonov wrote:
> On 03/21/2017 10:31 PM, Andy Lutomirski wrote:
>> On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov
>> <dsafonov@virtuozzo.com> wrote:
>>> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>>>
>>>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>>>> wrote:
>>>>>
>>>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>>>> ...
>>>>>>
>>>>>> diff --git a/arch/x86/kernel/process_64.c
>>>>>> b/arch/x86/kernel/process_64.c
>>>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>>>> --- a/arch/x86/kernel/process_64.c
>>>>>> +++ b/arch/x86/kernel/process_64.c
>>>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>>>               if (current->mm)
>>>>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>>>>> -             /* in_compat_syscall() uses the presence of the x32
>>>>>> -                syscall bit flag to determine compat status */
>>>>>> +             /*
>>>>>> +              * in_compat_syscall() uses the presence of the x32
>>>>>> +              * syscall bit flag to determine compat status.
>>>>>> +              * On the bitness of syscall relies x86 mmap() code,
>>>>>> +              * so set x32 syscall bit right here to make
>>>>>> +              * in_compat_syscall() work during exec().
>>>>>> +              */
>>>>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>>>               current->thread.status &= ~TS_COMPAT;
>>>>>
>>>>>
>>>>> Hi! I must admit I didn't follow close the overall series (so can't
>>>>> comment much here :) but I have a slightly unrelated question -- is
>>>>> there a way to figure out if task is running in x32 mode say with
>>>>> some ptrace or procfs sign?
>>>>
>>>>
>>>> You should be able to figure out of a *syscall* is x32 by simply
>>>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>>>> is currently not reflected in ptrace.)
>>>
>>>
>>> The process could be stopped with PTRACE_SEIZE and I think, it'll not
>>> have x32 syscall bit at that moment.
>>>
>>> I guess the question comes from that we're releasing CRIU 3.0 with
>>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>>> As we don't want release a thing that we aren't properly testing.
>>> So for a while we should error on dumping x32 applications.
>>
>> I'm curious: shouldn't x32 CRIU just work?  What goes wrong?
>
> I also think, it should be quite easy to add, as we have arch_prctl()
> for vdso and etc.
> But there are things, which will not work if we just dump application
> as 64-bit.
>
> For example, what comes to mind: sys_get_robust_list(), it has different
> pointers for 64-bit or for x32/ia32 applications: robust_list
> and compat_robust_list. So during C/R we should sometimes call
> compatible syscalls for x32 applications to dump/restore, as for futex
> list e.g., native will return NULL or empty list.

Maybe we should just save both pointers with CRIU for simplicity.
Which will add additional syscall for most applications that define only
one of compat/native lists.

I think, there are some other things like that, but it's end of the day
and nothing crosses my mind.

Anyway, I wouldn't want release anything without adding it to regular
tests, so that would need also some time to do. And a funny thing: there
are many folks which runs 32-bit containers on x86_64 to save memory,
but they use ia32, not x32. Maybe because of envoironment which is
easier to get (for x32 there are no templates for example). Maybe just
because.

So, yet I haven't saw any request for x32 C/R while for ia32 there is
crowded room. And quite many people for arm/arm64 where kernel doesn't
support yet vdso mremap() and CRIU doesn't run test for them regulary
yet.

But well, x32 still should be quite easy to add, say, for the next
release after ia32 C/R not sure if it'll be planned though.


-- 
              Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
@ 2017-03-21 20:04             ` Dmitry Safonov
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Safonov @ 2017-03-21 20:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Cyrill Gorcunov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On 03/21/2017 10:42 PM, Dmitry Safonov wrote:
> On 03/21/2017 10:31 PM, Andy Lutomirski wrote:
>> On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov
>> <dsafonov@virtuozzo.com> wrote:
>>> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>>>
>>>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@gmail.com>
>>>> wrote:
>>>>>
>>>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>>>> ...
>>>>>>
>>>>>> diff --git a/arch/x86/kernel/process_64.c
>>>>>> b/arch/x86/kernel/process_64.c
>>>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>>>> --- a/arch/x86/kernel/process_64.c
>>>>>> +++ b/arch/x86/kernel/process_64.c
>>>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>>>               if (current->mm)
>>>>>>                       current->mm->context.ia32_compat = TIF_X32;
>>>>>>               current->personality &= ~READ_IMPLIES_EXEC;
>>>>>> -             /* in_compat_syscall() uses the presence of the x32
>>>>>> -                syscall bit flag to determine compat status */
>>>>>> +             /*
>>>>>> +              * in_compat_syscall() uses the presence of the x32
>>>>>> +              * syscall bit flag to determine compat status.
>>>>>> +              * On the bitness of syscall relies x86 mmap() code,
>>>>>> +              * so set x32 syscall bit right here to make
>>>>>> +              * in_compat_syscall() work during exec().
>>>>>> +              */
>>>>>> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>>>               current->thread.status &= ~TS_COMPAT;
>>>>>
>>>>>
>>>>> Hi! I must admit I didn't follow close the overall series (so can't
>>>>> comment much here :) but I have a slightly unrelated question -- is
>>>>> there a way to figure out if task is running in x32 mode say with
>>>>> some ptrace or procfs sign?
>>>>
>>>>
>>>> You should be able to figure out of a *syscall* is x32 by simply
>>>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>>>> is currently not reflected in ptrace.)
>>>
>>>
>>> The process could be stopped with PTRACE_SEIZE and I think, it'll not
>>> have x32 syscall bit at that moment.
>>>
>>> I guess the question comes from that we're releasing CRIU 3.0 with
>>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>>> As we don't want release a thing that we aren't properly testing.
>>> So for a while we should error on dumping x32 applications.
>>
>> I'm curious: shouldn't x32 CRIU just work?  What goes wrong?
>
> I also think, it should be quite easy to add, as we have arch_prctl()
> for vdso and etc.
> But there are things, which will not work if we just dump application
> as 64-bit.
>
> For example, what comes to mind: sys_get_robust_list(), it has different
> pointers for 64-bit or for x32/ia32 applications: robust_list
> and compat_robust_list. So during C/R we should sometimes call
> compatible syscalls for x32 applications to dump/restore, as for futex
> list e.g., native will return NULL or empty list.

Maybe we should just save both pointers with CRIU for simplicity.
Which will add additional syscall for most applications that define only
one of compat/native lists.

I think, there are some other things like that, but it's end of the day
and nothing crosses my mind.

Anyway, I wouldn't want release anything without adding it to regular
tests, so that would need also some time to do. And a funny thing: there
are many folks which runs 32-bit containers on x86_64 to save memory,
but they use ia32, not x32. Maybe because of envoironment which is
easier to get (for x32 there are no templates for example). Maybe just
because.

So, yet I haven't saw any request for x32 C/R while for ia32 there is
crowded room. And quite many people for arm/arm64 where kernel doesn't
support yet vdso mremap() and CRIU doesn't run test for them regulary
yet.

But well, x32 still should be quite easy to add, say, for the next
release after ia32 C/R not sure if it'll be planned though.


-- 
              Dmitry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Q] Figuring out task mode
  2017-03-21 18:05       ` Cyrill Gorcunov
@ 2017-03-21 23:54         ` Andy Lutomirski
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2017-03-21 23:54 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 11:05 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> /I renamed the mail's subject/
>
> On Tue, Mar 21, 2017 at 10:45:57AM -0700, Andy Lutomirski wrote:
>> >> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>> >>               current->thread.status &= ~TS_COMPAT;
>> >
>> > Hi! I must admit I didn't follow close the overall series (so can't
>> > comment much here :) but I have a slightly unrelated question -- is
>> > there a way to figure out if task is running in x32 mode say with
>> > some ptrace or procfs sign?
>>
>> You should be able to figure out of a *syscall* is x32 by simply
>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>> is currently not reflected in ptrace.)
>
> Yes, syscall number will help but from criu perpspective (until
> Dima's patches are merged into mainlie) we need to figure out
> if we can dump x32 tasks without running parasite code inside,
> ie via plain ptrace call or some procfs output. But looks like
> it's impossible for now.
>
>> Do we actually have an x32 per-task mode at all?  If so, maybe we can
>> just remove it on top of Dmitry's series.
>
> Don't think so, x32 should be set upon exec and without Dima's series
> it is immutable I think.

What I mean is: why should the kernel care about per-task X32 state
*at all*?  On top of Dmitry's series, TIF_X32 appears to be used to
determine which vDSO to map, which mm layout to use, *and nothing
else*.  Want to write a trivial patch to get rid of it entirely?

Ideally we could get rid of mm->context.ia32_compat, too.  The only
interesting use it has is MPX, and we should probably instead track
mm->context.mpx_layout and determine *that* from the prctl() bitness.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Q] Figuring out task mode
@ 2017-03-21 23:54         ` Andy Lutomirski
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2017-03-21 23:54 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Adam Borowski,
	linux-mm, Andrei Vagin, Borislav Petkov, Kirill A. Shutemov,
	X86 ML, H. Peter Anvin, Andy Lutomirski, Ingo Molnar,
	Thomas Gleixner

On Tue, Mar 21, 2017 at 11:05 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> /I renamed the mail's subject/
>
> On Tue, Mar 21, 2017 at 10:45:57AM -0700, Andy Lutomirski wrote:
>> >> +             task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>> >>               current->thread.status &= ~TS_COMPAT;
>> >
>> > Hi! I must admit I didn't follow close the overall series (so can't
>> > comment much here :) but I have a slightly unrelated question -- is
>> > there a way to figure out if task is running in x32 mode say with
>> > some ptrace or procfs sign?
>>
>> You should be able to figure out of a *syscall* is x32 by simply
>> looking at bit 30 in the syscall number.  (This is unlike i386, which
>> is currently not reflected in ptrace.)
>
> Yes, syscall number will help but from criu perpspective (until
> Dima's patches are merged into mainlie) we need to figure out
> if we can dump x32 tasks without running parasite code inside,
> ie via plain ptrace call or some procfs output. But looks like
> it's impossible for now.
>
>> Do we actually have an x32 per-task mode at all?  If so, maybe we can
>> just remove it on top of Dmitry's series.
>
> Don't think so, x32 should be set upon exec and without Dima's series
> it is immutable I think.

What I mean is: why should the kernel care about per-task X32 state
*at all*?  On top of Dmitry's series, TIF_X32 appears to be used to
determine which vDSO to map, which mm layout to use, *and nothing
else*.  Want to write a trivial patch to get rid of it entirely?

Ideally we could get rid of mm->context.ia32_compat, too.  The only
interesting use it has is MPX, and we should probably instead track
mm->context.mpx_layout and determine *that* from the prctl() bitness.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2017-03-21 23:56 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21 16:37 [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY() Dmitry Safonov
2017-03-21 16:37 ` Dmitry Safonov
2017-03-21 17:17 ` Cyrill Gorcunov
2017-03-21 17:17   ` Cyrill Gorcunov
2017-03-21 17:45   ` Andy Lutomirski
2017-03-21 17:45     ` Andy Lutomirski
2017-03-21 18:05     ` [Q] Figuring out task mode Cyrill Gorcunov
2017-03-21 18:05       ` Cyrill Gorcunov
2017-03-21 23:54       ` Andy Lutomirski
2017-03-21 23:54         ` Andy Lutomirski
2017-03-21 18:09     ` [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY() Dmitry Safonov
2017-03-21 18:09       ` Dmitry Safonov
2017-03-21 18:40       ` Cyrill Gorcunov
2017-03-21 18:40         ` Cyrill Gorcunov
2017-03-21 18:51         ` hpa
2017-03-21 18:51           ` hpa
2017-03-21 19:07           ` Cyrill Gorcunov
2017-03-21 19:07             ` Cyrill Gorcunov
2017-03-21 19:20             ` hpa
2017-03-21 19:20               ` hpa
2017-03-21 19:19         ` Dmitry Safonov
2017-03-21 19:19           ` Dmitry Safonov
2017-03-21 19:24           ` Cyrill Gorcunov
2017-03-21 19:24             ` Cyrill Gorcunov
2017-03-21 19:34             ` Dmitry Safonov
2017-03-21 19:34               ` Dmitry Safonov
2017-03-21 19:31       ` Andy Lutomirski
2017-03-21 19:31         ` Andy Lutomirski
2017-03-21 19:34         ` Cyrill Gorcunov
2017-03-21 19:34           ` Cyrill Gorcunov
2017-03-21 19:42         ` Dmitry Safonov
2017-03-21 19:42           ` Dmitry Safonov
2017-03-21 20:04           ` Dmitry Safonov
2017-03-21 20:04             ` Dmitry Safonov
2017-03-21 18:49     ` hpa
2017-03-21 18:49       ` hpa
2017-03-21 17:27 ` hpa
2017-03-21 17:27   ` hpa
2017-03-21 17:27   ` Dmitry Safonov
2017-03-21 17:27     ` Dmitry Safonov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.