linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] x86/paravirt for v2.6.33
@ 2009-12-03 21:09 Ingo Molnar
  2009-12-08 21:34 ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2009-12-03 21:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin, Andrew Morton

Linus,

Please pull the latest x86-paravirt-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86-paravirt-for-linus

 Thanks,

	Ingo

------------------>
Jeremy Fitzhardinge (1):
      x86: unify sys_iopl


 arch/x86/include/asm/syscalls.h |    8 +++-----
 arch/x86/kernel/ioport.c        |   11 ++---------
 2 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 372b76e..5336ce2 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
 asmlinkage int sys_set_thread_area(struct user_desc __user *);
 asmlinkage int sys_get_thread_area(struct user_desc __user *);
 
-/* X86_32 only */
-#ifdef CONFIG_X86_32
 /* kernel/ioport.c */
-long sys_iopl(struct pt_regs *);
+asmlinkage long sys_iopl(unsigned int);
 
+/* X86_32 only */
+#ifdef CONFIG_X86_32
 /* kernel/process_32.c */
 int sys_clone(struct pt_regs *);
 int sys_execve(struct pt_regs *);
@@ -70,8 +70,6 @@ int sys_vm86(struct pt_regs *);
 #else /* CONFIG_X86_32 */
 
 /* X86_64 only */
-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
 
 /* kernel/process_64.c */
 asmlinkage long sys_clone(unsigned long, unsigned long,
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index 99c4d30..bad4f22 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -119,11 +119,10 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
 	return 0;
 }
 
-#ifdef CONFIG_X86_32
-long sys_iopl(struct pt_regs *regs)
+asmlinkage long sys_iopl(unsigned int level)
 {
-	unsigned int level = regs->bx;
 	struct thread_struct *t = &current->thread;
+	struct pt_regs *regs = task_pt_regs(current);
 	int rc;
 
 	rc = do_iopl(level, regs);
@@ -135,9 +134,3 @@ long sys_iopl(struct pt_regs *regs)
 out:
 	return rc;
 }
-#else
-asmlinkage long sys_iopl(unsigned int level, struct pt_regs *regs)
-{
-	return do_iopl(level, regs);
-}
-#endif

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-03 21:09 [GIT PULL] x86/paravirt for v2.6.33 Ingo Molnar
@ 2009-12-08 21:34 ` Linus Torvalds
  2009-12-09  7:36   ` Ingo Molnar
  2009-12-09 18:18   ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 17+ messages in thread
From: Linus Torvalds @ 2009-12-08 21:34 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin, Andrew Morton



On Thu, 3 Dec 2009, Ingo Molnar wrote:
> 
> Please pull the latest x86-paravirt-for-linus git tree from:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86-paravirt-for-linus

I _really_ don't like this:

> -long sys_iopl(struct pt_regs *regs)
> +asmlinkage long sys_iopl(unsigned int level)
>  {
> -	unsigned int level = regs->bx;
>  	struct thread_struct *t = &current->thread;
> +	struct pt_regs *regs = task_pt_regs(current);

I do _not_ want to add any more task_pt_regs() crap, please.

Why? It's wrong for at least vm86 mode (and from kernel system calls). 
Maybe we can't get into system calls from vm86 mode, and the kernel 
hopefully doesn't do those things anyway, but the point is, you chose the 
wrong way to go.

The old version that actually passed the stack frame was better. Why pick 
the inferior version?

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-08 21:34 ` Linus Torvalds
@ 2009-12-09  7:36   ` Ingo Molnar
  2009-12-09 18:19     ` Jeremy Fitzhardinge
  2009-12-09 18:31     ` Jeremy Fitzhardinge
  2009-12-09 18:18   ` Jeremy Fitzhardinge
  1 sibling, 2 replies; 17+ messages in thread
From: Ingo Molnar @ 2009-12-09  7:36 UTC (permalink / raw)
  To: Linus Torvalds, Jeremy Fitzhardinge
  Cc: linux-kernel, Thomas Gleixner, H. Peter Anvin, Andrew Morton


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 3 Dec 2009, Ingo Molnar wrote:
> > 
> > Please pull the latest x86-paravirt-for-linus git tree from:
> > 
> >    git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86-paravirt-for-linus
> 
> I _really_ don't like this:
> 
> > -long sys_iopl(struct pt_regs *regs)
> > +asmlinkage long sys_iopl(unsigned int level)
> >  {
> > -	unsigned int level = regs->bx;
> >  	struct thread_struct *t = &current->thread;
> > +	struct pt_regs *regs = task_pt_regs(current);
> 
> I do _not_ want to add any more task_pt_regs() crap, please.
> 
> Why? It's wrong for at least vm86 mode (and from kernel system calls). 
> Maybe we can't get into system calls from vm86 mode, and the kernel 
> hopefully doesn't do those things anyway, but the point is, you chose 
> the wrong way to go.
> 
> The old version that actually passed the stack frame was better. Why 
> pick the inferior version?

Yeah, agreed. I missed that detail.

Jeremy, mind sending a patch that updates this code to use the less 
obfuscated 32-bit version, not the 64-bit version? (a delta patch 
against tip:master would be nice, as there's a fair amount of testing in 
the unification change itself already, which we dont want to discard.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-08 21:34 ` Linus Torvalds
  2009-12-09  7:36   ` Ingo Molnar
@ 2009-12-09 18:18   ` Jeremy Fitzhardinge
  2009-12-09 21:58     ` Linus Torvalds
  1 sibling, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-09 18:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, Hugh Dickens

On 12/08/09 13:34, Linus Torvalds wrote:
> I do _not_ want to add any more task_pt_regs() crap, please.
>
> Why? It's wrong for at least vm86 mode (and from kernel system calls).
>    

Would the stack frame version work in these cases?

> Maybe we can't get into system calls from vm86 mode, and the kernel
> hopefully doesn't do those things anyway, but the point is, you chose the
> wrong way to go.
>    

iopl doesn't make much sense as a kernel-called syscall, unless the 
caller is intending to change the usermode iopl.  In which case, won't 
task_pt_regs() do the right thing - by pointing to the saved usermode 
register set - vs modifying the ptregs the caller may pass in?

iopl is also one of the special set of syscalls which get special 
handing in entry_*.S, so I don't think doing a direct call from within 
the kernel is ever sensible, and it should always be possible to make 
task_pt_regs return meaningful results.

I agree with you that vm86 would be a problem if its possible to call iopl.

> The old version that actually passed the stack frame was better. Why pick
> the inferior version?
>    

Mainly because it exposes the difference between the 32 and 64-bit ABIs, 
requiring separate code for each case; it seemed like an opportunity to 
remove the differences.

Anyway, I'll post a patch to revert to the pt_regs-based version shortly.

     J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09  7:36   ` Ingo Molnar
@ 2009-12-09 18:19     ` Jeremy Fitzhardinge
  2009-12-09 18:31     ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-09 18:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, linux-kernel, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, Hugh Dickens

On 12/08/09 23:36, Ingo Molnar wrote:
>> The old version that actually passed the stack frame was better. Why
>> pick the inferior version?
>>      
> Yeah, agreed. I missed that detail.
>    

Which detail is that?  The whole patch? ;)

> Jeremy, mind sending a patch that updates this code to use the less
> obfuscated 32-bit version, not the 64-bit version? (a delta patch
> against tip:master would be nice, as there's a fair amount of testing in
> the unification change itself already, which we dont want to discard.)
>    

Sure.

But I'm not sure I understand the objection to task_pt_regs(); is it 
considered deprecated?   This patch received quite a lot of discussion 
with no mention of it.  Should we consider all its uses as suspect?

Would it be better to have something similar which just returns a 
pointer to the saved [re]flags, since that's all we care about?  That 
should be easier to make robust against

     J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09  7:36   ` Ingo Molnar
  2009-12-09 18:19     ` Jeremy Fitzhardinge
@ 2009-12-09 18:31     ` Jeremy Fitzhardinge
  2009-12-09 18:47       ` Linus Torvalds
  2009-12-09 18:49       ` H. Peter Anvin
  1 sibling, 2 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-09 18:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, linux-kernel, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton

On 12/08/09 23:36, Ingo Molnar wrote:
> Jeremy, mind sending a patch that updates this code to use the less
> obfuscated 32-bit version, not the 64-bit version? (a delta patch
> against tip:master would be nice, as there's a fair amount of testing in
> the unification change itself already, which we dont want to discard.)
>    

How does this look?

git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git fix-iopl

From: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
Date: Wed, 9 Dec 2009 10:26:59 -0800
Subject: [PATCH] x86: Make sys_iopl use passed-in pt_regs

Rather than using task_pt_regs, use the pt_regs * passed into the syscall.
The ABI differences are handled in small 32/64-bit specific functions,
and everything else is handled in the common do_iopl().

Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 5336ce2..70497f0 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
  asmlinkage int sys_set_thread_area(struct user_desc __user *);
  asmlinkage int sys_get_thread_area(struct user_desc __user *);

-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int);
-
  /* X86_32 only */
  #ifdef CONFIG_X86_32
+/* kernel/ioport.c */
+asmlinkage long sys_iopl(struct pt_regs *);
+
  /* kernel/process_32.c */
  int sys_clone(struct pt_regs *);
  int sys_execve(struct pt_regs *);
@@ -71,6 +71,9 @@ int sys_vm86(struct pt_regs *);

  /* X86_64 only */

+/* kernel/ioport.c */
+asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
+
  /* kernel/process_64.c */
  asmlinkage long sys_clone(unsigned long, unsigned long,
  			  void __user *, void __user *,
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index bad4f22..ac3cf88 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -105,6 +105,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
   */
  static int do_iopl(unsigned int level, struct pt_regs *regs)
  {
+	struct thread_struct *t =&current->thread;
  	unsigned int old = (regs->flags>>  12)&  3;

  	if (level>  3)
@@ -116,21 +117,20 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
  	}
  	regs->flags = (regs->flags&  ~X86_EFLAGS_IOPL) | (level<<  12);

+	t->iopl = level<<  12;
+	set_iopl_mask(t->iopl);
+
  	return 0;
  }

-asmlinkage long sys_iopl(unsigned int level)
+#ifdef CONFIG_X86_64
+asmlinkage long sys_iopl(unsigned int level, struct pt_regs *regs)
  {
-	struct thread_struct *t =&current->thread;
-	struct pt_regs *regs = task_pt_regs(current);
-	int rc;
-
-	rc = do_iopl(level, regs);
-	if (rc<  0)
-		goto out;
-
-	t->iopl = level<<  12;
-	set_iopl_mask(t->iopl);
-out:
-	return rc;
+	return do_iopl(level, regs);
+}
+#else
+asmlinkage long sys_iopl(struct pt_regs *regs)
+{
+	return do_iopl(regs->bx, regs);
  }
+#endif



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:31     ` Jeremy Fitzhardinge
@ 2009-12-09 18:47       ` Linus Torvalds
  2009-12-09 18:54         ` H. Peter Anvin
  2009-12-09 19:32         ` Jeremy Fitzhardinge
  2009-12-09 18:49       ` H. Peter Anvin
  1 sibling, 2 replies; 17+ messages in thread
From: Linus Torvalds @ 2009-12-09 18:47 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton



On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:
> 
> How does this look?

I would actually prefer it if the calling convention was just made to 
match on both x86 and x86-64. Wouldn't it be nice if they both just had

> +/* kernel/ioport.c */
> +asmlinkage long sys_iopl(unsigned int, struct pt_regs *);

as the prototype, and looked the same?

I realize that right now the 32-bit PTREGSCALL() thing doesn't support 
that (very different macros for entry.S x86-32 and -64), but isn't that 
just another thing we should try to fix too?

IOW, maybe something like this would be good, and would change the x86-32 
calling convention to match the x86-64 one?

NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I 
don't remember, I didn't check, I'm just throwing this out as a "hey, 
maybe something _like_ this can work" patch, and will be immediately 
removing it from my machine after sending this email.

		Linus

---
 arch/x86/kernel/entry_32.S |   24 ++++++++++++------------
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 50b9c22..22b4431 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -725,22 +725,22 @@ END(syscall_badsys)
 /*
  * System calls that need a pt_regs pointer.
  */
-#define PTREGSCALL(name) \
+#define PTREGSCALL(name, reg) \
 	ALIGN; \
 ptregs_##name: \
-	leal 4(%esp),%eax; \
+	leal 4(%esp),reg; \
 	jmp sys_##name;
 
-PTREGSCALL(iopl)
-PTREGSCALL(fork)
-PTREGSCALL(clone)
-PTREGSCALL(vfork)
-PTREGSCALL(execve)
-PTREGSCALL(sigaltstack)
-PTREGSCALL(sigreturn)
-PTREGSCALL(rt_sigreturn)
-PTREGSCALL(vm86)
-PTREGSCALL(vm86old)
+PTREGSCALL(iopl,%edx)
+PTREGSCALL(fork,%eax)
+PTREGSCALL(clone,%eax)
+PTREGSCALL(vfork,%eax)
+PTREGSCALL(execve,%eax)
+PTREGSCALL(sigaltstack,%eax)
+PTREGSCALL(sigreturn,%eax)
+PTREGSCALL(rt_sigreturn,%eax)
+PTREGSCALL(vm86,%eax)
+PTREGSCALL(vm86old,%eax)
 
 .macro FIXUP_ESPFIX_STACK
 /*

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:31     ` Jeremy Fitzhardinge
  2009-12-09 18:47       ` Linus Torvalds
@ 2009-12-09 18:49       ` H. Peter Anvin
  1 sibling, 0 replies; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-09 18:49 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, Linus Torvalds, linux-kernel, Thomas Gleixner,
	Andrew Morton

On 12/09/2009 10:31 AM, Jeremy Fitzhardinge wrote:
> 
> How does this look?
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git fix-iopl
> 
> From: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
> Date: Wed, 9 Dec 2009 10:26:59 -0800
> Subject: [PATCH] x86: Make sys_iopl use passed-in pt_regs
> 
> Rather than using task_pt_regs, use the pt_regs * passed into the syscall.
> The ABI differences are handled in small 32/64-bit specific functions,
> and everything else is handled in the common do_iopl().
> 
> Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
> 
> diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
> index 5336ce2..70497f0 100644
> --- a/arch/x86/include/asm/syscalls.h
> +++ b/arch/x86/include/asm/syscalls.h
> @@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
>   asmlinkage int sys_set_thread_area(struct user_desc __user *);
>   asmlinkage int sys_get_thread_area(struct user_desc __user *);
> 
> -/* kernel/ioport.c */
> -asmlinkage long sys_iopl(unsigned int);
> -
>   /* X86_32 only */
>   #ifdef CONFIG_X86_32
> +/* kernel/ioport.c */
> +asmlinkage long sys_iopl(struct pt_regs *);
> +
>   /* kernel/process_32.c */

This is the main ugliness that led me to pass up on this in the first
place.  What it really comes down to is that on 32 bits we need the
analogue with what 64 bits have with different stubs for different
number of arguments.

	-hpa

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:47       ` Linus Torvalds
@ 2009-12-09 18:54         ` H. Peter Anvin
  2009-12-09 19:08           ` Linus Torvalds
  2009-12-09 19:25           ` Brian Gerst
  2009-12-09 19:32         ` Jeremy Fitzhardinge
  1 sibling, 2 replies; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-09 18:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeremy Fitzhardinge, Ingo Molnar, linux-kernel, Thomas Gleixner,
	Andrew Morton

On 12/09/2009 10:47 AM, Linus Torvalds wrote:
> 
> NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I 
> don't remember, I didn't check, I'm just throwing this out as a "hey, 
> maybe something _like_ this can work" patch, and will be immediately 
> removing it from my machine after sending this email.
> 

The second argument is in %edx, but unlike 64 bits, it is not loaded
into that register a priory ("asmlinkage" means arguments are on the stack.)

As such, we need something looking like:

#define PTREGSCALL0(name)	\
ptregs_##name:			\
	leal	4(%esp),%eax;	\
	jmp sys_##name

#define PTREGSCALL1(name)	\
ptregs_##name:			\
	movl	4(%esp),%eax;	\
	leal	4(%esp),%edx;	\
	jmp sys_##name

#define PTREGSCALL2(name)	\
ptregs_##name:			\
	movl	4(%esp),%eax;	\
	movl	8(%esp),%edx;	\
	leal	4(%esp),%ecx;	\
	jmp sys_##name

If we need more than two arguments + pt_regs, then we have to set up a
temporary stack frame.

[Sorry, I'm sitting in a meeting so I can't actually write up a real patch]

	-hpa

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:54         ` H. Peter Anvin
@ 2009-12-09 19:08           ` Linus Torvalds
  2009-12-09 19:25           ` Brian Gerst
  1 sibling, 0 replies; 17+ messages in thread
From: Linus Torvalds @ 2009-12-09 19:08 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jeremy Fitzhardinge, Ingo Molnar, linux-kernel, Thomas Gleixner,
	Andrew Morton



On Wed, 9 Dec 2009, H. Peter Anvin wrote:
>
> The second argument is in %edx, but unlike 64 bits, it is not loaded
> into that register a priory ("asmlinkage" means arguments are on the stack.)

Oh, I missed the fact that we don't actually use asmlinkage on 
sys_iopl() at all on x86-32 for this very reason.

And on x86-64, I think asmlinkage is a no-op, so that's ok - we should 
just make the prototype be

	long sys_iopl(long level, struct pt_regs *regs);

and your fancier macros.

			Linus



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:54         ` H. Peter Anvin
  2009-12-09 19:08           ` Linus Torvalds
@ 2009-12-09 19:25           ` Brian Gerst
  2009-12-09 19:35             ` H. Peter Anvin
  1 sibling, 1 reply; 17+ messages in thread
From: Brian Gerst @ 2009-12-09 19:25 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linus Torvalds, Jeremy Fitzhardinge, Ingo Molnar, linux-kernel,
	Thomas Gleixner, Andrew Morton

On Wed, Dec 9, 2009 at 1:54 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 12/09/2009 10:47 AM, Linus Torvalds wrote:
>>
>> NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
>> don't remember, I didn't check, I'm just throwing this out as a "hey,
>> maybe something _like_ this can work" patch, and will be immediately
>> removing it from my machine after sending this email.
>>
>
> The second argument is in %edx, but unlike 64 bits, it is not loaded
> into that register a priory ("asmlinkage" means arguments are on the stack.)
>
> As such, we need something looking like:
>
> #define PTREGSCALL0(name)       \
> ptregs_##name:                  \
>        leal    4(%esp),%eax;   \
>        jmp sys_##name
>
> #define PTREGSCALL1(name)       \
> ptregs_##name:                  \
>        movl    4(%esp),%eax;   \
>        leal    4(%esp),%edx;   \
>        jmp sys_##name
>
> #define PTREGSCALL2(name)       \
> ptregs_##name:                  \
>        movl    4(%esp),%eax;   \
>        movl    8(%esp),%edx;   \
>        leal    4(%esp),%ecx;   \
>        jmp sys_##name
>
> If we need more than two arguments + pt_regs, then we have to set up a
> temporary stack frame.
>
> [Sorry, I'm sitting in a meeting so I can't actually write up a real patch]
>
>        -hpa

I had started working on a patchset like this when this issue came up
the first time.  I'll get it up to date and send it out.  It would be
nice if this could be done for all syscalls, so that
asmlinkage_protect() isn't needed.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:47       ` Linus Torvalds
  2009-12-09 18:54         ` H. Peter Anvin
@ 2009-12-09 19:32         ` Jeremy Fitzhardinge
  2009-12-09 20:05           ` H. Peter Anvin
  1 sibling, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-12-09 19:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton

On 12/09/09 10:47, Linus Torvalds wrote:
>
> On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:
>    
>> How does this look?
>>      
> I would actually prefer it if the calling convention was just made to
> match on both x86 and x86-64. Wouldn't it be nice if they both just had
>
>    
>> +/* kernel/ioport.c */
>> +asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
>>      
> as the prototype, and looked the same?
>
> I realize that right now the 32-bit PTREGSCALL() thing doesn't support
> that (very different macros for entry.S x86-32 and -64), but isn't that
> just another thing we should try to fix too?
>
> IOW, maybe something like this would be good, and would change the x86-32
> calling convention to match the x86-64 one?
>
> NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
> don't remember, I didn't check, I'm just throwing this out as a "hey,
> maybe something _like_ this can work" patch, and will be immediately
> removing it from my machine after sending this email.
>    

I came up with this:

From: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
Date: Wed, 9 Dec 2009 11:17:52 -0800
Subject: [PATCH] x86/iopl: make 32bit iopl also get level argument

This makes it match the 64-bit prototype, and simplifies the whole thing.

Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 70497f0..4e567d5 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -18,6 +18,7 @@
  /* Common in X86_32 and X86_64 */
  /* kernel/ioport.c */
  asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
+long sys_iopl(unsigned int, struct pt_regs *);

  /* kernel/process.c */
  int sys_fork(struct pt_regs *);
@@ -36,7 +37,6 @@ asmlinkage int sys_get_thread_area(struct user_desc __user *);
  /* X86_32 only */
  #ifdef CONFIG_X86_32
  /* kernel/ioport.c */
-asmlinkage long sys_iopl(struct pt_regs *);

  /* kernel/process_32.c */
  int sys_clone(struct pt_regs *);
@@ -71,9 +71,6 @@ int sys_vm86(struct pt_regs *);

  /* X86_64 only */

-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
-
  /* kernel/process_64.c */
  asmlinkage long sys_clone(unsigned long, unsigned long,
  			  void __user *, void __user *,
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 50b9c22..737b81f 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -725,13 +725,15 @@ END(syscall_badsys)
  /*
   * System calls that need a pt_regs pointer.
   */
-#define PTREGSCALL(name) \
+#define PTREGSCALL_ARG(name,arg) \
  	ALIGN; \
  ptregs_##name: \
-	leal 4(%esp),%eax; \
+	leal 4(%esp),arg; \
  	jmp sys_##name;
-
-PTREGSCALL(iopl)
+#define PTREGSCALL(name) \
+	PTREGSCALL_ARG(name, %eax)
+	
+PTREGSCALL_ARG(iopl,%edx)
  PTREGSCALL(fork)
  PTREGSCALL(clone)
  PTREGSCALL(vfork)
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index b1cbac5..f435e42 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -103,7 +103,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
   * on system-call entry - see also fork() and the signal handling
   * code.
   */
-static int do_iopl(unsigned int level, struct pt_regs *regs)
+long sys_iopl(unsigned int level, struct pt_regs *regs)
  {
  	struct thread_struct *t =&current->thread;
  	unsigned int old = (regs->flags>>  12)&  3;
@@ -122,15 +122,3 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)

  	return 0;
  }
-
-#ifdef CONFIG_X86_64
-long sys_iopl(unsigned int level, struct pt_regs *regs)
-{
-	return do_iopl(level, regs);
-}
-#else
-long sys_iopl(struct pt_regs *regs)
-{
-	return do_iopl(regs->bx, regs);
-}
-#endif



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 19:25           ` Brian Gerst
@ 2009-12-09 19:35             ` H. Peter Anvin
  0 siblings, 0 replies; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-09 19:35 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Linus Torvalds, Jeremy Fitzhardinge, Ingo Molnar, linux-kernel,
	Thomas Gleixner, Andrew Morton

On 12/09/2009 11:25 AM, Brian Gerst wrote:
> 
> I had started working on a patchset like this when this issue came up
> the first time.  I'll get it up to date and send it out.  It would be
> nice if this could be done for all syscalls, so that
> asmlinkage_protect() isn't needed.
> 

asmlinkage_protect() is evil in the extreme and really should be killed.

In fact, I started working on a patchset to eliminate asmlinkage completely.

	-hpa

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 19:32         ` Jeremy Fitzhardinge
@ 2009-12-09 20:05           ` H. Peter Anvin
  0 siblings, 0 replies; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-09 20:05 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Linus Torvalds, Ingo Molnar, linux-kernel, Thomas Gleixner,
	Andrew Morton

On 12/09/2009 11:32 AM, Jeremy Fitzhardinge wrote:
>   /*
>    * System calls that need a pt_regs pointer.
>    */
> -#define PTREGSCALL(name) \
> +#define PTREGSCALL_ARG(name,arg) \
>   	ALIGN; \
>   ptregs_##name: \
> -	leal 4(%esp),%eax; \
> +	leal 4(%esp),arg; \
>   	jmp sys_##name;
> -

I believe this will not work for the reason outlined in the other post.

	-hpa

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:18   ` Jeremy Fitzhardinge
@ 2009-12-09 21:58     ` Linus Torvalds
  0 siblings, 0 replies; 17+ messages in thread
From: Linus Torvalds @ 2009-12-09 21:58 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, Hugh Dickens



On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:

> On 12/08/09 13:34, Linus Torvalds wrote:
> > I do _not_ want to add any more task_pt_regs() crap, please.
> > 
> > Why? It's wrong for at least vm86 mode (and from kernel system calls).
> >    
> 
> Would the stack frame version work in these cases?

It would "work" in the sense that at least it wouldn't corrupt the "outer" 
stack frame - it would only change the inner one. For vm86 mode, that 
would actually matter (iopl is meaningful), but as Peter also said, I 
don't think we actually allow direct system calls from vm86 mode.

For me it's actually more of a conceptual complaint: I really think 
'task_pt_regs()' is only reliable for ptrace and is simply _wrong_ in 
other situations. On other architectures, you literally need to set up the 
stack _differently_ on the signal handling path - which is what ptrace 
does - than on regular system call paths.

So conceptually, the system call stack layout is simply _different_ than 
the ptrace stack. And I'd hate to have x86 code that teaches people to do 
things that really don't work in general.

			Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
  2009-12-09 18:29 H. Peter Anvin
@ 2009-12-09 18:38 ` Linus Torvalds
  0 siblings, 0 replies; 17+ messages in thread
From: Linus Torvalds @ 2009-12-09 18:38 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jeremy Fitzhardinge, Ingo Molnar, linux-kernel, Thomas Gleixner,
	Andrew Morton, Hugh Dickens



On Wed, 9 Dec 2009, H. Peter Anvin wrote:
> "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:
> >
> >But I'm not sure I understand the objection to task_pt_regs(); is it 
> >considered deprecated?   This patch received quite a lot of discussion 
> >with no mention of it.  Should we consider all its uses as suspect?

I personally would be much happier without ever seeing a single 
task_pt_regs() call outside of pure ptrace() users (and quite frankly, 
even there I'd be happier if it was basically done once, and then 'struct 
pt_regs' being passed around as an argument as much as possible).

In the case of 'ptrace' we control the stack of the tracee. 

In other cases, we do _not_.

For example, think about us ever implementing 'tasklets' or async system 
calls in general. It's _very_ possible that we'd have a special stolen 
stack for them, and run the low-level system call function from that 
stack. Then something like task_pt_regs() migth very well do the wrong 
thing.

Now, I'm not saying that 'sys_iopl()' would ever be a valid target for 
async system calls, but I _am_ saying that system calls that depend on 
task_pt_regs() are fundamnetally fragile and broken, and have subtle 
interactions.

In contrast, if you make it very explicit that the system call gets passed 
a pointer to its pt_regs, then it still has all the same issues, but now 
it's not subtle any more - now it's explicitly encoded in the call 
sequence on both the caller and callee sides, and somebody doing tasklets 
would hopefully see that "oh, iopl() has that same special thing that 
fork/clone/vfork/execve has, and touches the stack frame register set 
explicitly".

			Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [GIT PULL] x86/paravirt for v2.6.33
@ 2009-12-09 18:29 H. Peter Anvin
  2009-12-09 18:38 ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-09 18:29 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Ingo Molnar
  Cc: Linus Torvalds, linux-kernel, Thomas Gleixner, Andrew Morton,
	Hugh Dickens

[-- Attachment #1: Type: text/plain, Size: 1543 bytes --]

Linus clearly prefers the style with pt_regs passed on the stack as the sole form.  Since we *have* to use that form for things like clone(), it makes sense to use it as the only form.

For what it's worth I did look at this when the patch first came up; it does make the individual patch a fair bit uglier, but I can understand Linus' consistency argument.

As far as I know, we don't allow any system calls from inside v86 mode.


	-hpa

"Jeremy Fitzhardinge" <jeremy@goop.org> wrote:

>On 12/08/09 23:36, Ingo Molnar wrote:
>>> The old version that actually passed the stack frame was better. Why
>>> pick the inferior version?
>>>      
>> Yeah, agreed. I missed that detail.
>>    
>
>Which detail is that?  The whole patch? ;)
>
>> Jeremy, mind sending a patch that updates this code to use the less
>> obfuscated 32-bit version, not the 64-bit version? (a delta patch
>> against tip:master would be nice, as there's a fair amount of testing in
>> the unification change itself already, which we dont want to discard.)
>>    
>
>Sure.
>
>But I'm not sure I understand the objection to task_pt_regs(); is it 
>considered deprecated?   This patch received quite a lot of discussion 
>with no mention of it.  Should we consider all its uses as suspect?
>
>Would it be better to have something similar which just returns a 
>pointer to the saved [re]flags, since that's all we care about?  That 
>should be easier to make robust against
>
>     J

--
Sent from my mobile phone. Please excuse any lack of formatting.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-12-09 21:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-03 21:09 [GIT PULL] x86/paravirt for v2.6.33 Ingo Molnar
2009-12-08 21:34 ` Linus Torvalds
2009-12-09  7:36   ` Ingo Molnar
2009-12-09 18:19     ` Jeremy Fitzhardinge
2009-12-09 18:31     ` Jeremy Fitzhardinge
2009-12-09 18:47       ` Linus Torvalds
2009-12-09 18:54         ` H. Peter Anvin
2009-12-09 19:08           ` Linus Torvalds
2009-12-09 19:25           ` Brian Gerst
2009-12-09 19:35             ` H. Peter Anvin
2009-12-09 19:32         ` Jeremy Fitzhardinge
2009-12-09 20:05           ` H. Peter Anvin
2009-12-09 18:49       ` H. Peter Anvin
2009-12-09 18:18   ` Jeremy Fitzhardinge
2009-12-09 21:58     ` Linus Torvalds
2009-12-09 18:29 H. Peter Anvin
2009-12-09 18:38 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).