linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* context switch code question
@ 2003-04-17 21:19 Robert Schweikert
  2003-04-24  8:11 ` Gabriel Paubert
  0 siblings, 1 reply; 3+ messages in thread
From: Robert Schweikert @ 2003-04-17 21:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Robert Schweikert

Can someone please point me to the context switching code. I am
interested in the context switch structure and the values that are
saved. I am chasing a weird problem with some numerical code that uses
mmx instructions to get flush to zero to work. Specifically I am calling
the

_MM_SET_FLUSH_TO_ZERO_MODE

macro which in turn ends up calling _mm_setcsr(), wherever that might be
implemented.

What I am trying to figure out is a.) is this register value properly
set/reset during context switch and b.) is this particular register
properly transfered when the process gets moved to another CPU. 

I assume that this is a configurable option in the kernel since PII and
Pentium do not have this instruction, and thus it could just be that my
kernel is not configured correctly.

I tried to find this type of info without posting but failed.

Thanks,
Robert
-- 
Robert Schweikert <Robert.Schweikert@abaqus.com>
ABAQUS

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: context switch code question
  2003-04-17 21:19 context switch code question Robert Schweikert
@ 2003-04-24  8:11 ` Gabriel Paubert
  2003-04-24 10:33   ` Robert Schweikert
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Paubert @ 2003-04-24  8:11 UTC (permalink / raw)
  To: Robert Schweikert; +Cc: linux-kernel, Robert Schweikert

On Thu, Apr 17, 2003 at 05:19:32PM -0400, Robert Schweikert wrote:
> Can someone please point me to the context switching code. I am
> interested in the context switch structure and the values that are
> saved. I am chasing a weird problem with some numerical code that uses
> mmx instructions to get flush to zero to work. Specifically I am calling
> the
> 
> _MM_SET_FLUSH_TO_ZERO_MODE
> 
> macro which in turn ends up calling _mm_setcsr(), wherever that might be
> implemented.
> 
> What I am trying to figure out is a.) is this register value properly
> set/reset during context switch and b.) is this particular register
> properly transfered when the process gets moved to another CPU. 

Well, there is at least one bug in arch/i386/kernel/i387.c regarding the
handling of mxcsr on processors with SSE2 extensions. A new mxcsr bit 
(bit 6, denormals are zero or DAZ) was defined by Intel which will be 
cleared under you when you get a signal and with some ptrace operations.

The following patch should fix this, but I'm not sure that it is your
problem, and the behaviour of SSE code will vary depending on whether
the processor has SSE2 (but blame Intel for that, not me ;-)). You don't
say which kernel you are using, so I made the patch against a very
recent pull from the from 2.5 tree, but it is trivially portable to 2.4 
and might even apply as is.


===== arch/i386/kernel/i387.c 1.16 vs edited =====
--- 1.16/arch/i386/kernel/i387.c	Wed Apr  9 07:45:37 2003
+++ edited/arch/i386/kernel/i387.c	Thu Apr 24 09:52:42 2003
@@ -25,6 +25,12 @@
 #define HAVE_HWFP 1
 #endif
 
+/* mxcsr 31-16 must be zero for security reasons,
+ * bit 6 unfortunately depends on cpu features...
+ */
+#define MXSCR_MASK (cpu_has_sse2 ? 0xffff : 0xffbf )
+
+
 /*
  * The _current_ task is using the FPU for the first time
  * so initialize it and set the mxcsr to its default
@@ -208,7 +214,7 @@
 void set_fpu_mxcsr( struct task_struct *tsk, unsigned short mxcsr )
 {
 	if ( cpu_has_xmm ) {
-		tsk->thread.i387.fxsave.mxcsr = (mxcsr & 0xffbf);
+		tsk->thread.i387.fxsave.mxcsr = (mxcsr & MXCSR_MASK);
 	}
 }
 
@@ -356,8 +362,7 @@
 	clear_fpu( tsk );
 	err = __copy_from_user( &tsk->thread.i387.fxsave, &buf->_fxsr_env[0],
 				sizeof(struct i387_fxsave_struct) );
-	/* mxcsr bit 6 and 31-16 must be zero for security reasons */
-	tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
+	tsk->thread.i387.fxsave.mxcsr &= MXCSR_MASK;
 	return err ? 1 : convert_fxsr_from_user( &tsk->thread.i387.fxsave, buf );
 }
 
@@ -455,8 +460,7 @@
 	if ( cpu_has_fxsr ) {
 		__copy_from_user( &tsk->thread.i387.fxsave, buf,
 				  sizeof(struct user_fxsr_struct) );
-		/* mxcsr bit 6 and 31-16 must be zero for security reasons */
-		tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
+		tsk->thread.i387.fxsave.mxcsr &= MXCSR_MASK;
 		return 0;
 	} else {
 		return -EIO;

The code in arch/x86-64/ia32/fpu32.c also has a couple of 0xffbf
masks, maybe they should really be 0xffff. But I lost my x86-64 doc
in a disk crash and have not yet downloaded it again.

	Regards,
	Gabriel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: context switch code question
  2003-04-24  8:11 ` Gabriel Paubert
@ 2003-04-24 10:33   ` Robert Schweikert
  0 siblings, 0 replies; 3+ messages in thread
From: Robert Schweikert @ 2003-04-24 10:33 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Robert Schweikert, linux kernel

Gabriel,

I am using the 2.4.20 kernel and will apply the patch and test the code.
The problem is twofold.

First bit for bit repeatability when running an application that
produces denormalized numbers. By setting flush to zero these should all
be zero and thus rounding in the least significant bit should not effect
us.

Second the following test case fails

#define FPE_SMALL_VALUE        5e-21

float sqrd(float v) { return v*v; }

int main (int argc, char** argv)
{
  _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
  dumm = sqrd(FPE_SMALL_VALUE);
  cerr << " before dumm = " << dumm << '\n';
}

dumm is not zero

dumm = 2.50006e-41

If I print the bit pattern I get the following

dumm hex = 0x45B1

This is compiled with the Intel compiler, although that should not make
a difference here.

Thanks,
Robert

On Thu, 2003-04-24 at 04:11, Gabriel Paubert wrote:
> On Thu, Apr 17, 2003 at 05:19:32PM -0400, Robert Schweikert wrote:
> > Can someone please point me to the context switching code. I am
> > interested in the context switch structure and the values that are
> > saved. I am chasing a weird problem with some numerical code that uses
> > mmx instructions to get flush to zero to work. Specifically I am calling
> > the
> > 
> > _MM_SET_FLUSH_TO_ZERO_MODE
> > 
> > macro which in turn ends up calling _mm_setcsr(), wherever that might be
> > implemented.
> > 
> > What I am trying to figure out is a.) is this register value properly
> > set/reset during context switch and b.) is this particular register
> > properly transfered when the process gets moved to another CPU. 
> 
> Well, there is at least one bug in arch/i386/kernel/i387.c regarding the
> handling of mxcsr on processors with SSE2 extensions. A new mxcsr bit 
> (bit 6, denormals are zero or DAZ) was defined by Intel which will be 
> cleared under you when you get a signal and with some ptrace operations.
> 
> The following patch should fix this, but I'm not sure that it is your
> problem, and the behaviour of SSE code will vary depending on whether
> the processor has SSE2 (but blame Intel for that, not me ;-)). You don't
> say which kernel you are using, so I made the patch against a very
> recent pull from the from 2.5 tree, but it is trivially portable to 2.4 
> and might even apply as is.
> 
> 
> ===== arch/i386/kernel/i387.c 1.16 vs edited =====
> --- 1.16/arch/i386/kernel/i387.c	Wed Apr  9 07:45:37 2003
> +++ edited/arch/i386/kernel/i387.c	Thu Apr 24 09:52:42 2003
> @@ -25,6 +25,12 @@
>  #define HAVE_HWFP 1
>  #endif
>  
> +/* mxcsr 31-16 must be zero for security reasons,
> + * bit 6 unfortunately depends on cpu features...
> + */
> +#define MXSCR_MASK (cpu_has_sse2 ? 0xffff : 0xffbf )
> +
> +
>  /*
>   * The _current_ task is using the FPU for the first time
>   * so initialize it and set the mxcsr to its default
> @@ -208,7 +214,7 @@
>  void set_fpu_mxcsr( struct task_struct *tsk, unsigned short mxcsr )
>  {
>  	if ( cpu_has_xmm ) {
> -		tsk->thread.i387.fxsave.mxcsr = (mxcsr & 0xffbf);
> +		tsk->thread.i387.fxsave.mxcsr = (mxcsr & MXCSR_MASK);
>  	}
>  }
>  
> @@ -356,8 +362,7 @@
>  	clear_fpu( tsk );
>  	err = __copy_from_user( &tsk->thread.i387.fxsave, &buf->_fxsr_env[0],
>  				sizeof(struct i387_fxsave_struct) );
> -	/* mxcsr bit 6 and 31-16 must be zero for security reasons */
> -	tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
> +	tsk->thread.i387.fxsave.mxcsr &= MXCSR_MASK;
>  	return err ? 1 : convert_fxsr_from_user( &tsk->thread.i387.fxsave, buf );
>  }
>  
> @@ -455,8 +460,7 @@
>  	if ( cpu_has_fxsr ) {
>  		__copy_from_user( &tsk->thread.i387.fxsave, buf,
>  				  sizeof(struct user_fxsr_struct) );
> -		/* mxcsr bit 6 and 31-16 must be zero for security reasons */
> -		tsk->thread.i387.fxsave.mxcsr &= 0xffbf;
> +		tsk->thread.i387.fxsave.mxcsr &= MXCSR_MASK;
>  		return 0;
>  	} else {
>  		return -EIO;
> 
> The code in arch/x86-64/ia32/fpu32.c also has a couple of 0xffbf
> masks, maybe they should really be 0xffff. But I lost my x86-64 doc
> in a disk crash and have not yet downloaded it again.
> 
> 	Regards,
> 	Gabriel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-04-24 10:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-17 21:19 context switch code question Robert Schweikert
2003-04-24  8:11 ` Gabriel Paubert
2003-04-24 10:33   ` Robert Schweikert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).