All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] x86: also use alternative asm on xsave side
@ 2016-02-03 12:32 Jan Beulich
  2016-02-03 12:39 ` [PATCH v2 1/2] x86: support 2- and 3-way alternatives Jan Beulich
  2016-02-03 12:39 ` [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Jan Beulich @ 2016-02-03 12:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser, Shuai Ruan

1: support 2- and 3-way alternatives
2: xstate: also use alternative asm on xsave side

Signed-off-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] x86: support 2- and 3-way alternatives
  2016-02-03 12:32 [PATCH v2 0/2] x86: also use alternative asm on xsave side Jan Beulich
@ 2016-02-03 12:39 ` Jan Beulich
  2016-02-03 13:15   ` Andrew Cooper
  2016-02-03 12:39 ` [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side Jan Beulich
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2016-02-03 12:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser, Shuai Ruan

[-- Attachment #1: Type: text/plain, Size: 3820 bytes --]

Parts taken from Linux, but implementing the ALTERNATIVE*() macros
recursively to avoid needless redundancy.

Also make the .discard section non-writable (we might even consider
dropping its alloc flag too) and limit the pushing and popping of
sections.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -46,18 +46,28 @@ extern void alternative_instructions(voi
 #define ALTINSTR_REPLACEMENT(newinstr, feature, number) /* replacement */     \
         b_replacement(number)":\n\t" newinstr "\n" e_replacement(number) ":\n\t"
 
+#define ALTERNATIVE_N(newinstr, feature, number)	\
+        ".pushsection .altinstructions,\"a\"\n"		\
+        ALTINSTR_ENTRY(feature, number)			\
+        ".section .discard,\"a\",@progbits\n"		\
+        DISCARD_ENTRY(number)				\
+        ".section .altinstr_replacement, \"ax\"\n"	\
+        ALTINSTR_REPLACEMENT(newinstr, feature, number)	\
+        ".popsection\n"
+
 /* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, feature)                        \
-        OLDINSTR(oldinstr)                                              \
-        ".pushsection .altinstructions,\"a\"\n"                         \
-        ALTINSTR_ENTRY(feature, 1)                                      \
-        ".popsection\n"                                                 \
-        ".pushsection .discard,\"aw\",@progbits\n"                      \
-        DISCARD_ENTRY(1)                                                \
-        ".popsection\n"                                                 \
-        ".pushsection .altinstr_replacement, \"ax\"\n"                  \
-        ALTINSTR_REPLACEMENT(newinstr, feature, 1)                      \
-        ".popsection"
+#define ALTERNATIVE(oldinstr, newinstr, feature)			  \
+        OLDINSTR(oldinstr)						  \
+	ALTERNATIVE_N(newinstr, feature, 1)
+
+#define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
+        ALTERNATIVE(oldinstr, newinstr1, feature1)			  \
+	ALTERNATIVE_N(newinstr2, feature2, 2)
+
+#define ALTERNATIVE_3(oldinstr, newinstr1, feature1, newinstr2, feature2, \
+		      newinstr3, feature3)				  \
+        ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
+	ALTERNATIVE_N(newinstr3, feature3, 3)
 
 /*
  * Alternative instructions for different CPU types or capabilities.
@@ -93,6 +103,37 @@ extern void alternative_instructions(voi
 	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
 		      : output : input)
 
+/*
+ * This is similar to alternative_io. But it has two features and
+ * respective instructions.
+ *
+ * If CPU has feature2, newinstr2 is used.
+ * Otherwise, if CPU has feature1, newinstr1 is used.
+ * Otherwise, oldinstr is used.
+ */
+#define alternative_io_2(oldinstr, newinstr1, feature1, newinstr2,	\
+			 feature2, output, input...)			\
+	asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1,	\
+				   newinstr2, feature2)			\
+		     : output : input)
+
+/*
+ * This is similar to alternative_io. But it has three features and
+ * respective instructions.
+ *
+ * If CPU has feature3, newinstr3 is used.
+ * Otherwise, if CPU has feature2, newinstr2 is used.
+ * Otherwise, if CPU has feature1, newinstr1 is used.
+ * Otherwise, oldinstr is used.
+ */
+#define alternative_io_3(oldinstr, newinstr1, feature1, newinstr2,	\
+			 feature2, newinstr3, feature3, output,		\
+			 input...)					\
+	asm volatile(ALTERNATIVE_3(oldinstr, newinstr1, feature1,	\
+				   newinstr2, feature2, newinstr3,	\
+				   feature3)				\
+		     : output : input)
+
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 




[-- Attachment #2: x86-alternative-2-3.patch --]
[-- Type: text/plain, Size: 3856 bytes --]

x86: support 2- and 3-way alternatives

Parts taken from Linux, but implementing the ALTERNATIVE*() macros
recursively to avoid needless redundancy.

Also make the .discard section non-writable (we might even consider
dropping its alloc flag too) and limit the pushing and popping of
sections.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -46,18 +46,28 @@ extern void alternative_instructions(voi
 #define ALTINSTR_REPLACEMENT(newinstr, feature, number) /* replacement */     \
         b_replacement(number)":\n\t" newinstr "\n" e_replacement(number) ":\n\t"
 
+#define ALTERNATIVE_N(newinstr, feature, number)	\
+        ".pushsection .altinstructions,\"a\"\n"		\
+        ALTINSTR_ENTRY(feature, number)			\
+        ".section .discard,\"a\",@progbits\n"		\
+        DISCARD_ENTRY(number)				\
+        ".section .altinstr_replacement, \"ax\"\n"	\
+        ALTINSTR_REPLACEMENT(newinstr, feature, number)	\
+        ".popsection\n"
+
 /* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, feature)                        \
-        OLDINSTR(oldinstr)                                              \
-        ".pushsection .altinstructions,\"a\"\n"                         \
-        ALTINSTR_ENTRY(feature, 1)                                      \
-        ".popsection\n"                                                 \
-        ".pushsection .discard,\"aw\",@progbits\n"                      \
-        DISCARD_ENTRY(1)                                                \
-        ".popsection\n"                                                 \
-        ".pushsection .altinstr_replacement, \"ax\"\n"                  \
-        ALTINSTR_REPLACEMENT(newinstr, feature, 1)                      \
-        ".popsection"
+#define ALTERNATIVE(oldinstr, newinstr, feature)			  \
+        OLDINSTR(oldinstr)						  \
+	ALTERNATIVE_N(newinstr, feature, 1)
+
+#define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
+        ALTERNATIVE(oldinstr, newinstr1, feature1)			  \
+	ALTERNATIVE_N(newinstr2, feature2, 2)
+
+#define ALTERNATIVE_3(oldinstr, newinstr1, feature1, newinstr2, feature2, \
+		      newinstr3, feature3)				  \
+        ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
+	ALTERNATIVE_N(newinstr3, feature3, 3)
 
 /*
  * Alternative instructions for different CPU types or capabilities.
@@ -93,6 +103,37 @@ extern void alternative_instructions(voi
 	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
 		      : output : input)
 
+/*
+ * This is similar to alternative_io. But it has two features and
+ * respective instructions.
+ *
+ * If CPU has feature2, newinstr2 is used.
+ * Otherwise, if CPU has feature1, newinstr1 is used.
+ * Otherwise, oldinstr is used.
+ */
+#define alternative_io_2(oldinstr, newinstr1, feature1, newinstr2,	\
+			 feature2, output, input...)			\
+	asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1,	\
+				   newinstr2, feature2)			\
+		     : output : input)
+
+/*
+ * This is similar to alternative_io. But it has three features and
+ * respective instructions.
+ *
+ * If CPU has feature3, newinstr3 is used.
+ * Otherwise, if CPU has feature2, newinstr2 is used.
+ * Otherwise, if CPU has feature1, newinstr1 is used.
+ * Otherwise, oldinstr is used.
+ */
+#define alternative_io_3(oldinstr, newinstr1, feature1, newinstr2,	\
+			 feature2, newinstr3, feature3, output,		\
+			 input...)					\
+	asm volatile(ALTERNATIVE_3(oldinstr, newinstr1, feature1,	\
+				   newinstr2, feature2, newinstr3,	\
+				   feature3)				\
+		     : output : input)
+
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side
  2016-02-03 12:32 [PATCH v2 0/2] x86: also use alternative asm on xsave side Jan Beulich
  2016-02-03 12:39 ` [PATCH v2 1/2] x86: support 2- and 3-way alternatives Jan Beulich
@ 2016-02-03 12:39 ` Jan Beulich
  2016-02-03 13:26   ` Andrew Cooper
  2016-02-04  5:27   ` Shuai Ruan
  1 sibling, 2 replies; 9+ messages in thread
From: Jan Beulich @ 2016-02-03 12:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Keir Fraser, Shuai Ruan

[-- Attachment #1: Type: text/plain, Size: 4825 bytes --]

From: Shuai Ruan <shuai.ruan@linux.intel.com>

This patch use alternavtive asm on the xsave side.
As xsaves use modified optimization like xsaveopt, xsaves
may not writing the FPU portion of the save image too.
So xsaves also need some extra tweaks.

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>

Fix XSAVES opcode. Extend the other respective XSAVEOPT conditional to
cover XSAVES as well. Re-wrap comment being adjusted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask
     uint32_t hmask = mask >> 32;
     uint32_t lmask = mask;
     int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1;
+#define XSAVE(pfx) \
+        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \
+                         ".byte " pfx "0x0f,0xae,0x37\n", \
+                         X86_FEATURE_XSAVEOPT, \
+                         ".byte " pfx "0x0f,0xc7,0x27\n", \
+                         X86_FEATURE_XSAVEC, \
+                         ".byte " pfx "0x0f,0xc7,0x2f\n", \
+                         X86_FEATURE_XSAVES, \
+                         "=m" (*ptr), \
+                         "a" (lmask), "d" (hmask), "D" (ptr))
 
     if ( word_size <= 0 || !is_pv_32bit_vcpu(v) )
     {
         typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
         typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
 
-        if ( cpu_has_xsaves )
-            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsavec )
-            asm volatile ( ".byte 0x48,0x0f,0xc7,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsaveopt )
+        if ( cpu_has_xsaveopt || cpu_has_xsaves )
         {
             /*
-             * xsaveopt may not write the FPU portion even when the respective
-             * mask bit is set. For the check further down to work we hence
-             * need to put the save image back into the state that it was in
-             * right after the previous xsaveopt.
+             * XSAVEOPT/XSAVES may not write the FPU portion even when the
+             * respective mask bit is set. For the check further down to work
+             * we hence need to put the save image back into the state that
+             * it was in right after the previous XSAVEOPT.
              */
             if ( word_size > 0 &&
                  (ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] == 4 ||
@@ -279,14 +281,9 @@ void xsave(struct vcpu *v, uint64_t mask
                 ptr->fpu_sse.fip.sel = 0;
                 ptr->fpu_sse.fdp.sel = 0;
             }
-            asm volatile ( ".byte 0x48,0x0f,0xae,0x37"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
         }
-        else
-            asm volatile ( ".byte 0x48,0x0f,0xae,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
+
+        XSAVE("0x48,");
 
         if ( !(mask & ptr->xsave_hdr.xstate_bv & XSTATE_FP) ||
              /*
@@ -296,7 +293,7 @@ void xsave(struct vcpu *v, uint64_t mask
              (!(ptr->fpu_sse.fsw & 0x0080) &&
               boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
         {
-            if ( cpu_has_xsaveopt && word_size > 0 )
+            if ( (cpu_has_xsaveopt || cpu_has_xsaves) && word_size > 0 )
             {
                 ptr->fpu_sse.fip.sel = fcs;
                 ptr->fpu_sse.fdp.sel = fds;
@@ -317,24 +314,10 @@ void xsave(struct vcpu *v, uint64_t mask
     }
     else
     {
-        if ( cpu_has_xsaves )
-            asm volatile ( ".byte 0x0f,0xc7,0x2f"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsavec )
-            asm volatile ( ".byte 0x0f,0xc7,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsaveopt )
-            asm volatile ( ".byte 0x0f,0xae,0x37"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else
-            asm volatile ( ".byte 0x0f,0xae,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
+        XSAVE("");
         word_size = 4;
     }
+#undef XSAVE
     if ( word_size >= 0 )
         ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size;
 }



[-- Attachment #2: x86-xsaves-alternative.patch --]
[-- Type: text/plain, Size: 4875 bytes --]

x86/xstate: also use alternative asm on xsave side

From: Shuai Ruan <shuai.ruan@linux.intel.com>

This patch use alternavtive asm on the xsave side.
As xsaves use modified optimization like xsaveopt, xsaves
may not writing the FPU portion of the save image too.
So xsaves also need some extra tweaks.

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>

Fix XSAVES opcode. Extend the other respective XSAVEOPT conditional to
cover XSAVES as well. Re-wrap comment being adjusted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask
     uint32_t hmask = mask >> 32;
     uint32_t lmask = mask;
     int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1;
+#define XSAVE(pfx) \
+        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \
+                         ".byte " pfx "0x0f,0xae,0x37\n", \
+                         X86_FEATURE_XSAVEOPT, \
+                         ".byte " pfx "0x0f,0xc7,0x27\n", \
+                         X86_FEATURE_XSAVEC, \
+                         ".byte " pfx "0x0f,0xc7,0x2f\n", \
+                         X86_FEATURE_XSAVES, \
+                         "=m" (*ptr), \
+                         "a" (lmask), "d" (hmask), "D" (ptr))
 
     if ( word_size <= 0 || !is_pv_32bit_vcpu(v) )
     {
         typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
         typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
 
-        if ( cpu_has_xsaves )
-            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsavec )
-            asm volatile ( ".byte 0x48,0x0f,0xc7,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsaveopt )
+        if ( cpu_has_xsaveopt || cpu_has_xsaves )
         {
             /*
-             * xsaveopt may not write the FPU portion even when the respective
-             * mask bit is set. For the check further down to work we hence
-             * need to put the save image back into the state that it was in
-             * right after the previous xsaveopt.
+             * XSAVEOPT/XSAVES may not write the FPU portion even when the
+             * respective mask bit is set. For the check further down to work
+             * we hence need to put the save image back into the state that
+             * it was in right after the previous XSAVEOPT.
              */
             if ( word_size > 0 &&
                  (ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] == 4 ||
@@ -279,14 +281,9 @@ void xsave(struct vcpu *v, uint64_t mask
                 ptr->fpu_sse.fip.sel = 0;
                 ptr->fpu_sse.fdp.sel = 0;
             }
-            asm volatile ( ".byte 0x48,0x0f,0xae,0x37"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
         }
-        else
-            asm volatile ( ".byte 0x48,0x0f,0xae,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
+
+        XSAVE("0x48,");
 
         if ( !(mask & ptr->xsave_hdr.xstate_bv & XSTATE_FP) ||
              /*
@@ -296,7 +293,7 @@ void xsave(struct vcpu *v, uint64_t mask
              (!(ptr->fpu_sse.fsw & 0x0080) &&
               boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
         {
-            if ( cpu_has_xsaveopt && word_size > 0 )
+            if ( (cpu_has_xsaveopt || cpu_has_xsaves) && word_size > 0 )
             {
                 ptr->fpu_sse.fip.sel = fcs;
                 ptr->fpu_sse.fdp.sel = fds;
@@ -317,24 +314,10 @@ void xsave(struct vcpu *v, uint64_t mask
     }
     else
     {
-        if ( cpu_has_xsaves )
-            asm volatile ( ".byte 0x0f,0xc7,0x2f"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsavec )
-            asm volatile ( ".byte 0x0f,0xc7,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else if ( cpu_has_xsaveopt )
-            asm volatile ( ".byte 0x0f,0xae,0x37"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
-        else
-            asm volatile ( ".byte 0x0f,0xae,0x27"
-                           : "=m" (*ptr)
-                           : "a" (lmask), "d" (hmask), "D" (ptr) );
+        XSAVE("");
         word_size = 4;
     }
+#undef XSAVE
     if ( word_size >= 0 )
         ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size;
 }

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86: support 2- and 3-way alternatives
  2016-02-03 12:39 ` [PATCH v2 1/2] x86: support 2- and 3-way alternatives Jan Beulich
@ 2016-02-03 13:15   ` Andrew Cooper
  2016-02-03 13:24     ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2016-02-03 13:15 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Keir Fraser, Shuai Ruan

On 03/02/16 12:39, Jan Beulich wrote:
> Parts taken from Linux, but implementing the ALTERNATIVE*() macros
> recursively to avoid needless redundancy.
>
> Also make the .discard section non-writable (we might even consider
> dropping its alloc flag too) and limit the pushing and popping of
> sections.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

There is quite a lot of whitespace damage and mixed tabs/spaces, but the
content looks ok.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86: support 2- and 3-way alternatives
  2016-02-03 13:15   ` Andrew Cooper
@ 2016-02-03 13:24     ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2016-02-03 13:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Shuai Ruan

>>> On 03.02.16 at 14:15, <andrew.cooper3@citrix.com> wrote:
> On 03/02/16 12:39, Jan Beulich wrote:
>> Parts taken from Linux, but implementing the ALTERNATIVE*() macros
>> recursively to avoid needless redundancy.
>>
>> Also make the .discard section non-writable (we might even consider
>> dropping its alloc flag too) and limit the pushing and popping of
>> sections.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> There is quite a lot of whitespace damage and mixed tabs/spaces, but the
> content looks ok.

Damage? Mixture? I've tried to make it all Linux style (as the file was
meant to be) for the changes/additions, but I see I failed for some line
of the macro bodies (just fixed).

> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side
  2016-02-03 12:39 ` [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side Jan Beulich
@ 2016-02-03 13:26   ` Andrew Cooper
  2016-02-03 13:34     ` Jan Beulich
  2016-02-04  5:27   ` Shuai Ruan
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2016-02-03 13:26 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Keir Fraser, Shuai Ruan

On 03/02/16 12:39, Jan Beulich wrote:
> From: Shuai Ruan <shuai.ruan@linux.intel.com>
>
> This patch use alternavtive asm on the xsave side.
> As xsaves use modified optimization like xsaveopt, xsaves
> may not writing the FPU portion of the save image too.
> So xsaves also need some extra tweaks.
>
> Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
>
> Fix XSAVES opcode. Extend the other respective XSAVEOPT conditional to
> cover XSAVES as well. Re-wrap comment being adjusted.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask
>      uint32_t hmask = mask >> 32;
>      uint32_t lmask = mask;
>      int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1;
> +#define XSAVE(pfx) \
> +        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \
> +                         ".byte " pfx "0x0f,0xae,0x37\n", \
> +                         X86_FEATURE_XSAVEOPT, \
> +                         ".byte " pfx "0x0f,0xc7,0x27\n", \
> +                         X86_FEATURE_XSAVEC, \
> +                         ".byte " pfx "0x0f,0xc7,0x2f\n", \
> +                         X86_FEATURE_XSAVES, \

Given that the options are a little out of order and using raw bytes,
would you mind annotating the lines with the operations. e.g.

+        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
+                         ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \
+                         X86_FEATURE_XSAVEOPT, \
+                         ".byte " pfx "0x0f,0xc7,0x27\n", /* xsavec */ \
+                         X86_FEATURE_XSAVEC, \
+                         ".byte " pfx "0x0f,0xc7,0x2f\n", /* xsaves */ \
+                         X86_FEATURE_XSAVES, \

IMO, this is somewhat clearer to read.

Otherwise,

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side
  2016-02-03 13:26   ` Andrew Cooper
@ 2016-02-03 13:34     ` Jan Beulich
  2016-02-03 13:50       ` Andrew Cooper
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2016-02-03 13:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Shuai Ruan

>>> On 03.02.16 at 14:26, <andrew.cooper3@citrix.com> wrote:
> On 03/02/16 12:39, Jan Beulich wrote:
>> --- a/xen/arch/x86/xstate.c
>> +++ b/xen/arch/x86/xstate.c
>> @@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask
>>      uint32_t hmask = mask >> 32;
>>      uint32_t lmask = mask;
>>      int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1;
>> +#define XSAVE(pfx) \
>> +        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \
>> +                         ".byte " pfx "0x0f,0xae,0x37\n", \
>> +                         X86_FEATURE_XSAVEOPT, \
>> +                         ".byte " pfx "0x0f,0xc7,0x27\n", \
>> +                         X86_FEATURE_XSAVEC, \
>> +                         ".byte " pfx "0x0f,0xc7,0x2f\n", \
>> +                         X86_FEATURE_XSAVES, \
> 
> Given that the options are a little out of order and using raw bytes,
> would you mind annotating the lines with the operations. e.g.
> 
> +        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
> +                         ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \
> +                         X86_FEATURE_XSAVEOPT, \
> +                         ".byte " pfx "0x0f,0xc7,0x27\n", /* xsavec */ \
> +                         X86_FEATURE_XSAVEC, \
> +                         ".byte " pfx "0x0f,0xc7,0x2f\n", /* xsaves */ \
> +                         X86_FEATURE_XSAVES, \
> 
> IMO, this is somewhat clearer to read.

Okay, since I had been considering this too, I've just done so.

> Otherwise,
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks, Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side
  2016-02-03 13:34     ` Jan Beulich
@ 2016-02-03 13:50       ` Andrew Cooper
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2016-02-03 13:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Shuai Ruan

On 03/02/16 13:34, Jan Beulich wrote:
>>>> On 03.02.16 at 14:26, <andrew.cooper3@citrix.com> wrote:
>> On 03/02/16 12:39, Jan Beulich wrote:
>>> --- a/xen/arch/x86/xstate.c
>>> +++ b/xen/arch/x86/xstate.c
>>> @@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask
>>>      uint32_t hmask = mask >> 32;
>>>      uint32_t lmask = mask;
>>>      int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1;
>>> +#define XSAVE(pfx) \
>>> +        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \
>>> +                         ".byte " pfx "0x0f,0xae,0x37\n", \
>>> +                         X86_FEATURE_XSAVEOPT, \
>>> +                         ".byte " pfx "0x0f,0xc7,0x27\n", \
>>> +                         X86_FEATURE_XSAVEC, \
>>> +                         ".byte " pfx "0x0f,0xc7,0x2f\n", \
>>> +                         X86_FEATURE_XSAVES, \
>> Given that the options are a little out of order and using raw bytes,
>> would you mind annotating the lines with the operations. e.g.
>>
>> +        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
>> +                         ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \
>> +                         X86_FEATURE_XSAVEOPT, \
>> +                         ".byte " pfx "0x0f,0xc7,0x27\n", /* xsavec */ \
>> +                         X86_FEATURE_XSAVEC, \
>> +                         ".byte " pfx "0x0f,0xc7,0x2f\n", /* xsaves */ \
>> +                         X86_FEATURE_XSAVES, \
>>
>> IMO, this is somewhat clearer to read.
> Okay, since I had been considering this too, I've just done so.
>
>> Otherwise,
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Thanks, Jan

On further thoughts, it would be nice to annotate the XRSTOR() side as well.

~Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side
  2016-02-03 12:39 ` [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side Jan Beulich
  2016-02-03 13:26   ` Andrew Cooper
@ 2016-02-04  5:27   ` Shuai Ruan
  1 sibling, 0 replies; 9+ messages in thread
From: Shuai Ruan @ 2016-02-04  5:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Andrew Cooper

On Wed, Feb 03, 2016 at 05:39:53AM -0700, Jan Beulich wrote:
> From: Shuai Ruan <shuai.ruan@linux.intel.com>
> 
> This patch use alternavtive asm on the xsave side.
> As xsaves use modified optimization like xsaveopt, xsaves
> may not writing the FPU portion of the save image too.
> So xsaves also need some extra tweaks.
> 
> Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
> 
> Fix XSAVES opcode. Extend the other respective XSAVEOPT conditional to
> cover XSAVES as well. Re-wrap comment being adjusted.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Thanks for pointing out the bug and fix it. 
And I test the V2 patch, it works well.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-04  5:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-03 12:32 [PATCH v2 0/2] x86: also use alternative asm on xsave side Jan Beulich
2016-02-03 12:39 ` [PATCH v2 1/2] x86: support 2- and 3-way alternatives Jan Beulich
2016-02-03 13:15   ` Andrew Cooper
2016-02-03 13:24     ` Jan Beulich
2016-02-03 12:39 ` [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side Jan Beulich
2016-02-03 13:26   ` Andrew Cooper
2016-02-03 13:34     ` Jan Beulich
2016-02-03 13:50       ` Andrew Cooper
2016-02-04  5:27   ` Shuai Ruan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.