linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Pengfei Xu <pengfei.xu@intel.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>,
	linux-kselftest <linux-kselftest@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Heng Su <heng.su@intel.com>, Luck Tony <tony.luck@intel.com>,
	Mehta Sohil <sohil.mehta@intel.com>,
	Chen Yu C <yu.c.chen@intel.com>,
	Andy Lutomirski <luto@kernel.org>, Borislav Petkov <bp@suse.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Bae Chang Seok <chang.seok.bae@intel.com>
Subject: Re: [PATCH v8 1/1] selftests/x86/xstate: Add xstate test cases for XSAVE feature
Date: Mon, 11 Apr 2022 07:42:53 -0700	[thread overview]
Message-ID: <c9f881fb-460e-de44-a6f8-0faf05dda9b7@intel.com> (raw)
In-Reply-To: <YlP/gvE4qzPj8sWF@xpf.sh.intel.com>

On 4/11/22 03:14, Pengfei Xu wrote:
> On 2022-04-08 at 09:58:50 -0700, Dave Hansen wrote:
>> On 3/16/22 05:40, Pengfei Xu wrote:
>>> +#ifndef __cpuid_count
>>> +#define __cpuid_count(level, count, a, b, c, d) ({	\
>>> +	__asm__ __volatile__ ("cpuid\n\t"	\
>>> +			: "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
>>> +			: "0" (level), "2" (count));	\
>>> +})
>>> +#endif
>>
>>
>> By the time you post the next revision, I hope these are merged:
>>
>>> https://lore.kernel.org/all/cover.1647360971.git.reinette.chatre@intel.com/
>>
>> Could you rebase on top of those to avoid duplication, please?
>>
>   Yes, thanks for remind. I would like to use the helper __cpuid_count() based
>   on above fixed patch.
>   And I have a concern with stdlib.h with "-mno-sse" issue, it's a GCC bug,
>   and it will be fixed in GCC11. And I could not use kselftest.h, because
>   kselftest.h used stdlib.h also...

Ugh.  What a mess.

>   Thanks for the link! From above "id=27600"link, Lu Hongjiu mentioned that:
>   It's a "GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652"
>   And id=99652 shows that it's fixed in GCC 11.
>   I tried "-mgeneral-regs-only"(it includes "-mno-sse"), it also gcc failed
>   with same error as "-mno-sse":
>   "
> /usr/include/bits/stdlib-float.h: In function ‘atof’:
> /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
>   "
>   And the error shows that: it's related to "stdlib-float.h", seems I didn't
>   use stdlib-float.h related function.
> 
>   In order for this test code to support GCC versions before GCC11, such as
>   GCC8, etc., I used this workaround "avoid stdlib.h" for GCC bug, and add
>   *aligned_alloc() declaration above.
> 
>   Because kselftest.h uses stdlib.h also, so I could not use kselftest.h with
>   "-mno-see" special GCC requirement, and seems I could not use __cpuid_count()
>   patch in kselftest.h from Reinette Chatre.
>   Thanks!

Can you break this test up into two pieces which are spread across two
.c files?  One that *just* does the sequences that need XSAVE and to
preserve FPU state with -mno-sse, and then a separate one that uses
stdlib.h and also the kselftest infrastructure?

For instance, all of the detection and enumeration can be in the nornal
.c file.

...
>>> +/* Fill FP/XMM/YMM/OPMASK and PKRU xstates into buffer. */
>>> +static void fill_xstates_buf(struct xsave_buffer *buf, uint32_t xsave_mask)
>>> +{
>>> +	uint32_t i;
>>> +	/* The data of FP x87 state are as follows. */
>>
>> OK, but what *is* this?  Random data?  Or did you put some data in that
>> means something?
>>
> I learned from filling the fp data by follow functions: fill FPU xstate
> by fldl instructions, push the source operand onto the FPU register stack
>  and recorded the fp xstate, then used buffer filling way
> to fill the fpu xstates:
> Some fp data could be set as random value under the "FP in SDM rules".
> Shall I add the comment for the fpu data filling like as follow:
> "
> /*
>  * The data of FP x87 state has the same effect as pushing source operand
>  * 0x1f2f3f4f1f2f3f4f onto the FPU stack ST0-7.
>  */
> unsigned char fp_data[160] = {...
> "

No, that's hideous.  If you generated the fp state with code, let's
include the *code*.

> As follow is the pushing source operand 0x1f2f3f4f1f2f3f4f onto the FPU stack
> ST0-7:
> "
> static inline void prepare_fp_buf(uint64_t ui64_fp)
> {
> 	/* Populate ui64_fp value onto FP registers stack ST0-7. */
> 	asm volatile("finit");
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> }
> ...
> prepare_fp_buf(0x1f2f3f4f);
> __xsave(buf, xstate_info.mask);
> "
> 
> The code to set fp data and display xstate is as follow:
> https://github.com/xupengfe/xstate_show/blob/0411_fp/x86/xstate.c
> 
> I found there were 2 different:
>>> +	unsigned char fp_data[160] = {
>>> +		0x7f, 0x03, 0x00, 0x00, 0xff, 0x00, 0x00, 0x00,
>>> +		0xf0, 0x19, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00,
>             ^Above function shows "0xff, 0x12" not "0xf0, 0x19" in 0x8/0x9
> offset bytes:
> Bytes 11:8 are used for bits 31:0 of the x87 FPU Instruction Pointer
> Offset(FIP). It could be impacted if I added "vzeroall" and so on instructions,
> in order to match with above fpu function result, will change to "0xff, 0x12".
> 
>>> +		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
>>> +		0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00,
>             ^Above function shows "0x80, 0x1f" not "0x00, 0x00" in offset
> 0x18/0x19 bytes:
> Bytes 27:24 are used for the MXCSR register. XRSTOR and XRSTORS generate
> general-protection faults (#GP) in response to attempts to set any of the
> reserved bits of the MXCSR register.
> It could be set as "0x00, 0x00", in order to match with result of above
> function, will fill as "0x80, 0x1f".

I'm totally lost with what you are trying to say here.

...
>> I have to wonder if we can do this in a bit more structured way:
>>
>> struct xstate_test
>> {
>> 	bool (*init)(void);
>> 	bool (*fill_state)(struct xsave_buffer *buf);
>> 	...
>> }
>>
>> Yes, that means indirect function calls, but that's OK since we know it
>> will all be compiled with the "no-sse" flag (and friends).
>>
>> Then fill_xstates_buf() just becomes a loop over an xstate_tests[] array.
>>
>   Yes, it's much better to fill the buf in a loop! Thanks!
>   Because it's special for pkru filling, so I will improve it like as follow:
> "
> 	uint32_t xfeature_num;
> ...
> 
> 	/* Fill test byte value into each tested xstate buffer(offset/size). */
> 	for (xfeature_num = XFEATURE_YMM; xfeature_num < XFEATURE_MAX;
> 			xfeature_num++) {
> 		if (xstate_tested(xfeature_num)) {
> 			if (xfeature_num == XFEATURE_PKRU) {
> 				/* Only 0-3 bytes of pkru xstates are allowed to be written. */
> 				memset((unsigned char *)buf + xstate_info.offset[XFEATURE_PKRU],
> 					PKRU_TESTBYTE, sizeof(uint32_t));
> 				continue;
> 			}
> 
> 			memset((unsigned char *)buf + xstate_info.offset[xfeature_num],
> 				XSTATE_TESTBYTE, xstate_info.size[xfeature_num]);
> 		}
> 	}
> "

I'm not sure that's an improvement.

>>> +/*
>>> + * Because xstate like XMM, YMM registers are not preserved across function
>>> + * calls, so use inline function with assembly code only to raise signal.
>>> + */
>>> +static inline long long __raise(long long pid_num, long long sig_num)
>>> +{
>>> +	long long ret, nr = SYS_kill;
>>> +
>>> +	register long long arg1 asm("rdi") = pid_num;
>>> +	register long long arg2 asm("rsi") = sig_num;
>>
>> I really don't like register variables.  They're very fragile.  Imagine
>> if someone put a printf() right here.  Why don't you just do this with
>> inline assembly directives?
>>
>   It seems that adding printf() to an inline function is not good.
>   I'm not sure if I understand correctly: should I use inline assembly to
>   assign variables to rdi, rsi and then syscall?
>   It it's right, I will think about how to implement it by inline assembly way
>   and fix it.

Yes, do it with inline assembly.


  reply	other threads:[~2022-04-11 14:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16 12:40 [PATCH v8 0/1] Introduce XSAVE feature self-test Pengfei Xu
2022-03-16 12:40 ` [PATCH v8 1/1] selftests/x86/xstate: Add xstate test cases for XSAVE feature Pengfei Xu
2022-03-24 10:06   ` Chang S. Bae
2022-03-24 11:37     ` Pengfei Xu
2022-04-08 16:58   ` Dave Hansen
2022-04-11 10:14     ` Pengfei Xu
2022-04-11 14:42       ` Dave Hansen [this message]
2022-04-12 14:01         ` Pengfei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9f881fb-460e-de44-a6f8-0faf05dda9b7@intel.com \
    --to=dave.hansen@intel.com \
    --cc=bp@suse.de \
    --cc=chang.seok.bae@intel.com \
    --cc=heng.su@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=pengfei.xu@intel.com \
    --cc=skhan@linuxfoundation.org \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).