All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Pengfei Xu <pengfei.xu@intel.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>,
	linux-kselftest <linux-kselftest@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Heng Su <heng.su@intel.com>, Luck Tony <tony.luck@intel.com>,
	Mehta Sohil <sohil.mehta@intel.com>,
	Chen Yu C <yu.c.chen@intel.com>,
	Andy Lutomirski <luto@kernel.org>, Borislav Petkov <bp@suse.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Bae Chang Seok <chang.seok.bae@intel.com>
Subject: Re: [PATCH v8 1/1] selftests/x86/xstate: Add xstate test cases for XSAVE feature
Date: Mon, 11 Apr 2022 07:42:53 -0700	[thread overview]
Message-ID: <c9f881fb-460e-de44-a6f8-0faf05dda9b7@intel.com> (raw)
In-Reply-To: <YlP/gvE4qzPj8sWF@xpf.sh.intel.com>

On 4/11/22 03:14, Pengfei Xu wrote:
> On 2022-04-08 at 09:58:50 -0700, Dave Hansen wrote:
>> On 3/16/22 05:40, Pengfei Xu wrote:
>>> +#ifndef __cpuid_count
>>> +#define __cpuid_count(level, count, a, b, c, d) ({	\
>>> +	__asm__ __volatile__ ("cpuid\n\t"	\
>>> +			: "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
>>> +			: "0" (level), "2" (count));	\
>>> +})
>>> +#endif
>>
>>
>> By the time you post the next revision, I hope these are merged:
>>
>>> https://lore.kernel.org/all/cover.1647360971.git.reinette.chatre@intel.com/
>>
>> Could you rebase on top of those to avoid duplication, please?
>>
>   Yes, thanks for remind. I would like to use the helper __cpuid_count() based
>   on above fixed patch.
>   And I have a concern with stdlib.h with "-mno-sse" issue, it's a GCC bug,
>   and it will be fixed in GCC11. And I could not use kselftest.h, because
>   kselftest.h used stdlib.h also...

Ugh.  What a mess.

>   Thanks for the link! From above "id=27600"link, Lu Hongjiu mentioned that:
>   It's a "GCC bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652"
>   And id=99652 shows that it's fixed in GCC 11.
>   I tried "-mgeneral-regs-only"(it includes "-mno-sse"), it also gcc failed
>   with same error as "-mno-sse":
>   "
> /usr/include/bits/stdlib-float.h: In function ‘atof’:
> /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
>   "
>   And the error shows that: it's related to "stdlib-float.h", seems I didn't
>   use stdlib-float.h related function.
> 
>   In order for this test code to support GCC versions before GCC11, such as
>   GCC8, etc., I used this workaround "avoid stdlib.h" for GCC bug, and add
>   *aligned_alloc() declaration above.
> 
>   Because kselftest.h uses stdlib.h also, so I could not use kselftest.h with
>   "-mno-see" special GCC requirement, and seems I could not use __cpuid_count()
>   patch in kselftest.h from Reinette Chatre.
>   Thanks!

Can you break this test up into two pieces which are spread across two
.c files?  One that *just* does the sequences that need XSAVE and to
preserve FPU state with -mno-sse, and then a separate one that uses
stdlib.h and also the kselftest infrastructure?

For instance, all of the detection and enumeration can be in the nornal
.c file.

...
>>> +/* Fill FP/XMM/YMM/OPMASK and PKRU xstates into buffer. */
>>> +static void fill_xstates_buf(struct xsave_buffer *buf, uint32_t xsave_mask)
>>> +{
>>> +	uint32_t i;
>>> +	/* The data of FP x87 state are as follows. */
>>
>> OK, but what *is* this?  Random data?  Or did you put some data in that
>> means something?
>>
> I learned from filling the fp data by follow functions: fill FPU xstate
> by fldl instructions, push the source operand onto the FPU register stack
>  and recorded the fp xstate, then used buffer filling way
> to fill the fpu xstates:
> Some fp data could be set as random value under the "FP in SDM rules".
> Shall I add the comment for the fpu data filling like as follow:
> "
> /*
>  * The data of FP x87 state has the same effect as pushing source operand
>  * 0x1f2f3f4f1f2f3f4f onto the FPU stack ST0-7.
>  */
> unsigned char fp_data[160] = {...
> "

No, that's hideous.  If you generated the fp state with code, let's
include the *code*.

> As follow is the pushing source operand 0x1f2f3f4f1f2f3f4f onto the FPU stack
> ST0-7:
> "
> static inline void prepare_fp_buf(uint64_t ui64_fp)
> {
> 	/* Populate ui64_fp value onto FP registers stack ST0-7. */
> 	asm volatile("finit");
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> 	asm volatile("fldl %0" : : "m" (ui64_fp));
> }
> ...
> prepare_fp_buf(0x1f2f3f4f);
> __xsave(buf, xstate_info.mask);
> "
> 
> The code to set fp data and display xstate is as follow:
> https://github.com/xupengfe/xstate_show/blob/0411_fp/x86/xstate.c
> 
> I found there were 2 different:
>>> +	unsigned char fp_data[160] = {
>>> +		0x7f, 0x03, 0x00, 0x00, 0xff, 0x00, 0x00, 0x00,
>>> +		0xf0, 0x19, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00,
>             ^Above function shows "0xff, 0x12" not "0xf0, 0x19" in 0x8/0x9
> offset bytes:
> Bytes 11:8 are used for bits 31:0 of the x87 FPU Instruction Pointer
> Offset(FIP). It could be impacted if I added "vzeroall" and so on instructions,
> in order to match with above fpu function result, will change to "0xff, 0x12".
> 
>>> +		0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
>>> +		0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00,
>             ^Above function shows "0x80, 0x1f" not "0x00, 0x00" in offset
> 0x18/0x19 bytes:
> Bytes 27:24 are used for the MXCSR register. XRSTOR and XRSTORS generate
> general-protection faults (#GP) in response to attempts to set any of the
> reserved bits of the MXCSR register.
> It could be set as "0x00, 0x00", in order to match with result of above
> function, will fill as "0x80, 0x1f".

I'm totally lost with what you are trying to say here.

...
>> I have to wonder if we can do this in a bit more structured way:
>>
>> struct xstate_test
>> {
>> 	bool (*init)(void);
>> 	bool (*fill_state)(struct xsave_buffer *buf);
>> 	...
>> }
>>
>> Yes, that means indirect function calls, but that's OK since we know it
>> will all be compiled with the "no-sse" flag (and friends).
>>
>> Then fill_xstates_buf() just becomes a loop over an xstate_tests[] array.
>>
>   Yes, it's much better to fill the buf in a loop! Thanks!
>   Because it's special for pkru filling, so I will improve it like as follow:
> "
> 	uint32_t xfeature_num;
> ...
> 
> 	/* Fill test byte value into each tested xstate buffer(offset/size). */
> 	for (xfeature_num = XFEATURE_YMM; xfeature_num < XFEATURE_MAX;
> 			xfeature_num++) {
> 		if (xstate_tested(xfeature_num)) {
> 			if (xfeature_num == XFEATURE_PKRU) {
> 				/* Only 0-3 bytes of pkru xstates are allowed to be written. */
> 				memset((unsigned char *)buf + xstate_info.offset[XFEATURE_PKRU],
> 					PKRU_TESTBYTE, sizeof(uint32_t));
> 				continue;
> 			}
> 
> 			memset((unsigned char *)buf + xstate_info.offset[xfeature_num],
> 				XSTATE_TESTBYTE, xstate_info.size[xfeature_num]);
> 		}
> 	}
> "

I'm not sure that's an improvement.

>>> +/*
>>> + * Because xstate like XMM, YMM registers are not preserved across function
>>> + * calls, so use inline function with assembly code only to raise signal.
>>> + */
>>> +static inline long long __raise(long long pid_num, long long sig_num)
>>> +{
>>> +	long long ret, nr = SYS_kill;
>>> +
>>> +	register long long arg1 asm("rdi") = pid_num;
>>> +	register long long arg2 asm("rsi") = sig_num;
>>
>> I really don't like register variables.  They're very fragile.  Imagine
>> if someone put a printf() right here.  Why don't you just do this with
>> inline assembly directives?
>>
>   It seems that adding printf() to an inline function is not good.
>   I'm not sure if I understand correctly: should I use inline assembly to
>   assign variables to rdi, rsi and then syscall?
>   It it's right, I will think about how to implement it by inline assembly way
>   and fix it.

Yes, do it with inline assembly.


  reply	other threads:[~2022-04-11 14:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16 12:40 [PATCH v8 0/1] Introduce XSAVE feature self-test Pengfei Xu
2022-03-16 12:40 ` [PATCH v8 1/1] selftests/x86/xstate: Add xstate test cases for XSAVE feature Pengfei Xu
2022-03-24 10:06   ` Chang S. Bae
2022-03-24 11:37     ` Pengfei Xu
2022-04-08 16:58   ` Dave Hansen
2022-04-11 10:14     ` Pengfei Xu
2022-04-11 14:42       ` Dave Hansen [this message]
2022-04-12 14:01         ` Pengfei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9f881fb-460e-de44-a6f8-0faf05dda9b7@intel.com \
    --to=dave.hansen@intel.com \
    --cc=bp@suse.de \
    --cc=chang.seok.bae@intel.com \
    --cc=heng.su@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=pengfei.xu@intel.com \
    --cc=skhan@linuxfoundation.org \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.