All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] Question about kernel/syscall/signal/signal06.c
@ 2019-07-17  9:21 Hongzhi, Song
  2019-07-17  9:46 ` Cyril Hrubis
  0 siblings, 1 reply; 7+ messages in thread
From: Hongzhi, Song @ 2019-07-17  9:21 UTC (permalink / raw)
  To: ltp

Hi Wang,

Sorry for bother you.

I find signal06 fails on qemux86-64 when qemu has a small number cores, 
e.g. "qemu -smp 1/2/4/6".

ERROR INFO:

signal06??? 0? TINFO? :? loop = 23
signal06??? 1? TFAIL? :? signal06.c:87: Bug Reproduced!

But if boot qemu with "-smp 16", the case has great chance to pass.


I have two questions about this case:

1. I don't know why multi-core will affect the case.

2. On failure situation, what does break the "while loop" shown in below 
code.

 ??? while (D == VALUE && loop < LOOPS) {
 ??????? /* sys_tkill(pid, SIGHUP); asm to avoid save/reload
 ???????? * fp regs around c call */
 ??????? asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
 ??????? asm ("syscall" : : : "ax");

 ??????? loop++;
 ??? }

 ??? ...

 ??? if (loop == LOOPS) {
 ??????? tst_resm(TPASS, "%s call succeeded", TCID);
 ??? } else {
 >>> tst_resm(TFAIL, "Bug Reproduced!");
 ??????? tst_exit();
 ??? }


Thanks.

--Hongzhi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] Question about kernel/syscall/signal/signal06.c
  2019-07-17  9:21 [LTP] Question about kernel/syscall/signal/signal06.c Hongzhi, Song
@ 2019-07-17  9:46 ` Cyril Hrubis
  2019-07-19  8:13   ` Hongzhi, Song
  0 siblings, 1 reply; 7+ messages in thread
From: Cyril Hrubis @ 2019-07-17  9:46 UTC (permalink / raw)
  To: ltp

Hi!
> I find signal06 fails on qemux86-64 when qemu has a small number cores, 
> e.g. "qemu -smp 1/2/4/6".
> 
> ERROR INFO:
> 
> signal06?????? 0?? TINFO?? :?? loop = 23
> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced!
> 
> But if boot qemu with "-smp 16", the case has great chance to pass.
> 
> 
> I have two questions about this case:
> 
> 1. I don't know why multi-core will affect the case.

Have you looked into the code? The test is trying to reproduce a race
condition between two threads of course the number of cores does affect
the reproducibility.

> 2. On failure situation, what does break the "while loop" shown in below 
> code.

Bug in a kernel that fails to restore fpu state.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] Question about kernel/syscall/signal/signal06.c
  2019-07-17  9:46 ` Cyril Hrubis
@ 2019-07-19  8:13   ` Hongzhi, Song
  2019-07-19  8:44     ` Li Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Hongzhi, Song @ 2019-07-19  8:13 UTC (permalink / raw)
  To: ltp

This case fails when boot qemux86-64 with 1/2 cores.

I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect.

If git checkout a commit before 0d714dba162, the case will pass on the 
same qemu configuration.


--Hongzhi


On 7/17/19 5:46 PM, Cyril Hrubis wrote:
> Hi!
>> I find signal06 fails on qemux86-64 when qemu has a small number cores,
>> e.g. "qemu -smp 1/2/4/6".
>>
>> ERROR INFO:
>>
>> signal06?????? 0?? TINFO?? :?? loop = 23
>> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced!
>>
>> But if boot qemu with "-smp 16", the case has great chance to pass.
>>
>>
>> I have two questions about this case:
>>
>> 1. I don't know why multi-core will affect the case.
> Have you looked into the code? The test is trying to reproduce a race
> condition between two threads of course the number of cores does affect
> the reproducibility.
>
>> 2. On failure situation, what does break the "while loop" shown in below
>> code.
> Bug in a kernel that fails to restore fpu state.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] Question about kernel/syscall/signal/signal06.c
  2019-07-19  8:13   ` Hongzhi, Song
@ 2019-07-19  8:44     ` Li Wang
  2019-07-22  1:56       ` Hongzhi, Song
  0 siblings, 1 reply; 7+ messages in thread
From: Li Wang @ 2019-07-19  8:44 UTC (permalink / raw)
  To: ltp

On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song
<hongzhi.song@windriver.com> wrote:
>
> This case fails when boot qemux86-64 with 1/2 cores.
>
> I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect.
>
> If git checkout a commit before 0d714dba162, the case will pass on the
> same qemu configuration.

It sounds like a new regression on fpu. I will have a try on this test then.

@Hongzhi, could you provide more info of your test machine? (e.g.
lscpu, uname -r)
and test result with 1vcpu, 2vcpus?

[Ccing fpu Dev in this loop]

>
>
> --Hongzhi
>
>
> On 7/17/19 5:46 PM, Cyril Hrubis wrote:
> > Hi!
> >> I find signal06 fails on qemux86-64 when qemu has a small number cores,
> >> e.g. "qemu -smp 1/2/4/6".
> >>
> >> ERROR INFO:
> >>
> >> signal06?????? 0?? TINFO?? :?? loop = 23
> >> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced!
> >>
> >> But if boot qemu with "-smp 16", the case has great chance to pass.
> >>
> >>
> >> I have two questions about this case:
> >>
> >> 1. I don't know why multi-core will affect the case.
> > Have you looked into the code? The test is trying to reproduce a race
> > condition between two threads of course the number of cores does affect
> > the reproducibility.
> >
> >> 2. On failure situation, what does break the "while loop" shown in below
> >> code.
> > Bug in a kernel that fails to restore fpu state.
> >



-- 
Regards,
Li Wang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] Question about kernel/syscall/signal/signal06.c
  2019-07-19  8:44     ` Li Wang
@ 2019-07-22  1:56       ` Hongzhi, Song
  2019-07-24  9:56         ` Li Wang
  2019-08-07 10:15         ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 7+ messages in thread
From: Hongzhi, Song @ 2019-07-22  1:56 UTC (permalink / raw)
  To: ltp


On 7/19/19 4:44 PM, Li Wang wrote:
> On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song
> <hongzhi.song@windriver.com> wrote:
>> This case fails when boot qemux86-64 with 1/2 cores.
>>
>> I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect.

Hi Li,Wang


Sorry for my a bit mistake, the exact tag is [5.1-rc3 : 0d714dba162]

commit 0d714dba162620fd8b9f5b3104a487e041353c4d
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:?? Wed Apr 3 18:41:48 2019 +0200

 ??? x86/fpu: Update xstate's PKRU value on write_pkru()

 ??? During the context switch the xstate is loaded which also includes the
 ??? PKRU value.

 ??? If xstate is restored on return to userland it is required
 ??? that the PKRU value in xstate is the same as the one in the CPU.

 ??? Save the PKRU in xstate during modification.


>>
>> If git checkout a commit before 0d714dba162, the case will pass on the
>> same qemu configuration.
> It sounds like a new regression on fpu. I will have a try on this test then.
>
> @Hongzhi, could you provide more info of your test machine? (e.g.
> lscpu, uname -r)
> and test result with 1vcpu, 2vcpus?


I tested "-smp 1/2/4" and "-cpu Skylake-Client-IBRS/core2duo", all of 
them failed.


1. This is my qemu boot cmdline:

qemu-system-x86_64 -device 
virtio-net-pci,netdev=net0,mac=52:54:00:12:35:02 -netdev 
user,id=net0,hostfwd=tcp::2222-:22,hostfwd=tcp::2323-:23,tftp=images/qemux86-64 
-drive file=image.rootfs.ext4,if=virtio,format=raw -vga vmware 
-show-cursor -usb -device usb-tablet -object 
rng-random,filename=/dev/urandom,id=rng0 -device 
virtio-rng-pci,rng=rng0? -nographic? -m 256? -cpu Skylake-Client-IBRS 
-serial mon:stdio -serial null -kernel linux/arch/x86/boot/bzImage 
-append 'root=/dev/vda rw highres=off console=ttyS0 mem=256M ip=dhcp 
vga=0 uvesafb.mode_option=640x480-32 oprofile.timer=1 
uvesafb.task_timeout=-1 '

2. lscpu

root@qemux86-64:~# lscpu
Architecture:??????????????????? x86_64
CPU op-mode(s):????????????? 32-bit, 64-bit
Byte Order:????????????????????? Little Endian
Address sizes:???????????????? 40 bits physical, 48 bits virtual
CPU(s):??????????????????????????? 4
On-line CPU(s) list:?????????? 0
Thread(s) per core:????????? 1
Core(s) per socket:????????? 1
Socket(s):??????????????????????? 1
Vendor ID:?????????????????????? GenuineIntel
CPU family:????????????????????? 6
Model:???????????????????????????? 94
Model name:?????????????????? Intel Core Processor (Skylake, IBRS)
Stepping:??????????????????????? 3
CPU MHz:??????????????????????? 3100.012
BogoMIPS:?????????????????????? 6200.02
L1d cache:????????????????????? 32 KiB
L1i cache:?????????????????????? 32 KiB
L2 cache:??????????????????????? 4 MiB
L3 cache:??????????????????????? 16 MiB
Vulnerability L1tf:??????????? Mitigation; PTE Inversion
Vulnerability Meltdown:?? Mitigation; PTI
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:??????? Mitigation; __user pointer sanitization
Vulnerability Spectre v2:??????? Mitigation; Full generic retpoline, 
STIBP disab
 ???????????????????????????????? led, RSB filling
Flags:?????????????????????????? fpu de pse tsc msr pae mce cx8 apic sep 
mtrr pg
 ???????????????????????????????? e mca cmov pat pse36 clflush mmx fxsr 
sse sse2
 ???????????????????????????????? syscall nx rdtscp lm constant_tsc 
rep_good nopl
 ????????????????????????????????? xtopology cpuid pni pclmulqdq ssse3 
cx16 sse4_
 ???????????????????????????????? 1 sse4_2 movbe popcnt aes xsave 
hypervisor lahf
 ???????????????????????????????? _lm abm pti fsgsbase bmi1 smep bmi2 
erms adx sm
 ???????????????????????????????? ap xsaveopt xgetbv1 arat


3.? uname -r

root@qemux86-64:~# uname -r
5.1.0-rc3-Linux-standard


Thanks.

--Hongzhi


>
> [Ccing fpu Dev in this loop]
>
>>
>> --Hongzhi
>>
>>
>> On 7/17/19 5:46 PM, Cyril Hrubis wrote:
>>> Hi!
>>>> I find signal06 fails on qemux86-64 when qemu has a small number cores,
>>>> e.g. "qemu -smp 1/2/4/6".
>>>>
>>>> ERROR INFO:
>>>>
>>>> signal06?????? 0?? TINFO?? :?? loop = 23
>>>> signal06?????? 1?? TFAIL?? :?? signal06.c:87: Bug Reproduced!
>>>>
>>>> But if boot qemu with "-smp 16", the case has great chance to pass.
>>>>
>>>>
>>>> I have two questions about this case:
>>>>
>>>> 1. I don't know why multi-core will affect the case.
>>> Have you looked into the code? The test is trying to reproduce a race
>>> condition between two threads of course the number of cores does affect
>>> the reproducibility.
>>>
>>>> 2. On failure situation, what does break the "while loop" shown in below
>>>> code.
>>> Bug in a kernel that fails to restore fpu state.
>>>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] Question about kernel/syscall/signal/signal06.c
  2019-07-22  1:56       ` Hongzhi, Song
@ 2019-07-24  9:56         ` Li Wang
  2019-08-07 10:15         ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 7+ messages in thread
From: Li Wang @ 2019-07-24  9:56 UTC (permalink / raw)
  To: ltp

Hi Hongzhi,

On Mon, Jul 22, 2019 at 9:59 AM Hongzhi, Song
<hongzhi.song@windriver.com> wrote:
>
>
> On 7/19/19 4:44 PM, Li Wang wrote:
> > On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song
> > <hongzhi.song@windriver.com> wrote:
> >> This case fails when boot qemux86-64 with 1/2 cores.
> >>
> >> I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect.
>
> Hi Li,Wang
>
>
> Sorry for my a bit mistake, the exact tag is [5.1-rc3 : 0d714dba162]
>
> commit 0d714dba162620fd8b9f5b3104a487e041353c4d
> Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Date:   Wed Apr 3 18:41:48 2019 +0200
>
>      x86/fpu: Update xstate's PKRU value on write_pkru()
>
>      During the context switch the xstate is loaded which also includes the
>      PKRU value.
>
>      If xstate is restored on return to userland it is required
>      that the PKRU value in xstate is the same as the one in the CPU.
>
>      Save the PKRU in xstate during modification.
>
>
> >>
> >> If git checkout a commit before 0d714dba162, the case will pass on the
> >> same qemu configuration.
> > It sounds like a new regression on fpu. I will have a try on this test then.
> >
> > @Hongzhi, could you provide more info of your test machine? (e.g.
> > lscpu, uname -r)
> > and test result with 1vcpu, 2vcpus?
>
>
> I tested "-smp 1/2/4" and "-cpu Skylake-Client-IBRS/core2duo", all of
> them failed.
>
>
> 1. This is my qemu boot cmdline:
>
> qemu-system-x86_64 -device
> virtio-net-pci,netdev=net0,mac=52:54:00:12:35:02 -netdev
> user,id=net0,hostfwd=tcp::2222-:22,hostfwd=tcp::2323-:23,tftp=images/qemux86-64
> -drive file=image.rootfs.ext4,if=virtio,format=raw -vga vmware
> -show-cursor -usb -device usb-tablet -object
> rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-pci,rng=rng0  -nographic  -m 256  -cpu Skylake-Client-IBRS
> -serial mon:stdio -serial null -kernel linux/arch/x86/boot/bzImage
> -append 'root=/dev/vda rw highres=off console=ttyS0 mem=256M ip=dhcp
> vga=0 uvesafb.mode_option=640x480-32 oprofile.timer=1
> uvesafb.task_timeout=-1 '
>
> 2. lscpu
>
> root@qemux86-64:~# lscpu
> Architecture:                    x86_64
> CPU op-mode(s):              32-bit, 64-bit
> Byte Order:                      Little Endian
> Address sizes:                 40 bits physical, 48 bits virtual
> CPU(s):                            4
> On-line CPU(s) list:           0
> Thread(s) per core:          1
> Core(s) per socket:          1
> Socket(s):                        1
> Vendor ID:                       GenuineIntel
> CPU family:                      6
> Model:                             94
> Model name:                   Intel Core Processor (Skylake, IBRS)

Thanks for the information.

I tried the mainline kernel-v5.2 on the kvm system(with 1/2 Skylake
vCPUs) but didn't reproduce your failure, I'm not sure if I missed
anything there, maybe the virtualization way is related, I will have a
try on your command when I available.

--
Regards,
Li Wang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [LTP] Question about kernel/syscall/signal/signal06.c
  2019-07-22  1:56       ` Hongzhi, Song
  2019-07-24  9:56         ` Li Wang
@ 2019-08-07 10:15         ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-08-07 10:15 UTC (permalink / raw)
  To: ltp

I just woke up from hibernation and assume that this has not been
handled yet so?

On 2019-07-22 09:56:55 [+0800], Hongzhi, Song wrote:
> 
> On 7/19/19 4:44 PM, Li Wang wrote:
> > On Fri, Jul 19, 2019 at 4:14 PM Hongzhi, Song
> > <hongzhi.song@windriver.com> wrote:
> > > This case fails when boot qemux86-64 with 1/2 cores.
> > > 
> > > I find [kernel 5.2-rc1: 0d714dba162] causes the failure by git bisect.
> 
> Hi Li,Wang
> 
> 
> Sorry for my a bit mistake, the exact tag is [5.1-rc3 : 0d714dba162]
> 
> commit 0d714dba162620fd8b9f5b3104a487e041353c4d
> Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Date:?? Wed Apr 3 18:41:48 2019 +0200
> 
> ??? x86/fpu: Update xstate's PKRU value on write_pkru()
> 
> ??? During the context switch the xstate is loaded which also includes the
> ??? PKRU value.
> 
> ??? If xstate is restored on return to userland it is required
> ??? that the PKRU value in xstate is the same as the one in the CPU.
> 
> ??? Save the PKRU in xstate during modification.

So this commit is about PKRU handling and I miss PKU bits in your lscpu
output. So I assume this commit is not related but the FPU rework in
general.

> 3.? uname -r
> 
> root@qemux86-64:~# uname -r
> 5.1.0-rc3-Linux-standard

This is information is confusing. I can reproduce a test case failure in
0d714dba162 but it passes with latest supported kernel.
Please let me know if this problem still exists with 5.3-rc3 or 5.2.7. I
can't reproduce it on any of those kernels.
5.1 is EOL and the commit in question was merged into 5.2-rc1.

> Thanks.
> 
> --Hongzhi

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-08-07 10:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-17  9:21 [LTP] Question about kernel/syscall/signal/signal06.c Hongzhi, Song
2019-07-17  9:46 ` Cyril Hrubis
2019-07-19  8:13   ` Hongzhi, Song
2019-07-19  8:44     ` Li Wang
2019-07-22  1:56       ` Hongzhi, Song
2019-07-24  9:56         ` Li Wang
2019-08-07 10:15         ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.