All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
@ 2016-08-05  6:58 Guangwen Feng
  2016-08-08  8:44 ` Li Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Guangwen Feng @ 2016-08-05  6:58 UTC (permalink / raw)
  To: ltp

1. Currently, following code is incorrect on some releases with
   earlier version of gcc(tested on RHEL5.11GA):

	while (D == d && loop < LOOPS) {

   Because the argument in function test(double d) is used via (%rsp),
   but here we actually need a xmm register to trigger the fpu bug.
   So use global value instead to make sure to take use of xmm.

2. Although this regression test is designed to trigger SIGSEGV
   intentionally, on some releases with old kernel(tested on RHEL5.11GA),
   this will still lead to segmentation fault that terminate the program
   and break the test even though compiling with -O2.  So slightly adjust
   the weight of the codes in child thread to depress SIGSEGV trigger's
   chance while increase LOOPS to ensure reproducible.

Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
---
 testcases/kernel/syscalls/signal/signal06.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/testcases/kernel/syscalls/signal/signal06.c b/testcases/kernel/syscalls/signal/signal06.c
index 75ef733..81fd138 100644
--- a/testcases/kernel/syscalls/signal/signal06.c
+++ b/testcases/kernel/syscalls/signal/signal06.c
@@ -31,8 +31,8 @@
  * 		Date:   Tue Sep 2 19:57:17 2014 +0200
  *
  *      commit 66463db4fc5605d51c7bb81d009d5bf30a783a2c
- *               Author: Oleg Nesterov <oleg@redhat.com>
- *               Date:   Tue Sep 2 19:57:13 2014 +0200
+ *              Author: Oleg Nesterov <oleg@redhat.com>
+ *              Date:   Tue Sep 2 19:57:13 2014 +0200
  *
  * Reproduce:
  *	Test-case (needs -O2).
@@ -55,20 +55,21 @@ int TST_TOTAL = 5;
 
 #if __x86_64__
 
-#define LOOPS 10000
+#define LOOPS 100000
+#define VALUE 123.456
 
 volatile double D;
 volatile int FLAGE;
 
 char altstack[4096 * 10] __attribute__((aligned(4096)));
 
-void test(double d)
+void test(void)
 {
 	int loop = 0;
 	int pid = getpid();
 
-	D = d;
-	while (D == d && loop < LOOPS) {
+	D = VALUE;
+	while (D == VALUE && loop < LOOPS) {
 		/* sys_tkill(pid, SIGHUP); asm to avoid save/reload
 		 * fp regs around c call */
 		asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
@@ -94,12 +95,16 @@ void sigh(int sig LTP_ATTRIBUTE_UNUSED)
 
 void *tfunc(void *arg LTP_ATTRIBUTE_UNUSED)
 {
-	for (; ;) {
-		TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
-		if (TEST_RETURN == -1)
-			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
+	int i;
+
+	for (i = -1; ; i *= -1) {
+		if (i == -1) {
+			TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
+			if (TEST_RETURN == -1)
+				tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
+		}
 
-		TEST(mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE));
+		TEST(mprotect(altstack, sizeof(altstack), PROT_READ | PROT_WRITE));
 		if (TEST_RETURN == -1)
 			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
 
@@ -148,7 +153,7 @@ int main(int ac, char **av)
 				tst_brkm(TBROK | TRERRNO, NULL,
 						"pthread_create failed");
 
-			test(123.456);
+			test();
 
 			TEST(pthread_join(pt, NULL));
 			if (TEST_RETURN)
-- 
1.8.4.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-08-05  6:58 [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel Guangwen Feng
@ 2016-08-08  8:44 ` Li Wang
  2016-08-16  3:08   ` Guangwen Feng
  2016-09-20  9:59 ` Guangwen Feng
  2016-10-05 13:43 ` Cyril Hrubis
  2 siblings, 1 reply; 15+ messages in thread
From: Li Wang @ 2016-08-08  8:44 UTC (permalink / raw)
  To: ltp

On 5 August 2016 at 14:58, Guangwen Feng <fenggw-fnst@cn.fujitsu.com> wrote:
> 1. Currently, following code is incorrect on some releases with
>    earlier version of gcc(tested on RHEL5.11GA):
>
>         while (D == d && loop < LOOPS) {
>
>    Because the argument in function test(double d) is used via (%rsp),
>    but here we actually need a xmm register to trigger the fpu bug.
>    So use global value instead to make sure to take use of xmm.

Sounds reasonable! To verify that I did disassemble work of the
pre/post signal06 for comparison:

signal06 old version
-------------------------
void test(double d)
{
        int loop = 0;
        int pid = getpid();
  402846:       89 c7                   mov    %eax,%edi

        D = d;
  402848:       f2 0f 11 05 b0 27 21    movsd  %xmm0,0x2127b0(%rip)
    # 615000 <D>
  40284f:       00
        while (D == d && loop < LOOPS) {
  402850:       f2 0f 10 05 a8 27 21    movsd  0x2127a8(%rip),%xmm0
    # 615000 <D>
  402857:       00
  402858:       66 0f 2e 44 24 08       ucomisd 0x8(%rsp),%xmm0
  40285e:       0f 85 b8 00 00 00       jne    40291c <test+0xec>
  402864:       0f 8a b2 00 00 00       jp     40291c <test+0xec>
  40286a:       31 db                   xor    %ebx,%ebx
  40286c:       ba c8 00 00 00          mov    $0xc8,%edx
  402871:       be 01 00 00 00          mov    $0x1,%esi
  402876:       eb 0a                   jmp    402882 <test+0x52>
  402878:       81 fb 10 27 00 00       cmp    $0x2710,%ebx
  40287e:       66 90                   xchg   %ax,%ax
  402880:       74 6d                   je     4028ef <test+0xbf>


after applying this patch
------------------------------
void test(void)
{
        int loop = 0;
  402c58:       bb 00 00 00 00          mov    $0x0,%ebx
        int pid = getpid();

        D = VALUE;
        while (D == VALUE && loop < LOOPS) {
  402c5d:       0f 85 b5 00 00 00       jne    402d18 <test+0xe8>
  402c63:       89 c7                   mov    %eax,%edi
                /* sys_tkill(pid, SIGHUP); asm to avoid save/reload
                 * fp regs around c call */
                asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
  402c65:       ba c8 00 00 00          mov    $0xc8,%edx
  402c6a:       be 01 00 00 00          mov    $0x1,%esi
{
        int loop = 0;
        int pid = getpid();

        D = VALUE;
        while (D == VALUE && loop < LOOPS) {
  402c6f:       66 0f 28 d1             movapd %xmm1,%xmm2
  402c73:       eb 11                   jmp    402c86 <test+0x56>
  402c75:       0f 1f 00                nopl   (%rax)
  402c78:       66 0f 2e c2             ucomisd %xmm2,%xmm0
  402c7c:       75 1d                   jne    402c9b <test+0x6b>
  402c7e:       81 fb a0 86 01 00       cmp    $0x186a0,%ebx
  402c84:       74 65                   je     402ceb <test+0xbb>


>
> 2. Although this regression test is designed to trigger SIGSEGV
>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>    this will still lead to segmentation fault that terminate the program
>    and break the test even though compiling with -O2.  So slightly adjust
>    the weight of the codes in child thread to depress SIGSEGV trigger's
>    chance while increase LOOPS to ensure reproducible.

This is also acceptable, seems too many signals more than one kenel
can handled that easily cause segmental fault at the moment.

I test this patch on RHEL5.11(reproduced) and RHEL7.2(pass), it works fine.

Regards,
Li Wang

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-08-08  8:44 ` Li Wang
@ 2016-08-16  3:08   ` Guangwen Feng
  0 siblings, 0 replies; 15+ messages in thread
From: Guangwen Feng @ 2016-08-16  3:08 UTC (permalink / raw)
  To: ltp

Hi!

Thanks for your review and confirmation!

Regards,
Guangwen Feng

On 08/08/2016 04:44 PM, Li Wang wrote:
> On 5 August 2016 at 14:58, Guangwen Feng <fenggw-fnst@cn.fujitsu.com> wrote:
>> 1. Currently, following code is incorrect on some releases with
>>    earlier version of gcc(tested on RHEL5.11GA):
>>
>>         while (D == d && loop < LOOPS) {
>>
>>    Because the argument in function test(double d) is used via (%rsp),
>>    but here we actually need a xmm register to trigger the fpu bug.
>>    So use global value instead to make sure to take use of xmm.
> 
> Sounds reasonable! To verify that I did disassemble work of the
> pre/post signal06 for comparison:
> 
> signal06 old version
> -------------------------
> void test(double d)
> {
>         int loop = 0;
>         int pid = getpid();
>   402846:       89 c7                   mov    %eax,%edi
> 
>         D = d;
>   402848:       f2 0f 11 05 b0 27 21    movsd  %xmm0,0x2127b0(%rip)
>     # 615000 <D>
>   40284f:       00
>         while (D == d && loop < LOOPS) {
>   402850:       f2 0f 10 05 a8 27 21    movsd  0x2127a8(%rip),%xmm0
>     # 615000 <D>
>   402857:       00
>   402858:       66 0f 2e 44 24 08       ucomisd 0x8(%rsp),%xmm0
>   40285e:       0f 85 b8 00 00 00       jne    40291c <test+0xec>
>   402864:       0f 8a b2 00 00 00       jp     40291c <test+0xec>
>   40286a:       31 db                   xor    %ebx,%ebx
>   40286c:       ba c8 00 00 00          mov    $0xc8,%edx
>   402871:       be 01 00 00 00          mov    $0x1,%esi
>   402876:       eb 0a                   jmp    402882 <test+0x52>
>   402878:       81 fb 10 27 00 00       cmp    $0x2710,%ebx
>   40287e:       66 90                   xchg   %ax,%ax
>   402880:       74 6d                   je     4028ef <test+0xbf>
> 
> 
> after applying this patch
> ------------------------------
> void test(void)
> {
>         int loop = 0;
>   402c58:       bb 00 00 00 00          mov    $0x0,%ebx
>         int pid = getpid();
> 
>         D = VALUE;
>         while (D == VALUE && loop < LOOPS) {
>   402c5d:       0f 85 b5 00 00 00       jne    402d18 <test+0xe8>
>   402c63:       89 c7                   mov    %eax,%edi
>                 /* sys_tkill(pid, SIGHUP); asm to avoid save/reload
>                  * fp regs around c call */
>                 asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
>   402c65:       ba c8 00 00 00          mov    $0xc8,%edx
>   402c6a:       be 01 00 00 00          mov    $0x1,%esi
> {
>         int loop = 0;
>         int pid = getpid();
> 
>         D = VALUE;
>         while (D == VALUE && loop < LOOPS) {
>   402c6f:       66 0f 28 d1             movapd %xmm1,%xmm2
>   402c73:       eb 11                   jmp    402c86 <test+0x56>
>   402c75:       0f 1f 00                nopl   (%rax)
>   402c78:       66 0f 2e c2             ucomisd %xmm2,%xmm0
>   402c7c:       75 1d                   jne    402c9b <test+0x6b>
>   402c7e:       81 fb a0 86 01 00       cmp    $0x186a0,%ebx
>   402c84:       74 65                   je     402ceb <test+0xbb>
> 
> 
>>
>> 2. Although this regression test is designed to trigger SIGSEGV
>>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>>    this will still lead to segmentation fault that terminate the program
>>    and break the test even though compiling with -O2.  So slightly adjust
>>    the weight of the codes in child thread to depress SIGSEGV trigger's
>>    chance while increase LOOPS to ensure reproducible.
> 
> This is also acceptable, seems too many signals more than one kenel
> can handled that easily cause segmental fault at the moment.
> 
> I test this patch on RHEL5.11(reproduced) and RHEL7.2(pass), it works fine.
> 
> Regards,
> Li Wang
> 
> 
> .
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-08-05  6:58 [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel Guangwen Feng
  2016-08-08  8:44 ` Li Wang
@ 2016-09-20  9:59 ` Guangwen Feng
  2016-10-05 13:43 ` Cyril Hrubis
  2 siblings, 0 replies; 15+ messages in thread
From: Guangwen Feng @ 2016-09-20  9:59 UTC (permalink / raw)
  To: ltp

Hi!

Ping, thanks!


Regards,
Guangwen Feng

On 08/05/2016 02:58 PM, Guangwen Feng wrote:
> 1. Currently, following code is incorrect on some releases with
>    earlier version of gcc(tested on RHEL5.11GA):
> 
> 	while (D == d && loop < LOOPS) {
> 
>    Because the argument in function test(double d) is used via (%rsp),
>    but here we actually need a xmm register to trigger the fpu bug.
>    So use global value instead to make sure to take use of xmm.
> 
> 2. Although this regression test is designed to trigger SIGSEGV
>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>    this will still lead to segmentation fault that terminate the program
>    and break the test even though compiling with -O2.  So slightly adjust
>    the weight of the codes in child thread to depress SIGSEGV trigger's
>    chance while increase LOOPS to ensure reproducible.
> 
> Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
> ---
>  testcases/kernel/syscalls/signal/signal06.c | 29 +++++++++++++++++------------
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/testcases/kernel/syscalls/signal/signal06.c b/testcases/kernel/syscalls/signal/signal06.c
> index 75ef733..81fd138 100644
> --- a/testcases/kernel/syscalls/signal/signal06.c
> +++ b/testcases/kernel/syscalls/signal/signal06.c
> @@ -31,8 +31,8 @@
>   * 		Date:   Tue Sep 2 19:57:17 2014 +0200
>   *
>   *      commit 66463db4fc5605d51c7bb81d009d5bf30a783a2c
> - *               Author: Oleg Nesterov <oleg@redhat.com>
> - *               Date:   Tue Sep 2 19:57:13 2014 +0200
> + *              Author: Oleg Nesterov <oleg@redhat.com>
> + *              Date:   Tue Sep 2 19:57:13 2014 +0200
>   *
>   * Reproduce:
>   *	Test-case (needs -O2).
> @@ -55,20 +55,21 @@ int TST_TOTAL = 5;
>  
>  #if __x86_64__
>  
> -#define LOOPS 10000
> +#define LOOPS 100000
> +#define VALUE 123.456
>  
>  volatile double D;
>  volatile int FLAGE;
>  
>  char altstack[4096 * 10] __attribute__((aligned(4096)));
>  
> -void test(double d)
> +void test(void)
>  {
>  	int loop = 0;
>  	int pid = getpid();
>  
> -	D = d;
> -	while (D == d && loop < LOOPS) {
> +	D = VALUE;
> +	while (D == VALUE && loop < LOOPS) {
>  		/* sys_tkill(pid, SIGHUP); asm to avoid save/reload
>  		 * fp regs around c call */
>  		asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
> @@ -94,12 +95,16 @@ void sigh(int sig LTP_ATTRIBUTE_UNUSED)
>  
>  void *tfunc(void *arg LTP_ATTRIBUTE_UNUSED)
>  {
> -	for (; ;) {
> -		TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
> -		if (TEST_RETURN == -1)
> -			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
> +	int i;
> +
> +	for (i = -1; ; i *= -1) {
> +		if (i == -1) {
> +			TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
> +			if (TEST_RETURN == -1)
> +				tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
> +		}
>  
> -		TEST(mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE));
> +		TEST(mprotect(altstack, sizeof(altstack), PROT_READ | PROT_WRITE));
>  		if (TEST_RETURN == -1)
>  			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
>  
> @@ -148,7 +153,7 @@ int main(int ac, char **av)
>  				tst_brkm(TBROK | TRERRNO, NULL,
>  						"pthread_create failed");
>  
> -			test(123.456);
> +			test();
>  
>  			TEST(pthread_join(pt, NULL));
>  			if (TEST_RETURN)
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-08-05  6:58 [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel Guangwen Feng
  2016-08-08  8:44 ` Li Wang
  2016-09-20  9:59 ` Guangwen Feng
@ 2016-10-05 13:43 ` Cyril Hrubis
  2016-10-06 10:31   ` Guangwen Feng
  2 siblings, 1 reply; 15+ messages in thread
From: Cyril Hrubis @ 2016-10-05 13:43 UTC (permalink / raw)
  To: ltp

Hi!
First of all sorry for the delay.

> 1. Currently, following code is incorrect on some releases with
>    earlier version of gcc(tested on RHEL5.11GA):
> 
> 	while (D == d && loop < LOOPS) {
>
>    Because the argument in function test(double d) is used via (%rsp),
>    but here we actually need a xmm register to trigger the fpu bug.
>    So use global value instead to make sure to take use of xmm.

This looks OK.

> 2. Although this regression test is designed to trigger SIGSEGV
>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>    this will still lead to segmentation fault that terminate the program
>    and break the test even though compiling with -O2.  So slightly adjust
>    the weight of the codes in child thread to depress SIGSEGV trigger's
>    chance while increase LOOPS to ensure reproducible.

Hmm, what is the exact problem here? Does the old kernel break if we
send the signal too fast?

I do not like much that the test takes ten times more time to finish
now.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-10-05 13:43 ` Cyril Hrubis
@ 2016-10-06 10:31   ` Guangwen Feng
  2016-10-06 11:15     ` Cyril Hrubis
  0 siblings, 1 reply; 15+ messages in thread
From: Guangwen Feng @ 2016-10-06 10:31 UTC (permalink / raw)
  To: ltp

Hi!

Thanks for your comments!

On 10/05/2016 09:43 PM, Cyril Hrubis wrote:
> Hi!
> First of all sorry for the delay.
> 
>> 1. Currently, following code is incorrect on some releases with
>>    earlier version of gcc(tested on RHEL5.11GA):
>>
>> 	while (D == d && loop < LOOPS) {
>>
>>    Because the argument in function test(double d) is used via (%rsp),
>>    but here we actually need a xmm register to trigger the fpu bug.
>>    So use global value instead to make sure to take use of xmm.
> 
> This looks OK.
> 
>> 2. Although this regression test is designed to trigger SIGSEGV
>>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>>    this will still lead to segmentation fault that terminate the program
>>    and break the test even though compiling with -O2.  So slightly adjust
>>    the weight of the codes in child thread to depress SIGSEGV trigger's
>>    chance while increase LOOPS to ensure reproducible.
> 
> Hmm, what is the exact problem here? Does the old kernel break if we
> send the signal too fast?

Yes, running signal06 reports segmentation fault and breaks the test
if we send the signal too fast on the old kernel.

> 
> I do not like much that the test takes ten times more time to finish
> now.

Yes, it will take ten times more time than before, but it only takes
3~4 seconds to finish...

By current LOOPS(10000), most of the time, the buggy kernel can be
reproduced, but there is still a chance(about 0.5% in my environment)
to miss.


Best Regards,
Guangwen Feng



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-10-06 10:31   ` Guangwen Feng
@ 2016-10-06 11:15     ` Cyril Hrubis
  2016-10-10  7:05       ` Guangwen Feng
  0 siblings, 1 reply; 15+ messages in thread
From: Cyril Hrubis @ 2016-10-06 11:15 UTC (permalink / raw)
  To: ltp

Hi!
> > Hmm, what is the exact problem here? Does the old kernel break if we
> > send the signal too fast?
> 
> Yes, running signal06 reports segmentation fault and breaks the test
> if we send the signal too fast on the old kernel.

Hmm, that sounds like a bug itself since we set signal handler for the
SIGSEGV, it shouldn't kill the process while we try to write to the read
only memory. Can you try to run it in a debugger and send a trace?

Since what we do is disable/enable write access to the alternative while
simoutaneously hammer the process with SIGHUP in order to trigger
segfault inside of the sighup signal handler.

I wonder why we have to do that asynchronously in the first place, if
there is a need to hit a particular spot in the kernel while the signal
stack is being written.

Anyway on which systems is this bug reproducible? I can try to make it
both reliable and fast.

> > I do not like much that the test takes ten times more time to finish
> > now.
> 
> Yes, it will take ten times more time than before, but it only takes
> 3~4 seconds to finish...

Let me put that in perspective. We have more than 3000 tests in LTP, if
each of these would take just a few seconds more, the test run would
take a few more hours to finish.

And there is a reason why we need the LTP run to be as fast as possible.
The longer the run takes the less people would use it and it would be
used less frequently which would make all the work put into these
testcases less valuable.

All in all I'm mildly opposed to each change that increses the runtime
if it's not absolutely necessary.

> By current LOOPS(10000), most of the time, the buggy kernel can be
> reproduced, but there is still a chance(about 0.5% in my environment)
> to miss.

Hmm, what about increasing it twice or four times? What is the
probability of missing the bug then?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-10-06 11:15     ` Cyril Hrubis
@ 2016-10-10  7:05       ` Guangwen Feng
  2016-10-10 13:15         ` Cyril Hrubis
  0 siblings, 1 reply; 15+ messages in thread
From: Guangwen Feng @ 2016-10-10  7:05 UTC (permalink / raw)
  To: ltp

Hi!

On 10/06/2016 07:15 PM, Cyril Hrubis wrote:
> Hi!
>>> Hmm, what is the exact problem here? Does the old kernel break if we
>>> send the signal too fast?
>>
>> Yes, running signal06 reports segmentation fault and breaks the test
>> if we send the signal too fast on the old kernel.
> 
> Hmm, that sounds like a bug itself since we set signal handler for the
> SIGSEGV, it shouldn't kill the process while we try to write to the read
> only memory. Can you try to run it in a debugger and send a trace?

Yes, I guess this is the kernel's bug, but sorry I didn't look into
the real reason of the segfault issue much.

[root@RHEL5U11ga_Intel64 signal]# gdb ./signal06 core.18623 
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-45.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/ltp/testcases/kernel/syscalls/signal/signal06...done.
[New Thread 18623]
[New Thread 18625]
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fffe3384000
Core was generated by `./signal06'.
Program terminated with signal 11, Segmentation fault.
#0  test (d=123.456) at signal06.c:71
71		while (D == d && loop < LOOPS) {
(gdb) bt
#0  test (d=123.456)@signal06.c:71
#1  0x0000000000402bb7 in main (ac=<value optimized out>, av=<value optimized out>) at signal06.c:151

(gdb) disassemble 
Dump of assembler code for function test:
0x0000000000402910 <test+0>:	push   %rbx
0x0000000000402911 <test+1>:	sub    $0x10,%rsp
0x0000000000402915 <test+5>:	movsd  %xmm0,0x8(%rsp)
0x000000000040291b <test+11>:	callq  0x402278 <getpid@plt>
0x0000000000402920 <test+16>:	movsd  0x8(%rsp),%xmm0
0x0000000000402926 <test+22>:	mov    %eax,%edi
0x0000000000402928 <test+24>:	movsd  %xmm0,0x2136d0(%rip)        # 0x616000 <D>
0x0000000000402930 <test+32>:	movsd  0x2136c8(%rip),%xmm0        # 0x616000 <D>
0x0000000000402938 <test+40>:	ucomisd 0x8(%rsp),%xmm0
0x000000000040293e <test+46>:	jne    0x4029fc <test+236>
0x0000000000402944 <test+52>:	jp     0x4029fc <test+236>
0x000000000040294a <test+58>:	xor    %ebx,%ebx
0x000000000040294c <test+60>:	mov    $0xc8,%edx
0x0000000000402951 <test+65>:	mov    $0x1,%esi
0x0000000000402956 <test+70>:	jmp    0x402962 <test+82>
0x0000000000402958 <test+72>:	cmp    $0x2710,%ebx
0x000000000040295e <test+78>:	xchg   %ax,%ax
0x0000000000402960 <test+80>:	je     0x4029cf <test+191>
0x0000000000402962 <test+82>:	mov    %edx,%eax
0x0000000000402964 <test+84>:	syscall 
0x0000000000402966 <test+86>:	movsd  0x213692(%rip),%xmm0        # 0x616000 <D>
0x000000000040296e <test+94>:	add    $0x1,%ebx
0x0000000000402971 <test+97>:	ucomisd 0x8(%rsp),%xmm0
0x0000000000402977 <test+103>:	jp     0x40297b <test+107>
0x0000000000402979 <test+105>:	je     0x402958 <test+72>
0x000000000040297b <test+107>:	xor    %eax,%eax
0x000000000040297d <test+109>:	mov    %ebx,%r8d
0x0000000000402980 <test+112>:	mov    $0x40b2cb,%ecx
0x0000000000402985 <test+117>:	mov    $0x10,%edx
0x000000000040298a <test+122>:	mov    $0x51,%esi
0x000000000040298f <test+127>:	mov    $0x40b2c0,%edi
0x0000000000402994 <test+132>:	movl   $0x1,0x21e662(%rip)        # 0x621000 <FLAGE>
0x000000000040299e <test+142>:	callq  0x4043d0 <tst_resm_>
0x00000000004029a3 <test+147>:	cmp    $0x2710,%ebx
0x00000000004029a9 <test+153>:	jne    0x402a24 <test+276>
0x00000000004029ab <test+155>:	mov    0x20ec16(%rip),%r8        # 0x6115c8 <TCID>
0x00000000004029b2 <test+162>:	add    $0x10,%rsp
0x00000000004029b6 <test+166>:	mov    $0x40b2d5,%ecx
0x00000000004029bb <test+171>:	pop    %rbx
0x00000000004029bc <test+172>:	xor    %edx,%edx
0x00000000004029be <test+174>:	mov    $0x54,%esi
0x00000000004029c3 <test+179>:	mov    $0x40b2c0,%edi
0x00000000004029c8 <test+184>:	xor    %eax,%eax
0x00000000004029ca <test+186>:	jmpq   0x4043d0 <tst_resm_>
0x00000000004029cf <test+191>:	mov    $0x2710,%r8d
0x00000000004029d5 <test+197>:	mov    $0x40b2cb,%ecx
0x00000000004029da <test+202>:	mov    $0x10,%edx
0x00000000004029df <test+207>:	mov    $0x51,%esi
0x00000000004029e4 <test+212>:	mov    $0x40b2c0,%edi
0x00000000004029e9 <test+217>:	xor    %eax,%eax
0x00000000004029eb <test+219>:	movl   $0x1,0x21e60b(%rip)        # 0x621000 <FLAGE>
0x00000000004029f5 <test+229>:	callq  0x4043d0 <tst_resm_>
0x00000000004029fa <test+234>:	jmp    0x4029ab <test+155>
0x00000000004029fc <test+236>:	xor    %r8d,%r8d
0x00000000004029ff <test+239>:	mov    $0x40b2cb,%ecx
0x0000000000402a04 <test+244>:	mov    $0x10,%edx
0x0000000000402a09 <test+249>:	mov    $0x51,%esi
0x0000000000402a0e <test+254>:	mov    $0x40b2c0,%edi
0x0000000000402a13 <test+259>:	xor    %eax,%eax
0x0000000000402a15 <test+261>:	movl   $0x1,0x21e5e1(%rip)        # 0x621000 <FLAGE>
0x0000000000402a1f <test+271>:	callq  0x4043d0 <tst_resm_>
0x0000000000402a24 <test+276>:	mov    $0x40b2e7,%ecx
0x0000000000402a29 <test+281>:	mov    $0x1,%edx
0x0000000000402a2e <test+286>:	mov    $0x56,%esi
0x0000000000402a33 <test+291>:	mov    $0x40b2c0,%edi
0x0000000000402a38 <test+296>:	xor    %eax,%eax
0x0000000000402a3a <test+298>:	callq  0x4043d0 <tst_resm_>
0x0000000000402a3f <test+303>:	callq  0x404670 <tst_exit>
End of assembler dump.

> 
> Since what we do is disable/enable write access to the alternative while
> simoutaneously hammer the process with SIGHUP in order to trigger
> segfault inside of the sighup signal handler.
> 
> I wonder why we have to do that asynchronously in the first place, if
> there is a need to hit a particular spot in the kernel while the signal
> stack is being written.
> 
> Anyway on which systems is this bug reproducible? I can try to make it
> both reliable and fast.

Thanks very much.
The segfault issue can be reproduced steadily in RHEL5.11GA(2.6.18-398.el5).

> 
>>> I do not like much that the test takes ten times more time to finish
>>> now.
>>
>> Yes, it will take ten times more time than before, but it only takes
>> 3~4 seconds to finish...
> 
> Let me put that in perspective. We have more than 3000 tests in LTP, if
> each of these would take just a few seconds more, the test run would
> take a few more hours to finish.
> 
> And there is a reason why we need the LTP run to be as fast as possible.
> The longer the run takes the less people would use it and it would be
> used less frequently which would make all the work put into these
> testcases less valuable.
> 
> All in all I'm mildly opposed to each change that increses the runtime
> if it's not absolutely necessary.
> 

OK, I understand, will know for next time, thanks.

>> By current LOOPS(10000), most of the time, the buggy kernel can be
>> reproduced, but there is still a chance(about 0.5% in my environment)
>> to miss.
> 
> Hmm, what about increasing it twice or four times? What is the
> probability of missing the bug then?
> 

2 times(20000 loops):	99.9% reproducible
4 times(40000 loops):	100% reproducible

Is it acceptable to increase it four times?


Best Regards,
Guangwen Feng



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-10-10  7:05       ` Guangwen Feng
@ 2016-10-10 13:15         ` Cyril Hrubis
  2016-10-11  7:17           ` Guangwen Feng
  2016-11-16  8:14           ` [LTP] [PATCH v2] " Guangwen Feng
  0 siblings, 2 replies; 15+ messages in thread
From: Cyril Hrubis @ 2016-10-10 13:15 UTC (permalink / raw)
  To: ltp

Hi!
> Yes, I guess this is the kernel's bug, but sorry I didn't look into
> the real reason of the segfault issue much.
> 
> [root@RHEL5U11ga_Intel64 signal]# gdb ./signal06 core.18623 
> GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-45.el5)
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /root/ltp/testcases/kernel/syscalls/signal/signal06...done.
> [New Thread 18623]
> [New Thread 18625]
> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
> [Thread debugging using libthread_db enabled]
> Loaded symbols for /lib64/libpthread.so.0
> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> 
> warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fffe3384000
> Core was generated by `./signal06'.
> Program terminated with signal 11, Segmentation fault.
> #0  test (d=123.456) at signal06.c:71
> 71		while (D == d && loop < LOOPS) {
> (gdb) bt
> #0  test (d=123.456) at signal06.c:71
> #1  0x0000000000402bb7 in main (ac=<value optimized out>, av=<value optimized out>) at signal06.c:151

So it segfaults in the test() function, we aren't doing anything wrong
there. So this really looks like a bug itself.

Supposedly this causes the test segfault on a system where the original
bug cannot be reproduced, right?

> >> By current LOOPS(10000), most of the time, the buggy kernel can be
> >> reproduced, but there is still a chance(about 0.5% in my environment)
> >> to miss.
> > 
> > Hmm, what about increasing it twice or four times? What is the
> > probability of missing the bug then?
> > 
> 
> 2 times(20000 loops):	99.9% reproducible
> 4 times(40000 loops):	100% reproducible
> 
> Is it acceptable to increase it four times?

That is 2.5 times better :)

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-10-10 13:15         ` Cyril Hrubis
@ 2016-10-11  7:17           ` Guangwen Feng
  2016-11-16  8:14           ` [LTP] [PATCH v2] " Guangwen Feng
  1 sibling, 0 replies; 15+ messages in thread
From: Guangwen Feng @ 2016-10-11  7:17 UTC (permalink / raw)
  To: ltp

Hi!

On 10/10/2016 09:15 PM, Cyril Hrubis wrote:
> Hi!
>> Yes, I guess this is the kernel's bug, but sorry I didn't look into
>> the real reason of the segfault issue much.
>>
>> [root@RHEL5U11ga_Intel64 signal]# gdb ./signal06 core.18623 
>> GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-45.el5)
>> Copyright (C) 2009 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from /root/ltp/testcases/kernel/syscalls/signal/signal06...done.
>> [New Thread 18623]
>> [New Thread 18625]
>> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
>> [Thread debugging using libthread_db enabled]
>> Loaded symbols for /lib64/libpthread.so.0
>> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/libc.so.6
>> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/ld-linux-x86-64.so.2
>>
>> warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fffe3384000
>> Core was generated by `./signal06'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  test (d=123.456) at signal06.c:71
>> 71		while (D == d && loop < LOOPS) {
>> (gdb) bt
>> #0  test (d=123.456) at signal06.c:71
>> #1  0x0000000000402bb7 in main (ac=<value optimized out>, av=<value optimized out>) at signal06.c:151
> 
> So it segfaults in the test() function, we aren't doing anything wrong
> there. So this really looks like a bug itself.
> 
> Supposedly this causes the test segfault on a system where the original
> bug cannot be reproduced, right?

Yes, the original bug should be reproduced by signal06 on this system.
but the segfault issue breaks the regression test.

> 
>>>> By current LOOPS(10000), most of the time, the buggy kernel can be
>>>> reproduced, but there is still a chance(about 0.5% in my environment)
>>>> to miss.
>>>
>>> Hmm, what about increasing it twice or four times? What is the
>>> probability of missing the bug then?
>>>
>>
>> 2 times(20000 loops):	99.9% reproducible
>> 4 times(40000 loops):	100% reproducible
>>
>> Is it acceptable to increase it four times?
> 
> That is 2.5 times better :)

So, may I just increase it to 4 times and send a v2?


Best Regards,
Guangwen Feng



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH v2] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-10-10 13:15         ` Cyril Hrubis
  2016-10-11  7:17           ` Guangwen Feng
@ 2016-11-16  8:14           ` Guangwen Feng
  2017-02-09  9:00             ` Guangwen Feng
  2017-02-09 16:19             ` Cyril Hrubis
  1 sibling, 2 replies; 15+ messages in thread
From: Guangwen Feng @ 2016-11-16  8:14 UTC (permalink / raw)
  To: ltp

1. Currently, following code is incorrect on some releases with
   earlier version of gcc(tested on RHEL5.11GA):

	while (D == d && loop < LOOPS) {

   Because the argument in function test(double d) is used via (%rsp),
   but here we actually need a xmm register to trigger the fpu bug.
   So use global value instead to make sure to take use of xmm.

2. Although this regression test is designed to trigger SIGSEGV
   intentionally, on some releases with old kernel(tested on RHEL5.11GA),
   this will still lead to segmentation fault that terminate the program
   and break the test even though compiling with -O2.  So slightly adjust
   the weight of the codes in child thread to depress SIGSEGV trigger's
   chance while increase LOOPS to ensure reproducible.

Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
---
 testcases/kernel/syscalls/signal/signal06.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/testcases/kernel/syscalls/signal/signal06.c b/testcases/kernel/syscalls/signal/signal06.c
index 75ef733..1f1b31c 100644
--- a/testcases/kernel/syscalls/signal/signal06.c
+++ b/testcases/kernel/syscalls/signal/signal06.c
@@ -31,8 +31,8 @@
  * 		Date:   Tue Sep 2 19:57:17 2014 +0200
  *
  *      commit 66463db4fc5605d51c7bb81d009d5bf30a783a2c
- *               Author: Oleg Nesterov <oleg@redhat.com>
- *               Date:   Tue Sep 2 19:57:13 2014 +0200
+ *              Author: Oleg Nesterov <oleg@redhat.com>
+ *              Date:   Tue Sep 2 19:57:13 2014 +0200
  *
  * Reproduce:
  *	Test-case (needs -O2).
@@ -55,20 +55,21 @@ int TST_TOTAL = 5;
 
 #if __x86_64__
 
-#define LOOPS 10000
+#define LOOPS 30000
+#define VALUE 123.456
 
 volatile double D;
 volatile int FLAGE;
 
 char altstack[4096 * 10] __attribute__((aligned(4096)));
 
-void test(double d)
+void test(void)
 {
 	int loop = 0;
 	int pid = getpid();
 
-	D = d;
-	while (D == d && loop < LOOPS) {
+	D = VALUE;
+	while (D == VALUE && loop < LOOPS) {
 		/* sys_tkill(pid, SIGHUP); asm to avoid save/reload
 		 * fp regs around c call */
 		asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
@@ -94,12 +95,16 @@ void sigh(int sig LTP_ATTRIBUTE_UNUSED)
 
 void *tfunc(void *arg LTP_ATTRIBUTE_UNUSED)
 {
-	for (; ;) {
-		TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
-		if (TEST_RETURN == -1)
-			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
+	int i;
+
+	for (i = -1; ; i *= -1) {
+		if (i == -1) {
+			TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
+			if (TEST_RETURN == -1)
+				tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
+		}
 
-		TEST(mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE));
+		TEST(mprotect(altstack, sizeof(altstack), PROT_READ | PROT_WRITE));
 		if (TEST_RETURN == -1)
 			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
 
@@ -148,7 +153,7 @@ int main(int ac, char **av)
 				tst_brkm(TBROK | TRERRNO, NULL,
 						"pthread_create failed");
 
-			test(123.456);
+			test();
 
 			TEST(pthread_join(pt, NULL));
 			if (TEST_RETURN)
-- 
1.8.4.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [LTP] [PATCH v2] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-11-16  8:14           ` [LTP] [PATCH v2] " Guangwen Feng
@ 2017-02-09  9:00             ` Guangwen Feng
  2017-02-09 16:19             ` Cyril Hrubis
  1 sibling, 0 replies; 15+ messages in thread
From: Guangwen Feng @ 2017-02-09  9:00 UTC (permalink / raw)
  To: ltp

Hi!

Ping, thanks!


Best Regards,
Guangwen Feng

On 11/16/2016 04:14 PM, Guangwen Feng wrote:
> 1. Currently, following code is incorrect on some releases with
>    earlier version of gcc(tested on RHEL5.11GA):
> 
> 	while (D == d && loop < LOOPS) {
> 
>    Because the argument in function test(double d) is used via (%rsp),
>    but here we actually need a xmm register to trigger the fpu bug.
>    So use global value instead to make sure to take use of xmm.
> 
> 2. Although this regression test is designed to trigger SIGSEGV
>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>    this will still lead to segmentation fault that terminate the program
>    and break the test even though compiling with -O2.  So slightly adjust
>    the weight of the codes in child thread to depress SIGSEGV trigger's
>    chance while increase LOOPS to ensure reproducible.
> 
> Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
> ---
>  testcases/kernel/syscalls/signal/signal06.c | 29 +++++++++++++++++------------
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/testcases/kernel/syscalls/signal/signal06.c b/testcases/kernel/syscalls/signal/signal06.c
> index 75ef733..1f1b31c 100644
> --- a/testcases/kernel/syscalls/signal/signal06.c
> +++ b/testcases/kernel/syscalls/signal/signal06.c
> @@ -31,8 +31,8 @@
>   * 		Date:   Tue Sep 2 19:57:17 2014 +0200
>   *
>   *      commit 66463db4fc5605d51c7bb81d009d5bf30a783a2c
> - *               Author: Oleg Nesterov <oleg@redhat.com>
> - *               Date:   Tue Sep 2 19:57:13 2014 +0200
> + *              Author: Oleg Nesterov <oleg@redhat.com>
> + *              Date:   Tue Sep 2 19:57:13 2014 +0200
>   *
>   * Reproduce:
>   *	Test-case (needs -O2).
> @@ -55,20 +55,21 @@ int TST_TOTAL = 5;
>  
>  #if __x86_64__
>  
> -#define LOOPS 10000
> +#define LOOPS 30000
> +#define VALUE 123.456
>  
>  volatile double D;
>  volatile int FLAGE;
>  
>  char altstack[4096 * 10] __attribute__((aligned(4096)));
>  
> -void test(double d)
> +void test(void)
>  {
>  	int loop = 0;
>  	int pid = getpid();
>  
> -	D = d;
> -	while (D == d && loop < LOOPS) {
> +	D = VALUE;
> +	while (D == VALUE && loop < LOOPS) {
>  		/* sys_tkill(pid, SIGHUP); asm to avoid save/reload
>  		 * fp regs around c call */
>  		asm ("" : : "a"(__NR_tkill), "D"(pid), "S"(SIGHUP));
> @@ -94,12 +95,16 @@ void sigh(int sig LTP_ATTRIBUTE_UNUSED)
>  
>  void *tfunc(void *arg LTP_ATTRIBUTE_UNUSED)
>  {
> -	for (; ;) {
> -		TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
> -		if (TEST_RETURN == -1)
> -			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
> +	int i;
> +
> +	for (i = -1; ; i *= -1) {
> +		if (i == -1) {
> +			TEST(mprotect(altstack, sizeof(altstack), PROT_READ));
> +			if (TEST_RETURN == -1)
> +				tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
> +		}
>  
> -		TEST(mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE));
> +		TEST(mprotect(altstack, sizeof(altstack), PROT_READ | PROT_WRITE));
>  		if (TEST_RETURN == -1)
>  			tst_brkm(TBROK | TTERRNO, NULL, "mprotect failed");
>  
> @@ -148,7 +153,7 @@ int main(int ac, char **av)
>  				tst_brkm(TBROK | TRERRNO, NULL,
>  						"pthread_create failed");
>  
> -			test(123.456);
> +			test();
>  
>  			TEST(pthread_join(pt, NULL));
>  			if (TEST_RETURN)
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH v2] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2016-11-16  8:14           ` [LTP] [PATCH v2] " Guangwen Feng
  2017-02-09  9:00             ` Guangwen Feng
@ 2017-02-09 16:19             ` Cyril Hrubis
  2017-02-13  3:38               ` Guangwen Feng
  1 sibling, 1 reply; 15+ messages in thread
From: Cyril Hrubis @ 2017-02-09 16:19 UTC (permalink / raw)
  To: ltp

Hi!
> 1. Currently, following code is incorrect on some releases with
>    earlier version of gcc(tested on RHEL5.11GA):
> 
> 	while (D == d && loop < LOOPS) {
> 
>    Because the argument in function test(double d) is used via (%rsp),
>    but here we actually need a xmm register to trigger the fpu bug.
>    So use global value instead to make sure to take use of xmm.
> 
> 2. Although this regression test is designed to trigger SIGSEGV
>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>    this will still lead to segmentation fault that terminate the program
>    and break the test even though compiling with -O2.  So slightly adjust
>    the weight of the codes in child thread to depress SIGSEGV trigger's
>    chance while increase LOOPS to ensure reproducible.

This needs to be worded a bit better, I had to read the email discussion
to remember what we are trying to avoid here.

Something as:

2. Although this regresssion test is designed to trigger SIGSEGV
   intentionally in order to invoke signal handler, producing it too
   fast can lead to a program termination on older kernels (found in
   RHEL5.11GA). Hence we divide the frequency in half and increase
   number of LOOPS at the same time as a compensation to make sure
   that the bug is 100% reproducible.

OK to commit with this change to the patch description?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH v2] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2017-02-09 16:19             ` Cyril Hrubis
@ 2017-02-13  3:38               ` Guangwen Feng
  2017-02-13  9:12                 ` Cyril Hrubis
  0 siblings, 1 reply; 15+ messages in thread
From: Guangwen Feng @ 2017-02-13  3:38 UTC (permalink / raw)
  To: ltp

Hi!

On 02/10/2017 12:19 AM, Cyril Hrubis wrote:
> Hi!
>> 1. Currently, following code is incorrect on some releases with
>>    earlier version of gcc(tested on RHEL5.11GA):
>>
>> 	while (D == d && loop < LOOPS) {
>>
>>    Because the argument in function test(double d) is used via (%rsp),
>>    but here we actually need a xmm register to trigger the fpu bug.
>>    So use global value instead to make sure to take use of xmm.
>>
>> 2. Although this regression test is designed to trigger SIGSEGV
>>    intentionally, on some releases with old kernel(tested on RHEL5.11GA),
>>    this will still lead to segmentation fault that terminate the program
>>    and break the test even though compiling with -O2.  So slightly adjust
>>    the weight of the codes in child thread to depress SIGSEGV trigger's
>>    chance while increase LOOPS to ensure reproducible.
> 
> This needs to be worded a bit better, I had to read the email discussion
> to remember what we are trying to avoid here.
> 
> Something as:
> 
> 2. Although this regresssion test is designed to trigger SIGSEGV
>    intentionally in order to invoke signal handler, producing it too
>    fast can lead to a program termination on older kernels (found in
>    RHEL5.11GA). Hence we divide the frequency in half and increase
>    number of LOOPS at the same time as a compensation to make sure
>    that the bug is 100% reproducible.
> 
> OK to commit with this change to the patch description?

Sure, the description is much better now.
Thanks very much!

Best Regards,
Guangwen Feng



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [LTP] [PATCH v2] syscalls/signal06: fix test for regression with earlier version of gcc and kernel
  2017-02-13  3:38               ` Guangwen Feng
@ 2017-02-13  9:12                 ` Cyril Hrubis
  0 siblings, 0 replies; 15+ messages in thread
From: Cyril Hrubis @ 2017-02-13  9:12 UTC (permalink / raw)
  To: ltp

Hi!
Pushed, thanks.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-02-13  9:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-05  6:58 [LTP] [PATCH] syscalls/signal06: fix test for regression with earlier version of gcc and kernel Guangwen Feng
2016-08-08  8:44 ` Li Wang
2016-08-16  3:08   ` Guangwen Feng
2016-09-20  9:59 ` Guangwen Feng
2016-10-05 13:43 ` Cyril Hrubis
2016-10-06 10:31   ` Guangwen Feng
2016-10-06 11:15     ` Cyril Hrubis
2016-10-10  7:05       ` Guangwen Feng
2016-10-10 13:15         ` Cyril Hrubis
2016-10-11  7:17           ` Guangwen Feng
2016-11-16  8:14           ` [LTP] [PATCH v2] " Guangwen Feng
2017-02-09  9:00             ` Guangwen Feng
2017-02-09 16:19             ` Cyril Hrubis
2017-02-13  3:38               ` Guangwen Feng
2017-02-13  9:12                 ` Cyril Hrubis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.