All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] LTP: futex_wake04 hangs forever on i386
@ 2018-10-08 13:21 Naresh Kamboju
  2018-10-08 13:30 ` Cyril Hrubis
  0 siblings, 1 reply; 20+ messages in thread
From: Naresh Kamboju @ 2018-10-08 13:21 UTC (permalink / raw)
  To: ltp

LTP syscall test case futex_wake04 hangs forever on i386 kernel
running on x86_64 server machine.

Test PASS on qemu_i386 and other devices.
Test HANGs on i386 (i386 kernel booting on x86_64 machine).

output,
incrementing stop
futex_wake04    0  TINFO  :  Hugepagesize 4194304

LAVA job id,
https://lkft.validation.linaro.org/scheduler/job/399913

strace output of this test case,
https://lkft.validation.linaro.org/scheduler/job/405706

Best regards
Naresh Kamboju

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-08 13:21 [LTP] LTP: futex_wake04 hangs forever on i386 Naresh Kamboju
@ 2018-10-08 13:30 ` Cyril Hrubis
  2018-10-09  2:05   ` Rafael David Tinoco
  0 siblings, 1 reply; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-08 13:30 UTC (permalink / raw)
  To: ltp

Hi!
> LTP syscall test case futex_wake04 hangs forever on i386 kernel
> running on x86_64 server machine.
> 
> Test PASS on qemu_i386 and other devices.
> Test HANGs on i386 (i386 kernel booting on x86_64 machine).
> 
> output,
> incrementing stop
> futex_wake04    0  TINFO  :  Hugepagesize 4194304

Looks like we are trying to fault 4GB hugepage, that may be quite slow,
but then, the test is a kernel reproducer as well.

Can you connect to the process with gdb and check if we happened to
timeout before or after the two threads that are are supposed to trigger
kernel bug are started?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-08 13:30 ` Cyril Hrubis
@ 2018-10-09  2:05   ` Rafael David Tinoco
  2018-10-09  9:28     ` Cyril Hrubis
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-09  2:05 UTC (permalink / raw)
  To: ltp

On 10/8/18 10:30 AM, Cyril Hrubis wrote:
> Hi!
>> LTP syscall test case futex_wake04 hangs forever on i386 kernel 
>> running on x86_64 server machine.
>> 
>> Test PASS on qemu_i386 and other devices. Test HANGs on i386 (i386
>>  kernel booting on x86_64 machine).
>> 
>> output, incrementing stop futex_wake04    0  TINFO  : Hugepagesize
>> 4194304
> 
> Looks like we are trying to fault 4GB hugepage, that may be quite 
> slow, but then, the test is a kernel reproducer as well.

This is i386 without PAE so the hugepage here is 4 MB (or else it would
be a 2 MB with PAE). I have configured the same environment on my side:

(k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-max 4M:128
(k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-min 4M:128
(k)inaddy@bug3984:~$ sudo hugeadm --pool-list
       Size  Minimum  Current  Maximum  Default
    4194304      128      128      128        *

And ran the test with 500 iterations, multiple times, without a single
issue.

(k)root@bug3984:~$ while true; do cat /proc/meminfo | grep -i
hugepages_free; done

Shoes me the hugepages are being used:

HugePages_Free:       93
HugePages_Free:        4
HugePages_Free:      128
HugePages_Free:        0
HugePages_Free:      128
HugePages_Free:        1
HugePages_Free:        8
HugePages_Free:      128
HugePages_Free:       27
HugePages_Free:        0
HugePages_Free:      102

> Can you connect to the process with gdb and check if we happened to 
> timeout before or after the two threads that are are supposed to 
> trigger kernel bug are started?
> 

Reading the test and commit 13d60f4b6a it looks like this test is
testing futexes indexes stored in hugepages (one in head and the other
outside the hugepage head) and a indexing problem on which futex() was
signalized as non-contended.

Kernel code was using page->index (of page hosting the futex) for futex
offset and, after this commit, it started taking in consideration if the
page was a hugepage (head or tail) or a compound page, when then it
would use the compound_idx for the futex offset, fixing the issue.

In order for the issue to happen we would have to face:

test thread #1                    test thread #2
--------------                    --------------
[hugepage #1 head] *futex1 offset [hugepage #1 head]
[page]                            [page] *futex2 offset
[page]                            [page]
[page]                            [page]
...                               ...

where futex2 key points to futex1.

And test case does NOT look like something intermittent (since it is a
lock ordering issue - depending on pthreads sharing futexes - with wrong
futexes indexes, like described in kernel commit).

* Test case problem would either happen or not *

I tried to extend a bit number of iterations and something caught my 
attention... when running the same test, multiple times, after some time 
I would face something like:

futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet

And there wasn't a second thread task id inside:

/proc/<pid>/task/{<pid>, <tid1>, <tid2>}

like it should.

Likely the task creation (because of error ? check last part) logic here 
was faster than wait_for_threads() logic could expect... causing a race 
for the test's logic.

* This would explain the timeout error we had, right ? *

Since stdout/stderr are being directed to a file: @Naresh, is it 
possible to get output from stdout/err files from the same LKFT URL you 
showed ?

* Wait, there is more ... *

While running the test, and monitoring meminfo, I could see the 
HugePages_Total oscillating:

HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:      54
HugePages_Free:       53
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:       1
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:     101
HugePages_Free:      101
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:     128
HugePages_Free:      128
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0

HugePages_Total:       1
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0

despite the fact that I had configured:

(k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-min 4M:128
(k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-max 4M:128

And had "128" for HugePages_Total in the beginning of the test. Number 
of hugepages total oscillates during a successful test and, at the very 
end, the number for HugePages_Total goes back to 128. I have disabled 
THP completely after this test and there was the same result.

Reading hugetlb_report_meminfo() in hugetlb.c, I am not sure (struct 
hstat *h)->nr_huge_pages should change that much (through new hugepages 
being asked for, sysctl, or numa balance, among other options). I'll try 
to trace this tomorrow to see who is touching the HugePages_Total during 
the test.

* Theory *

HugePages_Total flacks and causes 1 pthread to have errors before the 
test readdir() logic on /proc/<pid>/task/X catches if thread went to 
sleep, causing a race in the readdir() logic.

* Obs *

I still need to finish this last part investigation. I haven't tried 
with other architectures yet, only x86 on non-pae kernel. I have tested 
a 4.18 kernel (debian) and latest mainline and both can reproduce this.

Will get back to this soon...



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-09  2:05   ` Rafael David Tinoco
@ 2018-10-09  9:28     ` Cyril Hrubis
  2018-10-09 11:45       ` Rafael David Tinoco
  2018-10-09 18:49       ` Rafael David Tinoco
  0 siblings, 2 replies; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-09  9:28 UTC (permalink / raw)
  To: ltp

Hi!
> >> LTP syscall test case futex_wake04 hangs forever on i386 kernel 
> >> running on x86_64 server machine.
> >> 
> >> Test PASS on qemu_i386 and other devices. Test HANGs on i386 (i386
> >>  kernel booting on x86_64 machine).
> >> 
> >> output, incrementing stop futex_wake04    0  TINFO  : Hugepagesize
> >> 4194304
> > 
> > Looks like we are trying to fault 4GB hugepage, that may be quite 
> > slow, but then, the test is a kernel reproducer as well.
> 
> This is i386 without PAE so the hugepage here is 4 MB (or else it would
> be a 2 MB with PAE). I have configured the same environment on my side:

Ah, my bad, the test prints the size in bytes and not in kbytes as it's
in /proc/meminfo. So indeed this is 4MB.

> (k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-max 4M:128
> (k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-min 4M:128
> (k)inaddy@bug3984:~$ sudo hugeadm --pool-list
>        Size  Minimum  Current  Maximum  Default
>     4194304      128      128      128        *
> 
> And ran the test with 500 iterations, multiple times, without a single
> issue.
> 
> (k)root@bug3984:~$ while true; do cat /proc/meminfo | grep -i
> hugepages_free; done
> 
> Shoes me the hugepages are being used:
> 
> HugePages_Free:       93
> HugePages_Free:        4
> HugePages_Free:      128
> HugePages_Free:        0
> HugePages_Free:      128
> HugePages_Free:        1
> HugePages_Free:        8
> HugePages_Free:      128
> HugePages_Free:       27
> HugePages_Free:        0
> HugePages_Free:      102
> 
> > Can you connect to the process with gdb and check if we happened to 
> > timeout before or after the two threads that are are supposed to 
> > trigger kernel bug are started?
> > 
> 
> Reading the test and commit 13d60f4b6a it looks like this test is
> testing futexes indexes stored in hugepages (one in head and the other
> outside the hugepage head) and a indexing problem on which futex() was
> signalized as non-contended.
> 
> Kernel code was using page->index (of page hosting the futex) for futex
> offset and, after this commit, it started taking in consideration if the
> page was a hugepage (head or tail) or a compound page, when then it
> would use the compound_idx for the futex offset, fixing the issue.
> 
> In order for the issue to happen we would have to face:
> 
> test thread #1                    test thread #2
> --------------                    --------------
> [hugepage #1 head] *futex1 offset [hugepage #1 head]
> [page]                            [page] *futex2 offset
> [page]                            [page]
> [page]                            [page]
> ...                               ...
> 
> where futex2 key points to futex1.
> 
> And test case does NOT look like something intermittent (since it is a
> lock ordering issue - depending on pthreads sharing futexes - with wrong
> futexes indexes, like described in kernel commit).
> 
> * Test case problem would either happen or not *
> 
> I tried to extend a bit number of iterations and something caught my 
> attention... when running the same test, multiple times, after some time 
> I would face something like:
> 
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
> 
> And there wasn't a second thread task id inside:
> 
> /proc/<pid>/task/{<pid>, <tid1>, <tid2>}
> 
> like it should.
> 
> Likely the task creation (because of error ? check last part) logic here 
> was faster than wait_for_threads() logic could expect... causing a race 
> for the test's logic.

That shouldn't really happen, the wait_for_threads(2) function is called
in a loop until there are two sleeping threads under the
/proc/$pid/task/ directory (minus the main thread). So unless there is a
problem with starting a thread or locking on a futex the loop will end
eventually and the futexes are unlocked only after this loop finishes.

> * This would explain the timeout error we had, right ? *
> 
> Since stdout/stderr are being directed to a file: @Naresh, is it 
> possible to get output from stdout/err files from the same LKFT URL you 
> showed ?
> 
> * Wait, there is more ... *
> 
> While running the test, and monitoring meminfo, I could see the 
> HugePages_Total oscillating:
> 
> HugePages_Total:       1
> HugePages_Free:        1
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:       1
> HugePages_Free:        1
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:      54
> HugePages_Free:       53
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:       1
> HugePages_Free:        1
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:       1
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:     101
> HugePages_Free:      101
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:     128
> HugePages_Free:      128
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:       1
> HugePages_Free:        1
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> HugePages_Total:       1
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> 
> despite the fact that I had configured:
> 
> (k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-min 4M:128
> (k)inaddy@bug3984:~$ sudo hugeadm --hard --pool-pages-max 4M:128
> 
> And had "128" for HugePages_Total in the beginning of the test. Number 
> of hugepages total oscillates during a successful test and, at the very 
> end, the number for HugePages_Total goes back to 128. I have disabled 
> THP completely after this test and there was the same result.
>
> Reading hugetlb_report_meminfo() in hugetlb.c, I am not sure (struct 
> hstat *h)->nr_huge_pages should change that much (through new hugepages 
> being asked for, sysctl, or numa balance, among other options). I'll try 
> to trace this tomorrow to see who is touching the HugePages_Total during 
> the test.

That is because the test modifies the size of the huge pages pool itself
in the test setup so that it can run on a systems where the default is
set to 0. I guess that we should change it so that it check if the
orig_hugepages is non-zero or not and only change the value if it's
zero.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-09  9:28     ` Cyril Hrubis
@ 2018-10-09 11:45       ` Rafael David Tinoco
  2018-10-09 18:49       ` Rafael David Tinoco
  1 sibling, 0 replies; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-09 11:45 UTC (permalink / raw)
  To: ltp

>> And there wasn't a second thread task id inside:
>>
>> /proc/<pid>/task/{<pid>, <tid1>, <tid2>}
>>
>> like it should.
>>
>> Likely the task creation (because of error ? check last part) logic here
>> was faster than wait_for_threads() logic could expect... causing a race
>> for the test's logic.
> 
> That shouldn't really happen, the wait_for_threads(2) function is called
> in a loop until there are two sleeping threads under the
> /proc/$pid/task/ directory (minus the main thread). So unless there is a
> problem with starting a thread or locking on a futex the loop will end
> eventually and the futexes are unlocked only after this loop finishes.

Yes, I'll check why.

>> Reading hugetlb_report_meminfo() in hugetlb.c, I am not sure (struct
>> hstat *h)->nr_huge_pages should change that much (through new hugepages
>> being asked for, sysctl, or numa balance, among other options). I'll try
>> to trace this tomorrow to see who is touching the HugePages_Total during
>> the test.
> 
> That is because the test modifies the size of the huge pages pool itself
> in the test setup so that it can run on a systems where the default is
> set to 0. I guess that we should change it so that it check if the
> orig_hugepages is non-zero or not and only change the value if it's
> zero.

Hum, I missed that =), tks. Will get back to you soon on this.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-09  9:28     ` Cyril Hrubis
  2018-10-09 11:45       ` Rafael David Tinoco
@ 2018-10-09 18:49       ` Rafael David Tinoco
  2018-10-09 21:06         ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
  2018-10-10 10:41         ` [LTP] LTP: futex_wake04 hangs forever on i386 Cyril Hrubis
  1 sibling, 2 replies; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-09 18:49 UTC (permalink / raw)
  To: ltp

>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 7495 not sleeping yet
>>
>> And there wasn't a second thread task id inside:
>>
>> /proc/<pid>/task/{<pid>, <tid1>, <tid2>}
>>
>> like it should.
>>
>> Likely the task creation (because of error ? check last part) logic here
>> was faster than wait_for_threads() logic could expect... causing a race
>> for the test's logic.
> 
> That shouldn't really happen, the wait_for_threads(2) function is called
> in a loop until there are two sleeping threads under the
> /proc/$pid/task/ directory (minus the main thread). So unless there is a
> problem with starting a thread or locking on a futex the loop will end
> eventually and the futexes are unlocked only after this loop finishes.

Okay, so this is what is happening...

Like we predicted, one of the threads had an issue and the logic was 
stuck with 2 tasks only: parent and 1 thread. The parent in (R) mode and 
the thread in (S) mode.

Quoting as an example this situation:

futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
...

Despite, in the beginning, we had a situation that would allow the 
"while(wait_for_threads(2))" to break, like showed here:

(k)inaddy@bug3984:/proc$ PID=11405 ; echo ==== PID=$PID ====; ls -d 
$PID/task/* ; echo COMM ==== ; cat $PID/task/*/comm ; echo STAT ==== ; 
cat $PID/task/*/stat | awk '{print $1" "$2" "$3}' ; echo STACK ====; 
sudo cat $PID/task/*/stack

==== PID=11405 ====
11405/task/11403  11405/task/11404  11405/task/11405
COMM ====
futex_wake04
futex_wake04
futex_wake04
STAT ====
11403 (futex_wake04) S
11404 (futex_wake04) S
11405 (futex_wake04) R
STACK ====
[<0>] hrtimer_nanosleep+0xe6/0x250
[<0>] sys_nanosleep+0x64/0xa0
[<0>] do_fast_syscall_32+0x7f/0x1b0
[<0>] entry_SYSENTER_32+0x4e/0x7c
[<0>] 0xffffffff
[<0>] futex_wait_queue_me+0xb8/0x120
[<0>] futex_wait+0xda/0x210
[<0>] do_futex+0x433/0x9b0
[<0>] sys_futex+0xe9/0x120
[<0>] do_fast_syscall_32+0x7f/0x1b0
[<0>] entry_SYSENTER_32+0x4e/0x7c
[<0>] 0xffffffff
[<0>] apic_timer_interrupt+0x3c/0x44
[<0>] 0xffffffff

Something else happened, interfering in the expected behavior:

(k)inaddy@bug3984:/proc$ PID=11405 ; echo ==== PID=$PID ====; ls -d 
$PID/task/* ; echo COMM ==== ; cat $PID/task/*/comm ; echo STAT ==== ; 
cat $PID/task/*/stat | awk '{print $1" "$2" "$3}' ; echo STACK ====; 
sudo cat $PID/task/*/{comm,stack}

==== PID=11405 ====
11405/task/11403  11405/task/11405
COMM ====
futex_wake04
futex_wake04
STAT ====
11403 (futex_wake04) S
11405 (futex_wake04) R
STACK ====
futex_wake04
futex_wake04
[<0>] wait_woken+0x37/0x70
[<0>] n_tty_write+0x393/0x400
[<0>] tty_write+0x17d/0x310
[<0>] __vfs_write+0x1c/0x50
[<0>] vfs_write+0x94/0x1a0
[<0>] ksys_write+0x49/0xb0
[<0>] sys_write+0x16/0x20
[<0>] do_fast_syscall_32+0x7f/0x1b0
[<0>] entry_SYSENTER_32+0x4e/0x7c
[<0>] 0xffffffff
[<0>] apic_timer_interrupt+0x3c/0x44
[<0>] 0xffffffff

11404 thread is missing in this last sample, meaning it could already be 
gone, but the parent, 11405, running the wait_for_threads() function, 
did not catch that (yet ?) due to a CPU stall related to the terminal 
controller interrupts:

==============================================================
[  147.582036] INFO: rcu_sched detected stalls on CPUs/tasks:
[  147.583782] 	1-....: (2 GPs behind) idle=602/1/1073741828 
softirq=20147/20148 fqs=2239
[  147.585973] 	(detected by 2, t=5252 jiffies, g=7585, c=7584, q=62923)
[  147.587806] Sending NMI from CPU 2 to CPUs 1:
[  147.590144] NMI backtrace for cpu 1

[  147.590145] CPU: 1 PID: 11405 Comm: futex_wake04 Not tainted 
4.18.0-1-686 #1 Debian 4.18.6-1

[  147.590146] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.11.1-1 04/01/2014
[  147.590146] EIP: io_serial_out+0x17/0x20
[  147.590147] Code: ...
[  147.590165] EAX: 00000079 EBX: 00000079 ECX: 00000000 EDX: 000003f8
[  147.590165] ESI: f3be0000 EDI: d1446d60 EBP: f393bf2c ESP: f393bf28
[  147.590166] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000002
[  147.590166] CR0: 80050033 CR2: 0192144c CR3: 31f3b000 CR4: 000406d0
[  147.590167] Call Trace:
[  147.590167]  <IRQ>
[  147.590168]  serial8250_tx_chars+0xf5/0x1f0
[  147.590168]  serial8250_handle_irq.part.21+0xa7/0xd0
[  147.590168]  serial8250_default_handle_irq+0x49/0x60
[  147.590169]  serial8250_interrupt+0x61/0x130
[  147.590169]  __handle_irq_event_percpu+0x42/0x1a0
[  147.590170]  ? handle_level_irq+0x100/0x100
[  147.590170]  handle_irq_event_percpu+0x2b/0x70
[  147.590171]  handle_irq_event+0x2f/0x50
[  147.590171]  ? handle_level_irq+0x100/0x100
[  147.590172]  handle_edge_irq+0x71/0x160
[  147.590172]  handle_irq+0x92/0xa0
[  147.590173]  </IRQ>
[  147.590173]  <SOFTIRQ>
[  147.590173]  do_IRQ+0x44/0xc0
[  147.590174]  ? __softirqentry_text_start+0x8/0x8
[  147.590174]  common_interrupt+0x3b/0x40
[  147.590174] EIP: __do_softirq+0x5b/0x255
[  147.590175] Code: ...
[  147.590192] EAX: 00000282 EBX: f273bf84 ECX: 0000000f EDX: f393e000
[  147.590193] ESI: 00400040 EDI: d1647580 EBP: f393dff8 ESP: f393dfd4
[  147.590194] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000202
[  147.590194]  ? __softirqentry_text_start+0x8/0x8
[  147.590194]  ? __softirqentry_text_start+0x8/0x8
[  147.590195]  call_on_stack+0x45/0x50
[  147.590196]  </SOFTIRQ>
[  147.590196]  ? irq_exit+0xb5/0xc0
[  147.590196]  ? smp_apic_timer_interrupt+0x6c/0x120
[  147.590197]  ? apic_timer_interrupt+0x3c/0x44
==============================================================

Possibly the terminal I/O rate (stdout) has interfered with test 
behavior. When having, together with this, the adding/removing 
hugepages, I was able to "lock" this "sleep task detector" behavior for 
quite sometime.

Testing all this in a ssh session I had no issues. Adding 1 second delay 
in between while() loop for the logic also mitigated the issue =).

LKFT depends on serial console for the test results. Unfortunately we 
don't have a stack trace to analyze, but, I would bet the console rate 
limit could play tricks on us specially if we have already overloaded 
console serial lines with buffer from other runs.

So, with all that, if you agree, I'll send a patch fixing both issues:

1) delay in between the wait_for_threads() logic
2) keep existing hugepages configured

Opened to any comments ...

Best,
-Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep
  2018-10-09 18:49       ` Rafael David Tinoco
@ 2018-10-09 21:06         ` Rafael David Tinoco
  2018-10-10 10:43           ` Cyril Hrubis
  2018-10-10 10:41         ` [LTP] LTP: futex_wake04 hangs forever on i386 Cyril Hrubis
  1 sibling, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-09 21:06 UTC (permalink / raw)
  To: ltp

This commit fixes 2 observed issues:

1) usleep() time is too small if test is being executed in slower
terminal devices (specially embedded systems). Raising it to 0.001
seconds was enough to finish 10240 iterations in around 90 seconds in a
4 vcpu kvm guest (fully emulated serial being used as console).

2) Test was changing number of hugepages during setup()/cleanup() phase
despite the system had (or not) available hugepages. This was causing
overhead of destroying (or creating) hugepages during the test
execution.  Now, if system has > 0 hugepages available, the test doesn't
touch it.

Link: https://bugs.linaro.org/show_bug.cgi?id=3984
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
---
 testcases/kernel/syscalls/futex/futex_wake04.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/testcases/kernel/syscalls/futex/futex_wake04.c b/testcases/kernel/syscalls/futex/futex_wake04.c
index 3c7038317..93d36ac33 100644
--- a/testcases/kernel/syscalls/futex/futex_wake04.c
+++ b/testcases/kernel/syscalls/futex/futex_wake04.c
@@ -76,14 +76,17 @@ static void setup(void)
 	tst_tmpdir();
 
 	SAFE_FILE_SCANF(NULL, PATH_NR_HUGEPAGES, "%ld", &orig_hugepages);
-	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
+
+	if (!orig_hugepages)
+		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
 
 	TEST_PAUSE;
 }
 
 static void cleanup(void)
 {
-	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%ld", orig_hugepages);
+	if (!orig_hugepages)
+		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 0);
 
 	tst_rmdir();
 }
@@ -172,8 +175,9 @@ static void wakeup_thread2(void)
 				tst_strerrno(res));
 	}
 
+	/* 0.001 seconds: less might cause lockups for slower terminals */
 	while (wait_for_threads(2))
-		usleep(100);
+		usleep(1000);
 
 	futex_wake(futex2, 1, 0);
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-09 18:49       ` Rafael David Tinoco
  2018-10-09 21:06         ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
@ 2018-10-10 10:41         ` Cyril Hrubis
  2018-10-10 11:13           ` Rafael David Tinoco
  1 sibling, 1 reply; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-10 10:41 UTC (permalink / raw)
  To: ltp

Jo!
> > That shouldn't really happen, the wait_for_threads(2) function is called
> > in a loop until there are two sleeping threads under the
> > /proc/$pid/task/ directory (minus the main thread). So unless there is a
> > problem with starting a thread or locking on a futex the loop will end
> > eventually and the futexes are unlocked only after this loop finishes.
> 
> Okay, so this is what is happening...
> 
> Like we predicted, one of the threads had an issue and the logic was 
> stuck with 2 tasks only: parent and 1 thread. The parent in (R) mode and 
> the thread in (S) mode.

Actually the parent is in the (S) mode, that is because the parent
spends most of the time in the sleep() anyways.

> Quoting as an example this situation:
> 
> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
> ...
> 
> Despite, in the beginning, we had a situation that would allow the 
> "while(wait_for_threads(2))" to break, like showed here:

That is not true, see below.

> (k)inaddy@bug3984:/proc$ PID=11405 ; echo ==== PID=$PID ====; ls -d 
> $PID/task/* ; echo COMM ==== ; cat $PID/task/*/comm ; echo STAT ==== ; 
> cat $PID/task/*/stat | awk '{print $1" "$2" "$3}' ; echo STACK ====; 
> sudo cat $PID/task/*/stack
> 
> ==== PID=11405 ====
> 11405/task/11403  11405/task/11404  11405/task/11405
> COMM ====
> futex_wake04
> futex_wake04
> futex_wake04
> STAT ====
> 11403 (futex_wake04) S
> 11404 (futex_wake04) S
> 11405 (futex_wake04) R

The pids (tids) increment so the first one is the parent sleeping in
sleep() the second one is the first forked thread locked on futex and
the second one is the child that is supposed to lock on a futex.

> STACK ====
> [<0>] hrtimer_nanosleep+0xe6/0x250
> [<0>] sys_nanosleep+0x64/0xa0
> [<0>] do_fast_syscall_32+0x7f/0x1b0
> [<0>] entry_SYSENTER_32+0x4e/0x7c
> [<0>] 0xffffffff

So this is the parent trace, as expected it sleeps in the nanosleep()
timer.

> [<0>] futex_wait_queue_me+0xb8/0x120
> [<0>] futex_wait+0xda/0x210
> [<0>] do_futex+0x433/0x9b0
> [<0>] sys_futex+0xe9/0x120
> [<0>] do_fast_syscall_32+0x7f/0x1b0
> [<0>] entry_SYSENTER_32+0x4e/0x7c
> [<0>] 0xffffffff

This is the thread that is locked on a futex.

> [<0>] apic_timer_interrupt+0x3c/0x44
> [<0>] 0xffffffff

Huh, are we even supposed to get apic_timer_interrupt in the process
trace?

> Something else happened, interfering in the expected behavior:
> 
> (k)inaddy@bug3984:/proc$ PID=11405 ; echo ==== PID=$PID ====; ls -d 
> $PID/task/* ; echo COMM ==== ; cat $PID/task/*/comm ; echo STAT ==== ; 
> cat $PID/task/*/stat | awk '{print $1" "$2" "$3}' ; echo STACK ====; 
> sudo cat $PID/task/*/{comm,stack}
> 
> ==== PID=11405 ====
> 11405/task/11403  11405/task/11405
> COMM ====
> futex_wake04
> futex_wake04
> STAT ====
> 11403 (futex_wake04) S
> 11405 (futex_wake04) R

Looks like the child that was locked on a futex is no longer there.

If this was running for long enough (5 minutes) the test library process
(parent of 11403 here) send SIGKILL to the whole process group and the
only process killed was the one that was not stuck in the kernel at the
moment.

> STACK ====
> futex_wake04
> futex_wake04
> [<0>] wait_woken+0x37/0x70
> [<0>] n_tty_write+0x393/0x400
> [<0>] tty_write+0x17d/0x310
> [<0>] __vfs_write+0x1c/0x50
> [<0>] vfs_write+0x94/0x1a0
> [<0>] ksys_write+0x49/0xb0
> [<0>] sys_write+0x16/0x20
> [<0>] do_fast_syscall_32+0x7f/0x1b0
> [<0>] entry_SYSENTER_32+0x4e/0x7c
> [<0>] 0xffffffff

So supposedly the parent is writing something to stderr at this point.

> [<0>] apic_timer_interrupt+0x3c/0x44
> [<0>] 0xffffffff

And the child is (again) in the interrupt handler.

> 11404 thread is missing in this last sample, meaning it could already be 
> gone, but the parent, 11405, running the wait_for_threads() function, 
> did not catch that (yet ?) due to a CPU stall related to the terminal 
> controller interrupts:

The parent will not catch that at all, as the wait_for_threads() is very
simple function that stops the loop only and only if there are two
threads, different from the caller thread, sleeping in kernel.

> ==============================================================
> [  147.582036] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  147.583782] 	1-....: (2 GPs behind) idle=602/1/1073741828 
> softirq=20147/20148 fqs=2239
> [  147.585973] 	(detected by 2, t=5252 jiffies, g=7585, c=7584, q=62923)
> [  147.587806] Sending NMI from CPU 2 to CPUs 1:
> [  147.590144] NMI backtrace for cpu 1
> 
> [  147.590145] CPU: 1 PID: 11405 Comm: futex_wake04 Not tainted 
> 4.18.0-1-686 #1 Debian 4.18.6-1
> 
> [  147.590146] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS 1.11.1-1 04/01/2014
> [  147.590146] EIP: io_serial_out+0x17/0x20
> [  147.590147] Code: ...
> [  147.590165] EAX: 00000079 EBX: 00000079 ECX: 00000000 EDX: 000003f8
> [  147.590165] ESI: f3be0000 EDI: d1446d60 EBP: f393bf2c ESP: f393bf28
> [  147.590166] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000002
> [  147.590166] CR0: 80050033 CR2: 0192144c CR3: 31f3b000 CR4: 000406d0
> [  147.590167] Call Trace:
> [  147.590167]  <IRQ>
> [  147.590168]  serial8250_tx_chars+0xf5/0x1f0
> [  147.590168]  serial8250_handle_irq.part.21+0xa7/0xd0
> [  147.590168]  serial8250_default_handle_irq+0x49/0x60
> [  147.590169]  serial8250_interrupt+0x61/0x130
> [  147.590169]  __handle_irq_event_percpu+0x42/0x1a0
> [  147.590170]  ? handle_level_irq+0x100/0x100
> [  147.590170]  handle_irq_event_percpu+0x2b/0x70
> [  147.590171]  handle_irq_event+0x2f/0x50
> [  147.590171]  ? handle_level_irq+0x100/0x100
> [  147.590172]  handle_edge_irq+0x71/0x160
> [  147.590172]  handle_irq+0x92/0xa0
> [  147.590173]  </IRQ>
> [  147.590173]  <SOFTIRQ>
> [  147.590173]  do_IRQ+0x44/0xc0
> [  147.590174]  ? __softirqentry_text_start+0x8/0x8
> [  147.590174]  common_interrupt+0x3b/0x40
> [  147.590174] EIP: __do_softirq+0x5b/0x255
> [  147.590175] Code: ...
> [  147.590192] EAX: 00000282 EBX: f273bf84 ECX: 0000000f EDX: f393e000
> [  147.590193] ESI: 00400040 EDI: d1647580 EBP: f393dff8 ESP: f393dfd4
> [  147.590194] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000202
> [  147.590194]  ? __softirqentry_text_start+0x8/0x8
> [  147.590194]  ? __softirqentry_text_start+0x8/0x8
> [  147.590195]  call_on_stack+0x45/0x50
> [  147.590196]  </SOFTIRQ>
> [  147.590196]  ? irq_exit+0xb5/0xc0
> [  147.590196]  ? smp_apic_timer_interrupt+0x6c/0x120
> [  147.590197]  ? apic_timer_interrupt+0x3c/0x44
> ==============================================================
> 
> Possibly the terminal I/O rate (stdout) has interfered with test 
> behavior. When having, together with this, the adding/removing 
> hugepages, I was able to "lock" this "sleep task detector" behavior for 
> quite sometime.
> 
> Testing all this in a ssh session I had no issues. Adding 1 second delay 
> in between while() loop for the logic also mitigated the issue =).
> 
> LKFT depends on serial console for the test results. Unfortunately we 
> don't have a stack trace to analyze, but, I would bet the console rate 
> limit could play tricks on us specially if we have already overloaded 
> console serial lines with buffer from other runs.

Doing a simple math if you have serial line on 115200 it can transwer
about 11.5 bits in 100uS so yes we can easily overload serial connection
if we print messages that fast, so increasing the sleep 10x times sounds
only reasonable.

On the other hand I guess that kernel shouldn't produce stalls just
because we write to serial too fast. I guess that the task writing to
serial should be sleeping in a waitqueue somewhere in kernel if it
attempts to write to a full serial buffer, which will effectivelly
slow down the writer and the test would work fine in this case...

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep
  2018-10-09 21:06         ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
@ 2018-10-10 10:43           ` Cyril Hrubis
  2018-10-10 11:14             ` Rafael David Tinoco
                               ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-10 10:43 UTC (permalink / raw)
  To: ltp

Hi!
> This commit fixes 2 observed issues:
> 
> 1) usleep() time is too small if test is being executed in slower
> terminal devices (specially embedded systems). Raising it to 0.001
> seconds was enough to finish 10240 iterations in around 90 seconds in a
> 4 vcpu kvm guest (fully emulated serial being used as console).
> 
> 2) Test was changing number of hugepages during setup()/cleanup() phase
> despite the system had (or not) available hugepages. This was causing
> overhead of destroying (or creating) hugepages during the test
> execution.  Now, if system has > 0 hugepages available, the test doesn't
> touch it.

Can you please split this into separate patches? Generally it's
considered a good practice to split unrelated changes like these.

> Link: https://bugs.linaro.org/show_bug.cgi?id=3984
> Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
> ---
>  testcases/kernel/syscalls/futex/futex_wake04.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/testcases/kernel/syscalls/futex/futex_wake04.c b/testcases/kernel/syscalls/futex/futex_wake04.c
> index 3c7038317..93d36ac33 100644
> --- a/testcases/kernel/syscalls/futex/futex_wake04.c
> +++ b/testcases/kernel/syscalls/futex/futex_wake04.c
> @@ -76,14 +76,17 @@ static void setup(void)
>  	tst_tmpdir();
>  
>  	SAFE_FILE_SCANF(NULL, PATH_NR_HUGEPAGES, "%ld", &orig_hugepages);
> -	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
> +
> +	if (!orig_hugepages)

Maybe <= 0 would be a bit more robust.

> +		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
>  
>  	TEST_PAUSE;
>  }
>  
>  static void cleanup(void)
>  {
> -	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%ld", orig_hugepages);
> +	if (!orig_hugepages)
> +		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 0);
>  
>  	tst_rmdir();
>  }
> @@ -172,8 +175,9 @@ static void wakeup_thread2(void)
>  				tst_strerrno(res));
>  	}
>  
> +	/* 0.001 seconds: less might cause lockups for slower terminals */

I would avoid comments like this, you can check the git changelog to
figure out why the value was changed.

>  	while (wait_for_threads(2))
> -		usleep(100);
> +		usleep(1000);
>  
>  	futex_wake(futex2, 1, 0);
>  
> -- 
> 2.19.1
> 

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-10 10:41         ` [LTP] LTP: futex_wake04 hangs forever on i386 Cyril Hrubis
@ 2018-10-10 11:13           ` Rafael David Tinoco
  2018-10-10 12:05             ` Cyril Hrubis
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 11:13 UTC (permalink / raw)
  To: ltp

On 10/10/18 7:41 AM, Cyril Hrubis wrote:
> Jo!
>>> That shouldn't really happen, the wait_for_threads(2) function is called
>>> in a loop until there are two sleeping threads under the
>>> /proc/$pid/task/ directory (minus the main thread). So unless there is a
>>> problem with starting a thread or locking on a futex the loop will end
>>> eventually and the futexes are unlocked only after this loop finishes.
>>
>> Okay, so this is what is happening...
>>
>> Like we predicted, one of the threads had an issue and the logic was
>> stuck with 2 tasks only: parent and 1 thread. The parent in (R) mode and
>> the thread in (S) mode.
> 
> Actually the parent is in the (S) mode, that is because the parent
> spends most of the time in the sleep() anyways.
> 
>> Quoting as an example this situation:
>>
>> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
>> futex_wake04    0  TINFO  :  Thread 11405 not sleeping yet
>> ...
>>
>> Despite, in the beginning, we had a situation that would allow the
>> "while(wait_for_threads(2))" to break, like showed here:
> 
> That is not true, see below.
> 
>> (k)inaddy@bug3984:/proc$ PID=11405 ; echo ==== PID=$PID ====; ls -d
>> $PID/task/* ; echo COMM ==== ; cat $PID/task/*/comm ; echo STAT ==== ;
>> cat $PID/task/*/stat | awk '{print $1" "$2" "$3}' ; echo STACK ====;
>> sudo cat $PID/task/*/stack
>>
>> ==== PID=11405 ====
>> 11405/task/11403  11405/task/11404  11405/task/11405
>> COMM ====
>> futex_wake04
>> futex_wake04
>> futex_wake04
>> STAT ====
>> 11403 (futex_wake04) S
>> 11404 (futex_wake04) S
>> 11405 (futex_wake04) R
> 
> The pids (tids) increment so the first one is the parent sleeping in
> sleep() the second one is the first forked thread locked on futex and
> the second one is the child that is supposed to lock on a futex.

Yes, I had realized this after I wrote this part, during conclusion, and 
never revisited the upper part of this e-mail, sorry.

> 
>> STACK ====
>> [<0>] hrtimer_nanosleep+0xe6/0x250
>> [<0>] sys_nanosleep+0x64/0xa0
>> [<0>] do_fast_syscall_32+0x7f/0x1b0
>> [<0>] entry_SYSENTER_32+0x4e/0x7c
>> [<0>] 0xffffffff
> 
> So this is the parent trace, as expected it sleeps in the nanosleep()
> timer.

Yep

> 
>> [<0>] futex_wait_queue_me+0xb8/0x120
>> [<0>] futex_wait+0xda/0x210
>> [<0>] do_futex+0x433/0x9b0
>> [<0>] sys_futex+0xe9/0x120
>> [<0>] do_fast_syscall_32+0x7f/0x1b0
>> [<0>] entry_SYSENTER_32+0x4e/0x7c
>> [<0>] 0xffffffff
> 
> This is the thread that is locked on a futex.
> 
>> [<0>] apic_timer_interrupt+0x3c/0x44
>> [<0>] 0xffffffff
> 
> Huh, are we even supposed to get apic_timer_interrupt in the process
> trace?

I believe this is the kernel stack for the process, showing it was 
preempted by the IRQ handler for the clock, right after it entered 
kernel.. IRQ handlers run with their own stacks, so we can't see the 
rest in here, but the function that prints the stack on dmesg does get 
it by checking "__softirqentry_text_start" session.

>> Something else happened, interfering in the expected behavior:
>>
>> (k)inaddy@bug3984:/proc$ PID=11405 ; echo ==== PID=$PID ====; ls -d
>> $PID/task/* ; echo COMM ==== ; cat $PID/task/*/comm ; echo STAT ==== ;
>> cat $PID/task/*/stat | awk '{print $1" "$2" "$3}' ; echo STACK ====;
>> sudo cat $PID/task/*/{comm,stack}
>>
>> ==== PID=11405 ====
>> 11405/task/11403  11405/task/11405
>> COMM ====
>> futex_wake04
>> futex_wake04
>> STAT ====
>> 11403 (futex_wake04) S
>> 11405 (futex_wake04) R
> 
> Looks like the child that was locked on a futex is no longer there.
> 
> If this was running for long enough (5 minutes) the test library process
> (parent of 11403 here) send SIGKILL to the whole process group and the
> only process killed was the one that was not stuck in the kernel at the
> moment. >
>> STACK ====
>> futex_wake04
>> futex_wake04
>> [<0>] wait_woken+0x37/0x70
>> [<0>] n_tty_write+0x393/0x400
>> [<0>] tty_write+0x17d/0x310
>> [<0>] __vfs_write+0x1c/0x50
>> [<0>] vfs_write+0x94/0x1a0
>> [<0>] ksys_write+0x49/0xb0
>> [<0>] sys_write+0x16/0x20
>> [<0>] do_fast_syscall_32+0x7f/0x1b0
>> [<0>] entry_SYSENTER_32+0x4e/0x7c
>> [<0>] 0xffffffff
> 
> So supposedly the parent is writing something to stderr at this point.
> 
>> [<0>] apic_timer_interrupt+0x3c/0x44
>> [<0>] 0xffffffff
> 
> And the child is (again) in the interrupt handler.
> 
>> 11404 thread is missing in this last sample, meaning it could already be
>> gone, but the parent, 11405, running the wait_for_threads() function,
>> did not catch that (yet ?) due to a CPU stall related to the terminal
>> controller interrupts:
> 
> The parent will not catch that at all, as the wait_for_threads() is very
> simple function that stops the loop only and only if there are two
> threads, different from the caller thread, sleeping in kernel.

Agreed.

> 
>> ==============================================================
>> [  147.582036] INFO: rcu_sched detected stalls on CPUs/tasks:
>> [  147.583782] 	1-....: (2 GPs behind) idle=602/1/1073741828
>> softirq=20147/20148 fqs=2239
>> [  147.585973] 	(detected by 2, t=5252 jiffies, g=7585, c=7584, q=62923)
>> [  147.587806] Sending NMI from CPU 2 to CPUs 1:
>> [  147.590144] NMI backtrace for cpu 1
>>
>> [  147.590145] CPU: 1 PID: 11405 Comm: futex_wake04 Not tainted
>> 4.18.0-1-686 #1 Debian 4.18.6-1
>>
>> [  147.590146] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS 1.11.1-1 04/01/2014
>> [  147.590146] EIP: io_serial_out+0x17/0x20
>> [  147.590147] Code: ...
>> [  147.590165] EAX: 00000079 EBX: 00000079 ECX: 00000000 EDX: 000003f8
>> [  147.590165] ESI: f3be0000 EDI: d1446d60 EBP: f393bf2c ESP: f393bf28
>> [  147.590166] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000002
>> [  147.590166] CR0: 80050033 CR2: 0192144c CR3: 31f3b000 CR4: 000406d0
>> [  147.590167] Call Trace:
>> [  147.590167]  <IRQ>
>> [  147.590168]  serial8250_tx_chars+0xf5/0x1f0
>> [  147.590168]  serial8250_handle_irq.part.21+0xa7/0xd0
>> [  147.590168]  serial8250_default_handle_irq+0x49/0x60
>> [  147.590169]  serial8250_interrupt+0x61/0x130
>> [  147.590169]  __handle_irq_event_percpu+0x42/0x1a0
>> [  147.590170]  ? handle_level_irq+0x100/0x100
>> [  147.590170]  handle_irq_event_percpu+0x2b/0x70
>> [  147.590171]  handle_irq_event+0x2f/0x50
>> [  147.590171]  ? handle_level_irq+0x100/0x100
>> [  147.590172]  handle_edge_irq+0x71/0x160
>> [  147.590172]  handle_irq+0x92/0xa0
>> [  147.590173]  </IRQ>
>> [  147.590173]  <SOFTIRQ>
>> [  147.590173]  do_IRQ+0x44/0xc0
>> [  147.590174]  ? __softirqentry_text_start+0x8/0x8
>> [  147.590174]  common_interrupt+0x3b/0x40
>> [  147.590174] EIP: __do_softirq+0x5b/0x255
>> [  147.590175] Code: ...
>> [  147.590192] EAX: 00000282 EBX: f273bf84 ECX: 0000000f EDX: f393e000
>> [  147.590193] ESI: 00400040 EDI: d1647580 EBP: f393dff8 ESP: f393dfd4
>> [  147.590194] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000202
>> [  147.590194]  ? __softirqentry_text_start+0x8/0x8
>> [  147.590194]  ? __softirqentry_text_start+0x8/0x8
>> [  147.590195]  call_on_stack+0x45/0x50
>> [  147.590196]  </SOFTIRQ>
>> [  147.590196]  ? irq_exit+0xb5/0xc0
>> [  147.590196]  ? smp_apic_timer_interrupt+0x6c/0x120
>> [  147.590197]  ? apic_timer_interrupt+0x3c/0x44
>> ==============================================================
>>
>> Possibly the terminal I/O rate (stdout) has interfered with test
>> behavior. When having, together with this, the adding/removing
>> hugepages, I was able to "lock" this "sleep task detector" behavior for
>> quite sometime.
>>
>> Testing all this in a ssh session I had no issues. Adding 1 second delay
>> in between while() loop for the logic also mitigated the issue =).
>>
>> LKFT depends on serial console for the test results. Unfortunately we
>> don't have a stack trace to analyze, but, I would bet the console rate
>> limit could play tricks on us specially if we have already overloaded
>> console serial lines with buffer from other runs.
> 
> Doing a simple math if you have serial line on 115200 it can transwer
> about 11.5 bits in 100uS so yes we can easily overload serial connection
> if we print messages that fast, so increasing the sleep 10x times sounds
> only reasonable.
> 
> On the other hand I guess that kernel shouldn't produce stalls just
> because we write to serial too fast. I guess that the task writing to
> serial should be sleeping in a waitqueue somewhere in kernel if it
> attempts to write to a full serial buffer, which will effectivelly
> slow down the writer and the test would work fine in this case...

That is what happened, no ? Stack trace shows that possibly same process 
that had its execution interrupted by the time handler in a recent past 
(with a leftover frame pointer indicated by ? apic_timer_interrupt) also 
ran the handler for serial8250_interrupt() and that execution caused 
some dmesg warnings like:

serial8250: too much work for irq

which I had. I guess having the terminal with a smaller buffer would 
help here as well, since the calls to the interrupt handler would be 
more often and less time consuming (because of baud rate).

Continuing, there was a software interrupt to flush serial 8250 contents 
and it took too long to flush to the device, causing the task, waiting 
on the line, to wait.

HOWEVER, for this case, for example, every single byte written to a 
terminal memory mapped I/O register causes a VM_EXIT (in the host: 
cpu->kvm_run->exit_reason == 2), giving execution back to the QEMU vCPU 
(that had entered VM mode), that will schedule a function inside the 
emulation thread to deal with that I/O (for the virtualized serial HW 
being emulated) and calling the VM_ENTER again at some further point. 
This could make the interrupt handler even slower.

I sent a patch in the thread yesterday, feel free to change it 
(interval) if you need. After those 2 fixes I would ask Naresh to test 
this same test for some time (patching our LTP running version on LKFT) 
to make sure we solved original complain (despite this behavior) and 
interpretations were correct.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep
  2018-10-10 10:43           ` Cyril Hrubis
@ 2018-10-10 11:14             ` Rafael David Tinoco
  2018-10-10 12:06               ` Cyril Hrubis
  2018-10-10 11:41             ` [LTP] [PATCH v2 1/2] futex/futex_wake04.c: fix hugepages setup for test Rafael David Tinoco
  2018-10-10 14:20             ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
  2 siblings, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 11:14 UTC (permalink / raw)
  To: ltp

On 10/10/18 7:43 AM, Cyril Hrubis wrote:
> Hi!
>> This commit fixes 2 observed issues:
>>
>> 1) usleep() time is too small if test is being executed in slower
>> terminal devices (specially embedded systems). Raising it to 0.001
>> seconds was enough to finish 10240 iterations in around 90 seconds in a
>> 4 vcpu kvm guest (fully emulated serial being used as console).
>>
>> 2) Test was changing number of hugepages during setup()/cleanup() phase
>> despite the system had (or not) available hugepages. This was causing
>> overhead of destroying (or creating) hugepages during the test
>> execution.  Now, if system has > 0 hugepages available, the test doesn't
>> touch it.
> 
> Can you please split this into separate patches? Generally it's
> considered a good practice to split unrelated changes like these.

Yes, will do. Ignore my last sentence in the other e-mail on the same 
thread (about this patch). Will post 2 patches "v2".

Thanks a lot Cyril!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH v2 1/2] futex/futex_wake04.c: fix hugepages setup for test
  2018-10-10 10:43           ` Cyril Hrubis
  2018-10-10 11:14             ` Rafael David Tinoco
@ 2018-10-10 11:41             ` Rafael David Tinoco
  2018-10-10 11:41               ` [LTP] [PATCH v2 2/2] futex/futex_wake04.c: raise delay waiting for threads Rafael David Tinoco
  2018-10-10 14:20             ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
  2 siblings, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 11:41 UTC (permalink / raw)
  To: ltp

Test was changing number of hugepages during setup()/cleanup() phase,
despite system having (or not) already configured hugepages. This was
causing some overhead - of destroying (or creating) hugepages - during
the test execution, and also changing amount of available hugepages to
the rest of the system.

Now, if system has > 0 hugepages available, the test doesn't touch it.

Link: https://bugs.linaro.org/show_bug.cgi?id=3984
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
---
 testcases/kernel/syscalls/futex/futex_wake04.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/testcases/kernel/syscalls/futex/futex_wake04.c b/testcases/kernel/syscalls/futex/futex_wake04.c
index 3c7038317..1ae9599ab 100644
--- a/testcases/kernel/syscalls/futex/futex_wake04.c
+++ b/testcases/kernel/syscalls/futex/futex_wake04.c
@@ -76,14 +76,17 @@ static void setup(void)
 	tst_tmpdir();
 
 	SAFE_FILE_SCANF(NULL, PATH_NR_HUGEPAGES, "%ld", &orig_hugepages);
-	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
+
+	if (!orig_hugepages)
+		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
 
 	TEST_PAUSE;
 }
 
 static void cleanup(void)
 {
-	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%ld", orig_hugepages);
+	if (!orig_hugepages)
+		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 0);
 
 	tst_rmdir();
 }
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [LTP] [PATCH v2 2/2] futex/futex_wake04.c: raise delay waiting for threads
  2018-10-10 11:41             ` [LTP] [PATCH v2 1/2] futex/futex_wake04.c: fix hugepages setup for test Rafael David Tinoco
@ 2018-10-10 11:41               ` Rafael David Tinoco
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 11:41 UTC (permalink / raw)
  To: ltp

usleep() time was too small if test is being executed in slower terminal
devices (specially embedded systems). Raising it to 0.001 seconds was
enough to finish 10240 iterations in around 90 seconds in a 4 vcpu kvm
guest (fully emulated serial being used as console).

Link: https://bugs.linaro.org/show_bug.cgi?id=3984
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
---
 testcases/kernel/syscalls/futex/futex_wake04.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/testcases/kernel/syscalls/futex/futex_wake04.c b/testcases/kernel/syscalls/futex/futex_wake04.c
index 1ae9599ab..bd10e56af 100644
--- a/testcases/kernel/syscalls/futex/futex_wake04.c
+++ b/testcases/kernel/syscalls/futex/futex_wake04.c
@@ -175,8 +175,9 @@ static void wakeup_thread2(void)
 				tst_strerrno(res));
 	}
 
+    /* 0.001 seconds: less might cause lockups for slower terminals */
 	while (wait_for_threads(2))
-		usleep(100);
+		usleep(1000);
 
 	futex_wake(futex2, 1, 0);
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-10 11:13           ` Rafael David Tinoco
@ 2018-10-10 12:05             ` Cyril Hrubis
  2018-10-10 12:15               ` Rafael David Tinoco
  0 siblings, 1 reply; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-10 12:05 UTC (permalink / raw)
  To: ltp

Hi!
> > Doing a simple math if you have serial line on 115200 it can transwer
> > about 11.5 bits in 100uS so yes we can easily overload serial connection
> > if we print messages that fast, so increasing the sleep 10x times sounds
> > only reasonable.
> > 
> > On the other hand I guess that kernel shouldn't produce stalls just
> > because we write to serial too fast. I guess that the task writing to
> > serial should be sleeping in a waitqueue somewhere in kernel if it
> > attempts to write to a full serial buffer, which will effectivelly
> > slow down the writer and the test would work fine in this case...
> 
> That is what happened, no ? Stack trace shows that possibly same process 
> that had its execution interrupted by the time handler in a recent past 
> (with a leftover frame pointer indicated by ? apic_timer_interrupt) also 
> ran the handler for serial8250_interrupt() and that execution caused 
> some dmesg warnings like:
> 
> serial8250: too much work for irq
> 
> which I had. I guess having the terminal with a smaller buffer would 
> help here as well, since the calls to the interrupt handler would be 
> more often and less time consuming (because of baud rate).
> 
> Continuing, there was a software interrupt to flush serial 8250 contents 
> and it took too long to flush to the device, causing the task, waiting 
> on the line, to wait.
> 
> HOWEVER, for this case, for example, every single byte written to a 
> terminal memory mapped I/O register causes a VM_EXIT (in the host: 
> cpu->kvm_run->exit_reason == 2), giving execution back to the QEMU vCPU 
> (that had entered VM mode), that will schedule a function inside the 
> emulation thread to deal with that I/O (for the virtualized serial HW 
> being emulated) and calling the VM_ENTER again at some further point. 
> This could make the interrupt handler even slower.

Ok, I guess that I understand why this is so slow, we emulate PC memory
mapped serial port and we go back and forth for each byte we write. And
I guess that we cannot do much about this.

Also I suppose that it could be "fixed" by switching to virtio serial
driver that should be able to read the whole buffer in one go.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep
  2018-10-10 11:14             ` Rafael David Tinoco
@ 2018-10-10 12:06               ` Cyril Hrubis
  0 siblings, 0 replies; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-10 12:06 UTC (permalink / raw)
  To: ltp

Hi!
> > Can you please split this into separate patches? Generally it's
> > considered a good practice to split unrelated changes like these.
> 
> Yes, will do. Ignore my last sentence in the other e-mail on the same 
> thread (about this patch). Will post 2 patches "v2".

There were some minor comments, if you agree with them I can amend the
patches you send before pushing them...

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-10 12:05             ` Cyril Hrubis
@ 2018-10-10 12:15               ` Rafael David Tinoco
  2018-10-10 12:33                 ` Cyril Hrubis
  0 siblings, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 12:15 UTC (permalink / raw)
  To: ltp

On 10/10/18 9:05 AM, Cyril Hrubis wrote:
>> That is what happened, no ? Stack trace shows that possibly same process
>> that had its execution interrupted by the time handler in a recent past
>> (with a leftover frame pointer indicated by ? apic_timer_interrupt) also
>> ran the handler for serial8250_interrupt() and that execution caused
>> some dmesg warnings like:
>>
>> serial8250: too much work for irq
>>
>> which I had. I guess having the terminal with a smaller buffer would
>> help here as well, since the calls to the interrupt handler would be
>> more often and less time consuming (because of baud rate).
>>
>> Continuing, there was a software interrupt to flush serial 8250 contents
>> and it took too long to flush to the device, causing the task, waiting
>> on the line, to wait.
>>
>> HOWEVER, for this case, for example, every single byte written to a
>> terminal memory mapped I/O register causes a VM_EXIT (in the host:
>> cpu->kvm_run->exit_reason == 2), giving execution back to the QEMU vCPU
>> (that had entered VM mode), that will schedule a function inside the
>> emulation thread to deal with that I/O (for the virtualized serial HW
>> being emulated) and calling the VM_ENTER again at some further point.
>> This could make the interrupt handler even slower.
> Ok, I guess that I understand why this is so slow, we emulate PC memory
> mapped serial port and we go back and forth for each byte we write. And
> I guess that we cannot do much about this.
> 
> Also I suppose that it could be "fixed" by switching to virtio serial
> driver that should be able to read the whole buffer in one go.

Just as curiosity... Unfortunately, ARM QEMU does not seem to have a 
virtio serial device (yet ?) and x86 can't have boot loaders on a virtio 
serial device, only systemd output and console, afaik. =(.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-10 12:15               ` Rafael David Tinoco
@ 2018-10-10 12:33                 ` Cyril Hrubis
  2018-10-10 12:48                   ` Rafael David Tinoco
  0 siblings, 1 reply; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-10 12:33 UTC (permalink / raw)
  To: ltp

Hi!
> > Ok, I guess that I understand why this is so slow, we emulate PC memory
> > mapped serial port and we go back and forth for each byte we write. And
> > I guess that we cannot do much about this.
> > 
> > Also I suppose that it could be "fixed" by switching to virtio serial
> > driver that should be able to read the whole buffer in one go.
> 
> Just as curiosity... Unfortunately, ARM QEMU does not seem to have a 
> virtio serial device (yet ?) and x86 can't have boot loaders on a virtio 
> serial device, only systemd output and console, afaik. =(.

Huh, I'm pretty sure we run aarch64 VMs with virtio serial devices.

Also AFAIC you can enable both PC serial port emulation and virtio
serial and configure the system to start console on the virtio serial
device.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] LTP: futex_wake04 hangs forever on i386
  2018-10-10 12:33                 ` Cyril Hrubis
@ 2018-10-10 12:48                   ` Rafael David Tinoco
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 12:48 UTC (permalink / raw)
  To: ltp

On 10/10/18 9:33 AM, Cyril Hrubis wrote:
> Hi!
>>> Ok, I guess that I understand why this is so slow, we emulate PC memory
>>> mapped serial port and we go back and forth for each byte we write. And
>>> I guess that we cannot do much about this.
>>>
>>> Also I suppose that it could be "fixed" by switching to virtio serial
>>> driver that should be able to read the whole buffer in one go.
>>
>> Just as curiosity... Unfortunately, ARM QEMU does not seem to have a
>> virtio serial device (yet ?) and x86 can't have boot loaders on a virtio
>> serial device, only systemd output and console, afaik. =(.
> 
> Huh, I'm pretty sure we run aarch64 VMs with virtio serial devices.

Yes, you are right. This is what I have in my dev machine:

libvirt xml:

     <serial type='pty'>
       <source path='/dev/pts/1'/>
       <target type='system-serial' port='0'>
         <model name='pl011'/>
       </target>
       <alias name='serial0'/>
     </serial>
     <console type='pty' tty='/dev/pts/1'>
       <source path='/dev/pts/1'/>
       <target type='serial' port='0'/>
       <alias name='serial0'/>
     </console>
     <console type='pty'>
       <source path='/dev/pts/2'/>
       <target type='virtio' port='1'/>
       <alias name='console1'/>
     </console>

And the cmdline:

-device 
pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 
-device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0

-chardev pty,id=charserial0 -serial chardev:charserial0 -chardev 
pty,id=charconsole1 -device virtconsole,chardev=charconsole1,id=console1

> Also AFAIC you can enable both PC serial port emulation and virtio
> serial and configure the system to start console on the virtio serial
> device.

True.

Best,
Rafael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep
  2018-10-10 10:43           ` Cyril Hrubis
  2018-10-10 11:14             ` Rafael David Tinoco
  2018-10-10 11:41             ` [LTP] [PATCH v2 1/2] futex/futex_wake04.c: fix hugepages setup for test Rafael David Tinoco
@ 2018-10-10 14:20             ` Rafael David Tinoco
  2018-10-11 12:23               ` Cyril Hrubis
  2 siblings, 1 reply; 20+ messages in thread
From: Rafael David Tinoco @ 2018-10-10 14:20 UTC (permalink / raw)
  To: ltp

On 10/10/18 7:43 AM, Cyril Hrubis wrote:
> Hi!
>> This commit fixes 2 observed issues:
>>
>> 1) usleep() time is too small if test is being executed in slower
>> terminal devices (specially embedded systems). Raising it to 0.001
>> seconds was enough to finish 10240 iterations in around 90 seconds in a
>> 4 vcpu kvm guest (fully emulated serial being used as console).
>>
>> 2) Test was changing number of hugepages during setup()/cleanup() phase
>> despite the system had (or not) available hugepages. This was causing
>> overhead of destroying (or creating) hugepages during the test
>> execution.  Now, if system has > 0 hugepages available, the test doesn't
>> touch it.
> 
> Can you please split this into separate patches? Generally it's
> considered a good practice to split unrelated changes like these.
> 
>> Link: https://bugs.linaro.org/show_bug.cgi?id=3984
>> Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
>> ---
>>   testcases/kernel/syscalls/futex/futex_wake04.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/testcases/kernel/syscalls/futex/futex_wake04.c b/testcases/kernel/syscalls/futex/futex_wake04.c
>> index 3c7038317..93d36ac33 100644
>> --- a/testcases/kernel/syscalls/futex/futex_wake04.c
>> +++ b/testcases/kernel/syscalls/futex/futex_wake04.c
>> @@ -76,14 +76,17 @@ static void setup(void)
>>   	tst_tmpdir();
>>   
>>   	SAFE_FILE_SCANF(NULL, PATH_NR_HUGEPAGES, "%ld", &orig_hugepages);
>> -	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
>> +
>> +	if (!orig_hugepages)
> 
> Maybe <= 0 would be a bit more robust.
> 
>> +		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 1);
>>   
>>   	TEST_PAUSE;
>>   }
>>   
>>   static void cleanup(void)
>>   {
>> -	SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%ld", orig_hugepages);
>> +	if (!orig_hugepages)
>> +		SAFE_FILE_PRINTF(NULL, PATH_NR_HUGEPAGES, "%d", 0);
>>   
>>   	tst_rmdir();
>>   }
>> @@ -172,8 +175,9 @@ static void wakeup_thread2(void)
>>   				tst_strerrno(res));
>>   	}
>>   
>> +	/* 0.001 seconds: less might cause lockups for slower terminals */
> 
> I would avoid comments like this, you can check the git changelog to
> figure out why the value was changed.
> 
>>   	while (wait_for_threads(2))
>> -		usleep(100);
>> +		usleep(1000);
>>   
>>   	futex_wake(futex2, 1, 0);
>>   
>> -- 
>> 2.19.1
>>
> 

I missed these comments in the 2 v2 I sent. Please feel free to change 
or let me know if you prefer a v3. Tks a lot.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep
  2018-10-10 14:20             ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
@ 2018-10-11 12:23               ` Cyril Hrubis
  0 siblings, 0 replies; 20+ messages in thread
From: Cyril Hrubis @ 2018-10-11 12:23 UTC (permalink / raw)
  To: ltp

Hi!
> I missed these comments in the 2 v2 I sent. Please feel free to change 
> or let me know if you prefer a v3. Tks a lot.

Ammended and pushed, thanks.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-10-11 12:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08 13:21 [LTP] LTP: futex_wake04 hangs forever on i386 Naresh Kamboju
2018-10-08 13:30 ` Cyril Hrubis
2018-10-09  2:05   ` Rafael David Tinoco
2018-10-09  9:28     ` Cyril Hrubis
2018-10-09 11:45       ` Rafael David Tinoco
2018-10-09 18:49       ` Rafael David Tinoco
2018-10-09 21:06         ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
2018-10-10 10:43           ` Cyril Hrubis
2018-10-10 11:14             ` Rafael David Tinoco
2018-10-10 12:06               ` Cyril Hrubis
2018-10-10 11:41             ` [LTP] [PATCH v2 1/2] futex/futex_wake04.c: fix hugepages setup for test Rafael David Tinoco
2018-10-10 11:41               ` [LTP] [PATCH v2 2/2] futex/futex_wake04.c: raise delay waiting for threads Rafael David Tinoco
2018-10-10 14:20             ` [LTP] [PATCH] futex/futex_wake04.c: fix issues with hugepages and usleep Rafael David Tinoco
2018-10-11 12:23               ` Cyril Hrubis
2018-10-10 10:41         ` [LTP] LTP: futex_wake04 hangs forever on i386 Cyril Hrubis
2018-10-10 11:13           ` Rafael David Tinoco
2018-10-10 12:05             ` Cyril Hrubis
2018-10-10 12:15               ` Rafael David Tinoco
2018-10-10 12:33                 ` Cyril Hrubis
2018-10-10 12:48                   ` Rafael David Tinoco

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.