All of lore.kernel.org
 help / color / mirror / Atom feed
* concurrent rhashtable test failure
@ 2016-10-24 12:11 Geert Uytterhoeven
  2016-10-26  7:33 ` Phil Sutter
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2016-10-24 12:11 UTC (permalink / raw)
  To: Phil Sutter; +Cc: Thomas Graf, Herbert Xu, netdev, linux-m68k

Hi Phil,

On m68k/ARAnyM, test_rhashtable fails with:

    Test failed: thread 0 returned: -4

(-4 = -EINTR)

I traced this back to your commit f4a3e90ba5739cfd ("rhashtable-test: extend
to test concurrency"), which added that part of the test.

Diff of the test output between the failing commit and its parent:

 Running rhashtable test nelem=8, max_size=65536, shrinking=0
 Test 00:
   Adding 50000 keys
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Deleting 50000 keys
-  Duration of test: 1029960000 ns
+  Duration of test: 990000000 ns
 Test 01:
   Adding 50000 keys
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Deleting 50000 keys
-  Duration of test: 990000000 ns
+  Duration of test: 720000000 ns
 Test 02:
   Adding 50000 keys
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Deleting 50000 keys
-  Duration of test: 1130000000 ns
+  Duration of test: 700000000 ns
 Test 03:
   Adding 50000 keys
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
   Deleting 50000 keys
-  Duration of test: 1080000000 ns
-Average test time: 1057490000
+  Duration of test: 700000000 ns
+Average test time: 777500000
+Testing concurrent rhashtable access from 10 threads
+Test failed: thread 0 returned: -4
+Started 10 threads, 1 failed

Do you have any clue?

Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-24 12:11 concurrent rhashtable test failure Geert Uytterhoeven
  2016-10-26  7:33 ` Phil Sutter
@ 2016-10-26  7:33 ` Phil Sutter
  2016-10-26  9:11   ` Geert Uytterhoeven
  2016-10-26  9:51 ` Thomas Graf
  2016-10-26  9:51 ` Thomas Graf
  3 siblings, 1 reply; 8+ messages in thread
From: Phil Sutter @ 2016-10-26  7:33 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Thomas Graf, Herbert Xu, netdev, linux-m68k

Hi Geert,

On Mon, Oct 24, 2016 at 02:11:32PM +0200, Geert Uytterhoeven wrote:
> On m68k/ARAnyM, test_rhashtable fails with:
> 
>     Test failed: thread 0 returned: -4
> 
> (-4 = -EINTR)

How reproducible is this? I wonder why out of the ten threads only the
first one fails.

> I traced this back to your commit f4a3e90ba5739cfd ("rhashtable-test: extend
> to test concurrency"), which added that part of the test.
> 
> Diff of the test output between the failing commit and its parent:
> 
>  Running rhashtable test nelem=8, max_size=65536, shrinking=0
>  Test 00:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 1029960000 ns
> +  Duration of test: 990000000 ns
>  Test 01:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 990000000 ns
> +  Duration of test: 720000000 ns
>  Test 02:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 1130000000 ns
> +  Duration of test: 700000000 ns
>  Test 03:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 1080000000 ns
> -Average test time: 1057490000
> +  Duration of test: 700000000 ns
> +Average test time: 777500000
> +Testing concurrent rhashtable access from 10 threads
> +Test failed: thread 0 returned: -4
> +Started 10 threads, 1 failed
> 
> Do you have any clue?

Not really, I merely implemented the test.

Thanks, Phil

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-24 12:11 concurrent rhashtable test failure Geert Uytterhoeven
@ 2016-10-26  7:33 ` Phil Sutter
  2016-10-26  7:33 ` Phil Sutter
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Phil Sutter @ 2016-10-26  7:33 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Thomas Graf, Herbert Xu, netdev, linux-m68k

Hi Geert,

On Mon, Oct 24, 2016 at 02:11:32PM +0200, Geert Uytterhoeven wrote:
> On m68k/ARAnyM, test_rhashtable fails with:
> 
>     Test failed: thread 0 returned: -4
> 
> (-4 = -EINTR)

How reproducible is this? I wonder why out of the ten threads only the
first one fails.

> I traced this back to your commit f4a3e90ba5739cfd ("rhashtable-test: extend
> to test concurrency"), which added that part of the test.
> 
> Diff of the test output between the failing commit and its parent:
> 
>  Running rhashtable test nelem=8, max_size=65536, shrinking=0
>  Test 00:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 1029960000 ns
> +  Duration of test: 990000000 ns
>  Test 01:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 990000000 ns
> +  Duration of test: 720000000 ns
>  Test 02:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 1130000000 ns
> +  Duration of test: 700000000 ns
>  Test 03:
>    Adding 50000 keys
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
>    Deleting 50000 keys
> -  Duration of test: 1080000000 ns
> -Average test time: 1057490000
> +  Duration of test: 700000000 ns
> +Average test time: 777500000
> +Testing concurrent rhashtable access from 10 threads
> +Test failed: thread 0 returned: -4
> +Started 10 threads, 1 failed
> 
> Do you have any clue?

Not really, I merely implemented the test.

Thanks, Phil

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-26  7:33 ` Phil Sutter
@ 2016-10-26  9:11   ` Geert Uytterhoeven
  0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2016-10-26  9:11 UTC (permalink / raw)
  To: Phil Sutter, Geert Uytterhoeven, Thomas Graf, Herbert Xu, netdev,
	linux-m68k

On Wed, Oct 26, 2016 at 9:33 AM, Phil Sutter <phil@nwl.cc> wrote:
> On Mon, Oct 24, 2016 at 02:11:32PM +0200, Geert Uytterhoeven wrote:
>> On m68k/ARAnyM, test_rhashtable fails with:
>>
>>     Test failed: thread 0 returned: -4
>>
>> (-4 = -EINTR)
>
> How reproducible is this? I wonder why out of the ten threads only the
> first one fails.

100% reproducible.

Does it need SMP?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-24 12:11 concurrent rhashtable test failure Geert Uytterhoeven
                   ` (2 preceding siblings ...)
  2016-10-26  9:51 ` Thomas Graf
@ 2016-10-26  9:51 ` Thomas Graf
  2016-10-26 11:45   ` Geert Uytterhoeven
  2016-10-26 11:45   ` Geert Uytterhoeven
  3 siblings, 2 replies; 8+ messages in thread
From: Thomas Graf @ 2016-10-26  9:51 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Phil Sutter, Herbert Xu, netdev, linux-m68k

On 10/24/16 at 02:11pm, Geert Uytterhoeven wrote:
> Hi Phil,
> 
> On m68k/ARAnyM, test_rhashtable fails with:
> 
>     Test failed: thread 0 returned: -4
> 
> (-4 = -EINTR)

The error is returned by kthread_stop(), I suspect we are running into
this:

static int kthread(void *_create)
{
	[...]
        complete(done);
        schedule();

        ret = -EINTR;

        if (!test_bit(KTHREAD_SHOULD_STOP, &self.flags)) {
                __kthread_parkme(&self);
                ret = threadfn(data);
        }
        /* we can't just return, we must preserve "self" on stack */
        do_exit(ret);
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-24 12:11 concurrent rhashtable test failure Geert Uytterhoeven
  2016-10-26  7:33 ` Phil Sutter
  2016-10-26  7:33 ` Phil Sutter
@ 2016-10-26  9:51 ` Thomas Graf
  2016-10-26  9:51 ` Thomas Graf
  3 siblings, 0 replies; 8+ messages in thread
From: Thomas Graf @ 2016-10-26  9:51 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Phil Sutter, Herbert Xu, netdev, linux-m68k

On 10/24/16 at 02:11pm, Geert Uytterhoeven wrote:
> Hi Phil,
> 
> On m68k/ARAnyM, test_rhashtable fails with:
> 
>     Test failed: thread 0 returned: -4
> 
> (-4 = -EINTR)

The error is returned by kthread_stop(), I suspect we are running into
this:

static int kthread(void *_create)
{
	[...]
        complete(done);
        schedule();

        ret = -EINTR;

        if (!test_bit(KTHREAD_SHOULD_STOP, &self.flags)) {
                __kthread_parkme(&self);
                ret = threadfn(data);
        }
        /* we can't just return, we must preserve "self" on stack */
        do_exit(ret);
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-26  9:51 ` Thomas Graf
@ 2016-10-26 11:45   ` Geert Uytterhoeven
  2016-10-26 11:45   ` Geert Uytterhoeven
  1 sibling, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2016-10-26 11:45 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Phil Sutter, Herbert Xu, netdev, linux-m68k

Hi Thomas,

On Wed, Oct 26, 2016 at 11:51 AM, Thomas Graf <tgraf@suug.ch> wrote:
> On 10/24/16 at 02:11pm, Geert Uytterhoeven wrote:
>> On m68k/ARAnyM, test_rhashtable fails with:
>>
>>     Test failed: thread 0 returned: -4
>>
>> (-4 = -EINTR)
>
> The error is returned by kthread_stop(), I suspect we are running into
> this:
>
> static int kthread(void *_create)
> {
>         [...]
>         complete(done);
>         schedule();
>
>         ret = -EINTR;
>
>         if (!test_bit(KTHREAD_SHOULD_STOP, &self.flags)) {
>                 __kthread_parkme(&self);
>                 ret = threadfn(data);
>         }
>         /* we can't just return, we must preserve "self" on stack */
>         do_exit(ret);
> }

Looks reasonable. Adding a small delay like in the (whitespace-damaged)
patch below fixes the issue for me.

However, shouldn't the prestart_sem take care of making sure that
all threads have been started?

--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -13,6 +13,7 @@
  * Self Test
  **************************************************************************/

+#include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/jhash.h>
 #include <linux/kernel.h>
@@ -403,6 +404,7 @@ static int __init test_rht_init(void)
                pr_err("  down interruptible failed\n");
        for (i = 0; i < tcount; i++)
                up(&startup_sem);
+       msleep(1000);
        for (i = 0; i < tcount; i++) {
                if (IS_ERR(tdata[i].task))
                        continue;

Gr{oetje,eeting}s,

                        Geert

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: concurrent rhashtable test failure
  2016-10-26  9:51 ` Thomas Graf
  2016-10-26 11:45   ` Geert Uytterhoeven
@ 2016-10-26 11:45   ` Geert Uytterhoeven
  1 sibling, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2016-10-26 11:45 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Phil Sutter, Herbert Xu, netdev, linux-m68k

Hi Thomas,

On Wed, Oct 26, 2016 at 11:51 AM, Thomas Graf <tgraf@suug.ch> wrote:
> On 10/24/16 at 02:11pm, Geert Uytterhoeven wrote:
>> On m68k/ARAnyM, test_rhashtable fails with:
>>
>>     Test failed: thread 0 returned: -4
>>
>> (-4 = -EINTR)
>
> The error is returned by kthread_stop(), I suspect we are running into
> this:
>
> static int kthread(void *_create)
> {
>         [...]
>         complete(done);
>         schedule();
>
>         ret = -EINTR;
>
>         if (!test_bit(KTHREAD_SHOULD_STOP, &self.flags)) {
>                 __kthread_parkme(&self);
>                 ret = threadfn(data);
>         }
>         /* we can't just return, we must preserve "self" on stack */
>         do_exit(ret);
> }

Looks reasonable. Adding a small delay like in the (whitespace-damaged)
patch below fixes the issue for me.

However, shouldn't the prestart_sem take care of making sure that
all threads have been started?

--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -13,6 +13,7 @@
  * Self Test
  **************************************************************************/

+#include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/jhash.h>
 #include <linux/kernel.h>
@@ -403,6 +404,7 @@ static int __init test_rht_init(void)
                pr_err("  down interruptible failed\n");
        for (i = 0; i < tcount; i++)
                up(&startup_sem);
+       msleep(1000);
        for (i = 0; i < tcount; i++) {
                if (IS_ERR(tdata[i].task))
                        continue;

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-10-26 11:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-24 12:11 concurrent rhashtable test failure Geert Uytterhoeven
2016-10-26  7:33 ` Phil Sutter
2016-10-26  7:33 ` Phil Sutter
2016-10-26  9:11   ` Geert Uytterhoeven
2016-10-26  9:51 ` Thomas Graf
2016-10-26  9:51 ` Thomas Graf
2016-10-26 11:45   ` Geert Uytterhoeven
2016-10-26 11:45   ` Geert Uytterhoeven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.