All of lore.kernel.org
 help / color / mirror / Atom feed
From: Akira Yokosawa <akiyks@gmail.com>
To: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: perfbook@vger.kernel.org, Akira Yokosawa <akiyks@gmail.com>
Subject: Re: [PATCH] EXP hashtorture.h: Avoid sporadic SIGSEGV in hash_bkt_rcu
Date: Thu, 3 Jan 2019 00:02:56 +0900	[thread overview]
Message-ID: <3fa9dcca-14d0-876d-fdcb-5db7eff3a97b@gmail.com> (raw)
In-Reply-To: <20190101180025.GY4170@linux.ibm.com>

On 2019/01/01 10:00:25 -0800, Paul E. McKenney wrote:
> On Tue, Jan 01, 2019 at 09:27:41AM +0900, Akira Yokosawa wrote:
>> On 2018/12/31 13:03:07 -0800, Paul E. McKenney wrote:
>>> On Tue, Jan 01, 2019 at 12:15:23AM +0900, Akira Yokosawa wrote:
>>>> >From 52f5d218442eb64f2798335d56a1838f90d96d5f Mon Sep 17 00:00:00 2001
>>>> From: Akira Yokosawa <akiyks@gmail.com>
>>>> Date: Mon, 30 Dec 2018 22:54:43 +0900
>>>> Subject: [PATCH] EXP hashtorture.h: Avoid sporadic SIGSEGV in hash_bkt_rcu
>>>>
>>>> Commit 4e22bdc905ff ("Wait at end of test for call_rcu() to finish")
>>>> added a couple of synchronize_rcu()s in perftest_update()
>>>> and zoo_reader().
>>>>
>>>> However, there still remains sporadic SIGSEGV in
>>>>
>>>>     $ ./hash_bkt_rcu --perftest --nupdaters 3
>>>>
>>>> On the other hand,
>>>>
>>>>     $ ./hash_bkt_rcu --schroedinger --nupdaters 3
>>>>
>>>> does not show such issue. Just moving synchronize_rcu()s in
>>>> zoo_reader() to zoo_updater() does not resolve the
>>>> SIGSEGV.
>>>>
>>>>
>>>> This commit defines rcu_barrier() if not available,
>>>> and puts them at both before and after the final loop
>>>> of perftest_updater() and zoo_updater().
>>>>
>>>> It looks like this change can fix the above mentioned
>>>> SIGSEGV in "--perftest".
>>>>
>>>> [Tested on Ubuntu Xenial with liburcu-dev/xenial,now 0.9.1-3 and
>>>> liburcu4/xenial,now 0.9.1-3 installed.]
>>>>
>>>> NOTE:
>>>>
>>>>     $ ./hash_resize --schroedinger --resizemult 2 --duration 20
>>>
>>> I get SIGSEGV and hangs from time to time, so I am looking into this.
>>> Thank you for calling it to my attention!
>>
>> I've found some suspicious code in hash_resize.c
>>
>> hashtab_lock_mod() takes care of ongoing resizing and spin_lock()
>> new bucket if necessary. This is good for add, but for delete
>> we may still need to lock old bucket.
>>
>> And hashtab_unlock_mod() doesn't care ongoing resizing, so
>> there can be mismatch of spin_lock() -- spin_unlock().
>>
>> Also, htp_master->ht_cur can change during the
>> hashtab_lock_mod() -- hashtab_unlock_mod() critical section
>> because the update of the pointer by rcu_assign_pointer()
>> is ahead of synchronize_rcu().
>>
>> Given the resizing is infrequent, the simplest way might be to
>> block hashtab_lock_mod while resizing is going on.
> 
> I do believe you have found something here, and thank you!  So the
> answer to my earlier question as to whether I was smarter when writing
> it than now is clearly that I was equally stupid in both cases.  ;-)
> 
> Well, it is conference-driven code, but still high time for me to
> clean it up.
> 
>> There can be a better way to keep concurrent add/del/resize, though.
>> Happy hacking! ;-) 
> 
> I do believe that I can preserve concurrency between resizing and
> deletion, but that is clearly for me to prove.

There is one more thing I've noticed with "hash_resize --schroedinger".
*Without* resizing enabled, it says:

    $ ./hash_resize --schroedinger
    nlookups: 91373 91373  ncats: 0  nadds: 5  ndels: 6  duration: 10.851
    ns/read: 118.755  ns/update: 986455

This means that all the lookups failed. OTOH, hash_bkt_rcu works as expected
as follows:

    $ ./hash_bkt_rcu --schroedinger
    nlookups: 56064 28004  ncats: 0  nadds: 5  ndels: 5  duration: 10.373
    ns/read: 185.021  ns/update: 1.0373e+06

(ns/read looks slow because compiler optimization is disabled.)

There seems to be some mismatch in hash/key handling of hash_resize.c --
hashtorture.h combination. I've not yet figured out the cause, though.

        Thanks, Akira

> 
> And thank you again!
> 
> 							Thanx, Paul
> 
>>         Thanks, Akira
>>>
>>>> still fails with SIGSEGV frequently in zoo_del(). GDB says:
>>>>
>>>>     (gdb) where
>>>>     #0  0x0000000000402b27 in cds_list_del_rcu (elem=0x7ff8fc0138f0)
>>>>         at /usr/include/urcu/rculist.h:71
>>>>     #1  hashtab_del (htep=0x7ff8fc0138d0, htp_master=<optimized out>)
>>>>         at hash_resize.c:261
>>>>     #2  zoo_del (zhep=0x7ff8fc0138d0) at hashtorture.h:1007
>>>>     #3  zoo_updater (arg=0x1e8b298) at hashtorture.h:1153
>>>>     #4  0x00007ff9057d16ba in start_thread (arg=0x7ff903fed700)
>>>>         at pthread_create.c:333
>>>>     #5  0x00007ff9050f741d in clone ()
>>>>         at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
>>>>
>>>> Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
>>>
>>> Good catch, queue and pushed, thank you!
>>>
>>> With one small modification -- given that liburcu has had rcu_barrier()
>>> for some years now, I removed the "training wheels" (and unreliable)
>>> use of the wait and pair of synchronize_rcu() calls.
>>>
>>>> ---
>>>> Hi Paul,
>>>>
>>>> This is a partial fix, but it resolves SIGSEGV in "--perftest" of
>>>> hash_bkt_rcu and hash_resize.
>>>>
>>>> "--schroedinger" of hash_resize with resizing enabled still seg faults
>>>> as mentioned in the commit log.
>>>>
>>>> By the way, what version of liburcu are you using?
>>>
>>> It is about two years old, but it does have rcu_barrier().
>>>
>>> 								Thanx, Paul
>>>
>>>>         Thanks, Akira
>>>> --
>>>>  CodeSamples/datastruct/hash/hashtorture.h | 24 ++++++++++++++++--------
>>>>  1 file changed, 16 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/CodeSamples/datastruct/hash/hashtorture.h b/CodeSamples/datastruct/hash/hashtorture.h
>>>> index 0e90220..9ae3dfa 100644
>>>> --- a/CodeSamples/datastruct/hash/hashtorture.h
>>>> +++ b/CodeSamples/datastruct/hash/hashtorture.h
>>>> @@ -55,6 +55,15 @@ void (*defer_del_done)(struct ht_elem *htep) = NULL;
>>>>  #ifndef quiescent_state
>>>>  #define quiescent_state() do ; while (0)
>>>>  #define synchronize_rcu() do ; while (0)
>>>> +#define rcu_barrier() do ; while (0)
>>>> +#else
>>>> +#ifndef rcu_barrier
>>>> +#define rcu_barrier() do { \
>>>> +		synchronize_rcu(); \
>>>> +		poll(NULL, 0, 100); \
>>>> +		synchronize_rcu(); \
>>>> +	} while (0)
>>>> +#endif /* #ifndef rcu_barrier */
>>>>  #endif /* #ifndef quiescent_state */
>>>>  
>>>>  /*
>>>> @@ -765,6 +774,7 @@ void *perftest_reader(void *arg)
>>>>  		if (i >= ne)
>>>>  			i = i % ne + offset;
>>>>  	}
>>>> +
>>>>  	pap->nlookups = nlookups;
>>>>  	pap->nlookupfails = nlookupfails;
>>>>  	hash_unregister_thread();
>>>> @@ -839,6 +849,7 @@ void *perftest_updater(void *arg)
>>>>  			quiescent_state();
>>>>  	}
>>>>  
>>>> +	rcu_barrier();
>>>>  	/* Test over, so remove all our elements from the hash table. */
>>>>  	for (i = 0; i < elperupdater; i++) {
>>>>  		if (thep[i].in_table != 1)
>>>> @@ -846,10 +857,7 @@ void *perftest_updater(void *arg)
>>>>  		BUG_ON(!perftest_lookup(thep[i].data));
>>>>  		perftest_del(&thep[i]);
>>>>  	}
>>>> -	/* Really want rcu_barrier(), but missing from old liburcu versions. */
>>>> -	synchronize_rcu();
>>>> -	poll(NULL, 0, 100);
>>>> -	synchronize_rcu();
>>>> +	rcu_barrier();
>>>>  
>>>>  	hash_unregister_thread();
>>>>  	free(thep);
>>>> @@ -1048,10 +1056,6 @@ void *zoo_reader(void *arg)
>>>>  		if (i >= ne)
>>>>  			i = i % ne + offset;
>>>>  	}
>>>> -	/* Really want rcu_barrier(), but missing from old liburcu versions. */
>>>> -	synchronize_rcu();
>>>> -	poll(NULL, 0, 100);
>>>> -	synchronize_rcu();
>>>>  
>>>>  	pap->nlookups = nlookups;
>>>>  	pap->nlookupfails = nlookupfails;
>>>> @@ -1136,15 +1140,19 @@ void *zoo_updater(void *arg)
>>>>  			quiescent_state();
>>>>  	}
>>>>  
>>>> +	rcu_barrier();
>>>>  	/* Test over, so remove all our elements from the hash table. */
>>>>  	for (i = 0; i < elperupdater; i++) {
>>>>  		if (!zheplist[i])
>>>>  			continue;
>>>>  		zoo_del(zheplist[i]);
>>>>  	}
>>>> +	rcu_barrier();
>>>> +
>>>>  	hash_unregister_thread();
>>>>  	pap->nadds = nadds;
>>>>  	pap->ndels = ndels;
>>>> +	free(zheplist);
>>>>  	return NULL;
>>>>  }
>>>>  
>>>> -- 
>>>> 2.7.4
>>>>
>>>>
>>>
>>
> 


  reply	other threads:[~2019-01-02 15:02 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-24 14:46 [PATCH 00/11] datastruct: Employ new scheme for code snippet Akira Yokosawa
2018-12-24 14:53 ` [PATCH 01/11] fcvextract.pl: Enhance comment block handling of C source Akira Yokosawa
2018-12-24 14:55 ` [PATCH 02/11] CodeSamples: Add explicit 'keepcomment=yes' options Akira Yokosawa
2018-12-24 14:56 ` [PATCH 03/11] fcvextract.pl: Make 'keepcomment=no' as default Akira Yokosawa
2018-12-24 14:57 ` [PATCH 04/11] CodeSamples: Remove redundant \fcvexclude Akira Yokosawa
2018-12-24 14:59 ` [PATCH 05/11] fcvextract.pl: Support '/* \lnlbl{...} */' style label in C source Akira Yokosawa
2018-12-24 15:00 ` [PATCH 06/11] datastruct: Employ new scheme for snippets of hash_bkt.c Akira Yokosawa
2018-12-24 15:01 ` [PATCH 07/11] datastruct: Update hashdiagram figure Akira Yokosawa
2018-12-24 15:02 ` [PATCH 08/11] datastruct: Employ new scheme for snippets of hash_bkt_rcu and hash_resize Akira Yokosawa
2018-12-24 15:03 ` [PATCH 09/11] Make sure lmtt font is used in 'VerbatimL' and 'Verbatim' env Akira Yokosawa
2018-12-24 15:04 ` [PATCH 10/11] Use wider tabsize for snippet in 'listing*' Akira Yokosawa
2018-12-24 15:05 ` [PATCH 11/11] datastruct: Tweak hyphenation Akira Yokosawa
2018-12-24 23:58 ` [PATCH 00/11] datastruct: Employ new scheme for code snippet Paul E. McKenney
2018-12-25  0:53   ` Paul E. McKenney
2018-12-25 14:30     ` Akira Yokosawa
2018-12-26 14:17       ` Paul E. McKenney
2018-12-26 14:31       ` [PATCH] gen_snippet_d.pl: Add rules to ignore editor's backup files Akira Yokosawa
2018-12-26 15:00         ` Paul E. McKenney
2018-12-31  4:37           ` Sporadic SIGSEGV in hash_bkt_rcu and hash_resize (was Re: [PATCH] gen_snippet_d.pl: Add rules to ignore editor's backup files) Akira Yokosawa
2018-12-31 15:15             ` [PATCH] EXP hashtorture.h: Avoid sporadic SIGSEGV in hash_bkt_rcu Akira Yokosawa
2018-12-31 21:03               ` Paul E. McKenney
2019-01-01  0:27                 ` Akira Yokosawa
2019-01-01 18:00                   ` Paul E. McKenney
2019-01-02 15:02                     ` Akira Yokosawa [this message]
2019-01-02 17:18                       ` Paul E. McKenney
2019-01-02 19:18                         ` Paul E. McKenney
2019-01-03 15:57                           ` [PATCH] datastruct/hash: Tweak appearance of updated code in snippet Akira Yokosawa
2019-01-03 17:21                             ` Paul E. McKenney
2019-01-03 23:35                               ` Akira Yokosawa
2019-01-04  0:52                                 ` Paul E. McKenney
2019-01-04  1:56                                   ` Akira Yokosawa
2019-01-04  3:56                                     ` Paul E. McKenney
2019-01-04 15:38                                 ` Akira Yokosawa
2019-01-04 15:39                                   ` [PATCH 1/2] datastruct/hash: Tweak indent of folded line " Akira Yokosawa
2019-01-04 22:40                                     ` Paul E. McKenney
2019-01-04 15:41                                   ` [PATCH 2/2] datastruct/hash: Annotate racy accesses with READ_ONCE/WRITE_ONCE Akira Yokosawa
2019-01-05  0:10                                     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3fa9dcca-14d0-876d-fdcb-5db7eff3a97b@gmail.com \
    --to=akiyks@gmail.com \
    --cc=paulmck@linux.ibm.com \
    --cc=perfbook@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.