From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Wang, Yipeng1" Subject: Re: [PATCH 3/4] hash: fix rw concurrency while moving keys Date: Mon, 1 Oct 2018 22:56:28 +0000 Message-ID: References: <1536253938-192391-1-git-send-email-honnappa.nagarahalli@arm.com> <1536253938-192391-4-git-send-email-honnappa.nagarahalli@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" , "Gavin Hu (Arm Technology China)" , Steve Capper , Ola Liljedahl , nd , "Gobriel, Sameh" To: Honnappa Nagarahalli , "Richardson, Bruce" , "De Lara Guarch, Pablo" Return-path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 2DDA95B20 for ; Tue, 2 Oct 2018 01:05:11 +0200 (CEST) In-Reply-To: Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >-----Original Message----- >From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] >Sent: Sunday, September 30, 2018 4:06 PM >To: Wang, Yipeng1 ; Richardson, Bruce ; De Lara Guarch, Pablo > >Cc: dev@dpdk.org; Gavin Hu (Arm Technology China) ; Stev= e Capper ; Ola Liljedahl >; nd >Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving = keys > >> >+ __atomic_store_n(&h->tbl_chng_cnt, >> >+ h->tbl_chng_cnt + 1, >> >+ __ATOMIC_RELEASE); >> >+ /* The stores to sig_alt and sig_current should not >> >+ * move above the store to tbl_chng_cnt. >> >+ */ >> >+ __atomic_thread_fence(__ATOMIC_RELEASE); >> >+ >> [Wang, Yipeng] I believe for X86 this fence should not be compiled to an= y >> code, otherwise we need macros for the compile time check. >'__atomic_thread_fence(__ATOMIC_RELEASE)' provides load-load and load-stor= e fence [1]. Hence, it should not add any barriers for >x86. > >[1] https://preshing.com/20130922/acquire-and-release-fences/ > [Wang, Yipeng] Thanks for the link, it is very informative! >> >> >@@ -926,30 +957,56 @@ __rte_hash_lookup_with_hash(const struct >> rte_hash *h, const void *key, >> > uint32_t bucket_idx; >> > hash_sig_t alt_hash; >> > struct rte_hash_bucket *bkt; >> >+ uint32_t cnt_b, cnt_a; >> > int ret; >> > >> >- bucket_idx =3D sig & h->bucket_bitmask; >> >- bkt =3D &h->buckets[bucket_idx]; >> >- >> > __hash_rw_reader_lock(h); >> > >> >- /* Check if key is in primary location */ >> >- ret =3D search_one_bucket(h, key, sig, data, bkt); >> >- if (ret !=3D -1) { >> >- __hash_rw_reader_unlock(h); >> >- return ret; >> >- } >> >- /* Calculate secondary hash */ >> >- alt_hash =3D rte_hash_secondary_hash(sig); >> >- bucket_idx =3D alt_hash & h->bucket_bitmask; >> >- bkt =3D &h->buckets[bucket_idx]; >> >+ do { >> [Wang, Yipeng] As far as I know, the MemC3 paper "MemC3: Compact and >> Concurrent MemCache with Dumber Caching and Smarter Hashing" >> as well as OvS cmap uses similar version counter to implement read-write >> concurrency for hash table, but one difference is reader checks even/odd= of >> the version counter to make sure there is no concurrent writer. Could yo= u just >> double check and confirm that this is not needed for your implementation= ? >> >I relooked at this paper. My patch makes use of the fact that during the p= rocess of shifting the key will be present in both primary and >secondary buckets. The check for odd version counter is not required as th= e full key comparison would have identified any false >signature matches. [Wang, Yipeng] I was thinking about another corner case and wondering if th= e version counter needs to be done on key deletion as well. For example, a reader reads out the index and falsely matches a signature, = before it reads out the data and the key, the key-data pair got deleted, removed, and recycled by another writer thread. This writer parti= ally overwrote the key (because key store is too large to be atomic). Now, the reader begins to compare the key, and accidentally matches the key= (because the key is partially written and accidentally matches), will The reader read the wrong data out (which should have been a lookup miss)?