On Fri, Dec 07 2018, Herbert Xu wrote: > On Wed, Dec 05, 2018 at 02:51:02PM +1100, NeilBrown wrote: >> >> If the sequence: >> obj = rhashtable_walk_next(iter); >> rhashtable_walk_stop(iter); >> rhashtable_remove_fast(ht, &obj->head, params); >> rhashtable_walk_start(iter); >> >> races with another thread inserting or removing >> an object on the same hash chain, a subsequent >> rhashtable_walk_next() is not guaranteed to get the "next" >> object. It is possible that an object could be >> repeated, or missed. >> >> This can be made more reliable by keeping the objects in a hash chain >> sorted by memory address. A subsequent rhashtable_walk_next() >> call can reliably find the correct position in the list, and thus >> find the 'next' object. >> >> It is not possible to take this approach with an rhltable as keeping >> the hash chain in order is not so easy. When the first object with a >> given key is removed, it is replaced in the chain with the next >> object with the same key, and the address of that object may not be >> correctly ordered. >> I have not yet found any way to achieve the same stability >> with rhltables, that doesn't have a major impact on lookup >> or insert. No code currently in Linux would benefit from >> such extra stability. >> >> With this patch: >> - a new object is always inserted after the last object with a >> smaller address, or at the start. >> - when rhashtable_walk_start() is called, it records that 'p' is not >> 'safe', meaning that it cannot be dereferenced. The revalidation >> that was previously done here is moved to rhashtable_walk_next() >> - when rhashtable_walk_next() is called while p is not NULL and not >> safe, it walks the chain looking for the first object with an >> address greater than p and returns that. If there is none, it moves >> to the next hash chain. >> >> Signed-off-by: NeilBrown >> --- >> >> This is a resend of a patch that I sent back in July. I couldn't >> applied then because it assumed another rhashtable patch which hadn't >> landed yet - it now has. > > I thought we had agreed to drop this because nobody needs it > currently and it doesn't handle rhlist? Hi Herbert, I think it was agreed that I would not pursue features that were only of use to out-of-tree code, but I don't think that applies here. This is not a feature, this is a quality-of-implementation improvement. There are users in the kernel today which use rhashtable_walk_stop()/rhashtable_walk_start() to drop out of RCU protection for periods during the walk. Any such user might miss seeing an object that has been in the table for a while - sure that is less than optimal, and should be fixed if the cost is small. There are currently no rhlist users which use stop/start to drop out of RCU, so there is no clear value in fixing that case, or cost in not fixing it. Thanks, NeilBrown > > Cheers, > -- > Email: Herbert Xu > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt