From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757963AbcBXH4F (ORCPT ); Wed, 24 Feb 2016 02:56:05 -0500 Received: from mx2.suse.de ([195.135.220.15]:60519 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755752AbcBXH4C (ORCPT ); Wed, 24 Feb 2016 02:56:02 -0500 Date: Wed, 24 Feb 2016 08:56:22 +0100 From: Jan Kara To: Waiman Long Cc: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks Message-ID: <20160224075622.GA10096@quack.suse.cz> References: <1456254272-42313-1-git-send-email-Waiman.Long@hpe.com> <1456254272-42313-2-git-send-email-Waiman.Long@hpe.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1456254272-42313-2-git-send-email-Waiman.Long@hpe.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 23-02-16 14:04:30, Waiman Long wrote: > Linked list is used everywhere in the Linux kernel. However, if many > threads are trying to add or delete entries into the same linked list, > it can create a performance bottleneck. > > This patch introduces a new per-cpu list subystem with associated > per-cpu locks for protecting each of the lists individually. This > allows list entries insertion and deletion operations to happen in > parallel instead of being serialized with a global list and lock. > > List entry insertion is strictly per cpu. List deletion, however, can > happen in a cpu other than the one that did the insertion. So we still > need lock to protect the list. Because of that, there may still be > a small amount of contention when deletion is being done. > > A new header file include/linux/percpu-list.h will be added with the > associated pcpu_list_head and pcpu_list_node structures. The following > functions are provided to manage the per-cpu list: > > 1. int init_pcpu_list_head(struct pcpu_list_head **ppcpu_head) > 2. void pcpu_list_add(struct pcpu_list_node *node, > struct pcpu_list_head *head) > 3. void pcpu_list_del(struct pcpu_list *node) > > Iteration of all the list entries within a group of per-cpu > lists is done by calling either the pcpu_list_iterate() or > pcpu_list_iterate_safe() functions in a while loop. They correspond > to the list_for_each_entry() and list_for_each_entry_safe() macros > respectively. The iteration states are keep in a pcpu_list_state > structure that is passed to the iteration functions. > > Signed-off-by: Waiman Long Two comments below. > +/* > + * Helper function to find the first entry of the next per-cpu list > + * It works somewhat like for_each_possible_cpu(cpu). > + * > + * Return: true if the entry is found, false if all the lists exhausted > + */ > +static __always_inline bool > +__pcpu_list_next_cpu(struct pcpu_list_head *head, struct pcpu_list_state *state) > +{ > + if (state->lock) > + spin_unlock(state->lock); > +next_cpu: > + /* > + * for_each_possible_cpu(cpu) > + */ > + state->cpu = cpumask_next(state->cpu, cpu_possible_mask); > + if (state->cpu >= nr_cpu_ids) > + return false; /* All the per-cpu lists iterated */ > + > + state->head = &per_cpu_ptr(head, state->cpu)->list; > + state->lock = &per_cpu_ptr(head, state->cpu)->lock; > + state->curr = list_entry(state->head->next, > + struct pcpu_list_node, list); > + if (&state->curr->list == state->head) > + goto next_cpu; This might be more comprehensible as: if (list_empty(state->head)) goto next_cpu; and you can do it just after updating state->head (no need to init state->lock & state->curr if the list is empty). Another note: Initialization of state->curr is IMO racy - you need to hold state->lock to grab state->curr reliably, don't you? Otherwise someone can remove the entry while you are working with it. So you need to move that down just before returning. > + > + spin_lock(state->lock); > + return true; > +} > +#endif /* NR_CPUS == 1 */ ... > +/* > + * Delete a node from a percpu list > + * > + * We need to check the lock pointer again after taking the lock to guard > + * against concurrent delete of the same node. If the lock pointer changes > + * (becomes NULL or to a different one), we assume that the deletion was done > + * elsewhere. > + */ > +void pcpu_list_del(struct pcpu_list_node *node) > +{ > + spinlock_t *lock = READ_ONCE(node->lockptr); > + > + if (unlikely(!lock)) > + return; > + > + spin_lock(lock); > + if (likely(lock == node->lockptr)) { > + list_del_init(&node->list); > + node->lockptr = NULL; > + } But someone changing lockptr under your hands would mean that there are two processes racing to remove entries and that would generally point to a problem (and likely use-after-free) in the caller, won't it? Or do you have some particular usecase in mind? Honza -- Jan Kara SUSE Labs, CR