All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hpe.com>
To: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Chinner <dchinner@redhat.com>,
	Scott J Norton <scott.norton@hp.com>,
	Douglas Hatch <doug.hatch@hp.com>
Subject: Re: [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks
Date: Wed, 24 Feb 2016 14:51:17 -0500	[thread overview]
Message-ID: <56CE09B5.2000602@hpe.com> (raw)
In-Reply-To: <20160224075622.GA10096@quack.suse.cz>

On 02/24/2016 02:56 AM, Jan Kara wrote:
> On Tue 23-02-16 14:04:30, Waiman Long wrote:
>> Linked list is used everywhere in the Linux kernel. However, if many
>> threads are trying to add or delete entries into the same linked list,
>> it can create a performance bottleneck.
>>
>> This patch introduces a new per-cpu list subystem with associated
>> per-cpu locks for protecting each of the lists individually. This
>> allows list entries insertion and deletion operations to happen in
>> parallel instead of being serialized with a global list and lock.
>>
>> List entry insertion is strictly per cpu. List deletion, however, can
>> happen in a cpu other than the one that did the insertion. So we still
>> need lock to protect the list. Because of that, there may still be
>> a small amount of contention when deletion is being done.
>>
>> A new header file include/linux/percpu-list.h will be added with the
>> associated pcpu_list_head and pcpu_list_node structures. The following
>> functions are provided to manage the per-cpu list:
>>
>>   1. int init_pcpu_list_head(struct pcpu_list_head **ppcpu_head)
>>   2. void pcpu_list_add(struct pcpu_list_node *node,
>> 		       struct pcpu_list_head *head)
>>   3. void pcpu_list_del(struct pcpu_list *node)
>>
>> Iteration of all the list entries within a group of per-cpu
>> lists is done by calling either the pcpu_list_iterate() or
>> pcpu_list_iterate_safe() functions in a while loop. They correspond
>> to the list_for_each_entry() and list_for_each_entry_safe() macros
>> respectively. The iteration states are keep in a pcpu_list_state
>> structure that is passed to the iteration functions.
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
> Two comments below.
>
>> +/*
>> + * Helper function to find the first entry of the next per-cpu list
>> + * It works somewhat like for_each_possible_cpu(cpu).
>> + *
>> + * Return: true if the entry is found, false if all the lists exhausted
>> + */
>> +static __always_inline bool
>> +__pcpu_list_next_cpu(struct pcpu_list_head *head, struct pcpu_list_state *state)
>> +{
>> +	if (state->lock)
>> +		spin_unlock(state->lock);
>> +next_cpu:
>> +	/*
>> +	 * for_each_possible_cpu(cpu)
>> +	 */
>> +	state->cpu = cpumask_next(state->cpu, cpu_possible_mask);
>> +	if (state->cpu>= nr_cpu_ids)
>> +		return false;	/* All the per-cpu lists iterated */
>> +
>> +	state->head =&per_cpu_ptr(head, state->cpu)->list;
>> +	state->lock =&per_cpu_ptr(head, state->cpu)->lock;
>> +	state->curr = list_entry(state->head->next,
>> +				 struct pcpu_list_node, list);
>> +	if (&state->curr->list == state->head)
>> +		goto next_cpu;
> This might be more comprehensible as:
>
> 	if (list_empty(state->head))
> 		goto next_cpu;
>
> and you can do it just after updating state->head (no need to init
> state->lock&  state->curr if the list is empty).

Thank for the suggestion. Will change the code accordingly.
> Another note: Initialization of state->curr is IMO racy - you need to hold
> state->lock to grab state->curr reliably, don't you? Otherwise someone can
> remove the entry while you are working with it. So you need to move that
> down just before returning.

Right. I will move the initialization of state->curr after the spin_lock().

>> +
>> +	spin_lock(state->lock);
>> +	return true;
>> +}
>> +#endif /* NR_CPUS == 1 */
> ...
>
>> +/*
>> + * Delete a node from a percpu list
>> + *
>> + * We need to check the lock pointer again after taking the lock to guard
>> + * against concurrent delete of the same node. If the lock pointer changes
>> + * (becomes NULL or to a different one), we assume that the deletion was done
>> + * elsewhere.
>> + */
>> +void pcpu_list_del(struct pcpu_list_node *node)
>> +{
>> +	spinlock_t *lock = READ_ONCE(node->lockptr);
>> +
>> +	if (unlikely(!lock))
>> +		return;
>> +
>> +	spin_lock(lock);
>> +	if (likely(lock == node->lockptr)) {
>> +		list_del_init(&node->list);
>> +		node->lockptr = NULL;
>> +	}
> But someone changing lockptr under your hands would mean that there are
> two processes racing to remove entries and that would generally point to a
> problem (and likely use-after-free) in the caller, won't it? Or do you have
> some particular usecase in mind?
>
> 								Honza
>

This is just defensive programming to guard against unforeseen case. I 
don't have any particular use case in mind that will make that happen. 
Maybe I should put a WARN_ON if this really happens.

Cheers,
Longman

  reply	other threads:[~2016-02-24 19:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-23 19:04 [PATCH v3 0/3] vfs: Use per-cpu list for SB's s_inodes list Waiman Long
2016-02-23 19:04 ` [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks Waiman Long
2016-02-24  2:00   ` Boqun Feng
2016-02-24  4:01     ` Waiman Long
2016-02-24  7:56   ` Jan Kara
2016-02-24 19:51     ` Waiman Long [this message]
2016-02-23 19:04 ` [PATCH v3 2/3] fsnotify: Simplify inode iteration on umount Waiman Long
2016-02-23 19:04 ` [PATCH v3 3/3] vfs: Use per-cpu list for superblock's inode list Waiman Long
2016-02-24  8:28   ` Jan Kara
2016-02-24  8:36     ` Ingo Molnar
2016-02-24  8:58       ` Jan Kara
2016-02-25  8:06         ` Ingo Molnar
2016-02-25 14:43           ` Waiman Long
2016-02-24 20:23     ` Waiman Long
2016-02-25 14:50       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56CE09B5.2000602@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=andi@firstfloor.org \
    --cc=bfields@fieldses.org \
    --cc=cl@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=doug.hatch@hp.com \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hp.com \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.