Re: [PATCH v7 1/6] lib/dlock-list: Distributed and lock-protected lists

From: Boqun Feng <boqun.feng@gmail.com>
To: Waiman Long <longman@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Chinner <dchinner@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>
Subject: Re: [PATCH v7 1/6] lib/dlock-list: Distributed and lock-protected lists
Date: Tue, 10 Oct 2017 13:35:04 +0800	[thread overview]
Message-ID: <20171010053504.m37pzhammhgucqyy@tardis> (raw)
In-Reply-To: <1507229008-20569-2-git-send-email-longman@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5328 bytes --]

On Thu, Oct 05, 2017 at 06:43:23PM +0000, Waiman Long wrote:
[...]
> +/*
> + * As all the locks in the dlock list are dynamically allocated, they need
> + * to belong to their own special lock class to avoid warning and stack
> + * trace in kernel log when lockdep is enabled. Statically allocated locks
> + * don't have this problem.
> + */
> +static struct lock_class_key dlock_list_key;
> +

So in this way, you make all dlock_lists share the same lock_class_key,
which means if there are two structures:

	struct some_a {
		...
		struct dlock_list_heads dlists;
	};

	struct some_b {
		...
		struct dlock_list_heads dlists;
	};

some_a::dlists and some_b::dlists are going to have the same lockdep
key, is this what you want? If not, you may want to do something like
init_srcu_struct() does.

> +/*
> + * Initialize cpu2idx mapping table
> + *
> + * It is possible that a dlock-list can be allocated before the cpu2idx is
> + * initialized. In this case, all the cpus are mapped to the first entry
> + * before initialization.
> + *
> + */
> +static int __init cpu2idx_init(void)
> +{
> +	int idx, cpu;
> +
> +	idx = 0;
> +	for_each_possible_cpu(cpu)
> +		per_cpu(cpu2idx, cpu) = idx++;
> +	return 0;
> +}
> +postcore_initcall(cpu2idx_init);
> +
> +/**
> + * alloc_dlock_list_heads - Initialize and allocate the list of head entries
> + * @dlist: Pointer to the dlock_list_heads structure to be initialized
> + * Return: 0 if successful, -ENOMEM if memory allocation error
> + *
> + * This function does not allocate the dlock_list_heads structure itself. The
> + * callers will have to do their own memory allocation, if necessary. However,
> + * this allows embedding the dlock_list_heads structure directly into other
> + * structures.
> + */
> +int alloc_dlock_list_heads(struct dlock_list_heads *dlist)
> +{
> +	int idx;
> +
> +	dlist->heads = kcalloc(nr_cpu_ids, sizeof(struct dlock_list_head),
> +			       GFP_KERNEL);
> +
> +	if (!dlist->heads)
> +		return -ENOMEM;
> +
> +	for (idx = 0; idx < nr_cpu_ids; idx++) {
> +		struct dlock_list_head *head = &dlist->heads[idx];
> +
> +		INIT_LIST_HEAD(&head->list);
> +		head->lock = __SPIN_LOCK_UNLOCKED(&head->lock);
> +		lockdep_set_class(&head->lock, &dlock_list_key);
> +	}
> +	return 0;
> +}
> +
> +/**
> + * free_dlock_list_heads - Free all the heads entries of the dlock list
> + * @dlist: Pointer of the dlock_list_heads structure to be freed
> + *
> + * This function doesn't free the dlock_list_heads structure itself. So
> + * the caller will have to do it, if necessary.
> + */
> +void free_dlock_list_heads(struct dlock_list_heads *dlist)
> +{
> +	kfree(dlist->heads);
> +	dlist->heads = NULL;
> +}
> +
> +/**
> + * dlock_lists_empty - Check if all the dlock lists are empty
> + * @dlist: Pointer to the dlock_list_heads structure
> + * Return: true if list is empty, false otherwise.
> + *
> + * This can be a pretty expensive function call. If this function is required
> + * in a performance critical path, we may have to maintain a global count
> + * of the list entries in the global dlock_list_heads structure instead.
> + */
> +bool dlock_lists_empty(struct dlock_list_heads *dlist)
> +{
> +	int idx;
> +
> +	for (idx = 0; idx < nr_cpu_ids; idx++)
> +		if (!list_empty(&dlist->heads[idx].list))
> +			return false;
> +	return true;
> +}
> +
> +/**
> + * dlock_lists_add - Adds a node to the given dlock list
> + * @node : The node to be added
> + * @dlist: The dlock list where the node is to be added
> + *
> + * List selection is based on the CPU being used when the dlock_list_add()
> + * function is called. However, deletion may be done by a different CPU.
> + */
> +void dlock_lists_add(struct dlock_list_node *node,
> +		     struct dlock_list_heads *dlist)
> +{
> +	struct dlock_list_head *head = &dlist->heads[this_cpu_read(cpu2idx)];
> +
> +	/*
> +	 * There is no need to disable preemption
> +	 */
> +	spin_lock(&head->lock);
> +	node->head = head;
> +	list_add(&node->list, &head->list);
> +	spin_unlock(&head->lock);
> +}
> +
> +/**
> + * dlock_lists_del - Delete a node from a dlock list
> + * @node : The node to be deleted
> + *
> + * We need to check the lock pointer again after taking the lock to guard
> + * against concurrent deletion of the same node. If the lock pointer changes
> + * (becomes NULL or to a different one), we assume that the deletion was done
> + * elsewhere. A warning will be printed if this happens as it is likely to be
> + * a bug.
> + */
> +void dlock_lists_del(struct dlock_list_node *node)
> +{
> +	struct dlock_list_head *head;
> +	bool retry;
> +
> +	do {
> +		head = READ_ONCE(node->head);

Since we read node->head locklessly here, I think we should use
WRITE_ONCE() for all the stores of node->head, to avoid store tearings?

Regards,
Boqun

> +		if (WARN_ONCE(!head, "%s: node 0x%lx has no associated head\n",
> +			      __func__, (unsigned long)node))
> +			return;
> +
> +		spin_lock(&head->lock);
> +		if (likely(head == node->head)) {
> +			list_del_init(&node->list);
> +			node->head = NULL;
> +			retry = false;
> +		} else {
> +			/*
> +			 * The lock has somehow changed. Retry again if it is
> +			 * not NULL. Otherwise, just ignore the delete
> +			 * operation.
> +			 */
> +			retry = (node->head != NULL);
> +		}
> +		spin_unlock(&head->lock);
> +	} while (retry);
> +}
> +
[...]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]