From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com ([209.132.183.28]:47306 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754109AbdJIQO1 (ORCPT ); Mon, 9 Oct 2017 12:14:27 -0400 Subject: Re: [PATCH v7 4/6] lib/dlock-list: Make sibling CPUs share the same linked list To: Jan Kara Cc: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Boqun Feng , Davidlohr Bueso References: <1507229008-20569-1-git-send-email-longman@redhat.com> <1507229008-20569-5-git-send-email-longman@redhat.com> <20171009154056.GP17917@quack2.suse.cz> From: Waiman Long Message-ID: <819ead2f-1181-3d8a-8903-e34dab927d04@redhat.com> Date: Mon, 9 Oct 2017 12:14:25 -0400 MIME-Version: 1.0 In-Reply-To: <20171009154056.GP17917@quack2.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 10/09/2017 11:40 AM, Jan Kara wrote: > On Thu 05-10-17 14:43:26, Waiman Long wrote: >> The dlock list needs one list for each of the CPUs available. However, >> for sibling CPUs, they are sharing the L2 and probably L1 caches >> too. As a result, there is not much to gain in term of avoiding >> cacheline contention while increasing the cacheline footprint of the >> L1/L2 caches as separate lists may need to be in the cache. >> >> This patch makes all the sibling CPUs share the same list, thus >> reducing the number of lists that need to be maintained in each >> dlock list without having any noticeable impact on performance. It >> also improves dlock list iteration performance as fewer lists need >> to be iterated. >> >> Signed-off-by: Waiman Long > ... > >> @@ -118,7 +156,7 @@ bool dlock_lists_empty(struct dlock_list_heads *dlist) >> { >> int idx; >> >> - for (idx = 0; idx < nr_cpu_ids; idx++) >> + for (idx = 0; idx < nr_dlock_lists; idx++) >> if (!list_empty(&dlist->heads[idx].list)) >> return false; >> return true; >> @@ -207,7 +245,7 @@ struct dlock_list_node *__dlock_list_next_list(struct dlock_list_iter *iter) >> /* >> * Try next list >> */ >> - if (++iter->index >= nr_cpu_ids) >> + if (++iter->index >= nr_dlock_lists) >> return NULL; /* All the entries iterated */ >> >> if (list_empty(&iter->head[iter->index].list)) > Why these two do not need a similar treatment as alloc_dlist_heads()? > > Honza > I am aware that there is one vfs superblock allocation before nr_dlock_list is inited. However, I don't believe a list emptiness test or an iteration will happen that early in the boot sequence. Maybe I should put a BUG_ON(!nr_dlock_list) test in those function just to be sure. Cheers, Longman