From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932898AbdJaSv1 (ORCPT ); Tue, 31 Oct 2017 14:51:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47040 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932831AbdJaSvX (ORCPT ); Tue, 31 Oct 2017 14:51:23 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 93A0A2D6A2B Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=longman@redhat.com From: Waiman Long To: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Boqun Feng , Davidlohr Bueso , Waiman Long Subject: [PATCH v8 0/6] vfs: Use dlock list for SB's s_inodes list Date: Tue, 31 Oct 2017 14:50:54 -0400 Message-Id: <1509475860-16139-1-git-send-email-longman@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 31 Oct 2017 18:51:23 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org v7->v8: - Integrate the additional patches 8, 9 and 10 sent to fix issues in the original v7 patchset into patch 1 and adjust the other patches accordingly. v6->v7: - Fix outdated email address. - Add a comment to patch 4 to explain allocation issue & fix a compilation problem with cpumask. - Replace patch 6 with another one that adds an irqsafe mode argument in alloc_dlock_list_heads() instead of adding new APIs. v5->v6: - Rebased the patch to 4.14-rc3. - Drop the fsnotify patch as it had been merged somehow. - Add a new patch 5 with alternative way of selecting list by hashing instead of cpu #. - Add a new patch 6 to proivde a set irq safe APIs to be used in interrupt context. - Update the CPU to index mapping code. v4->v5: - Rebased the patch to 4.8-rc1 (changes to fs/fs-writeback.c was dropped). - Use kcalloc() instead of percpu_alloc() to allocate the dlock list heads structure as suggested by Christoph Lameter. - Replaced patch 5 by another one that made sibling CPUs use the same dlock list head thus reducing the number of list heads that needed to be maintained. v3->v4: - As suggested by Al, encapsulate the dlock list mechanism into the dlist_for_each_entry() and dlist_for_each_entry_safe() which are the equivalent of list_for_each_entry() and list_for_each_entry_safe() for regular linked list. That simplifies the changes in the call sites that perform dlock list iterations. - Add a new patch to make the percpu head structure cacheline aligned to prevent cacheline contention from disrupting the performance of nearby percpu variables. v2->v3: - Remove the 2 persubnode API patches. - Merge __percpu tag patch 2 into patch 1. - As suggested by Tejun Heo, restructure the dlock_list_head data structure to hide the __percpu tag and rename some of the functions and structures. - Move most of the code from dlock_list.h to dlock_list.c and export the symbols. v1->v2: - Add a set of simple per-subnode APIs that is between percpu and per-node in granularity. - Make dlock list to use the per-subnode APIs so as to reduce the total number of separate linked list that needs to be managed and iterated. - There is no change in patches 1-5. This patchset provides new APIs for a set of distributed locked lists (one/CPU core) to minimize lock and cacheline contention. Insertion and deletion to the list will be cheap and relatively contention free. Lookup, on the other hand, may be a bit more costly as there are multiple lists to iterate. This is not really a problem for the replacement of superblock's inode list by dlock list included in the patchset as lookup isn't needed. For use cases that need to do lookup, the dlock list can also be treated as a set of hashed lists that scales with the number of CPU cores in the system. Both patches 5 and 6 are added to support other use cases like epoll nested callbacks, for example, which could use the dlock-list to reduce lock contention problem. Patch 1 introduces the dlock list. The list heads are allocated by kcalloc() instead of percpu_alloc(). Each list head entry is cacheline aligned to minimize contention. Patch 2 replaces the use of list_for_each_entry_safe() in evict_inodes() and invalidate_inodes() by list_for_each_entry(). Patch 3 modifies the superblock and inode structures to use the dlock list. The corresponding functions that reference those structures are modified. Patch 4 makes the sibling CPUs use the same dlock list head to reduce the number of list heads that need to be iterated. Patch 5 enables alternative use case of as a set of hashed lists. Patch 6 provides an irq safe mode specified at dlock-list allocation time so that it can be used within interrupt context. Jan Kara (1): vfs: Remove unnecessary list_for_each_entry_safe() variants Waiman Long (5): lib/dlock-list: Distributed and lock-protected lists vfs: Use dlock list for superblock's inode list lib/dlock-list: Make sibling CPUs share the same linked list lib/dlock-list: Enable faster lookup with hashing lib/dlock-list: Add an IRQ-safe mode to be used in interrupt handler fs/block_dev.c | 9 +- fs/drop_caches.c | 9 +- fs/inode.c | 38 ++---- fs/notify/fsnotify.c | 9 +- fs/quota/dquot.c | 14 +- fs/super.c | 7 +- include/linux/dlock-list.h | 263 +++++++++++++++++++++++++++++++++++ include/linux/fs.h | 8 +- lib/Makefile | 2 +- lib/dlock-list.c | 333 +++++++++++++++++++++++++++++++++++++++++++++ 10 files changed, 638 insertions(+), 54 deletions(-) create mode 100644 include/linux/dlock-list.h create mode 100644 lib/dlock-list.c -- 1.8.3.1