From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752246AbdGSBqj (ORCPT ); Tue, 18 Jul 2017 21:46:39 -0400 Received: from smtp2.provo.novell.com ([137.65.250.81]:48776 "EHLO smtp2.provo.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751560AbdGSBqi (ORCPT ); Tue, 18 Jul 2017 21:46:38 -0400 From: Davidlohr Bueso To: akpm@linux-foundation.org Cc: mingo@kernel.org, peterz@infradead.org, jack@suse.cz, torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com, hch@infradead.org, ldufour@linux.vnet.ibm.com, mhocko@suse.com, mgorman@techsingularity.net, dave@stgolabs.net, linux-kernel@vger.kernel.org Subject: [PATCH -next v4 00/17] rbtree: cache leftmost node internally Date: Tue, 18 Jul 2017 18:45:46 -0700 Message-Id: <20170719014603.19029-1-dave@stgolabs.net> X-Mailer: git-send-email 2.12.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Changes from v3 (https://lwn.net/Articles/726809/): - Updated Documentation/rbtree.txt in patch 1. - Explicitly mentioned the asymmetry wrt rightmost node in patch 1 (hch). - Most changes in the set is adding more patches: patch 2,3,4,5,6,15,16,17. - Added Peter's ack. Changes from v2 (https://lkml.org/lkml/2017/6/8/857): - Fixed 0day reported crash for drm_mm selftest program. We were not correctly using the cached version of rbtree with the allocated nodes. - Added cfq patch to use internal rbtree caching. - Added Christian's and Jan's reviews. Changes from v1 (https://marc.info/?l=linux-kernel&m=149611025616685): - No longer rfc. - Removed bogus semimcolon in rb_first_cached() - Updated missing interval tree user drivers/infiniband/hw/hfi1/ - Removed redundant @cached arg in when erasing a node. - Added more patches that make use of rb_first_cached(), which I thought might be worth it: procfs and epoll. - Cc more people for patch 5, which touches drivers such as infiniband and gpu. The rest of the changes are pretty covered with the current Cc'ed maintainers and mm folks. Hi, Here's a proposal for extending rbtrees to internally cache the leftmost node such that we can have fast overlap check optimization for all interval tree users[1]. The benefits of this series are that: (i) Unify users that do internal leftmost node caching. (ii) Optimize all interval tree users. (iii) Convert at least two new users (epoll and procfs) to the new interface. (iv) Enable user rightmost node caching, similar to how we deal with the leftmost case before this patchset. Patch 1: Layout the rb machinery. Patch 2,3: Adds an unlikely() prediction to root-most node for insertions as well as some minor comment enhancements. Patches 4-6: Improve and extend rbrtee_test.c program. Patches 7-10: Make use of the internal leftmost node in scheduler, realtime mutexes and cfq. Patch 11: Implements fast overlap checks for interval trees. Patch 12: rocket science. Patches 13-15: Patches that convert to O(1) rb_first_cached(). Patches 16,17: Apply rightmost node optimization explicitly by two users. The series has survived booting, kernel builds and pistress workloads. Applies on top of today's -next. Thanks! Davidlohr Bueso (17): rbtree: cache leftmost node internally rbtree: optimize root-check during rebalancing loop rbtree: add some additional comments for rebalancing cases lib/rbtree_test.c: make input module parameters lib/rbtree_test.c: add (inorder) traversal test lib/rbtree_test.c: support rb_root_cached sched/fair: replace cfs_rq->rb_leftmost sched/deadline: replace earliest dl and rq leftmost caching locking/rtmutex: replace top-waiter and pi_waiters leftmost caching block/cfq: replace cfq_rb_root leftmost caching lib/interval_tree: fast overlap detection lib/interval-tree: correct comment wrt generic flavor procfs: use faster rb_first_cached() fs/epoll: use faster rb_first_cached() fs/ext4: use cached rbtrees mem/memcg: cache rightmost node block/cfq: cache rightmost rb_node Documentation/rbtree.txt | 33 +++ block/cfq-iosched.c | 81 +++----- drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +- drivers/gpu/drm/drm_mm.c | 19 +- drivers/gpu/drm/drm_vma_manager.c | 2 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 6 +- drivers/gpu/drm/radeon/radeon.h | 2 +- drivers/gpu/drm/radeon/radeon_mn.c | 8 +- drivers/gpu/drm/radeon/radeon_vm.c | 7 +- drivers/infiniband/core/umem_rbtree.c | 4 +- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/hw/hfi1/mmu_rb.c | 10 +- drivers/infiniband/hw/usnic/usnic_uiom.c | 6 +- drivers/infiniband/hw/usnic/usnic_uiom.h | 2 +- .../infiniband/hw/usnic/usnic_uiom_interval_tree.c | 15 +- .../infiniband/hw/usnic/usnic_uiom_interval_tree.h | 12 +- drivers/vhost/vhost.c | 2 +- drivers/vhost/vhost.h | 2 +- fs/eventpoll.c | 30 +-- fs/ext4/dir.c | 24 ++- fs/ext4/ext4.h | 4 +- fs/ext4/mballoc.c | 22 +- fs/hugetlbfs/inode.c | 6 +- fs/inode.c | 2 +- fs/proc/generic.c | 26 +-- fs/proc/internal.h | 2 +- fs/proc/proc_net.c | 2 +- fs/proc/root.c | 2 +- include/drm/drm_mm.h | 2 +- include/linux/fs.h | 4 +- include/linux/init_task.h | 5 +- include/linux/interval_tree.h | 8 +- include/linux/interval_tree_generic.h | 48 ++++- include/linux/mm.h | 17 +- include/linux/rbtree.h | 21 ++ include/linux/rbtree_augmented.h | 33 ++- include/linux/rmap.h | 4 +- include/linux/rtmutex.h | 11 +- include/linux/sched.h | 3 +- include/rdma/ib_umem_odp.h | 11 +- include/rdma/ib_verbs.h | 2 +- kernel/fork.c | 3 +- kernel/locking/rtmutex-debug.c | 2 +- kernel/locking/rtmutex.c | 35 +--- kernel/locking/rtmutex_common.h | 12 +- kernel/sched/deadline.c | 50 ++--- kernel/sched/debug.c | 2 +- kernel/sched/fair.c | 39 ++-- kernel/sched/sched.h | 9 +- lib/interval_tree_test.c | 4 +- lib/rbtree.c | 65 ++++-- lib/rbtree_test.c | 230 +++++++++++++++++---- mm/interval_tree.c | 10 +- mm/memcontrol.c | 23 ++- mm/memory.c | 4 +- mm/mmap.c | 10 +- mm/rmap.c | 4 +- 59 files changed, 643 insertions(+), 378 deletions(-) -- 2.12.0