From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751844AbbCJIUf (ORCPT ); Tue, 10 Mar 2015 04:20:35 -0400 Received: from mail.bmw-carit.de ([62.245.222.98]:37554 "EHLO mail.bmw-carit.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751562AbbCJIUb (ORCPT ); Tue, 10 Mar 2015 04:20:31 -0400 Message-ID: <54FEA948.2040209@bmw-carit.de> Date: Tue, 10 Mar 2015 09:20:24 +0100 From: Daniel Wagner User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Jeff Layton CC: , , Alexander Viro , "J. Bruce Fields" , Andi Kleen Subject: Re: [PATCH v3 0/2] Use blocked_lock_lock only to protect blocked_hash References: <1425628412-30259-1-git-send-email-daniel.wagner@bmw-carit.de> <20150307090041.16fbf4f8@tlielax.poochiereds.net> In-Reply-To: <20150307090041.16fbf4f8@tlielax.poochiereds.net> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 03/07/2015 03:00 PM, Jeff Layton wrote: > On Fri, 6 Mar 2015 08:53:30 +0100 > Daniel Wagner wrote: > >> Hi, >> >> Finally, I got a bigger machine and did a quick test round. I expected >> to see some improvements but the resutls do not show any real gain. So >> they are merely refactoring patches. >> > > Ok, in that case is there any point in merging these? I'm all for > breaking up global locks when it makes sense, but if you can't > demonstrate a clear benefit then I'm less inclined to take the churn. > > Perhaps we should wait to see if a benefit emerges when/if you convert > the lglock code to use normal spinlocks (like Andi suggested)? That > seems like a rather simple conversion, and I don't think it's > "cheating" in any sense of the word. > > I do however wonder why Nick used arch_spinlock_t there when he wrote > the lglock code instead of normal spinlocks. Was it simply memory usage > considerations or something else? I did a complete test run with 4.0.0-rc3, the two patches from this thread (fs-locks-v10), the spinlock_t conversion (lglock-v0) and fs-locks-v10 and lglock-v0 combined. Some of the test take rather long on my machine (12 minutes per run) so I tweaked it a bit to get more samples. Each test was run 100 times. The raw data and scripts are here: http://www.monom.org/lglock/data/ flock01 mean variance sigma max min 4.0.0-rc3 8930.8708 282098.1663 531.1291 9612.7085 4681.8674 fs-locks-v10 9081.6789 43493.0287 208.5498 9747.8491 8072.6508 lglock-v0 9004.9252 12339.3832 111.0828 9489.5420 8493.0763 fs-locks-v10+lglock-v0 9053.6680 17714.5623 133.0961 9588.7413 8555.0727 flock02 mean variance sigma max min 4.0.0-rc3 553.1720 1057.6026 32.5208 606.2989 479.5528 fs-locks-v10 596.0683 1486.8345 38.5595 662.6566 512.4865 lglock-v0 595.2150 976.8544 31.2547 646.7506 527.2517 fs-locks-v10+lglock-v0 587.5492 989.9467 31.4634 646.2098 509.6020 lease01 mean variance sigma max min 4.0.0-rc3 505.2654 780.7545 27.9420 564.2530 433.7727 fs-locks-v10 523.6855 705.2606 26.5567 570.3401 442.6539 lglock-v0 516.7525 1026.7596 32.0431 573.8766 433.4124 fs-locks-v10+lglock-v0 513.6507 751.1674 27.4074 567.0972 435.6154 lease02 mean variance sigma max min 4.0.0-rc3 3588.2069 26736.0422 163.5116 3769.7430 3154.8405 fs-locks-v10 3566.0658 34225.6410 185.0017 3747.6039 3188.5478 lglock-v0 3585.0648 28720.1679 169.4703 3758.7240 3150.9310 fs-locks-v10+lglock-v0 3621.9347 17706.2354 133.0648 3744.0095 3174.0998 posix01 mean variance sigma max min 4.0.0-rc3 9297.5030 26911.6602 164.0477 9941.8094 8963.3528 fs-locks-v10 9462.8665 44762.9316 211.5725 10504.3205 9202.5748 lglock-v0 9320.4716 47168.9903 217.1842 10401.6565 9054.1950 fs-locks-v10+lglock-v0 9458.1463 58231.8844 241.3128 10564.2086 9193.1177 posix02 mean variance sigma max min 4.0.0-rc3 920.6533 2648.1293 51.4600 983.4213 790.1792 fs-locks-v10 915.3972 4384.6821 66.2169 1004.2339 795.4129 lglock-v0 888.1910 4644.0478 68.1473 983.8412 777.4851 fs-locks-v10+lglock-v0 926.4184 1834.6481 42.8328 975.8544 794.4582 posix03 mean variance sigma max min 4.0.0-rc3 7.5202 0.0456 0.2136 7.9697 6.9803 fs-locks-v10 7.5203 0.0640 0.2529 7.9619 7.0063 lglock-v0 7.4738 0.0349 0.1867 7.8011 7.0973 fs-locks-v10+lglock-v0 7.5856 0.0595 0.2439 8.1098 6.8695 posix04 mean variance sigma max min 4.0.0-rc3 0.6699 0.1091 0.3303 3.3845 0.5247 fs-locks-v10 0.6320 0.0026 0.0511 0.9064 0.5195 lglock-v0 0.6460 0.0039 0.0623 1.0830 0.5438 fs-locks-v10+lglock-v0 0.6589 0.0338 0.1838 2.0346 0.5393 Hmm, not really convincing numbers. I hoped to see scaling effects but nope, no fun. cheers, daniel The spinlock_t conversion patch (lglock-v0) I used: diff --git a/include/linux/lglock.h b/include/linux/lglock.h index 0081f00..ea97f74 100644 --- a/include/linux/lglock.h +++ b/include/linux/lglock.h @@ -34,7 +34,7 @@ #endif struct lglock { - arch_spinlock_t __percpu *lock; + spinlock_t __percpu *lock; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lock_class_key lock_key; struct lockdep_map lock_dep_map; @@ -42,13 +42,13 @@ struct lglock { }; #define DEFINE_LGLOCK(name) \ - static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ - = __ARCH_SPIN_LOCK_UNLOCKED; \ + static DEFINE_PER_CPU(spinlock_t, name ## _lock) \ + = __SPIN_LOCK_UNLOCKED(name ## _lock); \ struct lglock name = { .lock = &name ## _lock } #define DEFINE_STATIC_LGLOCK(name) \ - static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ - = __ARCH_SPIN_LOCK_UNLOCKED; \ + static DEFINE_PER_CPU(spinlock_t, name ## _lock) \ + = __SPIN_LOCK_UNLOCKED(name ## _lock); \ static struct lglock name = { .lock = &name ## _lock } void lg_lock_init(struct lglock *lg, char *name); diff --git a/kernel/locking/lglock.c b/kernel/locking/lglock.c index 86ae2ae..34077a7 100644 --- a/kernel/locking/lglock.c +++ b/kernel/locking/lglock.c @@ -18,44 +18,44 @@ EXPORT_SYMBOL(lg_lock_init); void lg_local_lock(struct lglock *lg) { - arch_spinlock_t *lock; + spinlock_t *lock; preempt_disable(); lock_acquire_shared(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); lock = this_cpu_ptr(lg->lock); - arch_spin_lock(lock); + spin_lock(lock); } EXPORT_SYMBOL(lg_local_lock); void lg_local_unlock(struct lglock *lg) { - arch_spinlock_t *lock; + spinlock_t *lock; lock_release(&lg->lock_dep_map, 1, _RET_IP_); lock = this_cpu_ptr(lg->lock); - arch_spin_unlock(lock); + spin_unlock(lock); preempt_enable(); } EXPORT_SYMBOL(lg_local_unlock); void lg_local_lock_cpu(struct lglock *lg, int cpu) { - arch_spinlock_t *lock; + spinlock_t *lock; preempt_disable(); lock_acquire_shared(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); lock = per_cpu_ptr(lg->lock, cpu); - arch_spin_lock(lock); + spin_lock(lock); } EXPORT_SYMBOL(lg_local_lock_cpu); void lg_local_unlock_cpu(struct lglock *lg, int cpu) { - arch_spinlock_t *lock; + spinlock_t *lock; lock_release(&lg->lock_dep_map, 1, _RET_IP_); lock = per_cpu_ptr(lg->lock, cpu); - arch_spin_unlock(lock); + spin_unlock(lock); preempt_enable(); } EXPORT_SYMBOL(lg_local_unlock_cpu); @@ -67,9 +67,9 @@ void lg_global_lock(struct lglock *lg) preempt_disable(); lock_acquire_exclusive(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_); for_each_possible_cpu(i) { - arch_spinlock_t *lock; + spinlock_t *lock; lock = per_cpu_ptr(lg->lock, i); - arch_spin_lock(lock); + spin_lock(lock); } } EXPORT_SYMBOL(lg_global_lock); @@ -80,9 +80,9 @@ void lg_global_unlock(struct lglock *lg) lock_release(&lg->lock_dep_map, 1, _RET_IP_); for_each_possible_cpu(i) { - arch_spinlock_t *lock; + spinlock_t *lock; lock = per_cpu_ptr(lg->lock, i); - arch_spin_unlock(lock); + spin_unlock(lock); } preempt_enable(); }