From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759647Ab3BYT2l (ORCPT ); Mon, 25 Feb 2013 14:28:41 -0500 Received: from e23smtp01.au.ibm.com ([202.81.31.143]:39474 "EHLO e23smtp01.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759503Ab3BYT2i (ORCPT ); Mon, 25 Feb 2013 14:28:38 -0500 Message-ID: <512BBAD8.8010006@linux.vnet.ibm.com> Date: Tue, 26 Feb 2013 00:56:16 +0530 From: "Srivatsa S. Bhat" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120828 Thunderbird/15.0 MIME-Version: 1.0 To: Lai Jiangshan CC: Michel Lespinasse , linux-doc@vger.kernel.org, peterz@infradead.org, fweisbec@gmail.com, linux-kernel@vger.kernel.org, namhyung@kernel.org, mingo@kernel.org, linux-arch@vger.kernel.org, linux@arm.linux.org.uk, xiaoguangrong@linux.vnet.ibm.com, wangyun@linux.vnet.ibm.com, paulmck@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com, linux-pm@vger.kernel.org, rusty@rustcorp.com.au, rostedt@goodmis.org, rjw@sisk.pl, vincent.guittot@linaro.org, tglx@linutronix.de, linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org, oleg@redhat.com, sbw@mit.edu, tj@kernel.org, akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks References: <20130218123714.26245.61816.stgit@srivatsabhat.in.ibm.com> <20130218123856.26245.46705.stgit@srivatsabhat.in.ibm.com> <5122551E.1080703@linux.vnet.ibm.com> <51226B46.9080707@linux.vnet.ibm.com> <51226F91.7000108@linux.vnet.ibm.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13022519-1618-0000-0000-00000365E528 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Lai, On 02/25/2013 09:23 PM, Lai Jiangshan wrote: > Hi, Srivatsa, > > The target of the whole patchset is nice for me. Cool! Thanks :-) > A question: How did you find out the such usages of > "preempt_disable()" and convert them? did all are converted? > Well, I scanned through the source tree for usages which implicitly disabled CPU offline and converted them over. Its not limited to uses of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable() etc also help disable CPU offline. So I tried to dig out all such uses and converted them. However, since the merge window is open, a lot of new code is flowing into the tree. So I'll have to rescan the tree to see if there are any more places to convert. > And I think the lock is too complex and reinvent the wheel, why don't > you reuse the lglock? lglocks? No way! ;-) See below... > I wrote an untested draft here. > > Thanks, > Lai > > PS: Some HA tools(I'm writing one) which takes checkpoints of > virtual-machines frequently, I guess this patchset can speedup the > tools. > > From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001 > From: Lai Jiangshan > Date: Mon, 25 Feb 2013 23:14:27 +0800 > Subject: [PATCH] lglock: add read-preference local-global rwlock > > locality via lglock(trylock) > read-preference read-write-lock via fallback rwlock_t > > Signed-off-by: Lai Jiangshan > --- > include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++ > kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 76 insertions(+), 0 deletions(-) > > diff --git a/include/linux/lglock.h b/include/linux/lglock.h > index 0d24e93..30fe887 100644 > --- a/include/linux/lglock.h > +++ b/include/linux/lglock.h > @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu); > void lg_global_lock(struct lglock *lg); > void lg_global_unlock(struct lglock *lg); > > +struct lgrwlock { > + unsigned long __percpu *fallback_reader_refcnt; > + struct lglock lglock; > + rwlock_t fallback_rwlock; > +}; > + > +#define DEFINE_LGRWLOCK(name) \ > + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ > + = __ARCH_SPIN_LOCK_UNLOCKED; \ > + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ > + struct lgrwlock name = { \ > + .fallback_reader_refcnt = &name ## _refcnt, \ > + .lglock = { .lock = &name ## _lock } } > + > +#define DEFINE_STATIC_LGRWLOCK(name) \ > + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ > + = __ARCH_SPIN_LOCK_UNLOCKED; \ > + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ > + static struct lgrwlock name = { \ > + .fallback_reader_refcnt = &name ## _refcnt, \ > + .lglock = { .lock = &name ## _lock } } > + > +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name) > +{ > + lg_lock_init(&lgrw->lglock, name); > +} > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw); > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw); > +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw); > +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw); > #endif > diff --git a/kernel/lglock.c b/kernel/lglock.c > index 6535a66..463543a 100644 > --- a/kernel/lglock.c > +++ b/kernel/lglock.c > @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg) > preempt_enable(); > } > EXPORT_SYMBOL(lg_global_unlock); > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) > +{ > + struct lglock *lg = &lgrw->lglock; > + > + preempt_disable(); > + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { > + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) { > + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); > + return; > + } > + read_lock(&lgrw->fallback_rwlock); > + } > + > + __this_cpu_inc(*lgrw->fallback_reader_refcnt); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_lock); > + > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > +{ > + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { > + lg_local_unlock(&lgrw->lglock); > + return; > + } > + > + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt)) > + read_unlock(&lgrw->fallback_rwlock); > + > + preempt_enable(); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_unlock); > + If I read the code above correctly, all you are doing is implementing a recursive reader-side primitive (ie., allowing the reader to call these functions recursively, without resulting in a self-deadlock). But the thing is, making the reader-side recursive is the least of our problems! Our main challenge is to make the locking extremely flexible and also safe-guard it against circular-locking-dependencies and deadlocks. Please take a look at the changelog of patch 1 - it explains the situation with an example. > +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw) > +{ > + lg_global_lock(&lgrw->lglock); This does a for-loop on all CPUs and takes their locks one-by-one. That's exactly what we want to prevent, because that is the _source_ of all our deadlock woes in this case. In the presence of perfect lock ordering guarantees, this wouldn't have been a problem (that's why lglocks are being used successfully elsewhere in the kernel). In the stop-machine() removal case, the over-flexibility of preempt_disable() forces us to provide an equally flexible locking alternative. Hence we can't use such per-cpu locking schemes. You might note that, for exactly this reason, I haven't actually used any per-cpu _locks_ in this synchronization scheme, though it is named as "per-cpu rwlocks". The only per-cpu component here are the refcounts, and we consciously avoid waiting/spinning on them (because then that would be equivalent to having per-cpu locks, which are deadlock-prone). We use global rwlocks to get the deadlock-safety that we need. > + write_lock(&lgrw->fallback_rwlock); > +} > +EXPORT_SYMBOL(lg_rwlock_global_write_lock); > + > +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw) > +{ > + write_unlock(&lgrw->fallback_rwlock); > + lg_global_unlock(&lgrw->lglock); > +} > +EXPORT_SYMBOL(lg_rwlock_global_write_unlock); > Regards, Srivatsa S. Bhat From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e23smtp07.au.ibm.com (e23smtp07.au.ibm.com [202.81.31.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e23smtp07.au.ibm.com", Issuer "GeoTrust SSL CA" (not verified)) by ozlabs.org (Postfix) with ESMTPS id EB27D2C0087 for ; Tue, 26 Feb 2013 06:28:37 +1100 (EST) Received: from /spool/local by e23smtp07.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 26 Feb 2013 05:21:03 +1000 Received: from d23relay04.au.ibm.com (d23relay04.au.ibm.com [9.190.234.120]) by d23dlp03.au.ibm.com (Postfix) with ESMTP id B30A3357804E for ; Tue, 26 Feb 2013 06:28:28 +1100 (EST) Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r1PJG2E751904628 for ; Tue, 26 Feb 2013 06:16:02 +1100 Received: from d23av02.au.ibm.com (loopback [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r1PJSQ84001469 for ; Tue, 26 Feb 2013 06:28:28 +1100 Message-ID: <512BBAD8.8010006@linux.vnet.ibm.com> Date: Tue, 26 Feb 2013 00:56:16 +0530 From: "Srivatsa S. Bhat" MIME-Version: 1.0 To: Lai Jiangshan Subject: Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks References: <20130218123714.26245.61816.stgit@srivatsabhat.in.ibm.com> <20130218123856.26245.46705.stgit@srivatsabhat.in.ibm.com> <5122551E.1080703@linux.vnet.ibm.com> <51226B46.9080707@linux.vnet.ibm.com> <51226F91.7000108@linux.vnet.ibm.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Cc: linux-doc@vger.kernel.org, peterz@infradead.org, fweisbec@gmail.com, linux-kernel@vger.kernel.org, Michel Lespinasse , mingo@kernel.org, linux-arch@vger.kernel.org, linux@arm.linux.org.uk, xiaoguangrong@linux.vnet.ibm.com, wangyun@linux.vnet.ibm.com, paulmck@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com, linux-pm@vger.kernel.org, rusty@rustcorp.com.au, rostedt@goodmis.org, rjw@sisk.pl, namhyung@kernel.org, tglx@linutronix.de, linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org, oleg@redhat.com, vincent.guittot@linaro.org, sbw@mit.edu, tj@kernel.org, akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Hi Lai, On 02/25/2013 09:23 PM, Lai Jiangshan wrote: > Hi, Srivatsa, > > The target of the whole patchset is nice for me. Cool! Thanks :-) > A question: How did you find out the such usages of > "preempt_disable()" and convert them? did all are converted? > Well, I scanned through the source tree for usages which implicitly disabled CPU offline and converted them over. Its not limited to uses of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable() etc also help disable CPU offline. So I tried to dig out all such uses and converted them. However, since the merge window is open, a lot of new code is flowing into the tree. So I'll have to rescan the tree to see if there are any more places to convert. > And I think the lock is too complex and reinvent the wheel, why don't > you reuse the lglock? lglocks? No way! ;-) See below... > I wrote an untested draft here. > > Thanks, > Lai > > PS: Some HA tools(I'm writing one) which takes checkpoints of > virtual-machines frequently, I guess this patchset can speedup the > tools. > > From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001 > From: Lai Jiangshan > Date: Mon, 25 Feb 2013 23:14:27 +0800 > Subject: [PATCH] lglock: add read-preference local-global rwlock > > locality via lglock(trylock) > read-preference read-write-lock via fallback rwlock_t > > Signed-off-by: Lai Jiangshan > --- > include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++ > kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 76 insertions(+), 0 deletions(-) > > diff --git a/include/linux/lglock.h b/include/linux/lglock.h > index 0d24e93..30fe887 100644 > --- a/include/linux/lglock.h > +++ b/include/linux/lglock.h > @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu); > void lg_global_lock(struct lglock *lg); > void lg_global_unlock(struct lglock *lg); > > +struct lgrwlock { > + unsigned long __percpu *fallback_reader_refcnt; > + struct lglock lglock; > + rwlock_t fallback_rwlock; > +}; > + > +#define DEFINE_LGRWLOCK(name) \ > + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ > + = __ARCH_SPIN_LOCK_UNLOCKED; \ > + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ > + struct lgrwlock name = { \ > + .fallback_reader_refcnt = &name ## _refcnt, \ > + .lglock = { .lock = &name ## _lock } } > + > +#define DEFINE_STATIC_LGRWLOCK(name) \ > + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ > + = __ARCH_SPIN_LOCK_UNLOCKED; \ > + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ > + static struct lgrwlock name = { \ > + .fallback_reader_refcnt = &name ## _refcnt, \ > + .lglock = { .lock = &name ## _lock } } > + > +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name) > +{ > + lg_lock_init(&lgrw->lglock, name); > +} > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw); > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw); > +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw); > +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw); > #endif > diff --git a/kernel/lglock.c b/kernel/lglock.c > index 6535a66..463543a 100644 > --- a/kernel/lglock.c > +++ b/kernel/lglock.c > @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg) > preempt_enable(); > } > EXPORT_SYMBOL(lg_global_unlock); > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) > +{ > + struct lglock *lg = &lgrw->lglock; > + > + preempt_disable(); > + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { > + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) { > + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); > + return; > + } > + read_lock(&lgrw->fallback_rwlock); > + } > + > + __this_cpu_inc(*lgrw->fallback_reader_refcnt); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_lock); > + > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > +{ > + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { > + lg_local_unlock(&lgrw->lglock); > + return; > + } > + > + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt)) > + read_unlock(&lgrw->fallback_rwlock); > + > + preempt_enable(); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_unlock); > + If I read the code above correctly, all you are doing is implementing a recursive reader-side primitive (ie., allowing the reader to call these functions recursively, without resulting in a self-deadlock). But the thing is, making the reader-side recursive is the least of our problems! Our main challenge is to make the locking extremely flexible and also safe-guard it against circular-locking-dependencies and deadlocks. Please take a look at the changelog of patch 1 - it explains the situation with an example. > +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw) > +{ > + lg_global_lock(&lgrw->lglock); This does a for-loop on all CPUs and takes their locks one-by-one. That's exactly what we want to prevent, because that is the _source_ of all our deadlock woes in this case. In the presence of perfect lock ordering guarantees, this wouldn't have been a problem (that's why lglocks are being used successfully elsewhere in the kernel). In the stop-machine() removal case, the over-flexibility of preempt_disable() forces us to provide an equally flexible locking alternative. Hence we can't use such per-cpu locking schemes. You might note that, for exactly this reason, I haven't actually used any per-cpu _locks_ in this synchronization scheme, though it is named as "per-cpu rwlocks". The only per-cpu component here are the refcounts, and we consciously avoid waiting/spinning on them (because then that would be equivalent to having per-cpu locks, which are deadlock-prone). We use global rwlocks to get the deadlock-safety that we need. > + write_lock(&lgrw->fallback_rwlock); > +} > +EXPORT_SYMBOL(lg_rwlock_global_write_lock); > + > +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw) > +{ > + write_unlock(&lgrw->fallback_rwlock); > + lg_global_unlock(&lgrw->lglock); > +} > +EXPORT_SYMBOL(lg_rwlock_global_write_unlock); > Regards, Srivatsa S. Bhat From mboxrd@z Thu Jan 1 00:00:00 1970 From: srivatsa.bhat@linux.vnet.ibm.com (Srivatsa S. Bhat) Date: Tue, 26 Feb 2013 00:56:16 +0530 Subject: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks In-Reply-To: References: <20130218123714.26245.61816.stgit@srivatsabhat.in.ibm.com> <20130218123856.26245.46705.stgit@srivatsabhat.in.ibm.com> <5122551E.1080703@linux.vnet.ibm.com> <51226B46.9080707@linux.vnet.ibm.com> <51226F91.7000108@linux.vnet.ibm.com> Message-ID: <512BBAD8.8010006@linux.vnet.ibm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Lai, On 02/25/2013 09:23 PM, Lai Jiangshan wrote: > Hi, Srivatsa, > > The target of the whole patchset is nice for me. Cool! Thanks :-) > A question: How did you find out the such usages of > "preempt_disable()" and convert them? did all are converted? > Well, I scanned through the source tree for usages which implicitly disabled CPU offline and converted them over. Its not limited to uses of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable() etc also help disable CPU offline. So I tried to dig out all such uses and converted them. However, since the merge window is open, a lot of new code is flowing into the tree. So I'll have to rescan the tree to see if there are any more places to convert. > And I think the lock is too complex and reinvent the wheel, why don't > you reuse the lglock? lglocks? No way! ;-) See below... > I wrote an untested draft here. > > Thanks, > Lai > > PS: Some HA tools(I'm writing one) which takes checkpoints of > virtual-machines frequently, I guess this patchset can speedup the > tools. > > From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001 > From: Lai Jiangshan > Date: Mon, 25 Feb 2013 23:14:27 +0800 > Subject: [PATCH] lglock: add read-preference local-global rwlock > > locality via lglock(trylock) > read-preference read-write-lock via fallback rwlock_t > > Signed-off-by: Lai Jiangshan > --- > include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++ > kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 76 insertions(+), 0 deletions(-) > > diff --git a/include/linux/lglock.h b/include/linux/lglock.h > index 0d24e93..30fe887 100644 > --- a/include/linux/lglock.h > +++ b/include/linux/lglock.h > @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu); > void lg_global_lock(struct lglock *lg); > void lg_global_unlock(struct lglock *lg); > > +struct lgrwlock { > + unsigned long __percpu *fallback_reader_refcnt; > + struct lglock lglock; > + rwlock_t fallback_rwlock; > +}; > + > +#define DEFINE_LGRWLOCK(name) \ > + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ > + = __ARCH_SPIN_LOCK_UNLOCKED; \ > + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ > + struct lgrwlock name = { \ > + .fallback_reader_refcnt = &name ## _refcnt, \ > + .lglock = { .lock = &name ## _lock } } > + > +#define DEFINE_STATIC_LGRWLOCK(name) \ > + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ > + = __ARCH_SPIN_LOCK_UNLOCKED; \ > + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ > + static struct lgrwlock name = { \ > + .fallback_reader_refcnt = &name ## _refcnt, \ > + .lglock = { .lock = &name ## _lock } } > + > +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name) > +{ > + lg_lock_init(&lgrw->lglock, name); > +} > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw); > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw); > +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw); > +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw); > #endif > diff --git a/kernel/lglock.c b/kernel/lglock.c > index 6535a66..463543a 100644 > --- a/kernel/lglock.c > +++ b/kernel/lglock.c > @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg) > preempt_enable(); > } > EXPORT_SYMBOL(lg_global_unlock); > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) > +{ > + struct lglock *lg = &lgrw->lglock; > + > + preempt_disable(); > + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { > + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) { > + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); > + return; > + } > + read_lock(&lgrw->fallback_rwlock); > + } > + > + __this_cpu_inc(*lgrw->fallback_reader_refcnt); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_lock); > + > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > +{ > + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { > + lg_local_unlock(&lgrw->lglock); > + return; > + } > + > + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt)) > + read_unlock(&lgrw->fallback_rwlock); > + > + preempt_enable(); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_unlock); > + If I read the code above correctly, all you are doing is implementing a recursive reader-side primitive (ie., allowing the reader to call these functions recursively, without resulting in a self-deadlock). But the thing is, making the reader-side recursive is the least of our problems! Our main challenge is to make the locking extremely flexible and also safe-guard it against circular-locking-dependencies and deadlocks. Please take a look at the changelog of patch 1 - it explains the situation with an example. > +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw) > +{ > + lg_global_lock(&lgrw->lglock); This does a for-loop on all CPUs and takes their locks one-by-one. That's exactly what we want to prevent, because that is the _source_ of all our deadlock woes in this case. In the presence of perfect lock ordering guarantees, this wouldn't have been a problem (that's why lglocks are being used successfully elsewhere in the kernel). In the stop-machine() removal case, the over-flexibility of preempt_disable() forces us to provide an equally flexible locking alternative. Hence we can't use such per-cpu locking schemes. You might note that, for exactly this reason, I haven't actually used any per-cpu _locks_ in this synchronization scheme, though it is named as "per-cpu rwlocks". The only per-cpu component here are the refcounts, and we consciously avoid waiting/spinning on them (because then that would be equivalent to having per-cpu locks, which are deadlock-prone). We use global rwlocks to get the deadlock-safety that we need. > + write_lock(&lgrw->fallback_rwlock); > +} > +EXPORT_SYMBOL(lg_rwlock_global_write_lock); > + > +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw) > +{ > + write_unlock(&lgrw->fallback_rwlock); > + lg_global_unlock(&lgrw->lglock); > +} > +EXPORT_SYMBOL(lg_rwlock_global_write_unlock); > Regards, Srivatsa S. Bhat