From: Lai Jiangshan <eag0628@gmail.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Michel Lespinasse <walken@google.com>,
linux-doc@vger.kernel.org, peterz@infradead.org,
fweisbec@gmail.com, linux-kernel@vger.kernel.org,
namhyung@kernel.org, mingo@kernel.org,
linux-arch@vger.kernel.org, linux@arm.linux.org.uk,
xiaoguangrong@linux.vnet.ibm.com, wangyun@linux.vnet.ibm.com,
paulmck@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com,
linux-pm@vger.kernel.org, rusty@rustcorp.com.au,
rostedt@goodmis.org, rjw@sisk.pl, vincent.guittot@linaro.org,
tglx@linutronix.de, linux-arm-kernel@lists.infradead.org,
netdev@vger.kernel.org, oleg@redhat.com, sbw@mit.edu,
tj@kernel.org, akpm@linux-foundation.org,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
Date: Tue, 26 Feb 2013 08:19:19 +0800 [thread overview]
Message-ID: <CACvQF50-ZqE3=baerh5oVk6q+TxmNt53pp-vHkK=QfM1rtPEuw@mail.gmail.com> (raw)
In-Reply-To: <CACvQF51jCxk5jUqmhD=QBBtUsBkQWZzakacrKO4Gsk=w61rNwQ@mail.gmail.com>
On Tue, Feb 26, 2013 at 8:17 AM, Lai Jiangshan <eag0628@gmail.com> wrote:
> On Tue, Feb 26, 2013 at 3:26 AM, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> Hi Lai,
>>
>> On 02/25/2013 09:23 PM, Lai Jiangshan wrote:
>>> Hi, Srivatsa,
>>>
>>> The target of the whole patchset is nice for me.
>>
>> Cool! Thanks :-)
>>
>>> A question: How did you find out the such usages of
>>> "preempt_disable()" and convert them? did all are converted?
>>>
>>
>> Well, I scanned through the source tree for usages which implicitly
>> disabled CPU offline and converted them over. Its not limited to uses
>> of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable()
>> etc also help disable CPU offline. So I tried to dig out all such uses
>> and converted them. However, since the merge window is open, a lot of
>> new code is flowing into the tree. So I'll have to rescan the tree to
>> see if there are any more places to convert.
>>
>>> And I think the lock is too complex and reinvent the wheel, why don't
>>> you reuse the lglock?
>>
>> lglocks? No way! ;-) See below...
>>
>>> I wrote an untested draft here.
>>>
>>> Thanks,
>>> Lai
>>>
>>> PS: Some HA tools(I'm writing one) which takes checkpoints of
>>> virtual-machines frequently, I guess this patchset can speedup the
>>> tools.
>>>
>>> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> Date: Mon, 25 Feb 2013 23:14:27 +0800
>>> Subject: [PATCH] lglock: add read-preference local-global rwlock
>>>
>>> locality via lglock(trylock)
>>> read-preference read-write-lock via fallback rwlock_t
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> ---
>>> include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++
>>> kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 76 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/include/linux/lglock.h b/include/linux/lglock.h
>>> index 0d24e93..30fe887 100644
>>> --- a/include/linux/lglock.h
>>> +++ b/include/linux/lglock.h
>>> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
>>> void lg_global_lock(struct lglock *lg);
>>> void lg_global_unlock(struct lglock *lg);
>>>
>>> +struct lgrwlock {
>>> + unsigned long __percpu *fallback_reader_refcnt;
>>> + struct lglock lglock;
>>> + rwlock_t fallback_rwlock;
>>> +};
>>> +
>>> +#define DEFINE_LGRWLOCK(name) \
>>> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \
>>> + = __ARCH_SPIN_LOCK_UNLOCKED; \
>>> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \
>>> + struct lgrwlock name = { \
>>> + .fallback_reader_refcnt = &name ## _refcnt, \
>>> + .lglock = { .lock = &name ## _lock } }
>>> +
>>> +#define DEFINE_STATIC_LGRWLOCK(name) \
>>> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \
>>> + = __ARCH_SPIN_LOCK_UNLOCKED; \
>>> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \
>>> + static struct lgrwlock name = { \
>>> + .fallback_reader_refcnt = &name ## _refcnt, \
>>> + .lglock = { .lock = &name ## _lock } }
>>> +
>>> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
>>> +{
>>> + lg_lock_init(&lgrw->lglock, name);
>>> +}
>>> +
>>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
>>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
>>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
>>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
>>> #endif
>>> diff --git a/kernel/lglock.c b/kernel/lglock.c
>>> index 6535a66..463543a 100644
>>> --- a/kernel/lglock.c
>>> +++ b/kernel/lglock.c
>>> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
>>> preempt_enable();
>>> }
>>> EXPORT_SYMBOL(lg_global_unlock);
>>> +
>>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
>>> +{
>>> + struct lglock *lg = &lgrw->lglock;
>>> +
>>> + preempt_disable();
>>> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>>> + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
>>> + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
>>> + return;
>>> + }
>>> + read_lock(&lgrw->fallback_rwlock);
>>> + }
>>> +
>>> + __this_cpu_inc(*lgrw->fallback_reader_refcnt);
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_local_read_lock);
>>> +
>>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
>>> +{
>>> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>>> + lg_local_unlock(&lgrw->lglock);
>>> + return;
>>> + }
>>> +
>>> + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
>>> + read_unlock(&lgrw->fallback_rwlock);
>>> +
>>> + preempt_enable();
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
>>> +
>>
>> If I read the code above correctly, all you are doing is implementing a
>> recursive reader-side primitive (ie., allowing the reader to call these
>> functions recursively, without resulting in a self-deadlock).
>>
>> But the thing is, making the reader-side recursive is the least of our
>> problems! Our main challenge is to make the locking extremely flexible
>> and also safe-guard it against circular-locking-dependencies and deadlocks.
>> Please take a look at the changelog of patch 1 - it explains the situation
>> with an example.
>
>
> My lock fixes your requirements(I read patch 1-6 before I sent). In
s/fixes/fits/
> readsite, lglock 's lock is token via trylock, the lglock doesn't
> contribute to deadlocks, we can consider it doesn't exist when we find
> deadlock from it. And global fallback rwlock doesn't result to
> deadlocks because it is read-preference(you need to inc the
> fallback_reader_refcnt inside the cpu-hotplug write-side, I don't do
> it in generic lgrwlock)
>
>
> If lg_rwlock_local_read_lock() spins, which means
> lg_rwlock_local_read_lock() spins on fallback_rwlock, and which means
> lg_rwlock_global_write_lock() took the lgrwlock successfully and
> return, and which means lg_rwlock_local_read_lock() will stop spinning
> when the write side finished.
>
>
>>
>>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
>>> +{
>>> + lg_global_lock(&lgrw->lglock);
>>
>> This does a for-loop on all CPUs and takes their locks one-by-one. That's
>> exactly what we want to prevent, because that is the _source_ of all our
>> deadlock woes in this case. In the presence of perfect lock ordering
>> guarantees, this wouldn't have been a problem (that's why lglocks are
>> being used successfully elsewhere in the kernel). In the stop-machine()
>> removal case, the over-flexibility of preempt_disable() forces us to provide
>> an equally flexible locking alternative. Hence we can't use such per-cpu
>> locking schemes.
>>
>> You might note that, for exactly this reason, I haven't actually used any
>> per-cpu _locks_ in this synchronization scheme, though it is named as
>> "per-cpu rwlocks". The only per-cpu component here are the refcounts, and
>> we consciously avoid waiting/spinning on them (because then that would be
>> equivalent to having per-cpu locks, which are deadlock-prone). We use
>> global rwlocks to get the deadlock-safety that we need.
>>
>>> + write_lock(&lgrw->fallback_rwlock);
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_global_write_lock);
>>> +
>>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
>>> +{
>>> + write_unlock(&lgrw->fallback_rwlock);
>>> + lg_global_unlock(&lgrw->lglock);
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
>>>
>>
>> Regards,
>> Srivatsa S. Bhat
>>
next prev parent reply other threads:[~2013-02-26 0:19 UTC|newest]
Thread overview: 115+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-18 12:38 [PATCH v6 00/46] CPU hotplug: stop_machine()-free CPU hotplug Srivatsa S. Bhat
2013-02-18 12:38 ` [PATCH v6 01/46] percpu_rwlock: Introduce the global reader-writer lock backend Srivatsa S. Bhat
2013-02-18 12:38 ` [PATCH v6 02/46] percpu_rwlock: Introduce per-CPU variables for the reader and the writer Srivatsa S. Bhat
2013-02-18 12:38 ` [PATCH v6 03/46] percpu_rwlock: Provide a way to define and init percpu-rwlocks at compile time Srivatsa S. Bhat
2013-02-18 12:38 ` [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks Srivatsa S. Bhat
2013-02-18 15:45 ` Michel Lespinasse
2013-02-18 16:21 ` Srivatsa S. Bhat
2013-02-18 16:31 ` Steven Rostedt
2013-02-18 16:46 ` Srivatsa S. Bhat
2013-02-18 17:56 ` Srivatsa S. Bhat
2013-02-18 18:07 ` Michel Lespinasse
2013-02-18 18:14 ` Srivatsa S. Bhat
2013-02-25 15:53 ` Lai Jiangshan
2013-02-25 19:26 ` Srivatsa S. Bhat
2013-02-26 0:17 ` Lai Jiangshan
2013-02-26 0:19 ` Lai Jiangshan [this message]
2013-02-26 9:02 ` Srivatsa S. Bhat
2013-02-26 12:59 ` Lai Jiangshan
2013-02-26 14:22 ` Srivatsa S. Bhat
2013-02-26 16:25 ` Lai Jiangshan
2013-02-26 19:30 ` Srivatsa S. Bhat
2013-02-27 0:33 ` Lai Jiangshan
2013-02-27 21:19 ` Srivatsa S. Bhat
2013-03-01 17:44 ` [PATCH] lglock: add read-preference local-global rwlock Lai Jiangshan
2013-03-01 17:53 ` Tejun Heo
2013-03-01 20:06 ` Srivatsa S. Bhat
2013-03-01 18:28 ` Oleg Nesterov
2013-03-02 12:13 ` Michel Lespinasse
2013-03-02 13:14 ` [PATCH V2] " Lai Jiangshan
2013-03-02 17:11 ` Srivatsa S. Bhat
2013-03-05 15:41 ` Lai Jiangshan
2013-03-05 17:55 ` Srivatsa S. Bhat
2013-03-02 17:20 ` Oleg Nesterov
2013-03-03 17:40 ` Oleg Nesterov
2013-03-05 1:37 ` Michel Lespinasse
2013-03-05 15:27 ` Lai Jiangshan
2013-03-05 16:19 ` Michel Lespinasse
2013-03-05 16:41 ` Oleg Nesterov
2013-03-02 17:06 ` [PATCH] " Oleg Nesterov
2013-03-05 15:54 ` Lai Jiangshan
2013-03-05 16:32 ` Michel Lespinasse
2013-03-05 16:35 ` Oleg Nesterov
2013-03-02 13:42 ` Lai Jiangshan
2013-03-02 17:01 ` Oleg Nesterov
2013-03-01 17:50 ` [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks Lai Jiangshan
2013-03-01 19:47 ` Srivatsa S. Bhat
2013-03-05 16:25 ` Lai Jiangshan
2013-03-05 18:27 ` Srivatsa S. Bhat
2013-03-01 18:10 ` Tejun Heo
2013-03-01 19:59 ` Srivatsa S. Bhat
2013-02-27 11:11 ` Michel Lespinasse
2013-02-27 19:25 ` Oleg Nesterov
2013-02-28 11:34 ` Michel Lespinasse
2013-02-28 18:00 ` Oleg Nesterov
2013-02-28 18:20 ` Oleg Nesterov
2013-02-26 13:34 ` Lai Jiangshan
2013-02-26 15:17 ` Srivatsa S. Bhat
2013-02-26 14:17 ` Lai Jiangshan
2013-02-26 14:37 ` Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 05/46] percpu_rwlock: Make percpu-rwlocks IRQ-safe, optimally Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 06/46] percpu_rwlock: Rearrange the read-lock code to fastpath nested percpu readers Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 07/46] percpu_rwlock: Allow writers to be readers, and add lockdep annotations Srivatsa S. Bhat
2013-02-18 15:51 ` Michel Lespinasse
2013-02-18 16:31 ` Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 08/46] CPU hotplug: Provide APIs to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-02-18 16:23 ` Michel Lespinasse
2013-02-18 16:43 ` Srivatsa S. Bhat
2013-02-18 17:21 ` Michel Lespinasse
2013-02-18 18:50 ` Srivatsa S. Bhat
2013-02-19 9:40 ` Michel Lespinasse
2013-02-19 9:55 ` Srivatsa S. Bhat
2013-02-19 10:42 ` David Laight
2013-02-19 10:58 ` Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 09/46] CPU hotplug: Convert preprocessor macros to static inline functions Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 10/46] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly Srivatsa S. Bhat
2013-02-18 12:39 ` [PATCH v6 11/46] smp, cpu hotplug: Fix on_each_cpu_*() " Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 12/46] sched/timer: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 13/46] sched/migration: Use raw_spin_lock/unlock since interrupts are already disabled Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 14/46] sched/rt: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 15/46] tick: " Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 16/46] time/clocksource: " Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 17/46] clockevents: Use get/put_online_cpus_atomic() in clockevents_notify() Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 18/46] softirq: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-02-18 12:40 ` [PATCH v6 19/46] irq: " Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 20/46] net: " Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 21/46] block: " Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 22/46] crypto: pcrypt - Protect access to cpu_online_mask with get/put_online_cpus() Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 23/46] infiniband: ehca: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 24/46] [SCSI] fcoe: " Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 25/46] staging: octeon: " Srivatsa S. Bhat
2013-02-18 12:41 ` [PATCH v6 26/46] x86: " Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 27/46] perf/x86: " Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 28/46] KVM: Use get/put_online_cpus_atomic() to prevent CPU offline from atomic context Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 29/46] kvm/vmx: Use get/put_online_cpus_atomic() to prevent CPU offline Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 30/46] x86/xen: " Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 31/46] alpha/smp: " Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 32/46] blackfin/smp: " Srivatsa S. Bhat
2013-02-18 12:42 ` [PATCH v6 33/46] cris/smp: " Srivatsa S. Bhat
2013-02-18 13:07 ` Jesper Nilsson
2013-02-18 12:43 ` [PATCH v6 34/46] hexagon/smp: " Srivatsa S. Bhat
2013-02-18 12:43 ` [PATCH v6 35/46] ia64: " Srivatsa S. Bhat
2013-02-18 12:43 ` [PATCH v6 36/46] m32r: " Srivatsa S. Bhat
2013-02-18 12:43 ` [PATCH v6 37/46] MIPS: " Srivatsa S. Bhat
2013-02-18 12:43 ` [PATCH v6 38/46] mn10300: " Srivatsa S. Bhat
2013-02-18 12:43 ` [PATCH v6 39/46] parisc: " Srivatsa S. Bhat
2013-02-18 12:43 ` [PATCH v6 40/46] powerpc: " Srivatsa S. Bhat
2013-02-18 12:44 ` [PATCH v6 41/46] sh: " Srivatsa S. Bhat
2013-02-18 12:44 ` [PATCH v6 42/46] sparc: " Srivatsa S. Bhat
2013-02-18 12:44 ` [PATCH v6 43/46] tile: " Srivatsa S. Bhat
2013-02-18 12:44 ` [PATCH v6 44/46] cpu: No more __stop_machine() in _cpu_down() Srivatsa S. Bhat
2013-02-18 12:44 ` [PATCH v6 45/46] CPU hotplug, stop_machine: Decouple CPU hotplug from stop_machine() in Kconfig Srivatsa S. Bhat
2013-02-18 12:44 ` [PATCH v6 46/46] Documentation/cpu-hotplug: Remove references to stop_machine() Srivatsa S. Bhat
2013-02-22 0:31 ` [PATCH v6 00/46] CPU hotplug: stop_machine()-free CPU hotplug Rusty Russell
2013-02-25 21:45 ` Srivatsa S. Bhat
2013-03-01 12:05 ` Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACvQF50-ZqE3=baerh5oVk6q+TxmNt53pp-vHkK=QfM1rtPEuw@mail.gmail.com' \
--to=eag0628@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=fweisbec@gmail.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikunj@linux.vnet.ibm.com \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rjw@sisk.pl \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=sbw@mit.edu \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=walken@google.com \
--cc=wangyun@linux.vnet.ibm.com \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).