From: "Long, Wai Man" <waiman.long@hp.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>, linux-arch@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, xen-devel@lists.xenproject.org, kvm@vger.kernel.org, Paolo Bonzini <paolo.bonzini@gmail.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Rik van Riel <riel@redhat.com>, Linus Torvalds <torvalds@linux-foundation.org>, Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>, David Vrabel <david.vrabel@citrix.com>, Oleg Nesterov <oleg@redhat.com>, Gleb Natapov <gleb@redhat.com>, Scott J Norton <scott.norton@hp.com>, Chegu Vinod <chegu_vinod@hp.com> Subject: Re: [PATCH v11 06/16] qspinlock: prolong the stay in the pending bit path Date: Wed, 11 Jun 2014 17:22:28 -0400 [thread overview] Message-ID: <5398C894.6040808@hp.com> (raw) In-Reply-To: <20140611102606.GK3213@twins.programming.kicks-ass.net> On 6/11/2014 6:26 AM, Peter Zijlstra wrote: > On Fri, May 30, 2014 at 11:43:52AM -0400, Waiman Long wrote: >> --- >> kernel/locking/qspinlock.c | 18 ++++++++++++++++-- >> 1 files changed, 16 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c >> index fc7fd8c..7f10758 100644 >> --- a/kernel/locking/qspinlock.c >> +++ b/kernel/locking/qspinlock.c >> @@ -233,11 +233,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val) >> */ >> for (;;) { >> /* >> - * If we observe any contention; queue. >> + * If we observe that the queue is not empty or both >> + * the pending and lock bits are set, queue >> */ >> - if (val & ~_Q_LOCKED_MASK) >> + if ((val & _Q_TAIL_MASK) || >> + (val == (_Q_LOCKED_VAL|_Q_PENDING_VAL))) >> goto queue; >> >> + if (val == _Q_PENDING_VAL) { >> + /* >> + * Pending bit is set, but not the lock bit. >> + * Assuming that the pending bit holder is going to >> + * set the lock bit and clear the pending bit soon, >> + * it is better to wait than to exit at this point. >> + */ >> + cpu_relax(); >> + val = atomic_read(&lock->val); >> + continue; >> + } >> + >> new = _Q_LOCKED_VAL; >> if (val == new) >> new |= _Q_PENDING_VAL; > > So, again, you just posted a new version without replying to the > previous discussion; so let me try again, what's wrong with the proposal > here: > > lkml.kernel.org/r/20140417163640.GT11096@twins.programming.kicks-ass.net > > I thought I had answered you before, maybe the message was lost or the answer was not complete. Anyway, I will try to response to your question again here. > Wouldn't something like: > > while (atomic_read(&lock->val) == _Q_PENDING_VAL) > cpu_relax(); > > before the cmpxchg loop have gotten you all this? That is not exactly the same. The loop will exit if other bits are set or the pending bit cleared. In the case, we will need to do the same check at the beginning of the for loop in order to avoid doing an extra cmpxchg that is not necessary. > I just tried this on my code and I cannot see a difference. As I said before, I did see a difference with that change. I think it depends on the CPU chip that we used for testing. I ran my test on a 10-core Westmere-EX chip. I run my microbench on different pairs of core within the same chip. It produces different results that varies from 779.5ms to up to 1192ms. Without that patch, the lowest value I can get is still close to 800ms, but the highest can be up to 1800ms or so. So I believe it is just a matter of timing that you did not observed in your test machine. -Longman
WARNING: multiple messages have this Message-ID (diff)
From: "Long, Wai Man" <waiman.long@hp.com> To: Peter Zijlstra <peterz@infradead.org> Cc: linux-arch@vger.kernel.org, Rik van Riel <riel@redhat.com>, Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>, Oleg Nesterov <oleg@redhat.com>, Gleb Natapov <gleb@redhat.com>, kvm@vger.kernel.org, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Scott J Norton <scott.norton@hp.com>, x86@kernel.org, Paolo Bonzini <paolo.bonzini@gmail.com>, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Ingo Molnar <mingo@redhat.com>, Chegu Vinod <chegu_vinod@hp.com>, David Vrabel <david.vrabel@citrix.com>, "H. Peter Anvin" <hpa@zytor.com>, xen-devel@lists.xenproject.org, Thomas Gleixner <tglx@linutronix.de>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, Linus Torvalds <torvalds@linux-foundation.org>, Boris Ostrovsky <boris.ostrovsky@oracle.com> Subject: Re: [PATCH v11 06/16] qspinlock: prolong the stay in the pending bit path Date: Wed, 11 Jun 2014 17:22:28 -0400 [thread overview] Message-ID: <5398C894.6040808@hp.com> (raw) In-Reply-To: <20140611102606.GK3213@twins.programming.kicks-ass.net> On 6/11/2014 6:26 AM, Peter Zijlstra wrote: > On Fri, May 30, 2014 at 11:43:52AM -0400, Waiman Long wrote: >> --- >> kernel/locking/qspinlock.c | 18 ++++++++++++++++-- >> 1 files changed, 16 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c >> index fc7fd8c..7f10758 100644 >> --- a/kernel/locking/qspinlock.c >> +++ b/kernel/locking/qspinlock.c >> @@ -233,11 +233,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val) >> */ >> for (;;) { >> /* >> - * If we observe any contention; queue. >> + * If we observe that the queue is not empty or both >> + * the pending and lock bits are set, queue >> */ >> - if (val & ~_Q_LOCKED_MASK) >> + if ((val & _Q_TAIL_MASK) || >> + (val == (_Q_LOCKED_VAL|_Q_PENDING_VAL))) >> goto queue; >> >> + if (val == _Q_PENDING_VAL) { >> + /* >> + * Pending bit is set, but not the lock bit. >> + * Assuming that the pending bit holder is going to >> + * set the lock bit and clear the pending bit soon, >> + * it is better to wait than to exit at this point. >> + */ >> + cpu_relax(); >> + val = atomic_read(&lock->val); >> + continue; >> + } >> + >> new = _Q_LOCKED_VAL; >> if (val == new) >> new |= _Q_PENDING_VAL; > > So, again, you just posted a new version without replying to the > previous discussion; so let me try again, what's wrong with the proposal > here: > > lkml.kernel.org/r/20140417163640.GT11096@twins.programming.kicks-ass.net > > I thought I had answered you before, maybe the message was lost or the answer was not complete. Anyway, I will try to response to your question again here. > Wouldn't something like: > > while (atomic_read(&lock->val) == _Q_PENDING_VAL) > cpu_relax(); > > before the cmpxchg loop have gotten you all this? That is not exactly the same. The loop will exit if other bits are set or the pending bit cleared. In the case, we will need to do the same check at the beginning of the for loop in order to avoid doing an extra cmpxchg that is not necessary. > I just tried this on my code and I cannot see a difference. As I said before, I did see a difference with that change. I think it depends on the CPU chip that we used for testing. I ran my test on a 10-core Westmere-EX chip. I run my microbench on different pairs of core within the same chip. It produces different results that varies from 779.5ms to up to 1192ms. Without that patch, the lowest value I can get is still close to 800ms, but the highest can be up to 1800ms or so. So I believe it is just a matter of timing that you did not observed in your test machine. -Longman
next prev parent reply other threads:[~2014-06-11 21:22 UTC|newest] Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-05-30 15:43 [PATCH v11 00/16] qspinlock: a 4-byte queue spinlock with PV support Waiman Long 2014-05-30 15:43 ` [PATCH v11 01/16] qspinlock: A simple generic 4-byte queue spinlock Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 02/16] qspinlock, x86: Enable x86-64 to use " Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 03/16] qspinlock: Add pending bit Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 04/16] qspinlock: Extract out the exchange of tail code word Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 05/16] qspinlock: Optimize for smaller NR_CPUS Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 06/16] qspinlock: prolong the stay in the pending bit path Waiman Long 2014-06-11 10:26 ` Peter Zijlstra 2014-06-11 10:26 ` Peter Zijlstra 2014-06-11 21:22 ` Long, Wai Man 2014-06-11 21:22 ` Long, Wai Man [this message] 2014-06-11 21:22 ` Long, Wai Man 2014-06-12 6:00 ` Peter Zijlstra 2014-06-12 20:54 ` Waiman Long 2014-06-12 20:54 ` Waiman Long 2014-06-15 13:12 ` Peter Zijlstra 2014-06-15 13:12 ` Peter Zijlstra 2014-06-15 13:12 ` Peter Zijlstra 2014-06-12 20:54 ` Waiman Long 2014-06-12 6:00 ` Peter Zijlstra 2014-06-12 6:00 ` Peter Zijlstra 2014-06-11 10:26 ` Peter Zijlstra 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 07/16] qspinlock: Use a simple write to grab the lock, if applicable Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 08/16] qspinlock: Prepare for unfair lock support Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 09/16] qspinlock, x86: Allow unfair spinlock in a virtual guest Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-06-11 10:54 ` Peter Zijlstra 2014-06-11 10:54 ` Peter Zijlstra 2014-06-11 11:38 ` Peter Zijlstra 2014-06-11 11:38 ` Peter Zijlstra 2014-06-11 11:38 ` Peter Zijlstra 2014-06-12 1:37 ` Long, Wai Man 2014-06-12 1:37 ` Long, Wai Man 2014-06-12 1:37 ` Long, Wai Man 2014-06-12 5:50 ` Peter Zijlstra 2014-06-12 5:50 ` Peter Zijlstra 2014-06-12 21:08 ` Waiman Long 2014-06-12 21:08 ` Waiman Long 2014-06-15 13:14 ` Peter Zijlstra 2014-06-15 13:14 ` Peter Zijlstra 2014-06-15 13:14 ` Peter Zijlstra 2014-06-12 21:08 ` Waiman Long 2014-06-12 5:50 ` Peter Zijlstra 2014-06-11 10:54 ` Peter Zijlstra 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 10/16] qspinlock: Split the MCS queuing code into a separate slowerpath Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 11/16] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 12/16] pvqspinlock, x86: Add PV data structure & methods Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` [PATCH v11 13/16] pvqspinlock: Enable coexistence with the unfair lock Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:43 ` Waiman Long 2014-05-30 15:44 ` [PATCH v11 14/16] pvqspinlock: Add qspinlock para-virtualization support Waiman Long 2014-06-12 8:17 ` Peter Zijlstra 2014-06-12 8:17 ` Peter Zijlstra 2014-06-12 20:48 ` Waiman Long 2014-06-12 20:48 ` Waiman Long 2014-06-15 13:16 ` Peter Zijlstra 2014-06-15 13:16 ` Peter Zijlstra 2014-06-15 13:16 ` Peter Zijlstra 2014-06-17 20:59 ` Konrad Rzeszutek Wilk 2014-06-17 20:59 ` Konrad Rzeszutek Wilk 2014-06-17 20:59 ` Konrad Rzeszutek Wilk 2014-06-12 20:48 ` Waiman Long 2014-06-12 8:17 ` Peter Zijlstra 2014-05-30 15:44 ` Waiman Long 2014-05-30 15:44 ` Waiman Long 2014-05-30 15:44 ` [PATCH v11 15/16] pvqspinlock, x86: Enable PV qspinlock PV for KVM Waiman Long 2014-05-30 15:44 ` Waiman Long 2014-05-30 15:44 ` Waiman Long 2014-05-30 15:44 ` [PATCH v11 16/16] pvqspinlock, x86: Enable PV qspinlock for XEN Waiman Long 2014-05-30 15:44 ` Waiman Long 2014-05-30 15:44 ` Waiman Long
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=5398C894.6040808@hp.com \ --to=waiman.long@hp.com \ --cc=boris.ostrovsky@oracle.com \ --cc=chegu_vinod@hp.com \ --cc=david.vrabel@citrix.com \ --cc=gleb@redhat.com \ --cc=hpa@zytor.com \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=oleg@redhat.com \ --cc=paolo.bonzini@gmail.com \ --cc=paulmck@linux.vnet.ibm.com \ --cc=peterz@infradead.org \ --cc=raghavendra.kt@linux.vnet.ibm.com \ --cc=riel@redhat.com \ --cc=scott.norton@hp.com \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=virtualization@lists.linux-foundation.org \ --cc=x86@kernel.org \ --cc=xen-devel@lists.xenproject.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.