All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hp.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Matt Fleming <matt.fleming@intel.com>,
	Herbert Xu <herbert@gondor.hengli.com.au>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Michel Lespinasse <walken@google.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation
Date: Tue, 23 Jul 2013 19:54:12 -0400	[thread overview]
Message-ID: <51EF17A4.5040300@hp.com> (raw)
In-Reply-To: <51EB74C7.7060503@linux.vnet.ibm.com>

On 07/21/2013 01:42 AM, Raghavendra K T wrote:
> On 07/18/2013 07:49 PM, Waiman Long wrote:
>> On 07/18/2013 06:22 AM, Thomas Gleixner wrote:
>>> Waiman,
>>>
>>> On Mon, 15 Jul 2013, Waiman Long wrote:
>>>> On 07/15/2013 06:31 PM, Thomas Gleixner wrote:
>>>>> On Fri, 12 Jul 2013, Waiman Long wrote:
> [...]
>>>
>>>>>> + * an increase in lock size is not an issue.
>>>>> So is it faster in the general case or only for the high 
>>>>> contention or
>>>>> single thread operation cases?
>>>>>
>>>>> And you still miss to explain WHY it is faster. Can you please 
>>>>> explain
>>>>> proper WHY it is faster and WHY we can't apply that technique you
>>>>> implemented for qrwlocks to writer only locks (aka spinlocks) with a
>>>>> smaller lock size?
>>>> I will try to collect more data to justify the usefulness of qrwlock.
>>> And please provide a proper argument why we can't use the same
>>> technique for spinlocks.
>>
>> Of course, we can use the same technique for spinlock. Since we only
>> need 1 bit for lock, we could combine the lock bit with the queue
>> address with a little bit more overhead in term of coding and speed.
>> That will make the new lock 4 bytes in size for 32-bit code & 8 bytes
>> for 64-bit code. That could solve a lot of performance problem that we
>> have with spinlock. However, I am aware that increasing the size of
>> spinlock (for 64-bit systems) may break a lot of inherent alignment in
>> many of the data structures. That is why I am not proposing such a
>> change right now. But if there is enough interest, we could certainly go
>> ahead and see how things go.
>
> keeping apart the lock size part, for spinlocks, is it that
>  fastpath overhead is less significant in low contention scenarios for
> qlocks?

Fastpath speed is an important consideration for accepting changes to 
lock, especially if the critical section is short. This is the 
impression that I got so far. When the critical section is long, 
however, the speed of the fastpath will be less important.

> Also let me know if you have POC implementation for the spinlocks that
> you can share. I am happy to test that.

I don't any POC implementation for the spinlocks as I am aware that any 
increase in spinlock size will cause it hard to get merged. I could make 
one after I finish the current set of patches that I am working on.

> sorry. different context:
> apart from AIM7 fserver, is there any other benchmark to exercise this
> qrwlock series? (to help in the testing).
>
For the AIM7 test suite, the fserver & new_fserver with ext4 are the 
best ones for exercising the qrwlock series, but you do need to have a 
lot of cores to see the effect. I haven't try to find other suitable 
benchmark tests yet.

Actually, improving fserver and new_fserver performance is not my 
primary objective. My primary goal is to have a fair rwlock 
implementation that can be used to replace selected spinlocks that is in 
high contention without losing the fairness attribute of the ticket 
spinlock, just like the replacement of mutex by rwsem.

Regards,
Longman

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hp.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Matt Fleming <matt.fleming@intel.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Michel Lespinasse <walken@google.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	Norton, Sc
Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation
Date: Tue, 23 Jul 2013 19:54:12 -0400	[thread overview]
Message-ID: <51EF17A4.5040300@hp.com> (raw)
In-Reply-To: <51EB74C7.7060503@linux.vnet.ibm.com>

On 07/21/2013 01:42 AM, Raghavendra K T wrote:
> On 07/18/2013 07:49 PM, Waiman Long wrote:
>> On 07/18/2013 06:22 AM, Thomas Gleixner wrote:
>>> Waiman,
>>>
>>> On Mon, 15 Jul 2013, Waiman Long wrote:
>>>> On 07/15/2013 06:31 PM, Thomas Gleixner wrote:
>>>>> On Fri, 12 Jul 2013, Waiman Long wrote:
> [...]
>>>
>>>>>> + * an increase in lock size is not an issue.
>>>>> So is it faster in the general case or only for the high 
>>>>> contention or
>>>>> single thread operation cases?
>>>>>
>>>>> And you still miss to explain WHY it is faster. Can you please 
>>>>> explain
>>>>> proper WHY it is faster and WHY we can't apply that technique you
>>>>> implemented for qrwlocks to writer only locks (aka spinlocks) with a
>>>>> smaller lock size?
>>>> I will try to collect more data to justify the usefulness of qrwlock.
>>> And please provide a proper argument why we can't use the same
>>> technique for spinlocks.
>>
>> Of course, we can use the same technique for spinlock. Since we only
>> need 1 bit for lock, we could combine the lock bit with the queue
>> address with a little bit more overhead in term of coding and speed.
>> That will make the new lock 4 bytes in size for 32-bit code & 8 bytes
>> for 64-bit code. That could solve a lot of performance problem that we
>> have with spinlock. However, I am aware that increasing the size of
>> spinlock (for 64-bit systems) may break a lot of inherent alignment in
>> many of the data structures. That is why I am not proposing such a
>> change right now. But if there is enough interest, we could certainly go
>> ahead and see how things go.
>
> keeping apart the lock size part, for spinlocks, is it that
>  fastpath overhead is less significant in low contention scenarios for
> qlocks?

Fastpath speed is an important consideration for accepting changes to 
lock, especially if the critical section is short. This is the 
impression that I got so far. When the critical section is long, 
however, the speed of the fastpath will be less important.

> Also let me know if you have POC implementation for the spinlocks that
> you can share. I am happy to test that.

I don't any POC implementation for the spinlocks as I am aware that any 
increase in spinlock size will cause it hard to get merged. I could make 
one after I finish the current set of patches that I am working on.

> sorry. different context:
> apart from AIM7 fserver, is there any other benchmark to exercise this
> qrwlock series? (to help in the testing).
>
For the AIM7 test suite, the fserver & new_fserver with ext4 are the 
best ones for exercising the qrwlock series, but you do need to have a 
lot of cores to see the effect. I haven't try to find other suitable 
benchmark tests yet.

Actually, improving fserver and new_fserver performance is not my 
primary objective. My primary goal is to have a fair rwlock 
implementation that can be used to replace selected spinlocks that is in 
high contention without losing the fairness attribute of the ticket 
spinlock, just like the replacement of mutex by rwsem.

Regards,
Longman

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hp.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Matt Fleming <matt.fleming@intel.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Michel Lespinasse <walken@google.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation
Date: Tue, 23 Jul 2013 19:54:12 -0400	[thread overview]
Message-ID: <51EF17A4.5040300@hp.com> (raw)
Message-ID: <20130723235412.zpgZl0br0bxr-7suE57ZfHqjrVoc30Ug3PlBWt2M2JY@z> (raw)
In-Reply-To: <51EB74C7.7060503@linux.vnet.ibm.com>

On 07/21/2013 01:42 AM, Raghavendra K T wrote:
> On 07/18/2013 07:49 PM, Waiman Long wrote:
>> On 07/18/2013 06:22 AM, Thomas Gleixner wrote:
>>> Waiman,
>>>
>>> On Mon, 15 Jul 2013, Waiman Long wrote:
>>>> On 07/15/2013 06:31 PM, Thomas Gleixner wrote:
>>>>> On Fri, 12 Jul 2013, Waiman Long wrote:
> [...]
>>>
>>>>>> + * an increase in lock size is not an issue.
>>>>> So is it faster in the general case or only for the high 
>>>>> contention or
>>>>> single thread operation cases?
>>>>>
>>>>> And you still miss to explain WHY it is faster. Can you please 
>>>>> explain
>>>>> proper WHY it is faster and WHY we can't apply that technique you
>>>>> implemented for qrwlocks to writer only locks (aka spinlocks) with a
>>>>> smaller lock size?
>>>> I will try to collect more data to justify the usefulness of qrwlock.
>>> And please provide a proper argument why we can't use the same
>>> technique for spinlocks.
>>
>> Of course, we can use the same technique for spinlock. Since we only
>> need 1 bit for lock, we could combine the lock bit with the queue
>> address with a little bit more overhead in term of coding and speed.
>> That will make the new lock 4 bytes in size for 32-bit code & 8 bytes
>> for 64-bit code. That could solve a lot of performance problem that we
>> have with spinlock. However, I am aware that increasing the size of
>> spinlock (for 64-bit systems) may break a lot of inherent alignment in
>> many of the data structures. That is why I am not proposing such a
>> change right now. But if there is enough interest, we could certainly go
>> ahead and see how things go.
>
> keeping apart the lock size part, for spinlocks, is it that
>  fastpath overhead is less significant in low contention scenarios for
> qlocks?

Fastpath speed is an important consideration for accepting changes to 
lock, especially if the critical section is short. This is the 
impression that I got so far. When the critical section is long, 
however, the speed of the fastpath will be less important.

> Also let me know if you have POC implementation for the spinlocks that
> you can share. I am happy to test that.

I don't any POC implementation for the spinlocks as I am aware that any 
increase in spinlock size will cause it hard to get merged. I could make 
one after I finish the current set of patches that I am working on.

> sorry. different context:
> apart from AIM7 fserver, is there any other benchmark to exercise this
> qrwlock series? (to help in the testing).
>
For the AIM7 test suite, the fserver & new_fserver with ext4 are the 
best ones for exercising the qrwlock series, but you do need to have a 
lot of cores to see the effect. I haven't try to find other suitable 
benchmark tests yet.

Actually, improving fserver and new_fserver performance is not my 
primary objective. My primary goal is to have a fair rwlock 
implementation that can be used to replace selected spinlocks that is in 
high contention without losing the fairness attribute of the ticket 
spinlock, just like the replacement of mutex by rwsem.

Regards,
Longman

  reply	other threads:[~2013-07-23 23:54 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-13  1:34 [PATCH RFC 0/2] qrwlock: Introducing a queue read/write lock implementation Waiman Long
2013-07-13  1:34 ` Waiman Long
2013-07-13  1:34 ` [PATCH RFC 1/2] qrwlock: A " Waiman Long
2013-07-13  1:34   ` Waiman Long
2013-07-15 14:39   ` Steven Rostedt
2013-07-15 14:39     ` Steven Rostedt
2013-07-15 20:44     ` Waiman Long
2013-07-15 20:44       ` Waiman Long
2013-07-15 22:31   ` Thomas Gleixner
2013-07-15 22:31     ` Thomas Gleixner
2013-07-16  1:19     ` Waiman Long
2013-07-16  1:19       ` Waiman Long
2013-07-18  7:42       ` Ingo Molnar
2013-07-18  7:42         ` Ingo Molnar
2013-07-18  7:42         ` Ingo Molnar
2013-07-18 13:40         ` Waiman Long
2013-07-18 13:40           ` Waiman Long
2013-07-18 13:40           ` Waiman Long
2013-07-19  8:40           ` Ingo Molnar
2013-07-19  8:40             ` Ingo Molnar
2013-07-19  8:40             ` Ingo Molnar
2013-07-19 15:30             ` Waiman Long
2013-07-19 15:30               ` Waiman Long
2013-07-19 15:30               ` Waiman Long
2013-07-22 10:34               ` Ingo Molnar
2013-07-22 10:34                 ` Ingo Molnar
2013-07-22 10:34                 ` Ingo Molnar
2013-07-24  0:03                 ` Waiman Long
2013-07-24  0:03                   ` Waiman Long
2013-07-24  0:03                   ` Waiman Long
2013-07-18 10:22       ` Thomas Gleixner
2013-07-18 10:22         ` Thomas Gleixner
2013-07-18 14:19         ` Waiman Long
2013-07-18 14:19           ` Waiman Long
2013-07-21  5:42           ` Raghavendra K T
2013-07-21  5:42             ` Raghavendra K T
2013-07-21  5:42             ` Raghavendra K T
2013-07-23 23:54             ` Waiman Long [this message]
2013-07-23 23:54               ` Waiman Long
2013-07-23 23:54               ` Waiman Long
2013-07-13  1:34 ` [PATCH RFC 2/2] x86 qrwlock: Enable x86 to use queue read/write lock Waiman Long
2013-07-13  1:34   ` Waiman Long
2013-07-18 12:55 [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation George Spelvin
2013-07-18 13:43 ` Waiman Long
2013-07-18 18:46   ` George Spelvin
2013-07-19 15:43     ` Waiman Long
2013-07-19 21:11       ` George Spelvin
2013-07-19 21:35         ` Waiman Long
2013-07-18 13:18 George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EF17A4.5040300@hp.com \
    --to=waiman.long@hp.com \
    --cc=akinobu.mita@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=arnd@arndb.de \
    --cc=aswin@hp.com \
    --cc=catalin.marinas@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.hengli.com.au \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt.fleming@intel.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=richard@nod.at \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.