All of lore.kernel.org
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>,
	zach.brown@oracle.com, jens.axboe@oracle.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop
Date: Wed, 14 Apr 2010 21:11:44 +0200	[thread overview]
Message-ID: <4BC61370.7020700@colorfullife.com> (raw)
In-Reply-To: <20100414173319.GA3228@think>

[-- Attachment #1: Type: text/plain, Size: 3388 bytes --]

On 04/14/2010 07:33 PM, Chris Mason wrote:
> On Wed, Apr 14, 2010 at 06:16:53PM +0200, Manfred Spraul wrote:
>    
>> On 04/13/2010 08:19 PM, Chris Mason wrote:
>>      
>>> On Wed, Apr 14, 2010 at 04:09:45AM +1000, Nick Piggin wrote:
>>>        
>>>> On Tue, Apr 13, 2010 at 01:39:41PM -0400, Chris Mason wrote:
>>>> The other thing I don't know if your patch gets right is requeueing on
>>>> of the operations. When you requeue from one list to another, then you
>>>> seem to lose ordering with other pending operations, so that would
>>>> seem to break the API as well (can't remember if the API strictly
>>>> mandates FIFO, but anyway it can open up starvation cases).
>>>>          
>>> I don't see anything in the docs about the FIFO order.  I could add an
>>> extra sort on sequence number pretty easily, but is the starvation case
>>> really that bad?
>>>
>>>        
>> How do you want to determine the sequence number?
>> Is atomic_inc_return() on a per-semaphore array counter sufficiently fast?
>>      
> I haven't tried yet, but hopefully it won't be a problem.  A later patch
> does atomics on the reference count and it doesn't show up in the
> profiles.
>
>    
>>      
>>>> I was looking at doing a sequence number to be able to sort these, but
>>>> it ended up getting over complex (and SAP was only using simple ops so
>>>> it didn't seem to need much better).
>>>>
>>>> We want to be careful not to change semantics at all. And it gets
>>>> tricky quickly :( What about Zach's simpler wakeup API?
>>>>          
>>> Yeah, that's why my patches include code to handle userland sending
>>> duplicate semids.  Zach's simpler API is cooking too, but if I can get
>>> this done without insane complexity it helps with more than just the
>>> post/wait oracle workload.
>>>
>>>        
>> What is the oracle workload, which multi-sembuf operations does it use?
>> How many semaphores are in one array?
>>
>> When the last optimizations were written, I've searched a bit:
>> - postgres uses per-process semaphores, with small semaphore arrays.
>>      [process sleeps on it's own semaphore and is woken up by someone
>> else when it can make progress]
>>      
> This is similar to Oracle (and the sembench program).  Each process has
> a semaphore and when it is waiting for a commit it goes to sleep on it.
> They are woken up in bulk with semtimedop calls from a single process.
>
>    
Hmm. Thus you have:
- single sembuf decrease operations that are waiting frequently.
- multi-sembuf  increase operations.

What about optimizing for that case?
Increase operations succeed immediately. Thus complex_count is 0.

If we have performed an update operation, then we can scan all 
simple_lists that have seen an increase instead of checking the global 
list - as long as there are no complex operations waiting.
Right now, we give up if the update operation was a complex operation - 
but that does not matter.
All that matters are the sleeping operations, not the operation that did 
the wakeup.
I've attached an untested idea.

> But oracle also uses semaphores for locking in a traditional sense.
>
> Putting the waiters into a per-semaphore list is really only part of the
> speedup.  The real boost comes from the patch to break up the locks into
> a per semaphore lock.
>
>    
Ok. Then simple tricks won't help.
How many semaphores are in one array?

--
     Manfred

[-- Attachment #2: patch-ipc-optimize_bulkwakeup --]
[-- Type: text/plain, Size: 646 bytes --]

diff --git a/ipc/sem.c b/ipc/sem.c
index dbef95b..8986239 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1224,8 +1224,18 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
 
 	error = try_atomic_semop (sma, sops, nsops, un, task_tgid_vnr(current));
 	if (error <= 0) {
-		if (alter && error == 0)
-			update_queue(sma, (nsops == 1) ? sops[0].sem_num : -1);
+		if (alter && error == 0) {
+			if (sma->complex_count) {
+				update_queue(sma, -1);
+			} else {
+				int i;
+				
+				for (i=0;i<nsops;i++) {
+					if (sops[i].sem_op > 0)
+						update_queue(sma, sops[i].sem_num);
+				}
+			}
+		}
 
 		goto out_unlock_free;
 	}

  reply	other threads:[~2010-04-14 19:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-12 18:49 [PATCH RFC] Optimize semtimedop Chris Mason
2010-04-12 18:49 ` [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop Chris Mason
2010-04-13 17:15   ` Manfred Spraul
2010-04-13 17:39     ` Chris Mason
2010-04-13 18:09       ` Nick Piggin
2010-04-13 18:19         ` Chris Mason
2010-04-13 18:57           ` Nick Piggin
2010-04-13 19:01             ` Chris Mason
2010-04-13 19:25               ` Nick Piggin
2010-04-13 19:38                 ` Chris Mason
2010-04-13 20:05                   ` Nick Piggin
2010-05-16 16:57             ` Manfred Spraul
2010-05-16 22:40               ` Chris Mason
2010-05-17  7:20               ` Nick Piggin
2010-04-14 16:16           ` Manfred Spraul
2010-04-14 17:33             ` Chris Mason
2010-04-14 19:11               ` Manfred Spraul [this message]
2010-04-14 19:50                 ` Chris Mason
2010-04-15 16:33                   ` Manfred Spraul
2010-04-15 16:34                     ` Chris Mason
2010-04-13 18:24         ` Zach Brown
2010-04-16 11:26   ` Manfred Spraul
2010-04-16 11:45     ` Chris Mason
2010-04-12 18:49 ` [PATCH 2/2] ipc semaphores: order wakeups based on waiter CPU Chris Mason
2010-04-17 10:24   ` Manfred Spraul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BC61370.7020700@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=chris.mason@oracle.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=zach.brown@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.