[PATCH RFC] Optimize semtimedop

* [PATCH RFC] Optimize semtimedop
@ 2010-04-12 18:49 Chris Mason
  2010-04-12 18:49 ` [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop Chris Mason
  2010-04-12 18:49 ` [PATCH 2/2] ipc semaphores: order wakeups based on waiter CPU Chris Mason
  0 siblings, 2 replies; 25+ messages in thread
From: Chris Mason @ 2010-04-12 18:49 UTC (permalink / raw)
  To: chris.mason, zach.brown, jens.axboe, linux-kernel, Nick Piggin,
	Manfred Spraul

We've been poking around in semtimedop for a while now, mostly because
it is consistently showing up at the top of the CPU profiles for benchmarking
runs on big numa systems.  The biggest problem seems to be the IPC lock, and
the fact that we hold it for a long time while we loop over different lists and
try to do semaphore operations.

Zach Brown came up with a set of patches a while ago that switched away from
the global pending list, and semtimedop was recently optimized for the
single sop case by Nick and Manfred.

This patch series tries to build on ideas from all of these patches.  The
list of pending semaphore operations is pushed down to the individual
semaphore and the locking is also pushed down into the semaphore.  The
result is much faster with my micro benchmark:

http://oss.oracle.com/~mason/sembench.c

It more than doubles the total number of post/wait cycles the benchmark
is able to get in 30s.  Before this patch, semtimedop scored about the
same as futexes for the post/wait cycles, and so now it is 2x faster.

I did run this code through all of the ltp ipc tests, and later this week I
hope to get a full tpc database benchmark on it.

^ permalink raw reply	[flat|nested] 25+ messages in thread