[PATCH v2 0/5] mutex: Mutex scalability patches

* [PATCH v2 0/5] mutex: Mutex scalability patches
@ 2014-01-28 19:13 Jason Low
  2014-01-28 19:13 ` [PATCH v2 1/5] mutex: In mutex_can_spin_on_owner(), return false if task need_resched() Jason Low
                   ` (5 more replies)
  0 siblings, 6 replies; 51+ messages in thread
From: Jason Low @ 2014-01-28 19:13 UTC (permalink / raw)
  To: mingo, peterz, paulmck, Waiman.Long, torvalds, tglx, jason.low2
  Cc: linux-kernel, riel, akpm, davidlohr, hpa, andi, aswin,
	scott.norton, chegu_vinod

v1->v2:
- Replace the previous patch that limits the # of times a thread can spin with
  !lock->owner with a patch that releases the mutex before holding the wait_lock
  in the __mutex_unlock_common_slowpath() function.
- Add a patch which allows a thread to attempt 1 mutex_spin_on_owner() without
  checking need_resched() if need_resched() triggered while in the MCS queue.
- Add a patch which disables preemption between modifying lock->owner and
  acquiring/releasing the mutex.

This patchset addresses a few scalability issues with mutexes.

Patch 1 has the mutex_can_spin_on_owner() funtion check for need_resched()
before being added to MCS queue. 

Patches 2, 3 are to fix issues with threads spinning when
there is no lock owner when the mutex is under high contention.

Patch 4 and 5 are RFC patches. Patch 4 disables preemption between modifying
lock->owner and locking/unlocking the mutex. Patch 5 addresses the situation
where spinners can potentially wait a long time in the MCS queue for a chance
to spin on mutex owner (not checking for need_resched()), yet ends up not
getting to spin.

These changes benefit the AIM7 fserver and high_systime workloads (run on disk)
on an 8 socket, 80 core box. The table below shows the performance
improvements with 3.13 + patches 1, 2, 3 when compared to the 3.13 baseline,
and the performance improvements with 3.13 + all 5 patches compared to
the 3.13 baseline.

Note: I split the % improvement into these 2 categories because
patch 3 and patch 5 are the most interesting/important patches in
this patchset in terms of performance improvements.

---------------------------------------------------------------
		high_systime
---------------------------------------------------------------
# users   | avg % improvement with | avg % improvement with
          | 3.13 + patch 1, 2, 3   | 3.13 + patch 1, 2, 3, 4, 5
---------------------------------------------------------------
1000-2000 |    +27.05%             |    +53.35%
---------------------------------------------------------------
100-900   |    +36.11%             |    +52.56%
---------------------------------------------------------------
10-90     |     +2.47%             |     +4.05%
---------------------------------------------------------------

---------------------------------------------------------------
		fserver
---------------------------------------------------------------
# users   | avg % improvement with | avg % improvement with
          | 3.13 + patch 1, 2, 3   | 3.13 + patch 1, 2, 3, 4, 5
---------------------------------------------------------------
1000-2000 |    +18.31%             |    +37.65%
---------------------------------------------------------------
100-900   |     +5.99%             |    +17.50%
---------------------------------------------------------------
10-90     |     +2.47%             |     +6.10%

^ permalink raw reply	[flat|nested] 51+ messages in thread