All of lore.kernel.org
 help / color / mirror / Atom feed
* Batch locking for rmap fork/exit processing
@ 2011-05-05 19:32 Andi Kleen
  2011-05-05 19:32 ` [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking Andi Kleen
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Andi Kleen @ 2011-05-05 19:32 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Rik van Riel, tim.c.chen, torvalds, lwoodman, mel

012f18004da33ba67 in 2.6.36 caused a significant performance regression in 
fork/exit intensive workloads with a lot of sharing. The problem is that 
fork/exit now contend heavily on the lock of the root anon_vma.

This patchkit attempts to lower this a bit by batching the lock acquisions.
Right now the lock is taken for every shared vma individually. This
patchkit batches this and only reaquires the lock when actually needed.

When multiple processes are doing this in parallel, they will now 
spend much less time bouncing the lock cache line around. In addition
there should be also lower overhead in the uncontended case because
locks are relatively slow (not measured) 

This doesn't completely fix the regression on a 4S system, but cuts 
it down somewhat. One particular workload suffering from this gets
about 5% faster.

This is essentially a micro optimization that just tries to mitigate
the problem a bit.

Better would be to switch back to more local locking like .35 had, but I 
guess then we would be back with the old deadlocks? I was thinking also of 
adding some deadlock avoidance as an alternative.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-05 19:32 Batch locking for rmap fork/exit processing Andi Kleen
@ 2011-05-05 19:32 ` Andi Kleen
  2011-05-06 15:59   ` Rik van Riel
  2011-05-09 21:43   ` Andrew Morton
  2011-05-05 19:32 ` [PATCH 2/4] VM/RMAP: Batch anon vma chain root locking in fork Andi Kleen
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Andi Kleen @ 2011-05-05 19:32 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Rik van Riel, tim.c.chen, torvalds, lwoodman,
	mel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

In fork and exit it's quite common to take same rmap chain locks
again and again when the whole address space is processed  for a
address space that has a lot of sharing. Also since the locking
has changed to always lock the root anon_vma this can be very
contended.

This patch adds a simple wrapper to batch these lock acquisitions
and only reaquire the lock when another is needed. The main
advantage is that when multiple processes are doing this in
parallel they will avoid a lot of communication overhead
on the lock cache line.

I added a simple lock break (100 locks) for paranoia reason,
but it's unclear if that's needed or not.

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 include/linux/rmap.h |   38 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 830e65d..d5bb9f8 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -113,6 +113,44 @@ static inline void anon_vma_unlock(struct anon_vma *anon_vma)
 	spin_unlock(&anon_vma->root->lock);
 }
 
+/* 
+ * Batched locking for anon VMA chains to avoid too much cache line 
+ * bouncing.
+ */
+
+#define AVL_LOCKBREAK 500
+
+struct anon_vma_lock_state {
+	struct anon_vma *root_anon_vma;
+	int counter;
+};
+
+static inline void init_anon_vma_lock_batch(struct anon_vma_lock_state *avs)
+{
+	avs->root_anon_vma = NULL;
+	avs->counter = 0;
+}
+
+static inline void anon_vma_lock_batch(struct anon_vma *anon_vma,
+				       struct anon_vma_lock_state *state)
+{
+	if (state->root_anon_vma == anon_vma->root &&
+	    state->counter++ < AVL_LOCKBREAK)
+		return;
+	if (state->root_anon_vma) {
+		state->counter = 0;
+		spin_unlock(&state->root_anon_vma->lock);
+	}
+	state->root_anon_vma = anon_vma->root;
+	spin_lock(&state->root_anon_vma->lock);
+}
+
+static inline void anon_vma_unlock_batch(struct anon_vma_lock_state *avs)
+{
+	if (avs->root_anon_vma)
+		spin_unlock(&avs->root_anon_vma->lock);
+}
+
 /*
  * anon_vma helper functions.
  */
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/4] VM/RMAP: Batch anon vma chain root locking in fork
  2011-05-05 19:32 Batch locking for rmap fork/exit processing Andi Kleen
  2011-05-05 19:32 ` [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking Andi Kleen
@ 2011-05-05 19:32 ` Andi Kleen
  2011-05-06 15:59   ` Rik van Riel
  2011-05-05 19:32 ` [PATCH 3/4] VM/RMAP: Batch anon_vma_unlink in exit Andi Kleen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2011-05-05 19:32 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Rik van Riel, tim.c.chen, torvalds, lwoodman,
	mel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

We found that the changes to take anon vma root chain lock lead
to excessive lock contention on a fork intensive workload on a 4S
system.

Use the new batch lock infrastructure to optimize the fork()
path, where it is very common to acquire always the same lock.

This patch does not really lower the contention, but batches
the lock taking/freeing to lower the bouncing overhead when
multiple forks are working at the same time. Essentially each
user will get more work done inside a locking region.

Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/rmap.c |   69 +++++++++++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 8da044a..fbac55a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -177,44 +177,72 @@ int anon_vma_prepare(struct vm_area_struct *vma)
 	return -ENOMEM;
 }
 
+/* Caller must call anon_vma_unlock_batch. */
 static void anon_vma_chain_link(struct vm_area_struct *vma,
 				struct anon_vma_chain *avc,
-				struct anon_vma *anon_vma)
+				struct anon_vma *anon_vma,
+				struct anon_vma_lock_state *avs)
 {
 	avc->vma = vma;
 	avc->anon_vma = anon_vma;
 	list_add(&avc->same_vma, &vma->anon_vma_chain);
 
-	anon_vma_lock(anon_vma);
+	anon_vma_lock_batch(anon_vma, avs);
 	/*
 	 * It's critical to add new vmas to the tail of the anon_vma,
 	 * see comment in huge_memory.c:__split_huge_page().
 	 */
 	list_add_tail(&avc->same_anon_vma, &anon_vma->head);
-	anon_vma_unlock(anon_vma);
+	/* unlock in caller */
 }
 
+
 /*
  * Attach the anon_vmas from src to dst.
  * Returns 0 on success, -ENOMEM on failure.
+ * Caller must call anon_vma_unlock_batch.
  */
-int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
+static int anon_vma_clone_batch(struct vm_area_struct *dst, 
+				struct vm_area_struct *src,
+				struct anon_vma_lock_state *avs)
 {
-	struct anon_vma_chain *avc, *pavc;
-
+	struct anon_vma_chain *avc, *pavc, *avc_next;
+	LIST_HEAD(head);
+	
+	/* First allocate with sleeping */
 	list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
 		avc = anon_vma_chain_alloc();
 		if (!avc)
 			goto enomem_failure;
-		anon_vma_chain_link(dst, avc, pavc->anon_vma);
+		list_add_tail(&avc->same_anon_vma, &head);
+	}	
+
+	/* Now take locks and link in */
+	init_anon_vma_lock_batch(avs);
+	avc = list_first_entry(&head, struct anon_vma_chain, same_anon_vma);
+	list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
+		avc_next = list_entry(avc->same_anon_vma.next, 
+				      struct anon_vma_chain,
+				      same_anon_vma);
+		anon_vma_chain_link(dst, avc, pavc->anon_vma, avs);
+		avc = avc_next;
 	}
 	return 0;
 
  enomem_failure:
-	unlink_anon_vmas(dst);
+	list_for_each_entry (avc, &head, same_anon_vma)
+		anon_vma_chain_free(avc);
 	return -ENOMEM;
 }
 
+int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
+{
+	struct anon_vma_lock_state avs;
+	int n = anon_vma_clone_batch(dst, src, &avs);
+	anon_vma_unlock_batch(&avs);
+	return n;
+}
+
 /*
  * Attach vma to its own anon_vma, as well as to the anon_vmas that
  * the corresponding VMA in the parent process is attached to.
@@ -224,27 +252,27 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
 {
 	struct anon_vma_chain *avc;
 	struct anon_vma *anon_vma;
+	struct anon_vma_lock_state avs;
 
 	/* Don't bother if the parent process has no anon_vma here. */
 	if (!pvma->anon_vma)
 		return 0;
 
-	/*
-	 * First, attach the new VMA to the parent VMA's anon_vmas,
-	 * so rmap can find non-COWed pages in child processes.
-	 */
-	if (anon_vma_clone(vma, pvma))
-		return -ENOMEM;
-
-	/* Then add our own anon_vma. */
 	anon_vma = anon_vma_alloc();
 	if (!anon_vma)
-		goto out_error;
+		return -ENOMEM;
 	avc = anon_vma_chain_alloc();
 	if (!avc)
 		goto out_error_free_anon_vma;
 
 	/*
+	 * First, attach the new VMA to the parent VMA's anon_vmas,
+	 * so rmap can find non-COWed pages in child processes.
+	 */
+	if (anon_vma_clone_batch(vma, pvma, &avs))
+		goto out_error_free_vma_chain;
+
+	/*
 	 * The root anon_vma's spinlock is the lock actually used when we
 	 * lock any of the anon_vmas in this anon_vma tree.
 	 */
@@ -257,14 +285,15 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
 	get_anon_vma(anon_vma->root);
 	/* Mark this anon_vma as the one where our new (COWed) pages go. */
 	vma->anon_vma = anon_vma;
-	anon_vma_chain_link(vma, avc, anon_vma);
+	anon_vma_chain_link(vma, avc, anon_vma, &avs);
+	anon_vma_unlock_batch(&avs);
 
 	return 0;
 
+ out_error_free_vma_chain:
+	anon_vma_chain_free(avc);
  out_error_free_anon_vma:
 	put_anon_vma(anon_vma);
- out_error:
-	unlink_anon_vmas(vma);
 	return -ENOMEM;
 }
 
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/4] VM/RMAP: Batch anon_vma_unlink in exit
  2011-05-05 19:32 Batch locking for rmap fork/exit processing Andi Kleen
  2011-05-05 19:32 ` [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking Andi Kleen
  2011-05-05 19:32 ` [PATCH 2/4] VM/RMAP: Batch anon vma chain root locking in fork Andi Kleen
@ 2011-05-05 19:32 ` Andi Kleen
  2011-05-06 16:48   ` Rik van Riel
  2011-05-05 19:32 ` [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock Andi Kleen
  2011-05-06 16:44 ` Batch locking for rmap fork/exit processing Linus Torvalds
  4 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2011-05-05 19:32 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Rik van Riel, tim.c.chen, torvalds, lwoodman,
	mel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Apply the rmap chain lock batching to anon_vma_unlink() too.
This speeds up exit() on process chains with many processes,
when there is a lot of sharing.

Unfortunately this doesn't fix all lock contention -- file vmas
have a mapping lock that is also a problem. And even existing
anon_vmas still contend. But it's better than before.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/rmap.c |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index fbac55a..2076d78 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -297,7 +297,9 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
 	return -ENOMEM;
 }
 
-static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain)
+/* Caller must call anon_vma_unlock_batch */
+static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain,
+			    struct anon_vma_lock_state *avs)
 {
 	struct anon_vma *anon_vma = anon_vma_chain->anon_vma;
 	int empty;
@@ -306,12 +308,11 @@ static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain)
 	if (!anon_vma)
 		return;
 
-	anon_vma_lock(anon_vma);
+	anon_vma_lock_batch(anon_vma, avs);
 	list_del(&anon_vma_chain->same_anon_vma);
 
 	/* We must garbage collect the anon_vma if it's empty */
 	empty = list_empty(&anon_vma->head);
-	anon_vma_unlock(anon_vma);
 
 	if (empty)
 		put_anon_vma(anon_vma);
@@ -320,16 +321,19 @@ static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain)
 void unlink_anon_vmas(struct vm_area_struct *vma)
 {
 	struct anon_vma_chain *avc, *next;
-
+	struct anon_vma_lock_state avs;
+	
 	/*
 	 * Unlink each anon_vma chained to the VMA.  This list is ordered
 	 * from newest to oldest, ensuring the root anon_vma gets freed last.
 	 */
+	init_anon_vma_lock_batch(&avs);
 	list_for_each_entry_safe(avc, next, &vma->anon_vma_chain, same_vma) {
-		anon_vma_unlink(avc);
+		anon_vma_unlink(avc, &avs);
 		list_del(&avc->same_vma);
 		anon_vma_chain_free(avc);
 	}
+	anon_vma_unlock_batch(&avs);
 }
 
 static void anon_vma_ctor(void *data)
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock
  2011-05-05 19:32 Batch locking for rmap fork/exit processing Andi Kleen
                   ` (2 preceding siblings ...)
  2011-05-05 19:32 ` [PATCH 3/4] VM/RMAP: Batch anon_vma_unlink in exit Andi Kleen
@ 2011-05-05 19:32 ` Andi Kleen
  2011-05-06 16:52   ` Rik van Riel
  2011-05-06 16:44 ` Batch locking for rmap fork/exit processing Linus Torvalds
  4 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2011-05-05 19:32 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Rik van Riel, tim.c.chen, torvalds, lwoodman,
	mel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Now that the avc locking is batched move the freeing of AVCs
outside the lock. This lowers lock contention somewhat more on
a fork/exit intensive workload.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/rmap.c |   24 ++++++++++++++----------
 1 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 2076d78..92070f4 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -302,7 +302,6 @@ static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain,
 			    struct anon_vma_lock_state *avs)
 {
 	struct anon_vma *anon_vma = anon_vma_chain->anon_vma;
-	int empty;
 
 	/* If anon_vma_fork fails, we can get an empty anon_vma_chain. */
 	if (!anon_vma)
@@ -310,19 +309,14 @@ static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain,
 
 	anon_vma_lock_batch(anon_vma, avs);
 	list_del(&anon_vma_chain->same_anon_vma);
-
-	/* We must garbage collect the anon_vma if it's empty */
-	empty = list_empty(&anon_vma->head);
-
-	if (empty)
-		put_anon_vma(anon_vma);
 }
 
 void unlink_anon_vmas(struct vm_area_struct *vma)
 {
 	struct anon_vma_chain *avc, *next;
 	struct anon_vma_lock_state avs;
-	
+	LIST_HEAD(avmas);
+
 	/*
 	 * Unlink each anon_vma chained to the VMA.  This list is ordered
 	 * from newest to oldest, ensuring the root anon_vma gets freed last.
@@ -330,10 +324,20 @@ void unlink_anon_vmas(struct vm_area_struct *vma)
 	init_anon_vma_lock_batch(&avs);
 	list_for_each_entry_safe(avc, next, &vma->anon_vma_chain, same_vma) {
 		anon_vma_unlink(avc, &avs);
-		list_del(&avc->same_vma);
-		anon_vma_chain_free(avc);
+		list_move(&avc->same_vma, &avmas);
 	}
 	anon_vma_unlock_batch(&avs);
+
+	/* Now free them outside the lock */
+	list_for_each_entry_safe(avc, next, &avmas, same_vma) {
+		/* 
+		 * list_empty check can be done lockless because
+		 * once it is empty noone will readd.
+		 */
+		if (list_empty(&avc->anon_vma->head))
+			put_anon_vma(avc->anon_vma);
+		anon_vma_chain_free(avc);		
+	}
 }
 
 static void anon_vma_ctor(void *data)
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-05 19:32 ` [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking Andi Kleen
@ 2011-05-06 15:59   ` Rik van Riel
  2011-05-09 21:43   ` Andrew Morton
  1 sibling, 0 replies; 21+ messages in thread
From: Rik van Riel @ 2011-05-06 15:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, Andrea Arcangeli, tim.c.chen, torvalds, lwoodman, mel,
	Andi Kleen

On 05/05/2011 03:32 PM, Andi Kleen wrote:
> From: Andi Kleen<ak@linux.intel.com>
>
> In fork and exit it's quite common to take same rmap chain locks
> again and again when the whole address space is processed  for a
> address space that has a lot of sharing. Also since the locking
> has changed to always lock the root anon_vma this can be very
> contended.
>
> This patch adds a simple wrapper to batch these lock acquisitions
> and only reaquire the lock when another is needed. The main
> advantage is that when multiple processes are doing this in
> parallel they will avoid a lot of communication overhead
> on the lock cache line.
>
> I added a simple lock break (100 locks) for paranoia reason,
> but it's unclear if that's needed or not.
>
> Cc: Andrea Arcangeli<aarcange@redhat.com>
> Cc: Rik van Riel<riel@redhat.com>
> Signed-off-by: Andi Kleen<ak@linux.intel.com>

Reviewed-by: Rik van Riel<riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] VM/RMAP: Batch anon vma chain root locking in fork
  2011-05-05 19:32 ` [PATCH 2/4] VM/RMAP: Batch anon vma chain root locking in fork Andi Kleen
@ 2011-05-06 15:59   ` Rik van Riel
  0 siblings, 0 replies; 21+ messages in thread
From: Rik van Riel @ 2011-05-06 15:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, Andrea Arcangeli, tim.c.chen, torvalds, lwoodman, mel,
	Andi Kleen

On 05/05/2011 03:32 PM, Andi Kleen wrote:
> From: Andi Kleen<ak@linux.intel.com>
>
> We found that the changes to take anon vma root chain lock lead
> to excessive lock contention on a fork intensive workload on a 4S
> system.
>
> Use the new batch lock infrastructure to optimize the fork()
> path, where it is very common to acquire always the same lock.
>
> This patch does not really lower the contention, but batches
> the lock taking/freeing to lower the bouncing overhead when
> multiple forks are working at the same time. Essentially each
> user will get more work done inside a locking region.
>
> Reported-by: Tim Chen<tim.c.chen@linux.intel.com>
> Cc: Andrea Arcangeli<aarcange@redhat.com>
> Cc: Rik van Riel<riel@redhat.com>
> Signed-off-by: Andi Kleen<ak@linux.intel.com>

Reviewed-by: Rik van Riel<riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Batch locking for rmap fork/exit processing
  2011-05-05 19:32 Batch locking for rmap fork/exit processing Andi Kleen
                   ` (3 preceding siblings ...)
  2011-05-05 19:32 ` [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock Andi Kleen
@ 2011-05-06 16:44 ` Linus Torvalds
  2011-05-06 17:05   ` Andi Kleen
  4 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2011-05-06 16:44 UTC (permalink / raw)
  To: Andi Kleen, Andrew Morton
  Cc: linux-mm, Andrea Arcangeli, Rik van Riel, tim.c.chen, lwoodman, mel

Hmm. Andrew wasn't cc'd on this series, and usually things like this
go through the -mm tree.

Maybe Andrew saw it by virtue of the linux-mm list, but maybe he
didn't. So here he is cc'd directly.

The series looks reasonable to me,

                         Linus

On Thu, May 5, 2011 at 12:32 PM, Andi Kleen <andi@firstfloor.org> wrote:
> 012f18004da33ba67 in 2.6.36 caused a significant performance regression in
> fork/exit intensive workloads with a lot of sharing. The problem is that
> fork/exit now contend heavily on the lock of the root anon_vma.
>
> This patchkit attempts to lower this a bit by batching the lock acquisions.
> Right now the lock is taken for every shared vma individually. This
> patchkit batches this and only reaquires the lock when actually needed.
>
> When multiple processes are doing this in parallel, they will now
> spend much less time bouncing the lock cache line around. In addition
> there should be also lower overhead in the uncontended case because
> locks are relatively slow (not measured)
>
> This doesn't completely fix the regression on a 4S system, but cuts
> it down somewhat. One particular workload suffering from this gets
> about 5% faster.
>
> This is essentially a micro optimization that just tries to mitigate
> the problem a bit.
>
> Better would be to switch back to more local locking like .35 had, but I
> guess then we would be back with the old deadlocks? I was thinking also of
> adding some deadlock avoidance as an alternative.
>
> -Andi
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] VM/RMAP: Batch anon_vma_unlink in exit
  2011-05-05 19:32 ` [PATCH 3/4] VM/RMAP: Batch anon_vma_unlink in exit Andi Kleen
@ 2011-05-06 16:48   ` Rik van Riel
  0 siblings, 0 replies; 21+ messages in thread
From: Rik van Riel @ 2011-05-06 16:48 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, Andrea Arcangeli, tim.c.chen, torvalds, lwoodman, mel,
	Andi Kleen

On 05/05/2011 03:32 PM, Andi Kleen wrote:
> From: Andi Kleen<ak@linux.intel.com>
>
> Apply the rmap chain lock batching to anon_vma_unlink() too.
> This speeds up exit() on process chains with many processes,
> when there is a lot of sharing.
>
> Unfortunately this doesn't fix all lock contention -- file vmas
> have a mapping lock that is also a problem. And even existing
> anon_vmas still contend. But it's better than before.
>
> Signed-off-by: Andi Kleen<ak@linux.intel.com>

Reviewed-by: Rik van Riel<riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock
  2011-05-05 19:32 ` [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock Andi Kleen
@ 2011-05-06 16:52   ` Rik van Riel
  0 siblings, 0 replies; 21+ messages in thread
From: Rik van Riel @ 2011-05-06 16:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, Andrea Arcangeli, tim.c.chen, torvalds, lwoodman, mel,
	Andi Kleen

On 05/05/2011 03:32 PM, Andi Kleen wrote:
> From: Andi Kleen<ak@linux.intel.com>
>
> Now that the avc locking is batched move the freeing of AVCs
> outside the lock. This lowers lock contention somewhat more on
> a fork/exit intensive workload.
>
> Signed-off-by: Andi Kleen<ak@linux.intel.com>

Acked-by: Rik van Riel<riel@redhat.com>

I believe that calling put_anon_vma outside of the lock is
safe, but am not 100% sure.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Batch locking for rmap fork/exit processing
  2011-05-06 16:44 ` Batch locking for rmap fork/exit processing Linus Torvalds
@ 2011-05-06 17:05   ` Andi Kleen
  0 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2011-05-06 17:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Andrew Morton, linux-mm, Andrea Arcangeli,
	Rik van Riel, tim.c.chen, lwoodman, mel

On Fri, May 06, 2011 at 09:44:40AM -0700, Linus Torvalds wrote:
> Hmm. Andrew wasn't cc'd on this series, and usually things like this
> go through the -mm tree
> 
> Maybe Andrew saw it by virtue of the linux-mm list, but maybe he
> didn't. So here he is cc'd directly.
> 
> The series looks reasonable to me,

Thanks.

To be honest I'm not really happy about it, it's more a short term
bandaid.  anon_vma lock and file mapping i_mmap_lock lock are still very hot
in the test, compared to .35. And I didn't find any way to improve
i_mmap_lock.

If someone has better ideas to fix this I would be interested.

Still it's probably a useful short term bandaid.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-05 19:32 ` [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking Andi Kleen
  2011-05-06 15:59   ` Rik van Riel
@ 2011-05-09 21:43   ` Andrew Morton
  2011-05-09 22:23     ` Andi Kleen
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2011-05-09 21:43 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mm, Andrea Arcangeli, Rik van Riel, tim.c.chen, torvalds,
	lwoodman, mel, Andi Kleen

On Thu,  5 May 2011 12:32:49 -0700
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> In fork and exit it's quite common to take same rmap chain locks
> again and again when the whole address space is processed  for a
> address space that has a lot of sharing. Also since the locking
> has changed to always lock the root anon_vma this can be very
> contended.
> 
> This patch adds a simple wrapper to batch these lock acquisitions
> and only reaquire the lock when another is needed. The main
> advantage is that when multiple processes are doing this in
> parallel they will avoid a lot of communication overhead
> on the lock cache line.
> 
> I added a simple lock break (100 locks) for paranoia reason,
> but it's unclear if that's needed or not.
> 
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  include/linux/rmap.h |   38 ++++++++++++++++++++++++++++++++++++++
>  1 files changed, 38 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 830e65d..d5bb9f8 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -113,6 +113,44 @@ static inline void anon_vma_unlock(struct anon_vma *anon_vma)
>  	spin_unlock(&anon_vma->root->lock);
>  }
>  
> +/* 
> + * Batched locking for anon VMA chains to avoid too much cache line 
> + * bouncing.
> + */
> +
> +#define AVL_LOCKBREAK 500
> +
> +struct anon_vma_lock_state {
> +	struct anon_vma *root_anon_vma;
> +	int counter;
> +};
> +
> +static inline void init_anon_vma_lock_batch(struct anon_vma_lock_state *avs)
> +{
> +	avs->root_anon_vma = NULL;
> +	avs->counter = 0;
> +}

This should be called anon_vma_lock_batch_init().

> +static inline void anon_vma_lock_batch(struct anon_vma *anon_vma,
> +				       struct anon_vma_lock_state *state)
> +{
> +	if (state->root_anon_vma == anon_vma->root &&
> +	    state->counter++ < AVL_LOCKBREAK)
> +		return;
> +	if (state->root_anon_vma) {
> +		state->counter = 0;
> +		spin_unlock(&state->root_anon_vma->lock);
> +	}
> +	state->root_anon_vma = anon_vma->root;
> +	spin_lock(&state->root_anon_vma->lock);
> +}

hm, that's a bit large for inlining.

> +static inline void anon_vma_unlock_batch(struct anon_vma_lock_state *avs)
> +{
> +	if (avs->root_anon_vma)
> +		spin_unlock(&avs->root_anon_vma->lock);
> +}
> +
>  /*
>   * anon_vma helper functions.
>   */

The code doesn't build - the patchset forgot to add `spinlock_t lock'
to the anon_vma.

After fixing that and doing an allnoconfig x86_64 build, the patchset
takes rmap.o's .text from 6167 bytes to 6551.  This is likely to be a
regression for uniprocessor machines.  What can we do about this?



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-09 21:43   ` Andrew Morton
@ 2011-05-09 22:23     ` Andi Kleen
  2011-05-09 22:28       ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2011-05-09 22:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, linux-mm, Andrea Arcangeli, Rik van Riel, tim.c.chen,
	torvalds, lwoodman, mel


>> +static inline void anon_vma_unlock_batch(struct anon_vma_lock_state *avs)
>> +{
>> +	if (avs->root_anon_vma)
>> +		spin_unlock(&avs->root_anon_vma->lock);
>> +}
>> +
>>   /*
>>    * anon_vma helper functions.
>>    */
> The code doesn't build - the patchset forgot to add `spinlock_t lock'
> to the anon_vma.

Hmm, maybe I made a mistake in refactoring.

> After fixing that and doing an allnoconfig x86_64 build, the patchset
> takes rmap.o's .text from 6167 bytes to 6551.  This is likely to be a
> regression for uniprocessor machines.  What can we do about this?
>

Regression in what way? I guess I can move some of the functions out of 
line.

-Andi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-09 22:23     ` Andi Kleen
@ 2011-05-09 22:28       ` Andrew Morton
  2011-05-09 23:02         ` Andi Kleen
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2011-05-09 22:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andi Kleen, linux-mm, Andrea Arcangeli, Rik van Riel, tim.c.chen,
	torvalds, lwoodman, mel

On Mon, 09 May 2011 15:23:03 -0700
Andi Kleen <ak@linux.intel.com> wrote:

> > After fixing that and doing an allnoconfig x86_64 build, the patchset
> > takes rmap.o's .text from 6167 bytes to 6551.  This is likely to be a
> > regression for uniprocessor machines.  What can we do about this?
> >
> 
> Regression in what way?

It makes the code larger and probably slower, for no gain?

> I guess I can move some of the functions out of 
> line.

I don't know how much that will help.  Perhaps a wholesale refactoring
and making it all SMP-only will be justified.  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-09 22:28       ` Andrew Morton
@ 2011-05-09 23:02         ` Andi Kleen
  2011-05-09 23:16           ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2011-05-09 23:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Andi Kleen, linux-mm, Andrea Arcangeli, Rik van Riel,
	tim.c.chen, torvalds, lwoodman, mel

On Mon, May 09, 2011 at 03:28:41PM -0700, Andrew Morton wrote:
> On Mon, 09 May 2011 15:23:03 -0700
> Andi Kleen <ak@linux.intel.com> wrote:
> 
> > > After fixing that and doing an allnoconfig x86_64 build, the patchset
> > > takes rmap.o's .text from 6167 bytes to 6551.  This is likely to be a
> > > regression for uniprocessor machines.  What can we do about this?
> > >
> > 
> > Regression in what way?
> 
> It makes the code larger and probably slower, for no gain?

It should be actually faster because there are much less atomic ops.
Atomic ops are quite expensive -- especially on some older CPUs, even when
not contended.

> 
> > I guess I can move some of the functions out of 
> > line.
> 
> I don't know how much that will help.  Perhaps a wholesale refactoring
> and making it all SMP-only will be justified.  

Yes I don't think there were a lot of callers.

I can take out the lockbreak. I was a bit dubious on its utility
anyways.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-09 23:02         ` Andi Kleen
@ 2011-05-09 23:16           ` Andrew Morton
  2011-05-10  0:05             ` Andi Kleen
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2011-05-09 23:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andi Kleen, linux-mm, Andrea Arcangeli, Rik van Riel, tim.c.chen,
	torvalds, lwoodman, mel

On Tue, 10 May 2011 01:02:55 +0200
Andi Kleen <andi@firstfloor.org> wrote:

> On Mon, May 09, 2011 at 03:28:41PM -0700, Andrew Morton wrote:
> > On Mon, 09 May 2011 15:23:03 -0700
> > Andi Kleen <ak@linux.intel.com> wrote:
> > 
> > > > After fixing that and doing an allnoconfig x86_64 build, the patchset
> > > > takes rmap.o's .text from 6167 bytes to 6551.  This is likely to be a
> > > > regression for uniprocessor machines.  What can we do about this?
> > > >
> > > 
> > > Regression in what way?
> > 
> > It makes the code larger and probably slower, for no gain?
> 
> It should be actually faster because there are much less atomic ops.
> Atomic ops are quite expensive -- especially on some older CPUs, even when
> not contended.

hm, which atomic ops are those?  We shouldn't need buslocked operations
on UP.

> > 
> > > I guess I can move some of the functions out of 
> > > line.
> > 
> > I don't know how much that will help.  Perhaps a wholesale refactoring
> > and making it all SMP-only will be justified.  
> 
> Yes I don't think there were a lot of callers.
> 
> I can take out the lockbreak. I was a bit dubious on its utility
> anyways.

I guess that running something like latencytop with a suitably nasty
workload would permit detection of any problems.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-09 23:16           ` Andrew Morton
@ 2011-05-10  0:05             ` Andi Kleen
  2011-05-10  1:26               ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2011-05-10  0:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Andi Kleen, linux-mm, Andrea Arcangeli, Rik van Riel,
	tim.c.chen, torvalds, lwoodman, mel

> hm, which atomic ops are those?  We shouldn't need buslocked operations
> on UP.

Good point. I forgot they're patched out.

Then it's likely somewhat slower, but I doubt it's a significant
slowdown. Essentially it's just a few more if ()s

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-10  0:05             ` Andi Kleen
@ 2011-05-10  1:26               ` Andrew Morton
  2011-05-10  1:27                 ` Rik van Riel
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2011-05-10  1:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andi Kleen, linux-mm, Andrea Arcangeli, Rik van Riel, tim.c.chen,
	torvalds, lwoodman, mel

On Tue, 10 May 2011 02:05:17 +0200 Andi Kleen <andi@firstfloor.org> wrote:

> > hm, which atomic ops are those?  We shouldn't need buslocked operations
> > on UP.
> 
> Good point. I forgot they're patched out.
> 
> Then it's likely somewhat slower, but I doubt it's a significant
> slowdown. Essentially it's just a few more if ()s
> 

And presumably some more instruction cachelines here and there.  It'll
be small, but 100*small == large.  We can reduce the overhead to zero,
it's a question of how ugly the end result is.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-10  1:26               ` Andrew Morton
@ 2011-05-10  1:27                 ` Rik van Riel
  2011-05-10  1:48                   ` Andrew Morton
  0 siblings, 1 reply; 21+ messages in thread
From: Rik van Riel @ 2011-05-10  1:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Andi Kleen, linux-mm, Andrea Arcangeli, tim.c.chen,
	torvalds, lwoodman, mel

On 05/09/2011 09:26 PM, Andrew Morton wrote:
> On Tue, 10 May 2011 02:05:17 +0200 Andi Kleen<andi@firstfloor.org>  wrote:
>
>>> hm, which atomic ops are those?  We shouldn't need buslocked operations
>>> on UP.
>>
>> Good point. I forgot they're patched out.
>>
>> Then it's likely somewhat slower, but I doubt it's a significant
>> slowdown. Essentially it's just a few more if ()s
>
> And presumably some more instruction cachelines here and there.  It'll
> be small, but 100*small == large.  We can reduce the overhead to zero,
> it's a question of how ugly the end result is.

More instruction cachelines, but fewer pointer dereferences
due to not looking up the lock all the time (if the root
anon_vma is the same).

I suspect that will balance out pretty well.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking
  2011-05-10  1:27                 ` Rik van Riel
@ 2011-05-10  1:48                   ` Andrew Morton
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2011-05-10  1:48 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andi Kleen, Andi Kleen, linux-mm, Andrea Arcangeli, tim.c.chen,
	torvalds, lwoodman, mel

On Mon, 09 May 2011 21:27:28 -0400 Rik van Riel <riel@redhat.com> wrote:

> On 05/09/2011 09:26 PM, Andrew Morton wrote:
> > On Tue, 10 May 2011 02:05:17 +0200 Andi Kleen<andi@firstfloor.org>  wrote:
> >
> >>> hm, which atomic ops are those?  We shouldn't need buslocked operations
> >>> on UP.
> >>
> >> Good point. I forgot they're patched out.
> >>
> >> Then it's likely somewhat slower, but I doubt it's a significant
> >> slowdown. Essentially it's just a few more if ()s
> >
> > And presumably some more instruction cachelines here and there.  It'll
> > be small, but 100*small == large.  We can reduce the overhead to zero,
> > it's a question of how ugly the end result is.
> 
> More instruction cachelines, but fewer pointer dereferences
> due to not looking up the lock all the time (if the root
> anon_vma is the same).
> 

Why would we be looking up a lock which doesn't exist?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock
  2011-05-13 23:46 Batch locking for rmap fork/exit processing v2 Andi Kleen
@ 2011-05-13 23:46 ` Andi Kleen
  0 siblings, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2011-05-13 23:46 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Now that the avc locking is batched move the freeing of AVCs
outside the lock. This lowers lock contention somewhat more on
a fork/exit intensive workload.

Acked-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/rmap.c |   24 ++++++++++++++----------
 1 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index d4d3680..57f8ba9 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -302,7 +302,6 @@ static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain,
 			    struct anon_vma_lock_state *avs)
 {
 	struct anon_vma *anon_vma = anon_vma_chain->anon_vma;
-	int empty;
 
 	/* If anon_vma_fork fails, we can get an empty anon_vma_chain. */
 	if (!anon_vma)
@@ -310,19 +309,14 @@ static void anon_vma_unlink(struct anon_vma_chain *anon_vma_chain,
 
 	anon_vma_lock_batch(anon_vma, avs);
 	list_del(&anon_vma_chain->same_anon_vma);
-
-	/* We must garbage collect the anon_vma if it's empty */
-	empty = list_empty(&anon_vma->head);
-
-	if (empty)
-		put_anon_vma(anon_vma);
 }
 
 void unlink_anon_vmas(struct vm_area_struct *vma)
 {
 	struct anon_vma_chain *avc, *next;
 	struct anon_vma_lock_state avs;
-	
+	LIST_HEAD(avmas);
+
 	/*
 	 * Unlink each anon_vma chained to the VMA.  This list is ordered
 	 * from newest to oldest, ensuring the root anon_vma gets freed last.
@@ -330,10 +324,20 @@ void unlink_anon_vmas(struct vm_area_struct *vma)
 	anon_vma_lock_batch_init(&avs);
 	list_for_each_entry_safe(avc, next, &vma->anon_vma_chain, same_vma) {
 		anon_vma_unlink(avc, &avs);
-		list_del(&avc->same_vma);
-		anon_vma_chain_free(avc);
+		list_move(&avc->same_vma, &avmas);
 	}
 	anon_vma_unlock_batch(&avs);
+
+	/* Now free them outside the lock */
+	list_for_each_entry_safe(avc, next, &avmas, same_vma) {
+		/* 
+		 * list_empty check can be done lockless because
+		 * once it is empty noone will readd.
+		 */
+		if (list_empty(&avc->anon_vma->head))
+			put_anon_vma(avc->anon_vma);
+		anon_vma_chain_free(avc);		
+	}
 }
 
 static void anon_vma_ctor(void *data)
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-05-13 23:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-05 19:32 Batch locking for rmap fork/exit processing Andi Kleen
2011-05-05 19:32 ` [PATCH 1/4] VM/RMAP: Add infrastructure for batching the rmap chain locking Andi Kleen
2011-05-06 15:59   ` Rik van Riel
2011-05-09 21:43   ` Andrew Morton
2011-05-09 22:23     ` Andi Kleen
2011-05-09 22:28       ` Andrew Morton
2011-05-09 23:02         ` Andi Kleen
2011-05-09 23:16           ` Andrew Morton
2011-05-10  0:05             ` Andi Kleen
2011-05-10  1:26               ` Andrew Morton
2011-05-10  1:27                 ` Rik van Riel
2011-05-10  1:48                   ` Andrew Morton
2011-05-05 19:32 ` [PATCH 2/4] VM/RMAP: Batch anon vma chain root locking in fork Andi Kleen
2011-05-06 15:59   ` Rik van Riel
2011-05-05 19:32 ` [PATCH 3/4] VM/RMAP: Batch anon_vma_unlink in exit Andi Kleen
2011-05-06 16:48   ` Rik van Riel
2011-05-05 19:32 ` [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock Andi Kleen
2011-05-06 16:52   ` Rik van Riel
2011-05-06 16:44 ` Batch locking for rmap fork/exit processing Linus Torvalds
2011-05-06 17:05   ` Andi Kleen
2011-05-13 23:46 Batch locking for rmap fork/exit processing v2 Andi Kleen
2011-05-13 23:46 ` [PATCH 4/4] VM/RMAP: Move avc freeing outside the lock Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.