netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
@ 2016-09-01 15:27 Manfred Spraul
  2016-09-01 15:30 ` Will Deacon
  0 siblings, 1 reply; 9+ messages in thread
From: Manfred Spraul @ 2016-09-01 15:27 UTC (permalink / raw)
  To: benh, paulmck, Ingo Molnar, Boqun Feng, Peter Zijlstra, Andrew Morton
  Cc: LKML, will.deacon, 1vier1, Davidlohr Bueso, Manfred Spraul,
	Pablo Neira Ayuso, netfilter-devel

Since spin_unlock_wait() is defined as equivalent to spin_lock();
spin_unlock(), the memory barrier before spin_unlock_wait() is
also not required.

Not for stable!

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
---
 net/netfilter/nf_conntrack_core.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 7a3b5e6..0591a25 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -139,13 +139,7 @@ static void nf_conntrack_all_lock(void)
 
 	spin_lock(&nf_conntrack_locks_all_lock);
 
-	/*
-	 * Order the store of 'nf_conntrack_locks_all' against
-	 * the spin_unlock_wait() loads below, such that if
-	 * nf_conntrack_lock() observes 'nf_conntrack_locks_all'
-	 * we must observe nf_conntrack_locks[] held:
-	 */
-	smp_store_mb(nf_conntrack_locks_all, true);
+	nf_conntrack_locks_all = true;
 
 	for (i = 0; i < CONNTRACK_LOCKS; i++) {
 		spin_unlock_wait(&nf_conntrack_locks[i]);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-01 15:27 [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier Manfred Spraul
@ 2016-09-01 15:30 ` Will Deacon
  2016-09-01 16:41   ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Will Deacon @ 2016-09-01 15:30 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: benh, paulmck, Ingo Molnar, Boqun Feng, Peter Zijlstra,
	Andrew Morton, LKML, 1vier1, Davidlohr Bueso, Pablo Neira Ayuso,
	netfilter-devel

On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> Since spin_unlock_wait() is defined as equivalent to spin_lock();
> spin_unlock(), the memory barrier before spin_unlock_wait() is
> also not required.
> 
> Not for stable!
> 
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: netfilter-devel@vger.kernel.org
> ---
>  net/netfilter/nf_conntrack_core.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 7a3b5e6..0591a25 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -139,13 +139,7 @@ static void nf_conntrack_all_lock(void)
>  
>  	spin_lock(&nf_conntrack_locks_all_lock);
>  
> -	/*
> -	 * Order the store of 'nf_conntrack_locks_all' against
> -	 * the spin_unlock_wait() loads below, such that if
> -	 * nf_conntrack_lock() observes 'nf_conntrack_locks_all'
> -	 * we must observe nf_conntrack_locks[] held:
> -	 */
> -	smp_store_mb(nf_conntrack_locks_all, true);
> +	nf_conntrack_locks_all = true;

Don't you at least need WRITE_ONCE if you're going to do this?

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-01 15:30 ` Will Deacon
@ 2016-09-01 16:41   ` Peter Zijlstra
  2016-09-02  6:17     ` Boqun Feng
  2016-09-02  6:35     ` Manfred Spraul
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2016-09-01 16:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Manfred Spraul, benh, paulmck, Ingo Molnar, Boqun Feng,
	Andrew Morton, LKML, 1vier1, Davidlohr Bueso, Pablo Neira Ayuso,
	netfilter-devel

On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> > Since spin_unlock_wait() is defined as equivalent to spin_lock();
> > spin_unlock(), the memory barrier before spin_unlock_wait() is
> > also not required.

Note that ACQUIRE+RELEASE isn't a barrier.

Both are semi-permeable and things can cross in the middle, like:


	x = 1;
	LOCK
	UNLOCK
	r = y;

can (validly) get re-ordered like:

	LOCK
	r = y;
	x = 1;
	UNLOCK

So if you want things ordered, as I think you do, I think the smp_mb()
is still needed.

RELEASE + ACQUIRE otoh, that is a load-store barrier (but not
transitive).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-01 16:41   ` Peter Zijlstra
@ 2016-09-02  6:17     ` Boqun Feng
  2016-09-02  6:35     ` Manfred Spraul
  1 sibling, 0 replies; 9+ messages in thread
From: Boqun Feng @ 2016-09-02  6:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, Manfred Spraul, benh, paulmck, Ingo Molnar,
	Andrew Morton, LKML, 1vier1, Davidlohr Bueso, Pablo Neira Ayuso,
	netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

Hi Manfred,

On Thu, Sep 01, 2016 at 06:41:26PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> > On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> > > Since spin_unlock_wait() is defined as equivalent to spin_lock();
> > > spin_unlock(), the memory barrier before spin_unlock_wait() is
> > > also not required.
> 

As Peter said below, ACQUIRE+RELEASE is not a barrier.

What we rely here is that spin_unlock_wait() could pair with another
LOCK or UNLOCK(as spin_unlock_wait() acts as spin_lock();
spin_unlock()). And once paired, we could have the necessary order
guarantee between the code preceding or following unlock_wait() and the
code in the lock critical sections.

Regards,
Boqun

> Note that ACQUIRE+RELEASE isn't a barrier.
> 
> Both are semi-permeable and things can cross in the middle, like:
> 
> 
> 	x = 1;
> 	LOCK
> 	UNLOCK
> 	r = y;
> 
> can (validly) get re-ordered like:
> 
> 	LOCK
> 	r = y;
> 	x = 1;
> 	UNLOCK
> 
> So if you want things ordered, as I think you do, I think the smp_mb()
> is still needed.
> 
> RELEASE + ACQUIRE otoh, that is a load-store barrier (but not
> transitive).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-01 16:41   ` Peter Zijlstra
  2016-09-02  6:17     ` Boqun Feng
@ 2016-09-02  6:35     ` Manfred Spraul
  2016-09-02 19:22       ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Manfred Spraul @ 2016-09-02  6:35 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon
  Cc: benh, paulmck, Ingo Molnar, Boqun Feng, Andrew Morton, LKML,
	1vier1, Davidlohr Bueso, Pablo Neira Ayuso, netfilter-devel

On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
>> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
>>> Since spin_unlock_wait() is defined as equivalent to spin_lock();
>>> spin_unlock(), the memory barrier before spin_unlock_wait() is
>>> also not required.
> Note that ACQUIRE+RELEASE isn't a barrier.
>
> Both are semi-permeable and things can cross in the middle, like:
>
>
> 	x = 1;
> 	LOCK
> 	UNLOCK
> 	r = y;
>
> can (validly) get re-ordered like:
>
> 	LOCK
> 	r = y;
> 	x = 1;
> 	UNLOCK
>
> So if you want things ordered, as I think you do, I think the smp_mb()
> is still needed.
CPU1:
x=1; /* without WRITE_ONCE */
LOCK(l);
UNLOCK(l);
<do_semop>
smp_store_release(x,0)


CPU2;
LOCK(l)
if (smp_load_acquire(x)==1) goto slow_path
<do_semop>
UNLOCK(l)

Ordering is enforced because both CPUs access the same lock.

x=1 can't be reordered past the UNLOCK(l), I don't see that further 
guarantees are necessary.

Correct?

--
     Manfred

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-02  6:35     ` Manfred Spraul
@ 2016-09-02 19:22       ` Peter Zijlstra
  2016-09-03  5:33         ` Manfred Spraul
  2016-09-05 18:57         ` Manfred Spraul
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2016-09-02 19:22 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Will Deacon, benh, paulmck, Ingo Molnar, Boqun Feng,
	Andrew Morton, LKML, 1vier1, Davidlohr Bueso, Pablo Neira Ayuso,
	netfilter-devel

On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
> >On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> >>On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> >>>Since spin_unlock_wait() is defined as equivalent to spin_lock();
> >>>spin_unlock(), the memory barrier before spin_unlock_wait() is
> >>>also not required.
> >Note that ACQUIRE+RELEASE isn't a barrier.
> >
> >Both are semi-permeable and things can cross in the middle, like:
> >
> >
> >	x = 1;
> >	LOCK
> >	UNLOCK
> >	r = y;
> >
> >can (validly) get re-ordered like:
> >
> >	LOCK
> >	r = y;
> >	x = 1;
> >	UNLOCK
> >
> >So if you want things ordered, as I think you do, I think the smp_mb()
> >is still needed.
> CPU1:
> x=1; /* without WRITE_ONCE */
> LOCK(l);
> UNLOCK(l);
> <do_semop>
> smp_store_release(x,0)
> 
> 
> CPU2;
> LOCK(l)
> if (smp_load_acquire(x)==1) goto slow_path
> <do_semop>
> UNLOCK(l)
> 
> Ordering is enforced because both CPUs access the same lock.
> 
> x=1 can't be reordered past the UNLOCK(l), I don't see that further
> guarantees are necessary.
> 
> Correct?

Correct, sadly implementations do not comply :/ In fact, even x86 is
broken here.

I spoke to Will earlier today and he suggests either making
spin_unlock_wait() stronger to avoids any and all such surprises or just
getting rid of the thing.

I'm not sure which way we should go, but please hold off on these two
patches until I've had a chance to audit all of those implementations
again.

I'll try and have a look at your other patches before that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-02 19:22       ` Peter Zijlstra
@ 2016-09-03  5:33         ` Manfred Spraul
  2016-09-05 18:57         ` Manfred Spraul
  1 sibling, 0 replies; 9+ messages in thread
From: Manfred Spraul @ 2016-09-03  5:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, benh, paulmck, Ingo Molnar, Boqun Feng,
	Andrew Morton, LKML, 1vier1, Davidlohr Bueso, Pablo Neira Ayuso,
	netfilter-devel

On 09/02/2016 09:22 PM, Peter Zijlstra wrote:
> On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
>> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
>>> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
>>>> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
>>>>> Since spin_unlock_wait() is defined as equivalent to spin_lock();
>>>>> spin_unlock(), the memory barrier before spin_unlock_wait() is
>>>>> also not required.
>>> Note that ACQUIRE+RELEASE isn't a barrier.
>>>
>>> Both are semi-permeable and things can cross in the middle, like:
>>>
>>>
>>> 	x = 1;
>>> 	LOCK
>>> 	UNLOCK
>>> 	r = y;
>>>
>>> can (validly) get re-ordered like:
>>>
>>> 	LOCK
>>> 	r = y;
>>> 	x = 1;
>>> 	UNLOCK
>>>
>>> So if you want things ordered, as I think you do, I think the smp_mb()
>>> is still needed.
>> CPU1:
>> x=1; /* without WRITE_ONCE */
>> LOCK(l);
>> UNLOCK(l);
>> <do_semop>
>> smp_store_release(x,0)
>>
>>
>> CPU2;
>> LOCK(l)
>> if (smp_load_acquire(x)==1) goto slow_path
>> <do_semop>
>> UNLOCK(l)
>>
>> Ordering is enforced because both CPUs access the same lock.
>>
>> x=1 can't be reordered past the UNLOCK(l), I don't see that further
>> guarantees are necessary.
>>
>> Correct?
> Correct, sadly implementations do not comply :/ In fact, even x86 is
> broken here.
>
> I spoke to Will earlier today and he suggests either making
> spin_unlock_wait() stronger to avoids any and all such surprises or just
> getting rid of the thing.
>
> I'm not sure which way we should go, but please hold off on these two
> patches until I've had a chance to audit all of those implementations
> again.
For me, it doesn't really matter.
spin_unlock_wait() as "R", as "RAcq" or as "spin_lock(); spin_lock();" - 
I just want a usable definition for ipc/sem.c

So (just to keep Andrew updated):
Ready for merging is: (Bugfixes, safe just with spin_unlock_wait() as "R"):

- 45a449340cd1 ("ipc/sem.c: fix complex_count vs. simple op race")
   Cc stable, back to 3.10 ...
- 7fd5653d9986 ("net/netfilter/nf_conntrack_core: Fix memory barriers.")
   Cc stable, back to ~4.5

--
     Manfred

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-02 19:22       ` Peter Zijlstra
  2016-09-03  5:33         ` Manfred Spraul
@ 2016-09-05 18:57         ` Manfred Spraul
  2016-09-06 17:56           ` Will Deacon
  1 sibling, 1 reply; 9+ messages in thread
From: Manfred Spraul @ 2016-09-05 18:57 UTC (permalink / raw)
  To: Peter Zijlstra, Will Deacon, benh, paulmck
  Cc: Ingo Molnar, Boqun Feng, Andrew Morton, LKML, 1vier1,
	Davidlohr Bueso, Pablo Neira Ayuso, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]

Hi Peter,

On 09/02/2016 09:22 PM, Peter Zijlstra wrote:
> On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
>> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
>>> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
>>>> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
>>>>> Since spin_unlock_wait() is defined as equivalent to spin_lock();
>>>>> spin_unlock(), the memory barrier before spin_unlock_wait() is
>>>>> also not required.
>>> Note that ACQUIRE+RELEASE isn't a barrier.
>>>
>>> Both are semi-permeable and things can cross in the middle, like:
>>>
>>>
>>> 	x = 1;
>>> 	LOCK
>>> 	UNLOCK
>>> 	r = y;
>>>
>>> can (validly) get re-ordered like:
>>>
>>> 	LOCK
>>> 	r = y;
>>> 	x = 1;
>>> 	UNLOCK
>>>
>>> So if you want things ordered, as I think you do, I think the smp_mb()
>>> is still needed.
>> CPU1:
>> x=1; /* without WRITE_ONCE */
>> LOCK(l);
>> UNLOCK(l);
>> <do_semop>
>> smp_store_release(x,0)
>>
>>
>> CPU2;
>> LOCK(l)
>> if (smp_load_acquire(x)==1) goto slow_path
>> <do_semop>
>> UNLOCK(l)
>>
>> Ordering is enforced because both CPUs access the same lock.
>>
>> x=1 can't be reordered past the UNLOCK(l), I don't see that further
>> guarantees are necessary.
>>
>> Correct?
> Correct, sadly implementations do not comply :/ In fact, even x86 is
> broken here.
>
> I spoke to Will earlier today and he suggests either making
> spin_unlock_wait() stronger to avoids any and all such surprises or just
> getting rid of the thing.
I've tried the trivial solution:
Replace spin_unlock_wait() with spin_lock();spin_unlock().
With sem-scalebench, I get around a factor 2 slowdown with an array with 
16 semaphores and factor 13 slowdown with an array of 256 semaphores :-(
[with LOCKDEP+DEBUG_SPINLOCK].

Anyone around with a ppc or arm? How slow is the loop of the 
spin_unlock_wait() calls?
Single CPU is sufficient.

Question 1: How large is the difference between:
#./sem-scalebench -t 10 -c 1 -p 1 -o 4 -f -d 1
#./sem-scalebench -t 10 -c 1 -p 1 -o 4 -f -d 256
https://github.com/manfred-colorfu/ipcscale

For x86, the difference is only ~30%.

Question 2:
Is it faster if the attached patch is applied? (relative to mmots)

--
     Manfred

[-- Attachment #2: 0001-ipc-sem.c-Avoid-spin_unlock_wait.patch --]
[-- Type: text/x-patch, Size: 3558 bytes --]

>From b063c9edbb264cfcbca6c23eee3c85f90cd77ae1 Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
Date: Mon, 5 Sep 2016 20:45:38 +0200
Subject: [PATCH] ipc/sem.c: Avoid spin_unlock_wait()

experimental, not fully tested!
spin_unlock_wait() may be expensive, because it must ensure memory
ordering.
Test: Would it be faster if an explicit is_locked flag is used?
For large arrays, only 1 barrier would be required.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 ipc/sem.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 5e318c5..062ece2d 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -101,6 +101,7 @@ struct sem {
 	 */
 	int	sempid;
 	spinlock_t	lock;	/* spinlock for fine-grained semtimedop */
+	int		is_locked;	/* locked flag */
 	struct list_head pending_alter; /* pending single-sop operations */
 					/* that alter the semaphore */
 	struct list_head pending_const; /* pending single-sop operations */
@@ -282,17 +283,22 @@ static void complexmode_enter(struct sem_array *sma)
 
 	/* We need a full barrier after seting complex_mode:
 	 * The write to complex_mode must be visible
-	 * before we read the first sem->lock spinlock state.
+	 * before we read the first sem->is_locked state.
 	 */
 	smp_store_mb(sma->complex_mode, true);
 
 	for (i = 0; i < sma->sem_nsems; i++) {
 		sem = sma->sem_base + i;
-		spin_unlock_wait(&sem->lock);
+		if (sem->is_locked) {
+			spin_lock(&sem->lock);
+			spin_unlock(&sem->lock);
+		}
 	}
 	/*
-	 * spin_unlock_wait() is not a memory barriers, it is only a
-	 * control barrier. The code must pair with spin_unlock(&sem->lock),
+	 * If spin_lock(); spin_unlock() is used, then everything is
+	 * ordered. Otherwise: Reading sem->is_locked is only a control
+	 * barrier.
+	 * The code must pair with smp_store_release(&sem->is_locked),
 	 * thus just the control barrier is insufficient.
 	 *
 	 * smp_rmb() is sufficient, as writes cannot pass the control barrier.
@@ -364,17 +370,16 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
 		spin_lock(&sem->lock);
 
 		/*
-		 * See 51d7d5205d33
-		 * ("powerpc: Add smp_mb() to arch_spin_is_locked()"):
-		 * A full barrier is required: the write of sem->lock
-		 * must be visible before the read is executed
+		 * set is_locked. It must be ordered before
+		 * reading sma->complex_mode.
 		 */
-		smp_mb();
+		smp_store_mb(sem->is_locked, true);
 
 		if (!smp_load_acquire(&sma->complex_mode)) {
 			/* fast path successful! */
 			return sops->sem_num;
 		}
+		smp_store_release(&sem->is_locked, false);
 		spin_unlock(&sem->lock);
 	}
 
@@ -387,6 +392,8 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
 		 * back to the fast path.
 		 */
 		spin_lock(&sem->lock);
+		/* no need for smp_mb, we own the global lock */
+		sem->is_locked = true;
 		ipc_unlock_object(&sma->sem_perm);
 		return sops->sem_num;
 	} else {
@@ -406,6 +413,7 @@ static inline void sem_unlock(struct sem_array *sma, int locknum)
 		ipc_unlock_object(&sma->sem_perm);
 	} else {
 		struct sem *sem = sma->sem_base + locknum;
+		smp_store_release(&sem->is_locked, false);
 		spin_unlock(&sem->lock);
 	}
 }
@@ -551,6 +559,7 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params)
 		INIT_LIST_HEAD(&sma->sem_base[i].pending_alter);
 		INIT_LIST_HEAD(&sma->sem_base[i].pending_const);
 		spin_lock_init(&sma->sem_base[i].lock);
+		sma->sem_base[i].is_locked = false;
 	}
 
 	sma->complex_count = 0;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier
  2016-09-05 18:57         ` Manfred Spraul
@ 2016-09-06 17:56           ` Will Deacon
  0 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2016-09-06 17:56 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Peter Zijlstra, benh, paulmck, Ingo Molnar, Boqun Feng,
	Andrew Morton, LKML, 1vier1, Davidlohr Bueso, Pablo Neira Ayuso,
	netfilter-devel

On Mon, Sep 05, 2016 at 08:57:19PM +0200, Manfred Spraul wrote:
> On 09/02/2016 09:22 PM, Peter Zijlstra wrote:
> Anyone around with a ppc or arm? How slow is the loop of the
> spin_unlock_wait() calls?
> Single CPU is sufficient.
> 
> Question 1: How large is the difference between:
> #./sem-scalebench -t 10 -c 1 -p 1 -o 4 -f -d 1
> #./sem-scalebench -t 10 -c 1 -p 1 -o 4 -f -d 256
> https://github.com/manfred-colorfu/ipcscale

Not sure exactly what you want me to run here, but with an arm64
defconfig -rc3 kernel, those two invocations give me "Max total" values
where the first is 20x bigger than the second.

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-09-06 17:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-01 15:27 [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier Manfred Spraul
2016-09-01 15:30 ` Will Deacon
2016-09-01 16:41   ` Peter Zijlstra
2016-09-02  6:17     ` Boqun Feng
2016-09-02  6:35     ` Manfred Spraul
2016-09-02 19:22       ` Peter Zijlstra
2016-09-03  5:33         ` Manfred Spraul
2016-09-05 18:57         ` Manfred Spraul
2016-09-06 17:56           ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).