All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] ipc/sem.c: Add one more memory barrier to sem_lock().
@ 2015-02-25 19:36 Manfred Spraul
  2015-02-26 19:29 ` Oleg Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: Manfred Spraul @ 2015-02-25 19:36 UTC (permalink / raw)
  To: Oleg Nesterov, Paul E. McKenney
  Cc: LKML, 1vier1, Peter Zijlstra, Kirill Tkhai, Ingo Molnar,
	Josh Poimboeuf, Manfred Spraul

Hi,

What do you think about the following patch for sem_lock()?

Other options:

1) I don't like

	#define smp_mb__after_unlock_wait()	smp_rmb()

	I think it is too specific: the last block in sem_lock uses

		if (sma->complex_count == 0) {
			smp_rmb();
			return;
		}

2) What about

	#define smp_aquire__after_control_barrier()	smp_rmb()


Best regards,
	Manfred


xxxxx

sem_lock() does not properly pair memory barriers.

Theoretially an acquire barrier would the right operation.
But since the existing control boundary is a write memory barrier,
it is cheaper use an smp_rmb().

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 ipc/sem.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 9284211..d43011d 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -267,6 +267,10 @@ static void sem_wait_array(struct sem_array *sma)
 	if (sma->complex_count)  {
 		/* The thread that increased sma->complex_count waited on
 		 * all sem->lock locks. Thus we don't need to wait again.
+		 *
+		 * The is no need for memory barriers: with
+		 * complex_count>0, all threads acquire/release
+		 * sem_perm.lock, thus spin_lock/unlock is the barrier.
 		 */
 		return;
 	}
@@ -275,6 +279,20 @@ static void sem_wait_array(struct sem_array *sma)
 		sem = sma->sem_base + i;
 		spin_unlock_wait(&sem->lock);
 	}
+	/*
+	 * We own sem_perm.lock, all owners of sma->sem_base[i].lock have
+	 * dropped their locks. But we still need a memory barrier:
+	 * - The lock dropping thread did a spin_unlock(), which is the
+	 *   release memory barrier.
+	 * - But the spin_unlock(&sma->sem_base[i].lock) might have
+	 *   happened after this thread did spin_lock(&sma->sem_perm.lock),
+	 *   thus the acquire memory barrier in this thread is missing.
+	 * - spin_unlock_wait() is internally a loop, thus we have a control
+	 *   boundary. As writes are not speculated, we have already a barrier
+	 *   for writes. Reads can be performed speculatively, therefore a
+	 *   smp_rmb() is necessary.
+	 */
+	smp_rmb();
 }
 
 /*
@@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
 			 * Thus: if is now 0, then it will stay 0.
 			 */
 			if (sma->complex_count == 0) {
-				/* fast path successful! */
+				/*
+				 * Fast path successful!
+				 * We only need a final memory barrier.
+				 * (see sem_wait_array() for details).
+				 */
+				smp_rmb();
+
 				return sops->sem_num;
 			}
 		}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] ipc/sem.c: Add one more memory barrier to sem_lock().
  2015-02-25 19:36 [RFC PATCH] ipc/sem.c: Add one more memory barrier to sem_lock() Manfred Spraul
@ 2015-02-26 19:29 ` Oleg Nesterov
  2015-02-26 19:46   ` Manfred Spraul
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Nesterov @ 2015-02-26 19:29 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Paul E. McKenney, LKML, 1vier1, Peter Zijlstra, Kirill Tkhai,
	Ingo Molnar, Josh Poimboeuf

Sorry Manfred, I initiated this discussion and then disappeared. Currently
I am buried in the ancient 2.16.18 bugs ;)

On 02/25, Manfred Spraul wrote:
> Hi,
>
> What do you think about the following patch for sem_lock()?
>
> Other options:
>
> 1) I don't like
>
> 	#define smp_mb__after_unlock_wait()	smp_rmb()
>
> 	I think it is too specific: the last block in sem_lock uses
>
> 		if (sma->complex_count == 0) {
> 			smp_rmb();
> 			return;
> 		}

See below.

>
> 2) What about
>
> 	#define smp_aquire__after_control_barrier()	smp_rmb()


I agree with any naming. The only point of the new helper is that we can
factor out the comment, otherwise we would need to repeat it again and again.


> @@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
>  			 * Thus: if is now 0, then it will stay 0.
>  			 */
>  			if (sma->complex_count == 0) {
> -				/* fast path successful! */
> +				/*
> +				 * Fast path successful!
> +				 * We only need a final memory barrier.
> +				 * (see sem_wait_array() for details).
> +				 */
> +				smp_rmb();
> +

I'll try to read this again tomorrow, but so far I am confused.

Most probably I missed something, but this looks unneeded at first glance.

We already have another smp_rmb() above this check. And it should act as
a "final" barrier, or we can not trust this ->complex_count check ?

And (if I am right) this means that the comment above that rmb() should
be updated. And that is why I think the helper makes sense, the comment
should be almost the same as in sem_wait_array().

If not, could you please spell to explain why do we need another rmb() ?

Oleg.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] ipc/sem.c: Add one more memory barrier to sem_lock().
  2015-02-26 19:29 ` Oleg Nesterov
@ 2015-02-26 19:46   ` Manfred Spraul
  0 siblings, 0 replies; 3+ messages in thread
From: Manfred Spraul @ 2015-02-26 19:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Paul E. McKenney, LKML, 1vier1, Peter Zijlstra, Kirill Tkhai,
	Ingo Molnar, Josh Poimboeuf

Hi Oleg,

On 02/26/2015 08:29 PM, Oleg Nesterov wrote:
>> @@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
>>   			 * Thus: if is now 0, then it will stay 0.
>>   			 */
>>   			if (sma->complex_count == 0) {
>> -				/* fast path successful! */
>> +				/*
>> +				 * Fast path successful!
>> +				 * We only need a final memory barrier.
>> +				 * (see sem_wait_array() for details).
>> +				 */
>> +				smp_rmb();
>> +
> I'll try to read this again tomorrow, but so far I am confused.
>
> Most probably I missed something, but this looks unneeded at first glance.
No, my fault:
I thought long about sem_wait_array() and then I did copy&paste without 
thinking properly.

The sequence is:

thread A:
     spin_lock(&local)

thread B:
     complex_count=??;
     spin_unlock(&global); <<< release_mb

thread A:
     spin_unlock_wait(&global); <<< control_mb
     smb_mb__after_control_barrier(); <<< acquire_mb

     <<< now everything from thread B is visible.
     <<< and: thread B has dropped the lock, it can't change any 
protected var
     <<< and: a new thread C can't acquire a lock, we hold &local.

     if (complex_count == 0) goto success;

I'll update the patch.
(cc stable, starting from 3.10...)

--
     Manfred

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-02-26 19:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-25 19:36 [RFC PATCH] ipc/sem.c: Add one more memory barrier to sem_lock() Manfred Spraul
2015-02-26 19:29 ` Oleg Nesterov
2015-02-26 19:46   ` Manfred Spraul

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.