All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikulas Patocka <mpatocka@redhat.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Anton Arapov <anton@redhat.com>,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 1/2] brw_mutex: big read-write mutex
Date: Mon, 22 Oct 2012 19:09:38 -0400 (EDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.1210191948130.12753@file.rdu.redhat.com> (raw)
In-Reply-To: <20121019174947.GA1206@redhat.com>



On Fri, 19 Oct 2012, Oleg Nesterov wrote:

> On 10/19, Mikulas Patocka wrote:
> >
> > synchronize_rcu() is way slower than msleep(1) -
> 
> This depends, I guess. but this doesn't mmatter,
> 
> > so I don't see a reason
> > why should it be complicated to avoid msleep(1).
> 
> I don't think this really needs complications. Please look at this
> patch for example. Or initial (single writer) version below. It is
> not finished and lacks the barriers too, but I do not think it is
> more complex.

Hi

My implementation has a smaller structure (it doesn't have 
wait_queue_head_t).

Using preempt_disable()/synchronize_sched() instead of RCU seems like a 
good idea. Here, the locked region is so small that it doesn't make sense 
to play tricks with preemptible RCU.

Your implementation is prone to starvation - if the writer has a high 
priority and if it is doing back-to-back write unlocks/locks, it may 
happen that the readers have no chance to run.

The use of mutex instead of a wait queue in my implementation is unusual, 
but I don't see anything wrong with it - it makes the structure smaller 
and it solves the starvation problem (which would otherwise be complicated 
to solve).

Mikulas

> Oleg.
> 
> struct brw_sem {
> 	long __percpu		*read_ctr;
> 	wait_queue_head_t	read_waitq;
> 	struct mutex		writer_mutex;
> 	struct task_struct	*writer;
> };
> 
> int brw_init(struct brw_sem *brw)
> {
> 	brw->writer = NULL;
> 	mutex_init(&brw->writer_mutex);
> 	init_waitqueue_head(&brw->read_waitq);
> 	brw->read_ctr = alloc_percpu(long);
> 	return brw->read_ctr ? 0 : -ENOMEM;
> }
> 
> void brw_down_read(struct brw_sem *brw)
> {
> 	for (;;) {
> 		bool done = false;
> 
> 		preempt_disable();
> 		if (likely(!brw->writer)) {
> 			__this_cpu_inc(*brw->read_ctr);
> 			done = true;
> 		}
> 		preempt_enable();
> 
> 		if (likely(done))
> 			break;
> 
> 		__wait_event(brw->read_waitq, !brw->writer);
> 	}
> }
> 
> void brw_up_read(struct brw_sem *brw)
> {
> 	struct task_struct *writer;
> 
> 	preempt_disable();
> 	__this_cpu_dec(*brw->read_ctr);
> 	writer = ACCESS_ONCE(brw->writer);
> 	if (unlikely(writer))
> 		wake_up_process(writer);
> 	preempt_enable();
> }
> 
> static inline long brw_read_ctr(struct brw_sem *brw)
> {
> 	long sum = 0;
> 	int cpu;
> 
> 	for_each_possible_cpu(cpu)
> 		sum += per_cpu(*brw->read_ctr, cpu);

Integer overflow on signed types is undefined - you should use unsigned 
long - you can use -fwrapv option to gcc to make signed overflow defined, 
but Linux doesn't use it.

> 
> 	return sum;
> }
> 
> void brw_down_write(struct brw_sem *brw)
> {
> 	mutex_lock(&brw->writer_mutex);
> 	brw->writer = current;
> 	synchronize_sched();
> 	/*
> 	 * Thereafter brw_*_read() must see ->writer != NULL,
> 	 * and we should see the result of __this_cpu_inc().
> 	 */
> 	for (;;) {
> 		set_current_state(TASK_UNINTERRUPTIBLE);
> 		if (brw_read_ctr(brw) == 0)
> 			break;
> 		schedule();
> 	}
> 	__set_current_state(TASK_RUNNING);
> 	/*
> 	 * We can add another synchronize_sched() to avoid the
> 	 * spurious wakeups from brw_up_read() after return.
> 	 */
> }
> 
> void brw_up_write(struct brw_sem *brw)
> {
> 	brw->writer = NULL;
> 	synchronize_sched();

That synchronize_sched should be put before brw->writer = NULL. This is 
incorrect, because brw->writer = NULL may be reordered with previous 
writes done by this process and the other CPU may see brw->writer == NULL 
(and think that the lock is unlocked) while it doesn't see previous writes 
done by the writer.

I had this bug in my implementation too.

> 	wake_up_all(&brw->read_waitq);
> 	mutex_unlock(&brw->writer_mutex);
> }

Mikulas

  reply	other threads:[~2012-10-22 23:22 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-15 19:09 [RFC PATCH 0/2] uprobes: register/unregister can race with fork Oleg Nesterov
2012-10-15 19:10 ` [PATCH 1/2] brw_mutex: big read-write mutex Oleg Nesterov
2012-10-15 23:28   ` Paul E. McKenney
2012-10-16 15:56     ` Oleg Nesterov
2012-10-16 18:58       ` Paul E. McKenney
2012-10-17 16:37         ` Oleg Nesterov
2012-10-17 22:28           ` Paul E. McKenney
2012-10-16 19:56   ` Linus Torvalds
2012-10-17 16:59     ` Oleg Nesterov
2012-10-17 22:44       ` Paul E. McKenney
2012-10-18 16:24         ` Oleg Nesterov
2012-10-18 16:38           ` Paul E. McKenney
2012-10-18 17:57             ` Oleg Nesterov
2012-10-18 19:28               ` Mikulas Patocka
2012-10-19 12:38                 ` Peter Zijlstra
2012-10-19 15:32                   ` Mikulas Patocka
2012-10-19 17:40                     ` Peter Zijlstra
2012-10-19 17:57                       ` Oleg Nesterov
2012-10-19 22:54                       ` Mikulas Patocka
2012-10-24  3:08                         ` Dave Chinner
2012-10-25 14:09                           ` Mikulas Patocka
2012-10-25 23:40                             ` Dave Chinner
2012-10-26 12:06                               ` Oleg Nesterov
2012-10-26 13:22                                 ` Mikulas Patocka
2012-10-26 14:12                                   ` Oleg Nesterov
2012-10-26 15:23                                     ` mark_files_ro && sb_end_write Oleg Nesterov
2012-10-26 16:09                                     ` [PATCH 1/2] brw_mutex: big read-write mutex Mikulas Patocka
2012-10-19 17:49                     ` Oleg Nesterov
2012-10-22 23:09                       ` Mikulas Patocka [this message]
2012-10-23 15:12                         ` Oleg Nesterov
2012-10-19 19:28               ` Paul E. McKenney
2012-10-22 23:36                 ` [PATCH 0/2] fix and improvements for percpu-rw-semaphores (was: brw_mutex: big read-write mutex) Mikulas Patocka
2012-10-22 23:37                   ` [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers Mikulas Patocka
2012-10-22 23:39                     ` [PATCH 2/2] percpu-rw-semaphores: use rcu_read_lock_sched Mikulas Patocka
2012-10-24 16:16                       ` Paul E. McKenney
2012-10-24 17:18                         ` Oleg Nesterov
2012-10-24 18:20                           ` Paul E. McKenney
2012-10-24 18:43                             ` Oleg Nesterov
2012-10-24 19:43                               ` Paul E. McKenney
2012-10-25 14:54                         ` Mikulas Patocka
2012-10-25 15:07                           ` Paul E. McKenney
2012-10-25 16:15                             ` Mikulas Patocka
2012-10-23 16:59                     ` [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers Oleg Nesterov
2012-10-23 18:05                       ` Paul E. McKenney
2012-10-23 18:27                         ` Oleg Nesterov
2012-10-23 18:41                         ` Oleg Nesterov
2012-10-23 20:29                           ` Paul E. McKenney
2012-10-23 20:32                             ` Paul E. McKenney
2012-10-23 21:39                               ` Mikulas Patocka
2012-10-24 16:23                                 ` Paul E. McKenney
2012-10-24 20:22                                   ` Mikulas Patocka
2012-10-24 20:36                                     ` Paul E. McKenney
2012-10-24 20:44                                       ` Mikulas Patocka
2012-10-24 23:57                                         ` Paul E. McKenney
2012-10-25 12:39                                           ` Paul E. McKenney
2012-10-25 13:48                                           ` Mikulas Patocka
2012-10-23 19:23                       ` Oleg Nesterov
2012-10-23 20:45                         ` Peter Zijlstra
2012-10-23 20:57                         ` Peter Zijlstra
2012-10-24 15:11                           ` Oleg Nesterov
2012-10-23 21:26                         ` Mikulas Patocka
2012-10-23 20:32                     ` Peter Zijlstra
2012-10-30 18:48                   ` [PATCH 0/2] fix and improvements for percpu-rw-semaphores (was: brw_mutex: big read-write mutex) Oleg Nesterov
2012-10-31 19:41                     ` [PATCH 0/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily Oleg Nesterov
2012-10-31 19:41                       ` [PATCH 1/1] " Oleg Nesterov
2012-11-01 15:10                         ` Linus Torvalds
2012-11-01 15:34                           ` Oleg Nesterov
2012-11-02 18:06                           ` [PATCH v2 0/1] " Oleg Nesterov
2012-11-02 18:06                             ` [PATCH v2 1/1] " Oleg Nesterov
2012-11-07 17:04                               ` [PATCH v3 " Mikulas Patocka
2012-11-07 17:47                                 ` Oleg Nesterov
2012-11-07 19:17                                   ` Mikulas Patocka
2012-11-08 13:42                                     ` Oleg Nesterov
2012-11-08  1:23                                 ` Paul E. McKenney
2012-11-08  1:16                               ` [PATCH v2 " Paul E. McKenney
2012-11-08 13:33                                 ` Oleg Nesterov
2012-11-08 16:27                                   ` Paul E. McKenney
2012-11-08 13:48                             ` [PATCH RESEND v2 0/1] " Oleg Nesterov
2012-11-08 13:48                               ` [PATCH RESEND v2 1/1] " Oleg Nesterov
2012-11-08 20:07                                 ` Andrew Morton
2012-11-08 21:08                                   ` Paul E. McKenney
2012-11-08 23:41                                     ` Mikulas Patocka
2012-11-09  0:41                                       ` Paul E. McKenney
2012-11-09  3:23                                         ` Paul E. McKenney
2012-11-09 16:35                                           ` Oleg Nesterov
2012-11-09 16:59                                             ` Paul E. McKenney
2012-11-09 12:47                                   ` Mikulas Patocka
2012-11-09 15:46                                   ` Oleg Nesterov
2012-11-09 17:01                                     ` Paul E. McKenney
2012-11-09 18:10                                       ` Oleg Nesterov
2012-11-09 18:19                                         ` Oleg Nesterov
2012-11-10  0:55                                         ` Paul E. McKenney
2012-11-11 15:45                                           ` Oleg Nesterov
2012-11-12 18:38                                             ` Paul E. McKenney
2012-11-11 18:27                                   ` [PATCH -mm] percpu_rw_semaphore-reimplement-to-not-block-the-readers-unnecessari ly.fix Oleg Nesterov
2012-11-12 18:31                                     ` Paul E. McKenney
2012-11-16 23:22                                     ` Andrew Morton
2012-11-18 19:32                                       ` Oleg Nesterov
2012-11-01 15:43                         ` [PATCH 1/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily Paul E. McKenney
2012-11-01 18:33                           ` Oleg Nesterov
2012-11-02 16:18                             ` Oleg Nesterov
2012-10-15 19:10 ` [PATCH 2/2] uprobes: Use brw_mutex to fix register/unregister vs dup_mmap() race Oleg Nesterov
2012-10-18  7:03   ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.1210191948130.12753@file.rdu.redhat.com \
    --to=mpatocka@redhat.com \
    --cc=ananth@in.ibm.com \
    --cc=anton@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.