LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-rt-users <linux-rt-users@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Clark Williams <williams@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	jack@suse.com, Waiman Long <longman@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>
Subject: Re: [RT WARNING] DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current) with fsfreeze (4.19.25-rt16)
Date: Wed, 19 Jun 2019 11:50:43 +0200
Message-ID: <20190619095043.GT3402@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20190506165009.GA28959@redhat.com>


Sorry, I seem to have missed this email.

On Mon, May 06, 2019 at 06:50:09PM +0200, Oleg Nesterov wrote:
> On 05/03, Peter Zijlstra wrote:
> >
> > -static void lockdep_sb_freeze_release(struct super_block *sb)
> > -{
> > -	int level;
> > -
> > -	for (level = SB_FREEZE_LEVELS - 1; level >= 0; level--)
> > -		percpu_rwsem_release(sb->s_writers.rw_sem + level, 0, _THIS_IP_);
> > -}
> > -
> > -/*
> > - * Tell lockdep we are holding these locks before we call ->unfreeze_fs(sb).
> > - */
> > -static void lockdep_sb_freeze_acquire(struct super_block *sb)
> > -{
> > -	int level;
> > -
> > -	for (level = 0; level < SB_FREEZE_LEVELS; ++level)
> > -		percpu_rwsem_acquire(sb->s_writers.rw_sem + level, 0, _THIS_IP_);
> > +	percpu_down_write_non_owner(sb->s_writers.rw_sem + level-1);
> >  }
> 
> I'd suggest to not change fs/super.c, keep these helpers, and even not introduce
> xxx_write_non_owner().
> 
> freeze_super() takes other locks, it calls sync_filesystem(), freeze_fs(), lockdep
> should know that this task holds SB_FREEZE_XXX locks for writing.

Bah, I so hate these games. But OK, I suppose.

> > @@ -80,14 +83,8 @@ int __percpu_down_read(struct percpu_rw_
> >  	 * and reschedule on the preempt_enable() in percpu_down_read().
> >  	 */
> >  	preempt_enable_no_resched();
> > -
> > -	/*
> > -	 * Avoid lockdep for the down/up_read() we already have them.
> > -	 */
> > -	__down_read(&sem->rw_sem);
> > +	wait_event(sem->waiters, !atomic_read(&sem->block));
> >  	this_cpu_inc(*sem->read_count);
> 
> Argh, this looks racy :/
> 
> Suppose that sem->block == 0 when wait_event() is called, iow the writer released
> the lock.
> 
> Now suppose that this __percpu_down_read() races with another percpu_down_write().
> The new writer can set sem->block == 1 and call readers_active_check() in between,
> after wait_event() and before this_cpu_inc(*sem->read_count).


CPU0			CPU1			CPU2

percpu_up_write()
  sem->block = 0;

			__percpu_down_read()
			  wait_event(, !sem->block);

						percpu_down_write()
						  wait_event_exclusive(, xchg(sem->block,1)==0);
						  readers_active_check()

			  this_cpu_inc();

			  *whoopsy* reader while write owned.



I suppose we can 'patch' that by checking blocking again after we've
incremented, something like the below.

But looking at percpu_down_write() we have two wait_event*() on the same
queue back to back, which is 'odd' at best. Let me ponder that a little
more.


---

--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -61,6 +61,7 @@ int __percpu_down_read(struct percpu_rw_
 	 * writer missed them.
 	 */
 
+again:
 	smp_mb(); /* A matches D */
 
 	/*
@@ -87,7 +88,13 @@ int __percpu_down_read(struct percpu_rw_
 	wait_event(sem->waiters, !atomic_read_acquire(&sem->block));
 	this_cpu_inc(*sem->read_count);
 	preempt_disable();
-	return 1;
+
+	/*
+	 * percpu_down_write() could've set ->blocked right after we've seen it
+	 * 0 but missed our this_cpu_inc(), which is exactly the condition we
+	 * get called for from percpu_down_read().
+	 */
+	goto again;
 }
 EXPORT_SYMBOL_GPL(__percpu_down_read);
 


  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-26  9:34 Juri Lelli
2019-03-28 10:17 ` Sebastian Andrzej Siewior
2019-04-19  8:56 ` Juri Lelli
2019-04-30 12:51   ` Sebastian Andrzej Siewior
2019-04-30 13:28     ` Peter Zijlstra
2019-04-30 13:45       ` Sebastian Andrzej Siewior
2019-04-30 14:01         ` Peter Zijlstra
2019-04-30 14:15       ` Oleg Nesterov
2019-04-30 14:29         ` Peter Zijlstra
2019-04-30 14:42         ` Oleg Nesterov
2019-04-30 14:44           ` Peter Zijlstra
2019-04-30 14:53             ` Oleg Nesterov
2019-05-01 17:09       ` Peter Zijlstra
2019-05-01 17:26         ` Waiman Long
2019-05-01 18:54           ` Peter Zijlstra
2019-05-01 19:22             ` Davidlohr Bueso
2019-05-01 19:25               ` Peter Zijlstra
2019-05-02 10:09         ` Oleg Nesterov
2019-05-02 11:42           ` Oleg Nesterov
2019-05-03 14:50             ` Peter Zijlstra
2019-05-03 15:25               ` Peter Zijlstra
2019-05-06 16:50               ` Oleg Nesterov
2019-06-19  9:50                 ` Peter Zijlstra [this message]
2019-05-03 14:16           ` Peter Zijlstra
2019-05-03 15:37             ` Oleg Nesterov
2019-05-03 15:46               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190619095043.GT3402@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=jack@suse.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git