Linux-rt-users archive on lore.kernel.org
 help / color / Atom feed
* Re: [PATCH 1/2] dm-snapshot: fix crash with the realtime kernel
       [not found]       ` <a6f588d3-2403-d50a-70a1-ed644082cc83@arrikto.com>
@ 2019-11-13  6:01         ` Scott Wood
  0 siblings, 0 replies; only message in thread
From: Scott Wood @ 2019-11-13  6:01 UTC (permalink / raw)
  To: Nikos Tsironis, Mikulas Patocka, Mike Snitzer
  Cc: Ilias Tsitsimpis, dm-devel, linux-kernel, linux-fsdevel,
	Sebastian Andrzej Siewior, Daniel Wagner, tglx, linux-rt-users,
	Paul Gortmaker

On Tue, 2019-11-12 at 13:45 +0200, Nikos Tsironis wrote:
> On 11/12/19 9:50 AM, Mikulas Patocka wrote:
> > 
> > On Mon, 11 Nov 2019, Mike Snitzer wrote:
> > 
> > > On Mon, Nov 11 2019 at 11:37am -0500,
> > > Nikos Tsironis <ntsironis@arrikto.com> wrote:
> > > 
> > > > On 11/11/19 3:59 PM, Mikulas Patocka wrote:
> > > > > Snapshot doesn't work with realtime kernels since the commit
> > > > > f79ae415b64c.
> > > > > hlist_bl is implemented as a raw spinlock and the code takes two
> > > > > non-raw
> > > > > spinlocks while holding hlist_bl (non-raw spinlocks are blocking
> > > > > mutexes
> > > > > in the realtime kernel, so they couldn't be taken inside a raw
> > > > > spinlock).
> > > > > 
> > > > > This patch fixes the problem by using non-raw spinlock
> > > > > exception_table_lock instead of the hlist_bl lock.
> > > > > 
> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > > Fixes: f79ae415b64c ("dm snapshot: Make exception tables
> > > > > scalable")
> > > > > 
> > > > 
> > > > Hi Mikulas,
> > > > 
> > > > I wasn't aware that hlist_bl is implemented as a raw spinlock in the
> > > > real time kernel. I would expect it to be a standard non-raw
> > > > spinlock,
> > > > so everything works as expected. But, after digging further in the
> > > > real
> > > > time tree, I found commit ad7675b15fd87f1 ("list_bl: Make list head
> > > > locking RT safe") which suggests that such a conversion would break
> > > > other parts of the kernel.
> > > 
> > > Right, the proper fix is to update list_bl to work on realtime (which
> > > I
> > > assume the referenced commit does).  I do not want to take this
> > > dm-snapshot specific workaround that open-codes what should be done
> > > within hlist_{bl_lock,unlock}, etc.
> > 
> > If we change list_bl to use non-raw spinlock, it fails in dentry lookup 
> > code. The dentry code takes a seqlock (which is implemented as preempt 
> > disable in the realtime kernel) and then takes a list_bl lock.
> > 
> > This is wrong from the real-time perspective (the chain in the hash
> > could 
> > be arbitrarily long, so using non-raw spinlock could cause unbounded 
> > wait), however we can't do anything with it.
> > 
> > I think that fixing dm-snapshot is way easier than fixing the dentry
> > code. 
> > If you have an idea how to fix the dentry code, tell us.
> 
> I too think that it would be better to fix list_bl. dm-snapshot isn't
> really broken. One should be able to acquire a spinlock while holding
> another spinlock.

That's not universally true -- even in the absence of RT there are nesting
considerations.  But it would probably be good if raw locks weren't hidden
inside other locking primitives without making it clear (ideally in the
function names) that it's a raw lock.

> Moreover, apart from dm-snapshot, anyone ever using list_bl is at risk
> of breaking the realtime kernel, if he or she is not aware of that
> particular limitation of list_bl's implementation in the realtime tree.

In particular the non-rcu variant seems inherently bad unless you protect
traversal with some other lock (in which case why use bl at all?).  Maybe
fully separate the rcu version of list_bl and keep using raw locks there
(with the name clearly indicating so), with the side benefit that
accidentally mixing rcu and non-rcu operations on the same list would become
a build error, and convert the non-rcu list_bl to use non-raw locks on RT.

BTW, I'm wondering what makes bit spinlocks worse than raw spinlocks on
RT...  commit ad7675b15fd87f19 says there's no lockdep visibility, but that
seems orthogonal to RT, and could be addressed by adding a dep_map on debug
builds the same way the raw lock is currently added.  The other bit spinlock
conversion commits that I could find are replacing them with non-raw locks.

-Scott



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, back to index

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <alpine.LRH.2.02.1911110811060.28408@file01.intranet.prod.int.rdu2.redhat.com>
     [not found] ` <c9a772e9-e305-cf0b-1155-fb19bdb84e55@arrikto.com>
     [not found]   ` <20191112011444.GA32220@redhat.com>
     [not found]     ` <alpine.LRH.2.02.1911120240020.25757@file01.intranet.prod.int.rdu2.redhat.com>
     [not found]       ` <a6f588d3-2403-d50a-70a1-ed644082cc83@arrikto.com>
2019-11-13  6:01         ` [PATCH 1/2] dm-snapshot: fix crash with the realtime kernel Scott Wood

Linux-rt-users archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rt-users/0 linux-rt-users/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rt-users linux-rt-users/ https://lore.kernel.org/linux-rt-users \
		linux-rt-users@vger.kernel.org
	public-inbox-index linux-rt-users

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rt-users


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git