linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] change sb_writers to use percpu_rw_semaphore
@ 2015-07-13 21:25 Oleg Nesterov
  2015-07-13 21:25 ` [PATCH 1/4] change get_super_thawed() to use sb_start/end_write() Oleg Nesterov
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-13 21:25 UTC (permalink / raw)
  To: Al Viro, Jan Kara, Linus Torvalds, Paul McKenney, Peter Zijlstra
  Cc: Daniel Wagner, Davidlohr Bueso, Ingo Molnar, Tejun Heo, linux-kernel

Hello,

Al, Jan, could you comment? I mean the intent, the patches are
obviously not for inclusion yet.

We can remove everything from struct sb_writers except frozen
(which can become a boolean, it seems) and add the array of
percpu_rw_semaphore's instead.

__sb_start/end_write() can use percpu_down/up_read(), and
freeze/thaw_super() can use percpu_down/up_write().

Why:

	- Firstly, __sb_start_write() looks simply buggy. I does
	  __sb_end_write() if it sees ->frozen, but if it migrates
	  to another CPU before percpu_counter_dec() sb_wait_write()
	  can wrongly succeed if there is another task which holds
	  the same "semaphore": sb_wait_write() can miss the result
	  of the previous percpu_counter_inc() but see the result
	  of this percpu_counter_dec().

	- This code doesn't look simple. It would be better to rely
	  on the generic locking code.

	- __sb_start_write() will be a little bit faster, but this
	  is minor.

Todo:

	- __sb_start_write(wait => false) always fail.

	  Thivial, we already have percpu_down_read_trylock() just
	  this patch wasn't merged yet.

	- sb_lockdep_release() and sb_lockdep_acquire() play with
	  percpu_rw_semaphore's internals.

	  Trivial, we need a couple of new helper in percpu-rwsem.c.

	- Fix get_super_thawed(), it will spin if MS_RDONLY...

	  It is not clear to me what exactly should we do, but this
	  doesn't look hard. Perhaps it can just return if MS_RDONLY.

	- Most probably I missed something else, and I do not need
	  how to test.

Finally. freeze_super() calls synchronize_sched_expedited() 3 times in
a row. This is bad and just stupid. But if we change percpu_rw_semaphore
to use rcu_sync (see https://lkml.org/lkml/2015/7/11/211) we can avoid
this and do synchronize_sched() only once. Just we need some more simple
changes in percpu-rwsem.c, so that all sb_writers->rw_sem[] semaphores
could use the single sb_writers->rss.

In this case destroy_super() needs some modifications too,
percpu_free_rwsem() will be might_sleep(). But this looks simple too.

Oleg.

 fs/super.c         |  147 +++++++++++++++++++--------------------------------
 include/linux/fs.h |   14 +----
 2 files changed, 58 insertions(+), 103 deletions(-)


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-07-22 21:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-13 21:25 [PATCH RFC 0/4] change sb_writers to use percpu_rw_semaphore Oleg Nesterov
2015-07-13 21:25 ` [PATCH 1/4] change get_super_thawed() to use sb_start/end_write() Oleg Nesterov
2015-07-14 10:49   ` Jan Kara
2015-07-14 13:38     ` Oleg Nesterov
2015-07-13 21:25 ` [PATCH 2/4] introduce sb_unlock_frozen() Oleg Nesterov
2015-07-13 21:25 ` [PATCH 3/4] introduce sb_lockdep_release() Oleg Nesterov
2015-07-13 21:25 ` [PATCH 4/4] change sb_writers to use percpu_rw_semaphore Oleg Nesterov
2015-07-13 22:23 ` [PATCH RFC 0/4] " Dave Chinner
2015-07-13 22:42   ` Oleg Nesterov
2015-07-13 23:14     ` Dave Chinner
2015-07-14 10:48 ` Jan Kara
2015-07-14 13:37   ` Oleg Nesterov
2015-07-14 21:17     ` Dave Hansen
2015-07-14 21:22       ` Oleg Nesterov
2015-07-14 21:41         ` Dave Hansen
2015-07-15  6:47           ` Jan Kara
2015-07-15 18:19             ` Oleg Nesterov
2015-07-16  7:26               ` Jan Kara
2015-07-16  7:30                 ` Dave Hansen
2015-07-16  8:55                   ` Jan Kara
2015-07-16 17:32                 ` Oleg Nesterov
2015-07-17  1:27                   ` Dave Chinner
2015-07-17 17:31                     ` Oleg Nesterov
2015-07-17 22:40                       ` Dave Chinner
2015-07-20  8:26                         ` Jan Kara
2015-07-22 21:09                           ` Oleg Nesterov
2015-07-20 16:23                         ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).