All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Tim Murray <timmurray@google.com>
Cc: Waiman Long <longman@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-f2fs-devel@lists.sourceforge.net,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [PATCH] f2fs: move f2fs to use reader-unfair rwsems
Date: Wed, 12 Jan 2022 15:06:12 +0100	[thread overview]
Message-ID: <Yd7gVLdHW11TQUAi@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAEe=SxnWeK0pSfijPKJSTxBiMgD1Ev69fV3qSTCgWASk0b3vhA@mail.gmail.com>

On Mon, Jan 10, 2022 at 11:41:23AM -0800, Tim Murray wrote:

> 1. f2fs-ckpt thread is running f2fs_write_checkpoint(), holding the
> cp_rwsem write lock while doing so via f2fs_lock_all() in
> block_operations().
> 2. Random very-low-priority thread A makes some other f2fs call that
> tries to get the cp_rwsem read lock by atomically adding on the rwsem,
> fails and deschedules in uninterruptible sleep. cp_rwsem now has a
> non-zero reader count but is write-locked.
> 3. f2fs-ckpt thread releases the cp_rwsem write lock. cp_rwsem now has
> a non-zero reader count and is not write-locked, so is reader-locked.
> 4. Other threads call fsync(), which requests checkpoints from
> f2fs-ckpt, and block on a completion event that f2fs-ckpt dispatches.
> cp_rwsem still has a non-zero reader count because the low-prio thread
> A from (2) has not been scheduled again yet.
> 5. f2fs-ckpt wakes up to perform checkpoints, but it stalls on the
> write lock via cmpxchg in block_operations() until the low-prio thread
> A has run and released the cp_rwsem read lock. Because f2fs-ckpt can't
> run, all fsync() callers are also effectively blocked by the
> low-priority thread holding the read lock.
> 
> I think this is the rough shape of the problem (vs readers holding the
> lock for too long or something like that) because the low-priority
> thread is never run between when it is initially made runnable by
> f2fs-ckpt and when it runs tens/hundreds of milliseconds later then
> immediately unblocks f2fs-ckpt.

*urgh*... so you're making the worst case less likely but fundamentally
you don't change anything.

If one of those low prio threads manages to block while holding
cp_rwsem your checkpoint thread will still block for a very long time.

So while you improve the average case, the worst case doesn't improve
much I think.

Also, given that this is a system wide rwsem, would percpu-rwsem not be
'better' ? Arguably with the same hack cgroups uses for it (see
cgroup_init()) to lower the cost of percpu_down_write().

Now, I'm not a filesystem developer and I'm not much familiar with the
problem space, but this locking reads like a fairly big problem. I'm not
sure optimizing the lock is the answer.



WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Tim Murray <timmurray@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-f2fs-devel@lists.sourceforge.net,
	Christoph Hellwig <hch@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
	Waiman Long <longman@redhat.com>, Will Deacon <will@kernel.org>
Subject: Re: [f2fs-dev] [PATCH] f2fs: move f2fs to use reader-unfair rwsems
Date: Wed, 12 Jan 2022 15:06:12 +0100	[thread overview]
Message-ID: <Yd7gVLdHW11TQUAi@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAEe=SxnWeK0pSfijPKJSTxBiMgD1Ev69fV3qSTCgWASk0b3vhA@mail.gmail.com>

On Mon, Jan 10, 2022 at 11:41:23AM -0800, Tim Murray wrote:

> 1. f2fs-ckpt thread is running f2fs_write_checkpoint(), holding the
> cp_rwsem write lock while doing so via f2fs_lock_all() in
> block_operations().
> 2. Random very-low-priority thread A makes some other f2fs call that
> tries to get the cp_rwsem read lock by atomically adding on the rwsem,
> fails and deschedules in uninterruptible sleep. cp_rwsem now has a
> non-zero reader count but is write-locked.
> 3. f2fs-ckpt thread releases the cp_rwsem write lock. cp_rwsem now has
> a non-zero reader count and is not write-locked, so is reader-locked.
> 4. Other threads call fsync(), which requests checkpoints from
> f2fs-ckpt, and block on a completion event that f2fs-ckpt dispatches.
> cp_rwsem still has a non-zero reader count because the low-prio thread
> A from (2) has not been scheduled again yet.
> 5. f2fs-ckpt wakes up to perform checkpoints, but it stalls on the
> write lock via cmpxchg in block_operations() until the low-prio thread
> A has run and released the cp_rwsem read lock. Because f2fs-ckpt can't
> run, all fsync() callers are also effectively blocked by the
> low-priority thread holding the read lock.
> 
> I think this is the rough shape of the problem (vs readers holding the
> lock for too long or something like that) because the low-priority
> thread is never run between when it is initially made runnable by
> f2fs-ckpt and when it runs tens/hundreds of milliseconds later then
> immediately unblocks f2fs-ckpt.

*urgh*... so you're making the worst case less likely but fundamentally
you don't change anything.

If one of those low prio threads manages to block while holding
cp_rwsem your checkpoint thread will still block for a very long time.

So while you improve the average case, the worst case doesn't improve
much I think.

Also, given that this is a system wide rwsem, would percpu-rwsem not be
'better' ? Arguably with the same hack cgroups uses for it (see
cgroup_init()) to lower the cost of percpu_down_write().

Now, I'm not a filesystem developer and I'm not much familiar with the
problem space, but this locking reads like a fairly big problem. I'm not
sure optimizing the lock is the answer.




_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  parent reply	other threads:[~2022-01-12 14:06 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-08 16:46 [PATCH] f2fs: move f2fs to use reader-unfair rwsems Jaegeuk Kim
2022-01-08 16:46 ` [f2fs-dev] " Jaegeuk Kim
2022-01-10  8:05 ` Christoph Hellwig
2022-01-10  8:05   ` [f2fs-dev] " Christoph Hellwig
2022-01-10 15:02   ` Peter Zijlstra
2022-01-10 15:02     ` [f2fs-dev] " Peter Zijlstra
2022-01-10 16:18   ` Waiman Long
2022-01-10 16:18     ` [f2fs-dev] " Waiman Long
2022-01-10 18:45     ` Peter Zijlstra
2022-01-10 18:45       ` [f2fs-dev] " Peter Zijlstra
2022-01-10 18:55       ` Waiman Long
2022-01-10 18:55         ` [f2fs-dev] " Waiman Long
2022-01-10 19:41     ` Tim Murray
2022-01-10 19:41       ` [f2fs-dev] " Tim Murray via Linux-f2fs-devel
2022-01-11  4:15       ` Waiman Long
2022-01-11  4:15         ` [f2fs-dev] " Waiman Long
2022-01-11  6:53         ` Tim Murray
2022-01-11  6:53           ` [f2fs-dev] " Tim Murray via Linux-f2fs-devel
2022-01-11 15:04           ` Waiman Long
2022-01-11 15:04             ` [f2fs-dev] " Waiman Long
2022-01-11 17:07             ` Jaegeuk Kim
2022-01-11 17:07               ` [f2fs-dev] " Jaegeuk Kim
2022-01-11 22:01               ` Waiman Long
2022-01-11 22:01                 ` [f2fs-dev] " Waiman Long
2022-01-12  0:25                 ` Jaegeuk Kim
2022-01-12  0:25                   ` [f2fs-dev] " Jaegeuk Kim
2022-01-12 15:08                   ` Waiman Long
2022-01-12 15:08                     ` [f2fs-dev] " Waiman Long
2022-01-12 14:06       ` Peter Zijlstra [this message]
2022-01-12 14:06         ` Peter Zijlstra
2022-01-12 22:06         ` Tim Murray
2022-01-12 22:06           ` [f2fs-dev] " Tim Murray via Linux-f2fs-devel
2022-02-23  7:42         ` Christoph Hellwig
2022-02-23  7:42           ` [f2fs-dev] " Christoph Hellwig
2022-02-23 17:48           ` Jaegeuk Kim
2022-02-23 17:48             ` [f2fs-dev] " Jaegeuk Kim
2022-02-24  8:47 ` Hillf Danton
2022-02-25  1:47   ` Hillf Danton
2022-02-25  8:32     ` Hillf Danton
2022-02-28  1:12       ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yd7gVLdHW11TQUAi@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=boqun.feng@gmail.com \
    --cc=hch@infradead.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=timmurray@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.