From: Omar Sandoval <osandov@osandov.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] Btrfs: fix workqueue deadlock on dependent filesystems
Date: Wed, 7 Aug 2019 10:08:57 -0700 [thread overview]
Message-ID: <20190807170857.GA18948@vader> (raw)
In-Reply-To: <70f6b4fa-26b1-9225-7509-aa89bb7e067c@suse.com>
On Wed, Aug 07, 2019 at 10:17:26AM +0300, Nikolay Borisov wrote:
>
>
> On 6.08.19 г. 20:34 ч., Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > We hit a the following very strange deadlock on a system with Btrfs on a
> > loop device backed by another Btrfs filesystem:
> >
> > 1. The top (loop device) filesystem queues an async_cow work item from
> > cow_file_range_async(). We'll call this work X.
> > 2. Worker thread A starts work X (normal_work_helper()).
> > 3. Worker thread A executes the ordered work for the top filesystem
> > (run_ordered_work()).
> > 4. Worker thread A finishes the ordered work for work X and frees X
> > (work->ordered_free()).
> > 5. Worker thread A executes another ordered work and gets blocked on I/O
> > to the bottom filesystem (still in run_ordered_work()).
> > 6. Meanwhile, the bottom filesystem allocates and queues an async_cow
> > work item which happens to be the recently-freed X.
> > 7. The workqueue code sees that X is already being executed by worker
> > thread A, so it schedules X to be executed _after_ worker thread A
> > finishes (see the find_worker_executing_work() call in
> > process_one_work()).
>
> Isn't the bigger problem that a single run_ordered_work could
> potentially run the ordered work for more than one normal work? E.g.
> what if btrfs' code is reworked such that run_ordered_work executes
> ordered_func for just one work item (the one which called the function
> in the first place) ? Wouldn't that also resolve the issue? Correct me
> if I'm wrong but it seems silly to have one work item outlive
> ordered_free which is what currently happens, right?
We can't always run the ordered work for the normal work because then it
wouldn't be ordered :) If work item N completes before item N-1, then we
can't run the ordered work for N yet. Then, when N-1 completes, we need
to do the ordered work for N-1 _and_ N, which is how we get in this
situation.
next prev parent reply other threads:[~2019-08-07 17:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-06 17:34 [PATCH] Btrfs: fix workqueue deadlock on dependent filesystems Omar Sandoval
2019-08-07 7:17 ` Nikolay Borisov
2019-08-07 17:08 ` Omar Sandoval [this message]
2019-08-12 11:38 ` Filipe Manana
2019-08-12 18:48 ` Omar Sandoval
2019-08-12 18:53 ` Filipe Manana
2019-08-19 16:37 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190807170857.GA18948@vader \
--to=osandov@osandov.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).