From: Eric Biggers <ebiggers@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-ext4@vger.kernel.org
Subject: Re: [PATCH] fs/direct-io: avoid data race on ->s_dio_done_wq
Date: Tue, 14 Jul 2020 19:37:14 -0700 [thread overview]
Message-ID: <20200715023714.GA38091@sol.localdomain> (raw)
In-Reply-To: <20200715013008.GD2005@dread.disaster.area>
On Wed, Jul 15, 2020 at 11:30:08AM +1000, Dave Chinner wrote:
> On Sun, Jul 12, 2020 at 08:33:30PM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> >
> > Fix the preliminary checks for ->s_dio_done_wq to use READ_ONCE(), since
> > it's a data race, and technically the behavior is undefined without
> > READ_ONCE(). Also, on one CPU architecture (Alpha), the data read
> > dependency barrier included in READ_ONCE() is needed to guarantee that
> > the pointed-to struct is seen as fully initialized.
> >
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> > ---
> > fs/direct-io.c | 8 +++-----
> > fs/internal.h | 9 ++++++++-
> > fs/iomap/direct-io.c | 3 +--
> > 3 files changed, 12 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/direct-io.c b/fs/direct-io.c
> > index 6d5370eac2a8..26221ae24156 100644
> > --- a/fs/direct-io.c
> > +++ b/fs/direct-io.c
> > @@ -590,7 +590,7 @@ static inline int dio_bio_reap(struct dio *dio, struct dio_submit *sdio)
> > * filesystems that don't need it and also allows us to create the workqueue
> > * late enough so the we can include s_id in the name of the workqueue.
> > */
> > -int sb_init_dio_done_wq(struct super_block *sb)
> > +int __sb_init_dio_done_wq(struct super_block *sb)
> > {
> > struct workqueue_struct *old;
> > struct workqueue_struct *wq = alloc_workqueue("dio/%s",
> > @@ -615,9 +615,7 @@ static int dio_set_defer_completion(struct dio *dio)
> > if (dio->defer_completion)
> > return 0;
> > dio->defer_completion = true;
> > - if (!sb->s_dio_done_wq)
> > - return sb_init_dio_done_wq(sb);
> > - return 0;
> > + return sb_init_dio_done_wq(sb);
> > }
> >
> > /*
> > @@ -1250,7 +1248,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > retval = 0;
> > if (iocb->ki_flags & IOCB_DSYNC)
> > retval = dio_set_defer_completion(dio);
> > - else if (!dio->inode->i_sb->s_dio_done_wq) {
> > + else {
> > /*
> > * In case of AIO write racing with buffered read we
> > * need to defer completion. We can't decide this now,
> > diff --git a/fs/internal.h b/fs/internal.h
> > index 9b863a7bd708..6736c9eee978 100644
> > --- a/fs/internal.h
> > +++ b/fs/internal.h
> > @@ -178,7 +178,14 @@ extern void mnt_pin_kill(struct mount *m);
> > extern const struct dentry_operations ns_dentry_operations;
> >
> > /* direct-io.c: */
> > -int sb_init_dio_done_wq(struct super_block *sb);
> > +int __sb_init_dio_done_wq(struct super_block *sb);
> > +static inline int sb_init_dio_done_wq(struct super_block *sb)
> > +{
> > + /* pairs with cmpxchg() in __sb_init_dio_done_wq() */
> > + if (likely(READ_ONCE(sb->s_dio_done_wq)))
> > + return 0;
> > + return __sb_init_dio_done_wq(sb);
> > +}
>
> Ummm, why don't you just add this check in sb_init_dio_done_wq(). I
> don't see any need for adding another level of function call
> abstraction in the source code?
This keeps the fast path doing no function calls and one fewer branch, as it was
before. People care a lot about minimizing direct I/O overhead, so it seems
desirable to keep this simple optimization. Would you rather it be removed?
>
> Also, you need to explain the reason for the READ_ONCE() existing
> rather than just saying "it pairs with <some other operation>".
> Knowing what operation it pairs with doesn't explain why the pairing
> is necessary in the first place, and that leads to nobody reading
> the code being able to understand what this is protecting against.
>
How about this?
/*
* Nothing to do if ->s_dio_done_wq is already set. But since another
* process may set it concurrently, we need to use READ_ONCE() rather
* than a plain read to avoid a data race (undefined behavior) and to
* ensure we observe the pointed-to struct to be fully initialized.
*/
if (likely(READ_ONCE(sb->s_dio_done_wq)))
return 0;
next prev parent reply other threads:[~2020-07-15 2:37 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-13 3:33 [PATCH] fs/direct-io: avoid data race on ->s_dio_done_wq Eric Biggers
2020-07-15 1:30 ` Dave Chinner
2020-07-15 2:37 ` Eric Biggers [this message]
2020-07-15 8:01 ` Dave Chinner
2020-07-15 16:13 ` Eric Biggers
2020-07-15 16:41 ` Darrick J. Wong
2020-07-16 1:46 ` Dave Chinner
2020-07-16 2:39 ` Eric Biggers
2020-07-16 2:47 ` Matthew Wilcox
2020-07-16 3:19 ` Eric Biggers
2020-07-16 5:33 ` Eric Biggers
2020-07-16 8:16 ` Dave Chinner
2020-07-17 1:36 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200715023714.GA38091@sol.localdomain \
--to=ebiggers@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).