From: "Darrick J. Wong" <darrick.wong@oracle.com> To: Eric Biggers <ebiggers@kernel.org> Cc: Dave Chinner <david@fromorbit.com>, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org Subject: Re: [PATCH] fs/direct-io: avoid data race on ->s_dio_done_wq Date: Wed, 15 Jul 2020 09:41:18 -0700 [thread overview] Message-ID: <20200715164118.GB848607@magnolia> (raw) In-Reply-To: <20200715161342.GA1167@sol.localdomain> On Wed, Jul 15, 2020 at 09:13:42AM -0700, Eric Biggers wrote: > On Wed, Jul 15, 2020 at 06:01:44PM +1000, Dave Chinner wrote: > > > > > /* direct-io.c: */ > > > > > -int sb_init_dio_done_wq(struct super_block *sb); > > > > > +int __sb_init_dio_done_wq(struct super_block *sb); > > > > > +static inline int sb_init_dio_done_wq(struct super_block *sb) > > > > > +{ > > > > > + /* pairs with cmpxchg() in __sb_init_dio_done_wq() */ > > > > > + if (likely(READ_ONCE(sb->s_dio_done_wq))) > > > > > + return 0; > > > > > + return __sb_init_dio_done_wq(sb); > > > > > +} > > > > > > > > Ummm, why don't you just add this check in sb_init_dio_done_wq(). I > > > > don't see any need for adding another level of function call > > > > abstraction in the source code? > > > > > > This keeps the fast path doing no function calls and one fewer branch, as it was > > > before. People care a lot about minimizing direct I/O overhead, so it seems > > > desirable to keep this simple optimization. Would you rather it be removed? > > > > No. > > > > What I'm trying to say is that I'd prefer fast path checks don't get > > hidden away in a static inline function wrappers that require the > > reader to go look up code in a different file to understand that > > code in yet another different file is conditionally executed. > > > > Going from obvious, easy to read fast path code to spreading the > > fast path logic over functions in 3 different files is not an > > improvement in the code - it is how we turn good code into an > > unmaintainable mess... > > The alternative would be to duplicate the READ_ONCE() at all 3 call sites -- > including the explanatory comment. That seems strictly worse. > > And the code before was broken, so I disagree it was "obvious" or "good". > > > > > > > Also, you need to explain the reason for the READ_ONCE() existing > > > > rather than just saying "it pairs with <some other operation>". > > > > Knowing what operation it pairs with doesn't explain why the pairing > > > > is necessary in the first place, and that leads to nobody reading > > > > the code being able to understand what this is protecting against. > > > > > > > > > > How about this? > > > > > > /* > > > * Nothing to do if ->s_dio_done_wq is already set. But since another > > > * process may set it concurrently, we need to use READ_ONCE() rather > > > * than a plain read to avoid a data race (undefined behavior) and to > > > * ensure we observe the pointed-to struct to be fully initialized. > > > */ > > > if (likely(READ_ONCE(sb->s_dio_done_wq))) > > > return 0; > > > > You still need to document what it pairs with, as "data race" doesn't > > describe the actual dependency we are synchronising against is. > > > > AFAICT from your description, the data race is not on > > sb->s_dio_done_wq itself, but on seeing the contents of the > > structure being pointed to incorrectly. i.e. we need to ensure that > > writes done before the cmpxchg are ordered correctly against > > reads done after the pointer can be seen here. > > > > No, the data race is on ->s_dio_done_wq itself. How about this: > > /* > * Nothing to do if ->s_dio_done_wq is already set. The READ_ONCE() > * here pairs with the cmpxchg() in __sb_init_dio_done_wq(). Since the > * cmpxchg() may set ->s_dio_done_wq concurrently, a plain load would be > * a data race (undefined behavior), so READ_ONCE() is needed. > * READ_ONCE() also includes any needed read data dependency barrier to > * ensure that the pointed-to struct is seen to be fully initialized. > */ > > FWIW, long-term we really need to get developers to understand these sorts of > issues, so that the code is written correctly in the first place and we don't > need to annotate common patterns like one-time-init with a long essay and have a > long discussion. Recently KCSAN was merged upstream > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/dev-tools/kcsan.rst) > and the memory model documentation was improved > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt?h=v5.8-rc5#n1922), > so hopefully that will raise awareness... I tried to understand that, but TBH this whole topic area seems very complex and difficult to understand. I more or less understand what READ_ONCE and WRITE_ONCE do wrt restricting compiler optimizations, but I wouldn't say that I understand all that machinery. Granted, my winning strategy so far is to write a simple version with big dumb locks and let the rest of you argue over slick optimizations. :P <shrug> If using READ_ONCE and cmpxchg for pointer initialization (or I guess smp_store_release and smp_load_acquire?) are a commonly used paradigm, then maybe that should get its own section in tools/memory-model/Documentation/recipes.txt and then any code that uses it can point readers at that? --D > > If so, can't we just treat this as a normal > > store-release/load-acquire ordering pattern and hence use more > > relaxed memory barriers instead of have to patch up what we have now > > to specifically make ancient platforms that nobody actually uses > > with weird and unusual memory models work correctly? > > READ_ONCE() is already as relaxed as it can get, as it includes a read data > dependency barrier only (which is no-op on everything other than Alpha). > > If anything it should be upgraded to smp_load_acquire(), which handles control > dependencies too. I didn't see anything obvious in the workqueue code that > would need that (i.e. accesses to some global structure that isn't transitively > reachable via the workqueue_struct itself). But we could use it to be safe if > we're okay with any performance implications of the additional memory barrier it > would add. > > - Eric
next prev parent reply other threads:[~2020-07-15 16:41 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-07-13 3:33 Eric Biggers 2020-07-15 1:30 ` Dave Chinner 2020-07-15 2:37 ` Eric Biggers 2020-07-15 8:01 ` Dave Chinner 2020-07-15 16:13 ` Eric Biggers 2020-07-15 16:41 ` Darrick J. Wong [this message] 2020-07-16 1:46 ` Dave Chinner 2020-07-16 2:39 ` Eric Biggers 2020-07-16 2:47 ` Matthew Wilcox 2020-07-16 3:19 ` Eric Biggers 2020-07-16 5:33 ` Eric Biggers 2020-07-16 8:16 ` Dave Chinner 2020-07-17 1:36 ` Darrick J. Wong
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200715164118.GB848607@magnolia \ --to=darrick.wong@oracle.com \ --cc=david@fromorbit.com \ --cc=ebiggers@kernel.org \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-xfs@vger.kernel.org \ --subject='Re: [PATCH] fs/direct-io: avoid data race on ->s_dio_done_wq' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).