From: Josef Bacik <josef@toxicpanda.com>
To: viro@ZenIV.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org, kernel-team@fb.com, jack@suse.cz,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] fs: use READ_ONCE/WRITE_ONCE with the i_size helpers
Date: Thu, 24 Oct 2019 08:08:44 -0400 [thread overview]
Message-ID: <20191024120843.4n2eh47okn4c635f@MacBook-Pro-91.local> (raw)
In-Reply-To: <20191011202050.8656-1-josef@toxicpanda.com>
On Fri, Oct 11, 2019 at 04:20:50PM -0400, Josef Bacik wrote:
> I spent the last few weeks running down a weird regression in btrfs we
> were seeing in production. It turned out to be introduced by
> 62b37622718c, which took the following
>
> loff_t isize = i_size_read(inode);
>
> actual_end = min_t(u64, isize, end + 1);
>
> and turned it into
>
> actual_end = min_t(u64, i_size_read(inode), end + 1);
>
> The problem here is that the compiler is optimizing out the temporary
> variables used in __cmp_once, so the resulting assembly looks like this
>
> 498 actual_end = min_t(u64, i_size_read(inode), end + 1);
> 0xffffffff814b08c1 <+145>: 48 8b 44 24 28 mov 0x28(%rsp),%rax
> 0xffffffff814b08c6 <+150>: 48 39 45 50 cmp %rax,0x50(%rbp)
> 0xffffffff814b08ca <+154>: 48 89 c6 mov %rax,%rsi
> 0xffffffff814b08cd <+157>: 48 0f 46 75 50 cmovbe 0x50(%rbp),%rsi
>
> as you can see we read the value of the inode to compare, and then we
> read it a second time to assign it.
>
> This code is simply an optimization, so there's no locking to keep
> i_size from changing, however we really need min_t to actually return
> the minimum value for these two values, which it is failing to do.
>
> We've reverted that patch for now to fix the problem, but it's only a
> matter of time before the compiler becomes smart enough to optimize out
> the loff_t isize intermediate variable as well.
>
> Instead we want to make it explicit that i_size_read() should only read
> the value once. This will keep this class of problem from happening in
> the future, regardless of what the compiler chooses to do. With this
> change we get the following assembly generated for this code
>
> 491 actual_end = min_t(u64, i_size_read(inode), end + 1);
> 0xffffffff8148f625 <+149>: 48 8b 44 24 20 mov 0x20(%rsp),%rax
>
> ./include/linux/compiler.h:
> 199 __READ_ONCE_SIZE;
> 0xffffffff8148f62a <+154>: 4c 8b 75 50 mov 0x50(%rbp),%r14
>
> fs/btrfs/inode.c:
> 491 actual_end = min_t(u64, i_size_read(inode), end + 1);
> 0xffffffff8148f62e <+158>: 49 39 c6 cmp %rax,%r14
> 0xffffffff8148f631 <+161>: 4c 0f 47 f0 cmova %rax,%r14
>
> and this works out properly, we only read the value once and so we won't
> trip over this problem again.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Al,
Will you pick this up, or do you want me to send it along? Thanks,
Josef
next prev parent reply other threads:[~2019-10-24 12:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-11 20:20 [PATCH] fs: use READ_ONCE/WRITE_ONCE with the i_size helpers Josef Bacik
2019-10-12 0:34 ` Rik van Riel
2019-10-14 8:51 ` Jan Kara
2019-10-14 12:02 ` David Sterba
2019-10-24 12:08 ` Josef Bacik [this message]
2019-10-29 17:22 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191024120843.4n2eh47okn4c635f@MacBook-Pro-91.local \
--to=josef@toxicpanda.com \
--cc=jack@suse.cz \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).