From: Krzysztof Kozlowski <krzk@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
viro@zeniv.linux.org.uk, linux-nfs@vger.kernel.org,
bfields@fieldses.org, neilb@suse.de, jack@suse.de,
linux-ext4@vger.kernel.org, tytso@mit.edu,
adilger.kernel@dilger.ca, linux-xfs@vger.kernel.org,
darrick.wong@oracle.com, david@fromorbit.com,
linux-btrfs@vger.kernel.org, clm@fb.com, jbacik@fb.com,
dsterba@suse.com, linux-integrity@vger.kernel.org,
zohar@linux.vnet.ibm.com, dmitry.kasatkin@gmail.com,
linux-afs@lists.infradead.org, dhowells@redhat.com,
jaltman@auristor.com, linux-samsung-soc@vger.kernel.org,
"Marek Szyprowski" <m.szyprowski@samsung.com>,
"Bartłomiej Żołnierkiewicz" <b.zolnierkie@samsung.com>,
"Sylwester Nawrocki" <s.nawrocki@samsung.com>
Subject: Re: [PATCH v4 19/19] fs: handle inode->i_version more efficiently
Date: Sun, 7 Jan 2018 13:44:52 +0100 [thread overview]
Message-ID: <CAJKOXPd8ojTjiGF3nKQKWtVrEsjWfjvg6rLL+RGnbu_SZMDu7g@mail.gmail.com> (raw)
In-Reply-To: <20171222120556.7435-20-jlayton@kernel.org>
On Fri, Dec 22, 2017 at 1:05 PM, Jeff Layton <jlayton@kernel.org> wrote:
> From: Jeff Layton <jlayton@redhat.com>
>
> Since i_version is mostly treated as an opaque value, we can exploit that
> fact to avoid incrementing it when no one is watching. With that change,
> we can avoid incrementing the counter on writes, unless someone has
> queried for it since it was last incremented. If the a/c/mtime don't
> change, and the i_version hasn't changed, then there's no need to dirty
> the inode metadata on a write.
>
> Convert the i_version counter to an atomic64_t, and use the lowest order
> bit to hold a flag that will tell whether anyone has queried the value
> since it was last incremented.
>
> When we go to maybe increment it, we fetch the value and check the flag
> bit. If it's clear then we don't need to do anything if the update
> isn't being forced.
>
> If we do need to update, then we increment the counter by 2, and clear
> the flag bit, and then use a CAS op to swap it into place. If that
> works, we return true. If it doesn't then do it again with the value
> that we fetch from the CAS operation.
>
> On the query side, if the flag is already set, then we just shift the
> value down by 1 bit and return it. Otherwise, we set the flag in our
> on-stack value and again use cmpxchg to swap it into place if it hasn't
> changed. If it has, then we use the value from the cmpxchg as the new
> "old" value and try again.
>
> This method allows us to avoid incrementing the counter on writes (and
> dirtying the metadata) under typical workloads. We only need to increment
> if it has been queried since it was last changed.
>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
> include/linux/fs.h | 2 +-
> include/linux/iversion.h | 208 ++++++++++++++++++++++++++++++++++-------------
> 2 files changed, 154 insertions(+), 56 deletions(-)
>
Hi,
On recent linux-next my ARM/Exynos boards fail to boot over nfsroot.
This commit popped up through bisect (log at the end). Systemd
timeouts on some device-specific services, including mounting ext4
/home:
[ *** ] (1 of 4) A start job is running for…ress polling (1min 41s / no limit)
[ TIME ] Timed out waiting for device dev-ttySAC2.device.
Jan 07 13:29:38 [DEPEND] Dependency failed for Serial Getty on ttySAC2.
Jan 07 13:29:38 [ TIME ] Timed out waiting for device
dev-disk-by\x2dlabel-home.device.
Jan 07 13:29:38 [DEPEND] Dependency failed for /home.
Jan 07 13:29:38 [DEPEND] Dependency failed for Local File Systems.
Jan 07 13:29:38 [DEPEND] Dependency failed for File System Check on
/dev/disk/by-label/home.
Jan 07 13:30:02 [ *** ] (1 of 2) A start job is running for…ress
polling (1min 53s / no limit)
Kernel command line:
console=tty1 console=ttySAC2,115200n8
ip=192.168.1.11:192.168.1.10:192.168.1.1:255.255.255.0::eth0:none
nfsrootdebug root=/dev/nfs
nfsroot=192.168.1.10:/srv/nfs/odroidxu3,vers=4,nolock rootwait rw
smsc95xx.macaddr=00:1e:06:61:7a:93 no_console_suspend
/home is /dev/mmcblk1p2:
kozik@odroidxu3:~$ tune2fs -l /dev/mmcblk1p2
tune2fs 1.43.7 (16-Oct-2017)
Filesystem volume name: home
Last mounted on: /home
Filesystem UUID: 3f9dbeba-2738-45d3-807e-c1b2e21128ed
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent flex_bg sparse_super large_file
uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1430800
Block count: 5717760
Reserved block count: 285888
Free blocks: 5467576
Free inodes: 1428301
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1022
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8176
Inode blocks per group: 511
Flex block group size: 16
Filesystem created: Thu May 21 12:17:05 2015
Last mount time: Thu Dec 21 13:31:26 2017
Last write time: Thu Dec 21 13:31:26 2017
Mount count: 1
Maximum mount count: -1
Last checked: Thu Dec 21 13:31:25 2017
Check interval: 0 (<none>)
Lifetime writes: 126 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 42e17e23-86b2-4356-ad63-78aa51651d03
Journal backup: inode blocks
Full dmesg log:
http://www.krzk.eu/#/builders/1/builds/1258/steps/10/logs/serial0
The regular boot from rootfs on SD card also fails - but without any
serial console logs (just "Starting kernel...") so the real cause is
unknown.
Any hints?
Best regards,
Krzysztof
bisect log:
# bad: [73005e1a35fd67c644b0645c9e4c1efabd0fe62c] Add linux-next
specific files for 20180103
# good: [30a7acd573899fd8b8ac39236eff6468b195ac7d] Linux 4.15-rc6
git bisect start 'next/master' 'next/stable'
# bad: [c1d290f9ce8daa2b0a79d2fe48c1b7c3c5370f5a] Merge
remote-tracking branch 'crypto/master'
git bisect bad c1d290f9ce8daa2b0a79d2fe48c1b7c3c5370f5a
# bad: [55695d94f0915121d106cd2d1ab94983a32f3e9a] Merge
remote-tracking branch 'hid/for-next'
git bisect bad 55695d94f0915121d106cd2d1ab94983a32f3e9a
# good: [cffae1eead0dd91be1a3069a8348127bb00158f3] Merge
remote-tracking branch 'realtek/for-next'
git bisect good cffae1eead0dd91be1a3069a8348127bb00158f3
# good: [5f889f1176dc99636c6bf8af7c286decc888c007] Merge
remote-tracking branch 'btrfs/next'
git bisect good 5f889f1176dc99636c6bf8af7c286decc888c007
# good: [984c35877f36bee305e43a1c58176169854d85cf] Merge
remote-tracking branch 'xfs/for-next'
git bisect good 984c35877f36bee305e43a1c58176169854d85cf
# bad: [f9fec502daea2a869232b6dff33ba3de79dd0d61] Merge
remote-tracking branch 'printk/for-next'
git bisect bad f9fec502daea2a869232b6dff33ba3de79dd0d61
# good: [c71d227fc4133f949dae620ed5e3a250b43f2415] make kernel-side
POLL... arch-independent
git bisect good c71d227fc4133f949dae620ed5e3a250b43f2415
# good: [416d20e8c31107f5dfd45d1d80d1e6c8e4871180] Merge branches
'work.get_user_pages_fast', 'work.wmci', 'work.sock_recvmsg',
'misc.netdrv', 'misc.poll', 'work.mqueue', 'work.whack-a-mole' and
'work.misc' into for-next
git bisect good 416d20e8c31107f5dfd45d1d80d1e6c8e4871180
# good: [325a1de4a691512a48c1426b943a7b0b9f8d6744] xfs: convert to new
i_version API
git bisect good 325a1de4a691512a48c1426b943a7b0b9f8d6744
# good: [a94fe10fb114c169e7ddaecd8251521886409121] checkpatch: add
pF/pf deprecation warning
git bisect good a94fe10fb114c169e7ddaecd8251521886409121
# good: [6b3911dffd1184fdcd63299a5fee59ac000f2067] btrfs: only dirty
the inode in btrfs_update_time if something was changed
git bisect good 6b3911dffd1184fdcd63299a5fee59ac000f2067
# bad: [448f8c749a7a0ae03505823910ec45a112678048] Merge
remote-tracking branch 'iversion/iversion-next'
git bisect bad 448f8c749a7a0ae03505823910ec45a112678048
# bad: [8618bff776758ebff5b55211e7b5a60a0fc119a5] fs: handle
inode->i_version more efficiently
git bisect bad 8618bff776758ebff5b55211e7b5a60a0fc119a5
# first bad commit: [8618bff776758ebff5b55211e7b5a60a0fc119a5] fs:
handle inode->i_version more efficiently
next prev parent reply other threads:[~2018-01-07 12:45 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-22 12:05 [PATCH v4 00/19] fs: rework and optimize i_version handling in filesystems Jeff Layton
2017-12-22 12:05 ` [PATCH v4 01/19] fs: new API for handling inode->i_version Jeff Layton
2017-12-22 23:14 ` NeilBrown
2017-12-22 23:54 ` Jeff Layton
2018-01-02 17:01 ` Jan Kara
2017-12-25 14:50 ` Jeff Layton
2017-12-22 12:05 ` [PATCH v4 02/19] fs: don't take the i_lock in inode_inc_iversion Jeff Layton
2017-12-22 12:05 ` [PATCH v4 03/19] fat: convert to new i_version API Jeff Layton
2017-12-22 12:05 ` [PATCH v4 04/19] affs: " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 05/19] afs: " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 06/19] btrfs: " Jeff Layton
2018-01-08 17:59 ` David Sterba
2017-12-22 12:05 ` [PATCH v4 07/19] exofs: switch " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 08/19] ext2: convert " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 09/19] ext4: " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 10/19] nfs: " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 11/19] nfsd: " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 12/19] ocfs2: " Jeff Layton
2018-01-04 13:34 ` Jeff Layton
2017-12-22 12:05 ` [PATCH v4 13/19] ufs: use " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 14/19] xfs: convert to " Jeff Layton
2017-12-23 0:05 ` Darrick J. Wong
2017-12-22 12:05 ` [PATCH v4 15/19] IMA: switch IMA over " Jeff Layton
2017-12-22 12:05 ` [PATCH v4 16/19] fs: only set S_VERSION when updating times if necessary Jeff Layton
2018-01-02 16:50 ` Jan Kara
2018-01-02 19:03 ` Jeff Layton
2017-12-22 12:05 ` [PATCH v4 17/19] xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing Jeff Layton
2017-12-23 0:07 ` Darrick J. Wong
2017-12-22 12:05 ` [PATCH v4 18/19] btrfs: only dirty the inode in btrfs_update_time if something was changed Jeff Layton
2018-01-08 17:59 ` David Sterba
2017-12-22 12:05 ` [PATCH v4 19/19] fs: handle inode->i_version more efficiently Jeff Layton
2018-01-02 17:00 ` Jan Kara
2018-01-07 12:44 ` Krzysztof Kozlowski [this message]
2018-01-08 12:56 ` Jeff Layton
2018-01-08 13:21 ` Krzysztof Kozlowski
2018-01-08 13:29 ` Jeff Layton
2018-01-08 17:29 ` Krzysztof Kozlowski
2018-01-08 18:00 ` Jeff Layton
2018-01-08 18:33 ` Krzysztof Kozlowski
2018-01-08 19:15 ` Jeff Layton
2018-01-08 20:05 ` Jeff Layton
2018-01-08 20:17 ` Krzysztof Kozlowski
2018-01-08 21:39 ` Jeff Layton
2018-01-09 9:27 ` Krzysztof Kozlowski
2018-01-08 13:30 ` Matthew Wilcox
2018-01-08 13:46 ` Jeff Layton
2017-12-22 14:43 ` (Lack of) i_version handling in udf Steve Magnani
2017-12-22 15:54 ` Jeff Layton
2018-01-02 17:20 ` [PATCH v4 05/19] afs: convert to new i_version API David Howells
2018-01-02 18:57 ` Jeff Layton
2018-01-03 16:28 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJKOXPd8ojTjiGF3nKQKWtVrEsjWfjvg6rLL+RGnbu_SZMDu7g@mail.gmail.com \
--to=krzk@kernel.org \
--cc=adilger.kernel@dilger.ca \
--cc=b.zolnierkie@samsung.com \
--cc=bfields@fieldses.org \
--cc=clm@fb.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=dhowells@redhat.com \
--cc=dmitry.kasatkin@gmail.com \
--cc=dsterba@suse.com \
--cc=jack@suse.de \
--cc=jaltman@auristor.com \
--cc=jbacik@fb.com \
--cc=jlayton@kernel.org \
--cc=linux-afs@lists.infradead.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-samsung-soc@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=neilb@suse.de \
--cc=s.nawrocki@samsung.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=zohar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).