From: Mikulas Patocka <mpatocka@redhat.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
Steven Whitehouse <swhiteho@redhat.com>,
Eric Sandeen <esandeen@redhat.com>,
Dave Chinner <dchinner@redhat.com>,
"Theodore Ts'o" <tytso@mit.edu>,
Wang Jianchao <jianchao.wan9@gmail.com>,
"Kani, Toshi" <toshi.kani@hpe.com>,
"Norton, Scott J" <scott.norton@hpe.com>,
"Tadakamadla, Rajesh" <rajesh.tadakamadla@hpe.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-nvdimm@lists.01.org
Subject: Re: [RFC v2] nvfs: a filesystem for persistent memory
Date: Mon, 11 Jan 2021 06:41:36 -0500 (EST) [thread overview]
Message-ID: <alpine.LRH.2.02.2101110631330.4356@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <20210110234042.GX3579531@ZenIV.linux.org.uk>
On Sun, 10 Jan 2021, Al Viro wrote:
> On Sun, Jan 10, 2021 at 04:14:55PM -0500, Mikulas Patocka wrote:
>
> > That's a good point. I split nvfs_rw_iter to separate functions
> > nvfs_read_iter and nvfs_write_iter - and inlined nvfs_rw_iter_locked into
> > both of them. It improved performance by 1.3%.
> >
> > > Not that it had been more useful on the write side, really,
> > > but that's another story (nvfs_write_pages() handling of
> > > copyin is... interesting). Let's figure out what's going
> > > on with the read overhead first...
> > >
> > > lib/iov_iter.c primitives certainly could use massage for
> > > better code generation, but let's find out how much of the
> > > PITA is due to those and how much comes from you fighing
> > > the damn thing instead of using it sanely...
> >
> > The results are:
> >
> > read: 6.744s
> > read_iter: 7.417s
> > read_iter - separate read and write path: 7.321s
> > Al's read_iter: 7.182s
> > Al's read_iter with _copy_to_iter: 7.181s
>
> So
> * overhead of hardening stuff is noise here
> * switching to more straightforward ->read_iter() cuts
> the overhead by about 1/3.
>
> Interesting... I wonder how much of that is spent in
> iterate_and_advance() glue inside copy_to_iter() here. There's
> certainly quite a bit of optimizations possible in those
> primitives and your usecase makes a decent test for that...
>
> Could you profile that and see where is it spending
> the time, on instruction level?
This is the read method profile:
time 9.056s
52.69% pread [kernel.vmlinux] [k] copy_user_generic_string
6.24% pread [kernel.vmlinux] [k] current_time
6.22% pread [kernel.vmlinux] [k] entry_SYSCALL_64
4.88% pread libc-2.31.so [.] __libc_pread
3.75% pread [kernel.vmlinux] [k] syscall_return_via_sysret
3.63% pread [nvfs] [k] nvfs_read
2.83% pread [nvfs] [k] nvfs_bmap
2.81% pread [kernel.vmlinux] [k] vfs_read
2.63% pread [kernel.vmlinux] [k] __x64_sys_pread64
2.27% pread [kernel.vmlinux] [k] __fsnotify_parent
2.19% pread [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
1.55% pread [kernel.vmlinux] [k] atime_needs_update
1.17% pread [kernel.vmlinux] [k] syscall_enter_from_user_mode
1.15% pread [kernel.vmlinux] [k] touch_atime
0.84% pread [kernel.vmlinux] [k] down_read
0.82% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode
0.71% pread [kernel.vmlinux] [k] do_syscall_64
0.68% pread [kernel.vmlinux] [k] ktime_get_coarse_real_ts64
0.66% pread [kernel.vmlinux] [k] __fget_light
0.53% pread [kernel.vmlinux] [k] exit_to_user_mode_prepare
0.45% pread [kernel.vmlinux] [k] up_read
0.44% pread pread [.] main
0.44% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare
0.26% pread [kernel.vmlinux] [k] entry_SYSCALL_64_safe_stack
0.12% pread pread [.] pread@plt
0.07% pread [kernel.vmlinux] [k] __fdget
0.00% perf [kernel.vmlinux] [k] x86_pmu_enable_all
This is profile of "read_iter - separate read and write path":
time 10.058s
53.05% pread [kernel.vmlinux] [k] copy_user_generic_string
6.82% pread [kernel.vmlinux] [k] current_time
6.27% pread [nvfs] [k] nvfs_read_iter
4.70% pread [kernel.vmlinux] [k] entry_SYSCALL_64
3.20% pread libc-2.31.so [.] __libc_pread
2.77% pread [kernel.vmlinux] [k] syscall_return_via_sysret
2.31% pread [kernel.vmlinux] [k] vfs_read
2.15% pread [kernel.vmlinux] [k] new_sync_read
2.06% pread [kernel.vmlinux] [k] __fsnotify_parent
2.02% pread [nvfs] [k] nvfs_bmap
1.87% pread [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
1.86% pread [kernel.vmlinux] [k] iov_iter_advance
1.62% pread [kernel.vmlinux] [k] __x64_sys_pread64
1.40% pread [kernel.vmlinux] [k] atime_needs_update
0.99% pread [kernel.vmlinux] [k] syscall_enter_from_user_mode
0.85% pread [kernel.vmlinux] [k] touch_atime
0.85% pread [kernel.vmlinux] [k] down_read
0.84% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode
0.78% pread [kernel.vmlinux] [k] ktime_get_coarse_real_ts64
0.65% pread [kernel.vmlinux] [k] __fget_light
0.57% pread [kernel.vmlinux] [k] exit_to_user_mode_prepare
0.53% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare
0.45% pread pread [.] main
0.43% pread [kernel.vmlinux] [k] up_read
0.43% pread [kernel.vmlinux] [k] do_syscall_64
0.28% pread [kernel.vmlinux] [k] iov_iter_init
0.16% pread [kernel.vmlinux] [k] entry_SYSCALL_64_safe_stack
0.09% pread pread [.] pread@plt
0.03% pread [kernel.vmlinux] [k] __fdget
0.00% pread [kernel.vmlinux] [k] update_rt_rq_load_avg
0.00% perf [kernel.vmlinux] [k] x86_pmu_enable_all
This is your read_iter_locked profile (read_iter_locked is inlined to
nvfs_read_iter):
time 10.056s
50.71% pread [kernel.vmlinux] [k] copy_user_generic_string
6.95% pread [kernel.vmlinux] [k] current_time
5.22% pread [kernel.vmlinux] [k] entry_SYSCALL_64
4.29% pread libc-2.31.so [.] __libc_pread
4.17% pread [nvfs] [k] nvfs_read_iter
3.20% pread [kernel.vmlinux] [k] syscall_return_via_sysret
2.66% pread [kernel.vmlinux] [k] _copy_to_iter
2.44% pread [kernel.vmlinux] [k] __x64_sys_pread64
2.38% pread [kernel.vmlinux] [k] new_sync_read
2.37% pread [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
2.26% pread [kernel.vmlinux] [k] vfs_read
2.02% pread [nvfs] [k] nvfs_bmap
1.88% pread [kernel.vmlinux] [k] __fsnotify_parent
1.46% pread [kernel.vmlinux] [k] atime_needs_update
1.08% pread [kernel.vmlinux] [k] touch_atime
0.83% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode
0.82% pread [kernel.vmlinux] [k] syscall_enter_from_user_mode
0.75% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare
0.73% pread [kernel.vmlinux] [k] __fget_light
0.65% pread [kernel.vmlinux] [k] down_read
0.58% pread pread [.] main
0.58% pread [kernel.vmlinux] [k] exit_to_user_mode_prepare
0.52% pread [kernel.vmlinux] [k] ktime_get_coarse_real_ts64
0.48% pread [kernel.vmlinux] [k] up_read
0.42% pread [kernel.vmlinux] [k] do_syscall_64
0.28% pread [kernel.vmlinux] [k] iov_iter_init
0.13% pread [kernel.vmlinux] [k] __fdget
0.12% pread [kernel.vmlinux] [k] entry_SYSCALL_64_safe_stack
0.03% pread pread [.] pread@plt
0.00% perf [kernel.vmlinux] [k] x86_pmu_enable_all
Mikulas
next prev parent reply other threads:[~2021-01-11 11:43 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-07 13:15 [RFC v2] nvfs: a filesystem for persistent memory Mikulas Patocka
2021-01-07 15:11 ` Expense of read_iter Matthew Wilcox
2021-01-07 16:43 ` Mingkai Dong
2021-01-12 13:45 ` Zhongwei Cai
2021-01-12 14:06 ` David Laight
2021-01-13 16:44 ` Mikulas Patocka
2021-01-15 9:40 ` Zhongwei Cai
2021-01-20 4:47 ` Dave Chinner
2021-01-20 14:18 ` Jan Kara
2021-01-20 15:12 ` Mikulas Patocka
2021-01-20 15:44 ` David Laight
2021-01-21 15:47 ` Matthew Wilcox
2021-01-21 16:06 ` Mikulas Patocka
2021-01-21 16:30 ` Zhongwei Cai
2021-01-07 18:59 ` Mikulas Patocka
2021-01-10 6:13 ` Matthew Wilcox
2021-01-10 21:19 ` Mikulas Patocka
2021-01-11 0:18 ` Matthew Wilcox
2021-01-11 21:10 ` Mikulas Patocka
2021-01-11 10:11 ` David Laight
2021-01-10 16:20 ` [RFC v2] nvfs: a filesystem for persistent memory Al Viro
2021-01-10 16:51 ` Al Viro
2021-01-10 21:14 ` Mikulas Patocka
2021-01-10 23:40 ` Al Viro
2021-01-11 11:41 ` Mikulas Patocka [this message]
2021-01-11 10:29 ` David Laight
2021-01-11 11:44 ` Mikulas Patocka
2021-01-11 11:57 ` David Laight
2021-01-11 14:43 ` Al Viro
2021-01-11 14:54 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LRH.2.02.2101110631330.4356@file01.intranet.prod.int.rdu2.redhat.com \
--to=mpatocka@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dchinner@redhat.com \
--cc=esandeen@redhat.com \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jianchao.wan9@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=rajesh.tadakamadla@hpe.com \
--cc=scott.norton@hpe.com \
--cc=swhiteho@redhat.com \
--cc=toshi.kani@hpe.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).