From: kernel test robot <rong.a.chen@intel.com>
To: lkp@lists.01.org
Subject: Re: [fsnotify] 60f7ed8c7c: will-it-scale.per_thread_ops -5.9% regression
Date: Thu, 18 Oct 2018 17:22:59 +0800 [thread overview]
Message-ID: <20181018092259.GE16117@shao2-debian> (raw)
In-Reply-To: <CAOQ4uxi-xfukNt0dHMrjokcXJ9keeaD9Xx9Q4DHRv_SH_YcQXQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 9310 bytes --]
On Mon, Oct 15, 2018 at 03:26:13PM +0300, Amir Goldstein wrote:
> On Mon, Oct 15, 2018 at 12:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Oct 15, 2018 at 10:50 AM Rong Chen <rong.a.chen@intel.com> wrote:
> > [...]
> > > the patch seems not work.
> > >
> > > tests: 1
> > > testcase/path_params/tbox_group/run: will-it-scale/16-thread-unlink2-performance/lkp-bdw-ep3d
> > >
> > > commit:
> > > 1e6cb72399 ("fsnotify: add super block object type")
> > > 298cd0b2f4 (the below patch)
> > >
> > > 1e6cb72399fd58b3 298cd0b2f481d9cc2e2cd5bfd3
> > > ---------------- --------------------------
> > > %stddev change %stddev
> > > \ | \
> > > 103.21 -5% 98.54 will-it-scale.time.user_time
> > > 46266 -6% 43516 will-it-scale.time.involuntary_context_switches
> > > 54483 -7% 50610 will-it-scale.per_thread_ops
> > > 871749 -7% 809765 will-it-scale.workload
> >
> > Thanks for testing my patch. As Jan commented, it is not surprising
> > that the patch
> > makes no difference.
> >
> > I would like to clarify a few things about how you ran the test before
> > I continue to
> > investigate:
> >
> > 1. When I ran the workload I saw that it writes files to whatever filesystem is
> > mounted on /tmp. Can I assume you have tmpfs mounted at /tmp?
yes, it's tmpfs
> >
> > 2. Can you confirm that there is no fanotify mount mark on the /tmp mount?
> > for example:
> > # ls -l /proc/*/fd/*|grep fanotify
> > lrwx------ 1 root root 64 Oct 15 08:36 /proc/3927/fd/3 -> anon_inode:[fanotify]
> > # grep fanotify.mnt_id /proc/3927/fdinfo/3
> > fanotify mnt_id:33 mflags:0 mask:3b ignored_mask:0
> > # grep ^$(( 0x33 )) /proc/3927/mountinfo
> > 51 16 0:27 / /tmp rw,relatime shared:18 - tmpfs tmpfs rw
no fanotify mount
> >
> > 3. I saw that LKP caches the results for a specific commit
> > (i.e. 1e6cb72399 ("fsnotify: add super block object type")).
> > Did you use cached results when comparing to patch or did you re-run the
> > test with the "good" commit? The reason I am asking is because
> > sometimes performance result may differ between boots even with no
> > kernel code change.
> > Where all the "good" bisect samples taken from the same boot/machine?
> > or different boots/machines?
from the same boot/machine, the only difference is the commit.
> >
> > 4. If this regression is reliably reproduced, then our best bet is on the
> > cost of access to s_fsnotify_{marks,mask} fields.
> > The patch below moves those frequently accessed fields near the
> > frequently accessed fields s_time_gran,s_writers and moves
> > the seldom accessed fields s_id,s_uuid further away.
> > Could you please try this patch?
> >
>
> Better test this patch instead. It does a bit more re-organizing.
> If this works well for 16-thread-unlink2 workload, could you please
> also run it through other workloads to see if it improves them as well?
> and does not degrade them...
the patch looks good.
tests: 1
testcase/path_params/tbox_group/run: will-it-scale/16-thread-unlink2-performance/lkp-bdw-ep3d
1e6cb72399fd58b3 bfb397545bb4d4bbca5ffb5e52
---------------- --------------------------
%stddev change %stddev
\ | \
81035 -8% 74707 interrupts.CAL:Function_call_interrupts
137 135 turbostat.PkgWatt
1507 -8% 1382 vmstat.system.cs
452379 -8% 414960 perf-stat.context-switches
1.02e+08 ± 5% -17% 84231388 ± 7% perf-stat.dTLB-store-misses
0.01 ± 7% -20% 0.01 ± 7% perf-stat.dTLB-store-miss-rate%
3937 52% 5968 proc-vmstat.nr_zone_inactive_anon
3937 52% 5968 proc-vmstat.nr_inactive_anon
8101 ± 7% 17% 9456 ± 7% proc-vmstat.nr_shmem
6766 13% 7662 proc-vmstat.nr_mapped
18520 18807 proc-vmstat.nr_slab_reclaimable
4784 4724 proc-vmstat.numa_other
0 9e+05 917236 ±119% latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
3597 ± 18% 9e+05 919831 ±118% latency_stats.avg.max
269 ± 81% 8e+03 8196 ±217% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
0 7e+03 6926 ±185% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
0 9e+05 929948 ±116% latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
4934 9e+05 932944 ±116% latency_stats.max.max
300 ± 86% 2e+04 16025 ±222% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
0 7e+03 6926 ±185% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
0 1e+06 1209492 ±123% latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
10126 ± 27% 2e+04 32104 ± 72% latency_stats.sum.do_syslog.kmsg_read.proc_reg_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
663 ±113% 2e+04 16393 ±217% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup
0 7e+03 6926 ±185% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64
2357 ± 62% 3e+03 5243 ±123% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat
Best Regards,
Rong Chen
>
> Thanks,
> Amir.
>
> ---
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 25a449f37bb1..baec0b3ff53f 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1393,17 +1393,24 @@ struct super_block {
>
> struct sb_writers s_writers;
>
> + /* START frequently accessed fields block */
> + void *s_fs_info; /* Filesystem private info */
> +
> + /* Granularity of c/m/atime in ns (cannot be worse than a second) */
> + u32 s_time_gran;
> +#ifdef CONFIG_FSNOTIFY
> + __u32 s_fsnotify_mask;
> + struct fsnotify_mark_connector __rcu *s_fsnotify_marks;
> +#endif
> + /* END frequently accessed fields block */
> +
> + /* START seldom accessed fields block */
> char s_id[32]; /* Informational name */
> uuid_t s_uuid; /* UUID */
>
> - void *s_fs_info; /* Filesystem private info */
> unsigned int s_max_links;
> fmode_t s_mode;
>
> - /* Granularity of c/m/atime in ns.
> - Cannot be worse than a second */
> - u32 s_time_gran;
> -
> /*
> * The next field is for VFS *only*. No filesystems have any business
> * even looking at it. You had been warned.
> @@ -1415,6 +1422,7 @@ struct super_block {
> * in /proc/mounts will be "type.subtype"
> */
> char *s_subtype;
> + /* END seldom accessed fields block */
>
> const struct dentry_operations *s_d_op; /* default d_op for dentries */
>
> @@ -1464,11 +1472,6 @@ struct super_block {
>
> spinlock_t s_inode_wblist_lock;
> struct list_head s_inodes_wb; /* writeback inodes */
> -
> -#ifdef CONFIG_FSNOTIFY
> - __u32 s_fsnotify_mask;
> - struct fsnotify_mark_connector __rcu *s_fsnotify_marks;
> -#endif
> } __randomize_layout;
next prev parent reply other threads:[~2018-10-18 9:22 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-30 6:51 [LKP] [fsnotify] 60f7ed8c7c: will-it-scale.per_thread_ops -5.9% regression kernel test robot
2018-09-30 6:51 ` kernel test robot
2018-09-30 9:00 ` [LKP] " Amir Goldstein
2018-09-30 9:16 ` Amir Goldstein
2018-10-01 9:25 ` Jan Kara
2018-10-01 9:25 ` Jan Kara
2018-10-15 7:51 ` Rong Chen
2018-10-15 9:27 ` [LKP] " Amir Goldstein
2018-10-15 12:26 ` Amir Goldstein
2018-10-18 9:22 ` kernel test robot [this message]
2018-10-01 9:32 ` Jan Kara
2018-10-01 9:32 ` Jan Kara
2018-10-01 9:52 ` [LKP] " Amir Goldstein
2018-10-02 14:49 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181018092259.GE16117@shao2-debian \
--to=rong.a.chen@intel.com \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.