From: Linus Torvalds <torvalds@linux-foundation.org> To: "Huang, Ying" <ying.huang@intel.com> Cc: Christoph Hellwig <hch@lst.de>, Dave Chinner <david@fromorbit.com>, LKML <linux-kernel@vger.kernel.org>, Bob Peterson <rpeterso@redhat.com>, Wu Fengguang <fengguang.wu@intel.com>, LKP <lkp@01.org> Subject: Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression Date: Thu, 11 Aug 2016 14:40:59 -0700 [thread overview] Message-ID: <CA+55aFx=mbAd0hEnji-TWZH_NReLLk+SbJHK1s+=r1_USV0KXw@mail.gmail.com> (raw) In-Reply-To: <87ziojxazw.fsf@yhuang-mobile.sh.intel.com> On Thu, Aug 11, 2016 at 2:16 PM, Huang, Ying <ying.huang@intel.com> wrote: > > Test result is as follow, Thanks. No change. > raw perf data: I redid my munging, with the old (good) percentages in parenthesis: intel_idle: 17.66 (16.88) copy_user_enhanced_fast_string: 3.25 (3.94) memset_erms: 2.56 (3.26) xfs_bmapi_read: 2.28 ___might_sleep: 2.09 (2.33) __block_commit_write.isra.24: 2.07 (2.47) xfs_iext_bno_to_ext: 1.79 __block_write_begin_int: 1.74 (1.56) up_write: 1.72 (1.61) unlock_page: 1.69 (1.69) down_write: 1.59 (1.55) __mark_inode_dirty: 1.54 (1.88) xfs_bmap_search_extents: 1.33 xfs_iomap_write_delay: 1.23 mark_buffer_dirty: 1.21 (1.53) __radix_tree_lookup: 1.2 (1.32) xfs_bmap_search_multi_extents: 1.18 xfs_iomap_eof_want_preallocate.constprop.8: 1.17 entry_SYSCALL_64_fastpath: 1.15 (1.47) __might_sleep: 1.14 (1.26) _raw_spin_lock: 0.97 (1.17) vfs_write: 0.94 (1.14) xfs_bmapi_delay: 0.93 iomap_write_actor: 0.9 pagecache_get_page: 0.89 (1.03) xfs_file_write_iter: 0.86 (1.03) xfs_file_iomap_begin: 0.81 iov_iter_copy_from_user_atomic: 0.78 (0.87) iomap_apply: 0.77 generic_write_end: 0.74 (1.36) xfs_file_buffered_aio_write: 0.72 (0.84) find_get_entry: 0.69 (0.79) __vfs_write: 0.67 (0.87) and it's worth noting a few things: - most of the old percentages are bigger, but that's natural: the load used to take longer, and the more efficient (old) case thus has higher percent values. That doesn't mean it was slower, quite the reverse. - the main exception is intel_idle, so we do have more idle time. But the *big* difference is all the functions that didn't use to show up at all, and have no previous percent values: xfs_bmapi_read: 2.28 xfs_iext_bno_to_ext: 1.79 xfs_bmap_search_extents: 1.33 xfs_iomap_write_delay: 1.23 xfs_bmap_search_multi_extents: 1.18 xfs_iomap_eof_want_preallocate.constprop.8: 1.17 xfs_bmapi_delay: 0.93 iomap_write_actor: 0.9 xfs_file_iomap_begin: 0.81 iomap_apply: 0.77 and I think this really can explain the regression. That all adds up to 12% or so of "new overhead". Which is fairly close to the regression. (Ok, that is playing fast and loose with percentages, but I think it migth be "close enough" in practice). So for some reason the new code doesn't do a lot more per-page operations (the unlock_page() etc costs are fairly similar), but it has a *much* m ore expensive footprint in the xfs_bmap/iomap functions. The old code had almost no XFS footprint at all, and didn't need to look up block mappings etc, and worked almost entirely with the vfs caches (so used the block numbers in the buffers etc). And I know that DaveC often complains about vfs overhead, but the fact is, the VFS layer is optimized to hell and back and does really really well. Having to call down to filesystem routines (for block mappings etc) is when performance goes down. I think this is an example of that. And hey, maybe I'm just misreading things, or reading too much into those profiles. But it does look like that commit 68a9f5e7007c1afa2cf6830b690a90d0187c0684 ends up causing more xfs bmap activity. Linus
WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org> To: lkp@lists.01.org Subject: Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression Date: Thu, 11 Aug 2016 14:40:59 -0700 [thread overview] Message-ID: <CA+55aFx=mbAd0hEnji-TWZH_NReLLk+SbJHK1s+=r1_USV0KXw@mail.gmail.com> (raw) In-Reply-To: <87ziojxazw.fsf@yhuang-mobile.sh.intel.com> [-- Attachment #1: Type: text/plain, Size: 4215 bytes --] On Thu, Aug 11, 2016 at 2:16 PM, Huang, Ying <ying.huang@intel.com> wrote: > > Test result is as follow, Thanks. No change. > raw perf data: I redid my munging, with the old (good) percentages in parenthesis: intel_idle: 17.66 (16.88) copy_user_enhanced_fast_string: 3.25 (3.94) memset_erms: 2.56 (3.26) xfs_bmapi_read: 2.28 ___might_sleep: 2.09 (2.33) __block_commit_write.isra.24: 2.07 (2.47) xfs_iext_bno_to_ext: 1.79 __block_write_begin_int: 1.74 (1.56) up_write: 1.72 (1.61) unlock_page: 1.69 (1.69) down_write: 1.59 (1.55) __mark_inode_dirty: 1.54 (1.88) xfs_bmap_search_extents: 1.33 xfs_iomap_write_delay: 1.23 mark_buffer_dirty: 1.21 (1.53) __radix_tree_lookup: 1.2 (1.32) xfs_bmap_search_multi_extents: 1.18 xfs_iomap_eof_want_preallocate.constprop.8: 1.17 entry_SYSCALL_64_fastpath: 1.15 (1.47) __might_sleep: 1.14 (1.26) _raw_spin_lock: 0.97 (1.17) vfs_write: 0.94 (1.14) xfs_bmapi_delay: 0.93 iomap_write_actor: 0.9 pagecache_get_page: 0.89 (1.03) xfs_file_write_iter: 0.86 (1.03) xfs_file_iomap_begin: 0.81 iov_iter_copy_from_user_atomic: 0.78 (0.87) iomap_apply: 0.77 generic_write_end: 0.74 (1.36) xfs_file_buffered_aio_write: 0.72 (0.84) find_get_entry: 0.69 (0.79) __vfs_write: 0.67 (0.87) and it's worth noting a few things: - most of the old percentages are bigger, but that's natural: the load used to take longer, and the more efficient (old) case thus has higher percent values. That doesn't mean it was slower, quite the reverse. - the main exception is intel_idle, so we do have more idle time. But the *big* difference is all the functions that didn't use to show up at all, and have no previous percent values: xfs_bmapi_read: 2.28 xfs_iext_bno_to_ext: 1.79 xfs_bmap_search_extents: 1.33 xfs_iomap_write_delay: 1.23 xfs_bmap_search_multi_extents: 1.18 xfs_iomap_eof_want_preallocate.constprop.8: 1.17 xfs_bmapi_delay: 0.93 iomap_write_actor: 0.9 xfs_file_iomap_begin: 0.81 iomap_apply: 0.77 and I think this really can explain the regression. That all adds up to 12% or so of "new overhead". Which is fairly close to the regression. (Ok, that is playing fast and loose with percentages, but I think it migth be "close enough" in practice). So for some reason the new code doesn't do a lot more per-page operations (the unlock_page() etc costs are fairly similar), but it has a *much* m ore expensive footprint in the xfs_bmap/iomap functions. The old code had almost no XFS footprint at all, and didn't need to look up block mappings etc, and worked almost entirely with the vfs caches (so used the block numbers in the buffers etc). And I know that DaveC often complains about vfs overhead, but the fact is, the VFS layer is optimized to hell and back and does really really well. Having to call down to filesystem routines (for block mappings etc) is when performance goes down. I think this is an example of that. And hey, maybe I'm just misreading things, or reading too much into those profiles. But it does look like that commit 68a9f5e7007c1afa2cf6830b690a90d0187c0684 ends up causing more xfs bmap activity. Linus
next prev parent reply other threads:[~2016-08-11 21:41 UTC|newest] Thread overview: 219+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-08-09 14:33 [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression kernel test robot 2016-08-09 14:33 ` kernel test robot 2016-08-10 18:24 ` [lkp] " Linus Torvalds 2016-08-10 18:24 ` Linus Torvalds 2016-08-10 23:08 ` [lkp] " Dave Chinner 2016-08-10 23:08 ` Dave Chinner 2016-08-10 23:51 ` [lkp] " Linus Torvalds 2016-08-10 23:51 ` Linus Torvalds 2016-08-10 23:58 ` [LKP] [lkp] " Huang, Ying 2016-08-10 23:58 ` Huang, Ying 2016-08-11 0:11 ` [LKP] [lkp] " Huang, Ying 2016-08-11 0:11 ` Huang, Ying 2016-08-11 0:23 ` [LKP] [lkp] " Linus Torvalds 2016-08-11 0:23 ` Linus Torvalds 2016-08-11 0:33 ` [LKP] [lkp] " Huang, Ying 2016-08-11 0:33 ` Huang, Ying 2016-08-11 1:00 ` [LKP] [lkp] " Linus Torvalds 2016-08-11 1:00 ` Linus Torvalds 2016-08-11 4:46 ` [LKP] [lkp] " Dave Chinner 2016-08-11 4:46 ` Dave Chinner 2016-08-15 17:22 ` [LKP] [lkp] " Huang, Ying 2016-08-15 17:22 ` Huang, Ying 2016-08-16 0:08 ` [LKP] [lkp] " Dave Chinner 2016-08-16 0:08 ` Dave Chinner 2016-08-11 15:57 ` [LKP] [lkp] " Christoph Hellwig 2016-08-11 15:57 ` Christoph Hellwig 2016-08-11 16:55 ` [LKP] [lkp] " Linus Torvalds 2016-08-11 16:55 ` Linus Torvalds 2016-08-11 17:51 ` [LKP] [lkp] " Huang, Ying 2016-08-11 17:51 ` Huang, Ying 2016-08-11 19:51 ` [LKP] [lkp] " Linus Torvalds 2016-08-11 19:51 ` Linus Torvalds 2016-08-11 20:00 ` [LKP] [lkp] " Christoph Hellwig 2016-08-11 20:00 ` Christoph Hellwig 2016-08-11 20:35 ` [LKP] [lkp] " Linus Torvalds 2016-08-11 20:35 ` Linus Torvalds 2016-08-11 22:16 ` [LKP] [lkp] " Al Viro 2016-08-11 22:16 ` Al Viro 2016-08-11 22:30 ` [LKP] [lkp] " Linus Torvalds 2016-08-11 22:30 ` Linus Torvalds 2016-08-11 21:16 ` [LKP] [lkp] " Huang, Ying 2016-08-11 21:16 ` Huang, Ying 2016-08-11 21:40 ` Linus Torvalds [this message] 2016-08-11 21:40 ` Linus Torvalds 2016-08-11 22:08 ` [LKP] [lkp] " Christoph Hellwig 2016-08-11 22:08 ` Christoph Hellwig 2016-08-12 0:54 ` [LKP] [lkp] " Dave Chinner 2016-08-12 0:54 ` Dave Chinner 2016-08-12 2:23 ` [LKP] [lkp] " Dave Chinner 2016-08-12 2:23 ` Dave Chinner 2016-08-12 2:32 ` [LKP] [lkp] " Linus Torvalds 2016-08-12 2:32 ` Linus Torvalds 2016-08-12 2:52 ` [LKP] [lkp] " Christoph Hellwig 2016-08-12 2:52 ` Christoph Hellwig 2016-08-12 3:20 ` [LKP] [lkp] " Linus Torvalds 2016-08-12 3:20 ` Linus Torvalds 2016-08-12 4:16 ` [LKP] [lkp] " Dave Chinner 2016-08-12 4:16 ` Dave Chinner 2016-08-12 5:02 ` [LKP] [lkp] " Linus Torvalds 2016-08-12 5:02 ` Linus Torvalds 2016-08-12 6:04 ` [LKP] [lkp] " Dave Chinner 2016-08-12 6:04 ` Dave Chinner 2016-08-12 6:29 ` [LKP] [lkp] " Ye Xiaolong 2016-08-12 6:29 ` Ye Xiaolong 2016-08-12 8:51 ` [LKP] [lkp] " Ye Xiaolong 2016-08-12 8:51 ` Ye Xiaolong 2016-08-12 10:02 ` [LKP] [lkp] " Dave Chinner 2016-08-12 10:02 ` Dave Chinner 2016-08-12 10:43 ` Fengguang Wu 2016-08-12 10:43 ` Fengguang Wu 2016-08-13 0:30 ` [LKP] [lkp] " Christoph Hellwig 2016-08-13 0:30 ` Christoph Hellwig 2016-08-13 21:48 ` [LKP] [lkp] " Christoph Hellwig 2016-08-13 21:48 ` Christoph Hellwig 2016-08-13 22:07 ` [LKP] [lkp] " Fengguang Wu 2016-08-13 22:07 ` Fengguang Wu 2016-08-13 22:15 ` [LKP] [lkp] " Christoph Hellwig 2016-08-13 22:15 ` Christoph Hellwig 2016-08-13 22:51 ` [LKP] [lkp] " Fengguang Wu 2016-08-13 22:51 ` Fengguang Wu 2016-08-14 14:50 ` [LKP] [lkp] " Fengguang Wu 2016-08-14 14:50 ` Fengguang Wu 2016-08-14 16:17 ` [LKP] [lkp] " Christoph Hellwig 2016-08-14 16:17 ` Christoph Hellwig 2016-08-14 23:46 ` [LKP] [lkp] " Dave Chinner 2016-08-14 23:46 ` Dave Chinner 2016-08-14 23:57 ` [LKP] [lkp] " Fengguang Wu 2016-08-14 23:57 ` Fengguang Wu 2016-08-15 14:14 ` [LKP] [lkp] " Fengguang Wu 2016-08-15 14:14 ` Fengguang Wu 2016-08-15 21:22 ` [LKP] [lkp] " Dave Chinner 2016-08-15 21:22 ` Dave Chinner 2016-08-16 12:20 ` [LKP] [lkp] " Fengguang Wu 2016-08-16 12:20 ` Fengguang Wu 2016-08-15 20:30 ` [LKP] [lkp] " Huang, Ying 2016-08-15 20:30 ` Huang, Ying 2016-08-22 22:09 ` [LKP] [lkp] " Huang, Ying 2016-08-22 22:09 ` Huang, Ying 2016-09-26 6:25 ` [LKP] [lkp] " Huang, Ying 2016-09-26 6:25 ` Huang, Ying 2016-09-26 14:55 ` [LKP] [lkp] " Christoph Hellwig 2016-09-26 14:55 ` Christoph Hellwig 2016-09-27 0:52 ` [LKP] [lkp] " Huang, Ying 2016-09-27 0:52 ` Huang, Ying 2016-08-16 13:25 ` [LKP] [lkp] " Fengguang Wu 2016-08-16 13:25 ` Fengguang Wu 2016-08-13 23:32 ` [LKP] [lkp] " Dave Chinner 2016-08-13 23:32 ` Dave Chinner 2016-08-12 2:27 ` [LKP] [lkp] " Linus Torvalds 2016-08-12 2:27 ` Linus Torvalds 2016-08-12 3:56 ` [LKP] [lkp] " Dave Chinner 2016-08-12 3:56 ` Dave Chinner 2016-08-12 18:03 ` [LKP] [lkp] " Linus Torvalds 2016-08-12 18:03 ` Linus Torvalds 2016-08-13 23:58 ` [LKP] [lkp] " Fengguang Wu 2016-08-13 23:58 ` Fengguang Wu 2016-08-15 0:48 ` [LKP] [lkp] " Dave Chinner 2016-08-15 0:48 ` Dave Chinner 2016-08-15 1:37 ` [LKP] [lkp] " Linus Torvalds 2016-08-15 1:37 ` Linus Torvalds 2016-08-15 2:28 ` [LKP] [lkp] " Dave Chinner 2016-08-15 2:28 ` Dave Chinner 2016-08-15 2:53 ` [LKP] [lkp] " Linus Torvalds 2016-08-15 2:53 ` Linus Torvalds 2016-08-15 5:00 ` [LKP] [lkp] " Dave Chinner 2016-08-15 5:00 ` Dave Chinner [not found] ` <CA+55aFwva2Xffai+Eqv1Jn_NGryk3YJ2i5JoHOQnbQv6qVPAsw@mail.gmail.com> [not found] ` <CA+55aFy14nUnJQ_GdF=j8Fa9xiH70c6fY2G3q5HQ01+8z1z3qQ@mail.gmail.com> 2016-08-15 5:12 ` Linus Torvalds 2016-08-15 22:22 ` [LKP] [lkp] " Dave Chinner 2016-08-15 22:22 ` Dave Chinner 2016-08-15 22:42 ` [LKP] [lkp] " Dave Chinner 2016-08-15 22:42 ` Dave Chinner 2016-08-15 23:20 ` [LKP] [lkp] " Linus Torvalds 2016-08-15 23:20 ` Linus Torvalds 2016-08-15 23:48 ` [LKP] [lkp] " Linus Torvalds 2016-08-15 23:48 ` Linus Torvalds 2016-08-16 0:44 ` [LKP] [lkp] " Dave Chinner 2016-08-16 0:44 ` Dave Chinner 2016-08-16 15:05 ` [LKP] [lkp] " Mel Gorman 2016-08-16 15:05 ` Mel Gorman 2016-08-16 17:47 ` [LKP] [lkp] " Linus Torvalds 2016-08-16 17:47 ` Linus Torvalds 2016-08-17 15:48 ` [LKP] [lkp] " Michal Hocko 2016-08-17 15:48 ` Michal Hocko 2016-08-17 16:42 ` [LKP] [lkp] " Michal Hocko 2016-08-17 16:42 ` Michal Hocko 2016-08-17 15:49 ` [LKP] [lkp] " Mel Gorman 2016-08-17 15:49 ` Mel Gorman 2016-08-18 0:45 ` [LKP] [lkp] " Mel Gorman 2016-08-18 0:45 ` Mel Gorman 2016-08-18 7:11 ` [LKP] [lkp] " Dave Chinner 2016-08-18 7:11 ` Dave Chinner 2016-08-18 13:24 ` [LKP] [lkp] " Mel Gorman 2016-08-18 13:24 ` Mel Gorman 2016-08-18 17:55 ` [LKP] [lkp] " Linus Torvalds 2016-08-18 17:55 ` Linus Torvalds 2016-08-18 21:19 ` [LKP] [lkp] " Dave Chinner 2016-08-18 21:19 ` Dave Chinner 2016-08-18 22:25 ` [LKP] [lkp] " Linus Torvalds 2016-08-18 22:25 ` Linus Torvalds 2016-08-19 9:00 ` [LKP] [lkp] " Michal Hocko 2016-08-19 9:00 ` Michal Hocko 2016-08-19 10:49 ` [LKP] [lkp] " Mel Gorman 2016-08-19 10:49 ` Mel Gorman 2016-08-19 23:48 ` [LKP] [lkp] " Dave Chinner 2016-08-19 23:48 ` Dave Chinner 2016-08-20 1:08 ` [LKP] [lkp] " Linus Torvalds 2016-08-20 1:08 ` Linus Torvalds 2016-08-20 12:16 ` [LKP] [lkp] " Mel Gorman 2016-08-20 12:16 ` Mel Gorman 2016-08-19 15:08 ` [LKP] [lkp] " Mel Gorman 2016-08-19 15:08 ` Mel Gorman 2016-09-01 23:32 ` [LKP] [lkp] " Dave Chinner 2016-09-01 23:32 ` Dave Chinner 2016-09-06 15:37 ` [LKP] [lkp] " Mel Gorman 2016-09-06 15:37 ` Mel Gorman 2016-09-06 15:52 ` [LKP] [lkp] " Huang, Ying 2016-09-06 15:52 ` Huang, Ying 2016-08-24 15:40 ` [LKP] [lkp] " Huang, Ying 2016-08-24 15:40 ` Huang, Ying 2016-08-25 9:37 ` [LKP] [lkp] " Mel Gorman 2016-08-25 9:37 ` Mel Gorman 2016-08-18 2:44 ` [LKP] [lkp] " Dave Chinner 2016-08-18 2:44 ` Dave Chinner 2016-08-16 0:15 ` [LKP] [lkp] " Linus Torvalds 2016-08-16 0:15 ` Linus Torvalds 2016-08-16 0:38 ` [LKP] [lkp] " Dave Chinner 2016-08-16 0:38 ` Dave Chinner 2016-08-16 0:50 ` [LKP] [lkp] " Linus Torvalds 2016-08-16 0:50 ` Linus Torvalds 2016-08-16 0:19 ` [LKP] [lkp] " Dave Chinner 2016-08-16 0:19 ` Dave Chinner 2016-08-16 1:51 ` [LKP] [lkp] " Linus Torvalds 2016-08-16 1:51 ` Linus Torvalds 2016-08-16 22:02 ` [LKP] [lkp] " Dave Chinner 2016-08-16 22:02 ` Dave Chinner 2016-08-16 23:23 ` [LKP] [lkp] " Linus Torvalds 2016-08-16 23:23 ` Linus Torvalds 2016-08-15 23:01 ` [LKP] [lkp] " Linus Torvalds 2016-08-15 23:01 ` Linus Torvalds 2016-08-16 0:17 ` [LKP] [lkp] " Dave Chinner 2016-08-16 0:17 ` Dave Chinner 2016-08-16 0:45 ` [LKP] [lkp] " Linus Torvalds 2016-08-16 0:45 ` Linus Torvalds 2016-08-15 5:03 ` [LKP] [lkp] " Ingo Molnar 2016-08-15 5:03 ` Ingo Molnar 2016-08-17 16:24 ` [LKP] [lkp] " Peter Zijlstra 2016-08-17 16:24 ` Peter Zijlstra 2016-08-15 12:58 ` [LKP] [lkp] " Fengguang Wu 2016-08-15 12:58 ` Fengguang Wu 2016-08-11 1:16 ` [LKP] [lkp] " Dave Chinner 2016-08-11 1:16 ` Dave Chinner 2016-08-11 1:32 ` [LKP] [lkp] " Dave Chinner 2016-08-11 1:32 ` Dave Chinner 2016-08-11 2:36 ` [LKP] [lkp] " Ye Xiaolong 2016-08-11 2:36 ` Ye Xiaolong 2016-08-11 3:05 ` [LKP] [lkp] " Dave Chinner 2016-08-11 3:05 ` Dave Chinner 2016-08-12 1:26 ` [LKP] [lkp] " Dave Chinner 2016-08-12 1:26 ` Dave Chinner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CA+55aFx=mbAd0hEnji-TWZH_NReLLk+SbJHK1s+=r1_USV0KXw@mail.gmail.com' \ --to=torvalds@linux-foundation.org \ --cc=david@fromorbit.com \ --cc=fengguang.wu@intel.com \ --cc=hch@lst.de \ --cc=linux-kernel@vger.kernel.org \ --cc=lkp@01.org \ --cc=rpeterso@redhat.com \ --cc=ying.huang@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.