linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Huang\, Ying" <ying.huang@intel.com>
To: Minchan Kim <minchan@kernel.org>
Cc: "Huang\, Ying" <ying.huang@intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Rik van Riel <riel@redhat.com>, "Michal Hocko" <mhocko@suse.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	"Vinayak Menon" <vinmenon@codeaurora.org>,
	Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>, <lkp@01.org>
Subject: Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression
Date: Fri, 17 Jun 2016 12:26:51 -0700	[thread overview]
Message-ID: <87y463hb0k.fsf@yhuang-mobile.sh.intel.com> (raw)
In-Reply-To: <20160617054156.GB2374@bbox> (Minchan Kim's message of "Fri, 17 Jun 2016 14:41:56 +0900")

Minchan Kim <minchan@kernel.org> writes:

> On Thu, Jun 16, 2016 at 03:27:44PM -0700, Huang, Ying wrote:
>> Minchan Kim <minchan@kernel.org> writes:
>> 
>> > On Thu, Jun 16, 2016 at 07:52:26AM +0800, Huang, Ying wrote:
>> >> "Kirill A. Shutemov" <kirill@shutemov.name> writes:
>> >> 
>> >> > On Tue, Jun 14, 2016 at 05:57:28PM +0900, Minchan Kim wrote:
>> >> >> On Wed, Jun 08, 2016 at 11:58:11AM +0300, Kirill A. Shutemov wrote:
>> >> >> > On Wed, Jun 08, 2016 at 04:41:37PM +0800, Huang, Ying wrote:
>> >> >> > > "Huang, Ying" <ying.huang@intel.com> writes:
>> >> >> > > 
>> >> >> > > > "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> writes:
>> >> >> > > >
>> >> >> > > >> On Mon, Jun 06, 2016 at 10:27:24AM +0800, kernel test robot wrote:
>> >> >> > > >>> 
>> >> >> > > >>> FYI, we noticed a -6.3% regression of unixbench.score due to commit:
>> >> >> > > >>> 
>> >> >> > > >>> commit 5c0a85fad949212b3e059692deecdeed74ae7ec7 ("mm: make faultaround produce old ptes")
>> >> >> > > >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> >> >> > > >>> 
>> >> >> > > >>> in testcase: unixbench
>> >> >> > > >>> on test machine: lituya: 16 threads Haswell High-end Desktop (i7-5960X 3.0G) with 16G memory
>> >> >> > > >>> with following parameters: cpufreq_governor=performance/nr_task=1/test=shell8
>> >> >> > > >>> 
>> >> >> > > >>> 
>> >> >> > > >>> Details are as below:
>> >> >> > > >>> -------------------------------------------------------------------------------------------------->
>> >> >> > > >>> 
>> >> >> > > >>> 
>> >> >> > > >>> =========================================================================================
>> >> >> > > >>> compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase:
>> >> >> > > >>>   gcc-4.9/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench
>> >> >> > > >>> 
>> >> >> > > >>> commit: 
>> >> >> > > >>>   4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5
>> >> >> > > >>>   5c0a85fad949212b3e059692deecdeed74ae7ec7
>> >> >> > > >>> 
>> >> >> > > >>> 4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de 
>> >> >> > > >>> ---------------- -------------------------- 
>> >> >> > > >>>        fail:runs  %reproduction    fail:runs
>> >> >> > > >>>            |             |             |    
>> >> >> > > >>>           3:4          -75%            :4     kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
>> >> >> > > >>>          %stddev     %change         %stddev
>> >> >> > > >>>              \          |                \  
>> >> >> > > >>>      14321 .  0%      -6.3%      13425 .  0%  unixbench.score
>> >> >> > > >>>    1996897 .  0%      -6.1%    1874635 .  0%  unixbench.time.involuntary_context_switches
>> >> >> > > >>>  1.721e+08 .  0%      -6.2%  1.613e+08 .  0%  unixbench.time.minor_page_faults
>> >> >> > > >>>     758.65 .  0%      -3.0%     735.86 .  0%  unixbench.time.system_time
>> >> >> > > >>>     387.66 .  0%      +5.4%     408.49 .  0%  unixbench.time.user_time
>> >> >> > > >>>    5950278 .  0%      -6.2%    5583456 .  0%  unixbench.time.voluntary_context_switches
>> >> >> > > >>
>> >> >> > > >> That's weird.
>> >> >> > > >>
>> >> >> > > >> I don't understand why the change would reduce number or minor faults.
>> >> >> > > >> It should stay the same on x86-64. Rise of user_time is puzzling too.
>> >> >> > > >
>> >> >> > > > unixbench runs in fixed time mode.  That is, the total time to run
>> >> >> > > > unixbench is fixed, but the work done varies.  So the minor_page_faults
>> >> >> > > > change may reflect only the work done.
>> >> >> > > >
>> >> >> > > >> Hm. Is reproducible? Across reboot?
>> >> >> > > >
>> >> >> > > 
>> >> >> > > And FYI, there is no swap setup for test, all root file system including
>> >> >> > > benchmark files are in tmpfs, so no real page reclaim will be
>> >> >> > > triggered.  But it appears that active file cache reduced after the
>> >> >> > > commit.
>> >> >> > > 
>> >> >> > >     111331 .  1%     -13.3%      96503 .  0%  meminfo.Active
>> >> >> > >      27603 .  1%     -43.9%      15486 .  0%  meminfo.Active(file)
>> >> >> > > 
>> >> >> > > I think this is the expected behavior of the commit?
>> >> >> > 
>> >> >> > Yes, it's expected.
>> >> >> > 
>> >> >> > After the change faularound would produce old pte. It means there's more
>> >> >> > chance for these pages to be on inactive lru, unless somebody actually
>> >> >> > touch them and flip accessed bit.
>> >> >> 
>> >> >> Hmm, tmpfs pages should be in anonymous LRU list and VM shouldn't scan
>> >> >> anonymous LRU list on swapless system so I really wonder why active file
>> >> >> LRU is shrunk.
>> >> >
>> >> > Hm. Good point. I don't why we have anything on file lru if there's no
>> >> > filesystems except tmpfs.
>> >> >
>> >> > Ying, how do you get stuff to the tmpfs?
>> >> 
>> >> We put root file system and benchmark into a set of compressed cpio
>> >> archive, then concatenate them into one initrd, and finally kernel use
>> >> that initrd as initramfs.
>> >
>> > I see.
>> >
>> > Could you share your 4 full vmstat(/proc/vmstat) files?
>> >
>> > old:
>> >
>> > cat /proc/vmstat > before.old.vmstat
>> > do benchmark
>> > cat /proc/vmstat > after.old.vmstat
>> >
>> > new:
>> >
>> > cat /proc/vmstat > before.new.vmstat
>> > do benchmark
>> > cat /proc/vmstat > after.new.vmstat
>> >
>> > IOW, I want to see stats related to reclaim.
>> 
>> Hi,
>> 
>> The /proc/vmstat for the parent commit (parent-proc-vmstat.gz) and first
>> bad commit (fbc-proc-vmstat.gz) are attached with the email.
>> 
>> The contents of the file is more than the vmstat before and after
>> benchmark running, but are sampled every 1 seconds.  Every sample begin
>> with "time: <time>".  You can check the first and last samples.  The
>> first /proc/vmstat capturing is started at the same time of the
>> benchmark, so it is not exactly the vmstat before the benchmark running.
>> 
>
> Thanks for the testing!
>
> nr_active_file was shrunk 48% but the vaule itself is not huge so
> I don't think it affects performance a lot.
>
> There was no reclaim activity for testing. :(
>
> pgfault, 6% reduced. Given that, pgalloc/free reduced 6%, too
> because unixbench was time fixed mode and 6% regressed so no
> doubt.
>
> No interesting data.
>
> It seems you tested it with THP, maybe always mode?

Yes.  With following in kconfig.

CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y

> I'm so sorry but could you test it with disabling CONFIG_TRANSPARENT_HUGEPAGE=n
> again? it might you already did.
> Is it still 6% regressed with disabling THP?

Yes.  I disabled THP via

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

The regression is the same as before.

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase/thp_defrag/thp_enabled:
  gcc-4.9/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench/never/never

commit: 
  4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5
  5c0a85fad949212b3e059692deecdeed74ae7ec7

4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     14332 ±  0%      -6.2%      13438 ±  0%  unixbench.score
   6662206 ±  0%      -6.2%    6252260 ±  0%  unixbench.time.involuntary_context_switches
 5.734e+08 ±  0%      -6.2%  5.376e+08 ±  0%  unixbench.time.minor_page_faults
      2527 ±  0%      -3.2%       2446 ±  0%  unixbench.time.system_time
      1291 ±  0%      +5.4%       1361 ±  0%  unixbench.time.user_time
  19875455 ±  0%      -6.3%   18622488 ±  0%  unixbench.time.voluntary_context_switches
   6570355 ±  0%     -11.9%    5787517 ±  0%  cpuidle.C1-HSW.usage
     17257 ± 34%     -59.1%       7055 ±  7%  latency_stats.sum.ep_poll.SyS_epoll_wait.entry_SYSCALL_64_fastpath
      5976 ±  0%     -43.0%       3404 ±  0%  proc-vmstat.nr_active_file
     45729 ±  1%     -22.5%      35439 ±  1%  meminfo.Active
     23905 ±  0%     -43.0%      13619 ±  0%  meminfo.Active(file)
      8465 ±  3%     -29.8%       5940 ±  3%  slabinfo.pid.active_objs
      8476 ±  3%     -29.9%       5940 ±  3%  slabinfo.pid.num_objs
      3.46 ±  0%     +12.5%       3.89 ±  0%  turbostat.CPU%c3
     67.09 ±  0%      -2.1%      65.65 ±  0%  turbostat.PkgWatt
     96090 ±  0%      -5.8%      90479 ±  0%  vmstat.system.cs
      9083 ±  0%      -2.7%       8833 ±  0%  vmstat.system.in
    467.35 ± 78%    +416.7%       2414 ± 45%  sched_debug.cfs_rq:/.MIN_vruntime.avg
      7477 ± 78%    +327.7%      31981 ± 39%  sched_debug.cfs_rq:/.MIN_vruntime.max
      1810 ± 78%    +360.1%       8327 ± 40%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
    467.35 ± 78%    +416.7%       2414 ± 45%  sched_debug.cfs_rq:/.max_vruntime.avg
      7477 ± 78%    +327.7%      31981 ± 39%  sched_debug.cfs_rq:/.max_vruntime.max
      1810 ± 78%    +360.1%       8327 ± 40%  sched_debug.cfs_rq:/.max_vruntime.stddev
    -10724 ± -7%     -12.0%      -9433 ± -3%  sched_debug.cfs_rq:/.spread0.avg
    -17721 ± -4%      -9.8%     -15978 ± -2%  sched_debug.cfs_rq:/.spread0.min
     90355 ±  9%     +14.1%     103099 ±  5%  sched_debug.cpu.avg_idle.min
      0.12 ± 35%    +325.0%       0.52 ± 46%  sched_debug.cpu.cpu_load[0].min
     21913 ±  2%     +29.1%      28288 ± 14%  sched_debug.cpu.curr->pid.avg
     49953 ±  3%     +30.2%      65038 ±  0%  sched_debug.cpu.curr->pid.max
     23062 ±  2%     +30.1%      29996 ±  4%  sched_debug.cpu.curr->pid.stddev
    274.39 ±  5%     -10.2%     246.27 ±  3%  sched_debug.cpu.nr_uninterruptible.max
    242.73 ±  4%     -13.5%     209.90 ±  2%  sched_debug.cpu.nr_uninterruptible.stddev

Best Regards,
Huang, Ying

>                  nr_free_pages      -6663      -6461    96.97%
>                 nr_alloc_batch       2594       4013   154.70%
>               nr_inactive_anon        112        112   100.00%
>                 nr_active_anon       2536       2159    85.13%
>               nr_inactive_file       -567       -227    40.04%
>                 nr_active_file        648        315    48.61%
>                 nr_unevictable          0          0     0.00%
>                       nr_mlock          0          0     0.00%
>                  nr_anon_pages       2634       2161    82.04%
>                      nr_mapped        511        530   103.72%
>                  nr_file_pages        207        215   103.86%
>                       nr_dirty         -7         -6    85.71%
>                   nr_writeback          0          0     0.00%
>            nr_slab_reclaimable        158        328   207.59%
>          nr_slab_unreclaimable       2208       2115    95.79%
>            nr_page_table_pages        268        247    92.16%
>                nr_kernel_stack        143         80    55.94%
>                    nr_unstable          1          1   100.00%
>                      nr_bounce          0          0     0.00%
>                nr_vmscan_write          0          0     0.00%
>    nr_vmscan_immediate_reclaim          0          0     0.00%
>              nr_writeback_temp          0          0     0.00%
>               nr_isolated_anon          0          0     0.00%
>               nr_isolated_file          0          0     0.00%
>                       nr_shmem        131        131   100.00%
>                     nr_dirtied         67         78   116.42%
>                     nr_written         74         84   113.51%
>               nr_pages_scanned          0          0     0.00%
>                       numa_hit  483752446  453696304    93.79%
>                      numa_miss          0          0     0.00%
>                   numa_foreign          0          0     0.00%
>                numa_interleave          0          0     0.00%
>                     numa_local  483752445  453696304    93.79%
>                     numa_other          1          0     0.00%
>             workingset_refault          0          0     0.00%
>            workingset_activate          0          0     0.00%
>         workingset_nodereclaim          0          0     0.00%
>  nr_anon_transparent_hugepages          1          0     0.00%
>                    nr_free_cma          0          0     0.00%
>             nr_dirty_threshold      -1316      -1274    96.81%
>  nr_dirty_background_threshold       -658       -637    96.81%
>                         pgpgin          0          0     0.00%
>                        pgpgout          0          0     0.00%
>                         pswpin          0          0     0.00%
>                        pswpout          0          0     0.00%
>                    pgalloc_dma          0          0     0.00%
>                  pgalloc_dma32   60130977   56323630    93.67%
>                 pgalloc_normal  457203182  428863437    93.80%
>                pgalloc_movable          0          0     0.00%
>                         pgfree  517327743  485181251    93.79%
>                     pgactivate    2059556    1930950    93.76%
>                   pgdeactivate          0          0     0.00%
>                        pgfault  572723351  537107146    93.78%
>                     pgmajfault          0          0     0.00%
>                    pglazyfreed          0          0     0.00%
>                   pgrefill_dma          0          0     0.00%
>                 pgrefill_dma32          0          0     0.00%
>                pgrefill_normal          0          0     0.00%
>               pgrefill_movable          0          0     0.00%
>             pgsteal_kswapd_dma          0          0     0.00%
>           pgsteal_kswapd_dma32          0          0     0.00%
>          pgsteal_kswapd_normal          0          0     0.00%
>         pgsteal_kswapd_movable          0          0     0.00%
>             pgsteal_direct_dma          0          0     0.00%
>           pgsteal_direct_dma32          0          0     0.00%
>          pgsteal_direct_normal          0          0     0.00%
>         pgsteal_direct_movable          0          0     0.00%
>              pgscan_kswapd_dma          0          0     0.00%
>            pgscan_kswapd_dma32          0          0     0.00%
>           pgscan_kswapd_normal          0          0     0.00%
>          pgscan_kswapd_movable          0          0     0.00%
>              pgscan_direct_dma          0          0     0.00%
>            pgscan_direct_dma32          0          0     0.00%
>           pgscan_direct_normal          0          0     0.00%
>          pgscan_direct_movable          0          0     0.00%
>         pgscan_direct_throttle          0          0     0.00%
>            zone_reclaim_failed          0          0     0.00%
>                   pginodesteal          0          0     0.00%
>                  slabs_scanned          0          0     0.00%
>              kswapd_inodesteal          0          0     0.00%
>   kswapd_low_wmark_hit_quickly          0          0     0.00%
>  kswapd_high_wmark_hit_quickly          0          0     0.00%
>                     pageoutrun          0          0     0.00%
>                     allocstall          0          0     0.00%
>                      pgrotated          0          0     0.00%
>                 drop_pagecache          0          0     0.00%
>                      drop_slab          0          0     0.00%
>               numa_pte_updates          0          0     0.00%
>          numa_huge_pte_updates          0          0     0.00%
>               numa_hint_faults          0          0     0.00%
>         numa_hint_faults_local          0          0     0.00%
>            numa_pages_migrated          0          0     0.00%
>              pgmigrate_success          0          0     0.00%
>                 pgmigrate_fail          0          0     0.00%
>        compact_migrate_scanned          0          0     0.00%
>           compact_free_scanned          0          0     0.00%
>               compact_isolated          0          0     0.00%
>                  compact_stall          0          0     0.00%
>                   compact_fail          0          0     0.00%
>                compact_success          0          0     0.00%
>            compact_daemon_wake          0          0     0.00%
>       htlb_buddy_alloc_success          0          0     0.00%
>          htlb_buddy_alloc_fail          0          0     0.00%
>         unevictable_pgs_culled          0          0     0.00%
>        unevictable_pgs_scanned          0          0     0.00%
>        unevictable_pgs_rescued          0          0     0.00%
>        unevictable_pgs_mlocked          0          0     0.00%
>      unevictable_pgs_munlocked          0          0     0.00%
>        unevictable_pgs_cleared          0          0     0.00%
>       unevictable_pgs_stranded          0          0     0.00%
>                thp_fault_alloc      22731      21604    95.04%
>             thp_fault_fallback          0          0     0.00%
>             thp_collapse_alloc          1          0     0.00%
>      thp_collapse_alloc_failed          0          0     0.00%
>                 thp_split_page          0          0     0.00%
>          thp_split_page_failed          0          0     0.00%
>        thp_deferred_split_page      22731      21604    95.04%
>                  thp_split_pmd          0          0     0.00%
>            thp_zero_page_alloc          0          0     0.00%
>     thp_zero_page_alloc_failed          0          0     0.00%
>                balloon_inflate          0          0     0.00%
>                balloon_deflate          0          0     0.00%
>                balloon_migrate          0          0     0.00%

  reply	other threads:[~2016-06-17 19:26 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06  2:27 [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression kernel test robot
2016-06-06  9:51 ` Kirill A. Shutemov
2016-06-08  7:21   ` [LKP] " Huang, Ying
2016-06-08  8:41     ` Huang, Ying
2016-06-08  8:58       ` Kirill A. Shutemov
2016-06-12  0:49         ` Huang, Ying
2016-06-12  1:02           ` Linus Torvalds
2016-06-13  9:02             ` Huang, Ying
2016-06-14 13:38               ` Minchan Kim
2016-06-15 23:42                 ` Huang, Ying
2016-06-13 12:52             ` Kirill A. Shutemov
2016-06-14  6:11               ` Linus Torvalds
2016-06-14  8:26                 ` Kirill A. Shutemov
2016-06-14 16:07                   ` Rik van Riel
2016-06-14 14:03                 ` Christian Borntraeger
2016-06-14  8:57         ` Minchan Kim
2016-06-14 14:34           ` Kirill A. Shutemov
2016-06-15 23:52             ` Huang, Ying
2016-06-16  0:13               ` Minchan Kim
2016-06-16 22:27                 ` Huang, Ying
2016-06-17  5:41                   ` Minchan Kim
2016-06-17 19:26                     ` Huang, Ying [this message]
2016-06-20  0:06                       ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y463hb0k.fsf@yhuang-mobile.sh.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vinmenon@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).